Internationalization in JSF with UTF-8 encoded properties files

29 October 2010, by: Development

Introduction

To internationalize your (web)application, the normal approach is to use the ResourceBundle API in combination with properties files which contains externalized text. The ResourceBundle API will load the proper text based on the current Locale and the default (fallback) locale. This sounds nicer than it actually is. Deep under the covers, the ResourceBundle API uses Properties#load(InputStream) method to load the properties files. This method uses by default the ISO-8859-1 encoding. This is explicitly mentioned in its javadoc as well. Here’s an extract of relevance:

The load(InputStream) / store(OutputStream, String) methods work the same way as the load(Reader)/store(Writer, String) pair, except the input/output stream is encoded in ISO 8859-1 character encoding. Characters that cannot be directly represented in this encoding can be written using Unicode escapes ; only a single ‘u’ character is allowed in an escape sequence. The native2ascii tool can be used to convert property files to and from other character encodings.

In a nutshell, you can’t have something like the following as UTF-8 in your properties files:

some.dutch.text = Één van de wijken van Curaçao heet Saliña.

When incorrectly decoded using ISO-8859-1, it would end up as mojibake:

�én van de wijken van Curaçao heet Saliña.

For example the character ç ‘LATIN SMALL LETTER C WITH CEDILLA’ (U+00E7) exist in UTF-8, which is a 16-bit multi-byte encoding, of two bytes 0xC3 and 0xA7. In ISO-8859-1, which is an 8-bit single-byte encoding, those two bytes are separately represented as à and §, see also this ISO-8859-1 codepage. Note that the second byte of É ‘LATIN CAPITAL LETTER E WITH ACUTE’ (U+00C9), which is 0×89, doesn’t represent a valid character in ISO-8859-1. In many environments such a character will be replaced by a question mark ?.

The old solution

The common approach is (was) to escape the characters outside the ASCII range into unicode escape sequences like uXXXX where XXXX is the unicode codepoint of the character in hexadecimal. So, for example the ç has to be escaped as u00E7. Doing this manually is a pita, you can use the JDK-supplied native2ascii tool to convert an UTF-8 encoded properties file to an ISO-8859-1 encoded properties file.

/path/to/jdk/bin/native2ascii -encoding UTF-8 text_nl.utf8.properties text_nl.properties

This way the above example will end up like

some.dutch.text = u00c9u00e9n van de wijken van Curau00e7ao heet Saliu00f1a.

The better solution

Although using the native2ascii tool is to a certain degree automatable (ant, maven, etc), this is still a maintainability pain and prone to human errors. Fortunately, since the new i18n enhancements in Java 1.6, the Properties API got a new method Properties#load(Reader) which you could feed with an InputStreamReader wherein you can specify the character encoding. To use this in a ResourceBundle, you could make use of -also new since Java 1.6- ResourceBundle.Control class wherein you can control the loading of the propertiesfiles yourself. Therein you can use the -also new- PropertyResourceBundle constructor taking a Reader instead of InputStream.

Here’s a JSF-targeted example of such a ResourceBundle class:

package com.example.faces.i18n;

import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
import java.util.Enumeration;
import java.util.Locale;
import java.util.PropertyResourceBundle;
import java.util.ResourceBundle;

import javax.faces.context.FacesContext;

public class Text extends ResourceBundle {

    protected static final String BUNDLE_NAME = "com.example.faces.i18n.text";
    protected static final String BUNDLE_EXTENSION = "properties";
    protected static final Control UTF8_CONTROL = new UTF8Control();

    public Text() {
        setParent(ResourceBundle.getBundle(BUNDLE_NAME, 
            FacesContext.getCurrentInstance().getViewRoot().getLocale(), UTF8_CONTROL));
    }

    @Override
    protected Object handleGetObject(String key) {
        return parent.getObject(key);
    }

    @Override
    public Enumeration getKeys() {
        return parent.getKeys();
    }

    protected static class UTF8Control extends Control {
        public ResourceBundle newBundle
            (String baseName, Locale locale, String format, ClassLoader loader, boolean reload)
                throws IllegalAccessException, InstantiationException, IOException
        {
            // The below code is copied from default Control#newBundle() implementation.
            // Only the PropertyResourceBundle line is changed to read the file as UTF-8.
            String bundleName = toBundleName(baseName, locale);
            String resourceName = toResourceName(bundleName, BUNDLE_EXTENSION);
            ResourceBundle bundle = null;
            InputStream stream = null;
            if (reload) {
                URL url = loader.getResource(resourceName);
                if (url != null) {
                    URLConnection connection = url.openConnection();
                    if (connection != null) {
                        connection.setUseCaches(false);
                        stream = connection.getInputStream();
                    }
                }
            } else {
                stream = loader.getResourceAsStream(resourceName);
            }
            if (stream != null) {
                try {
                    bundle = new PropertyResourceBundle(new InputStreamReader(stream, "UTF-8"));
                } finally {
                    stream.close();
                }
            }
            return bundle;
        }
    }

}

Using it in JSF is extraordinary simple: just define the fully qualified classname instead of the fully qualified bundle name as resource-bundle base-name in faces-config.xml:

    <application>
        <locale-config>
            <default-locale>en</default-locale>
            <supported-locale>nl</supported-locale>
            <supported-locale>es</supported-locale>
        </locale-config>
        <resource-bundle>
            <base-name>com.example.faces.i18n.Text</base-name>
            <var>text</var>
        </resource-bundle>
    </application>

Note the var element. This enables you to reference the bundle as #{text} without the need for f:loadBundle tag in every view.

This way you can keep your properties files UTF-8 all the way. In the above code example, their location is definied by the BUNDLE_NAME constant in the Text class. It expects them to be inside the com.example.faces.i18n package with the filenames text.properties (for generic content, this file is mandatory anyway), text_en.properties (for English), text_nl.properties (for Dutch) and text_es.properties (for Spanish).

Changing the locale in JSF/Facelets

As an extra bonus, here’s some code which shows at its simplest way how you could change the locale (actually, the language) from in the view side. It’s targeted on JSF 2.x / Facelets, but can be done as good on JSF 1.x and/or JSP with minor changes.

The view:

<!DOCTYPE html>
<html lang="#{localeBean.language}"
    xmlns:f="http://java.sun.com/jsf/core" 
    xmlns:h="http://java.sun.com/jsf/html">
<f:view locale="#{localeBean.locale}">
    <h:head>
        <title>JSF/Facelets i18n example</title>
    </h:head>
    <h:body>
        <h:form>
            <h:selectOneMenu value="#{localeBean.language}" onchange="submit()">
                <f:selectItem itemValue="en" itemLabel="English" />
                <f:selectItem itemValue="nl" itemLabel="Nederlands" />
                <f:selectItem itemValue="es" itemLabel="Español" />
            </h:selectOneMenu>
        </h:form>
        <p><h:outputText value="#{text['some.text']}" /></p>
    </h:body>
</f:view>
</html>

The bean:

package com.example.faces;

import java.util.Locale;

import javax.faces.bean.ManagedBean;
import javax.faces.bean.SessionScoped;
import javax.faces.context.FacesContext;

@ManagedBean
@SessionScoped
public class LocaleBean {

    private Locale locale = FacesContext.getCurrentInstance().getViewRoot().getLocale();

    public Locale getLocale() {
        return locale;
    }

    public String getLanguage() {
        return locale.getLanguage();
    }

    public void setLanguage(String language) {
        this.locale = new Locale(language);
    }

}

That’s it!

More background information about the world of characters and bytes as it is seen by computers and humans can be found in this article: Unicode – How to get the characters right?.

Bauke Scholtz

6 comments to “Internationalization in JSF with UTF-8 encoded properties files”

  1. Merve Özbey says:

    Thank you Bauke for your solution, it worked fine in our case. We are grateful to you.

  2. Anonymous says:

    Thx a lot!

  3. JSF odkazy | BRAIN SNIPPETS says:

    [...] i18n & UTF8 property file [...]

  4. Andras Liter says:

    A great solution! Thanks for sharing.

  5. Hesam says:

    Thnx Man! it worked for me.

  6. Struts2: UTF .properties | SuperBlog says:

    [...] http://jdevelopment.nl/internationalization-jsf-utf8-encoded-properties-files/ [...]

Type your comment below:

best counter