Producing CSV files from Java for French and German

I’ve been applying some limited internationalization to my work’s US-centric Java-based web app: producing CSV files that give intelligible results when opened in Excel, without having to switch your Windows regional settings from “French (France)” or “German (Germany)” to “English (United States)”, as our long-suffering (and English-speaking) customers have to do.

I’m using Java’s built-in Locale, DateFormat, and NumberFormat classes. Here are a few things I’ve learned along the way.

Excel can expect ; (semi-colon) as the separator in a “comma-separated values” file

The character that Excel expects as the separator in a CSV file is controlled by the “list separator” setting in Windows regional settings. The possible values are , (comma) and ; (semi-colon).

Here are instructions for finding the “list separator” setting in different versions of Windows:

Note that even if the Windows regional settings are “French (France)”, the user can change the “list separator” setting between ; (semi-colon) and , (comma). So I added a separate “list separator” user preference in our web app.

Excel can’t handle Java’s French month abbreviations

I’d hoped that Java’s DateFormat for Locale “French (France)” would produce formatted dates that can be parsed by Excel with Windows regional settings of “French (France)”. Sadly, that’s not the case for the MEDIUM date format. Java puts a ‘.’ character at the end of month abbreviations e.g. “1 sept. 2011”, which Excel can’t handle; it only parses the months that have four letters or less, such as “mai”.

The solution was to use the LONG date format. This causes Java to emit “1 septembre 2011”, which Excel parses and renders as “01-sept-11”.

Excel expects UTF16-LE character encoding, not UTF-8

Accented (non-ASCII) characters in month names were garbled in Excel. Great, a character encoding mismatch. It seems that Excel expects UTF16-LE character encoding, which is easy enough to set using an OutputStreamWriter:

Writer writer = new BufferedWriter(new OutputStreamWriter(response.getOutputStream(), "UTF-16LE"));

although you’ll need to catch UnsupportedEncodingException, unless you use Guava’s Charsets.

Published 21 August 2011, tags: ,

Comments

You can email me at stephen#viles.nz (change # to @) or tweet me at @svilesnz.

Articles