I’ve been applying some limited internationalization to my work’s US-centric Java-based web app: producing CSV files that give intelligible results when opened in Excel, without having to switch your Windows regional settings from “French (France)” or “German (Germany)” to “English (United States)”, as our long-suffering (and English-speaking) customers have to do.
The character that Excel expects as the separator in a CSV file is controlled by the “list separator” setting in Windows regional settings. The possible values are , (comma) and ; (semi-colon).
Here are instructions for finding the “list separator” setting in different versions of Windows:
Note that even if the Windows regional settings are “French (France)”, the user can change the “list separator” setting between ; (semi-colon) and , (comma). So I added a separate “list separator” user preference in our web app.
I’d hoped that Java’s DateFormat for Locale “French (France)” would produce formatted dates that can be parsed by Excel with Windows regional settings of “French (France)”. Sadly, that’s not the case for the MEDIUM date format. Java puts a ‘.’ character at the end of month abbreviations e.g. “1 sept. 2011”, which Excel can’t handle; it only parses the months that have four letters or less, such as “mai”.
The solution was to use the LONG date format. This causes Java to emit “1 septembre 2011”, which Excel parses and renders as “01-sept-11”.
Accented (non-ASCII) characters in month names were garbled in Excel. Great, a character encoding mismatch. It seems that Excel expects UTF16-LE character encoding, which is easy enough to set using an OutputStreamWriter:
Writer writer = new BufferedWriter(new OutputStreamWriter(response.getOutputStream(), "UTF-16LE"));