18

I have a program which runs on a console and its Umlauts and other special characters are being output as ?'s on Macs. Here's a simple test program:

public static void main( String[] args ) {
    System.out.println("höhößüä");
    System.console().printf( "höhößüä" );
}

On a default Mac console (with default UTF-8 encoding), this prints:

 h?h????
 h?h????

But after manually setting the Mac terminal's encoding to "Mac OS Roman", it correctly printed

 höhößüä
 höhößüä

Note that on Windows systems using System.console() works:

 h÷h÷▀³õ
 höhößüä

So how do I make my program...rolleyes..."run everywhere"?

2 Answers 2

13

Try the following command-line argument when starting your application:

-Dfile.encoding=utf-8

This changes the default encoding of the JVM for I/O operations.

You can also try:

System.setOut(new PrintStream(System.out, true, "utf-8"));
Sign up to request clarification or add additional context in comments.

4 Comments

problem with this is that I'm shipping this console program as a jar, so I'm looking for a solution that doesn't involve the user having to add his own command line arguments.
@Epaga - use a bat file or some wrapper for the jar?
-Dfile.encoding=utf-8 is not a standard option and should be avoided.
@Bozho The second option helped for being able to print utf-8 (hebrew) to my console. However, my ultimate problem is mapping of servlet url patterns. When using hebrew urls in the url patterns the mappings are displayed (I checked on jconsole MBeans) as gibberish. I thought that fixing the console printing would also fix the mappings but it doesn't. Do you have any other suggestions?
10

Epaga: have a look right here. You can set the output encoding in a printstream - just have to determine or be absolutely sure about which is being set.

import java.io.PrintStream;
import java.io.UnsupportedEncodingException;

public class Test {
    public static void main (String[] argv) throws UnsupportedEncodingException {
    String unicodeMessage =
    "\u7686\u3055\u3093\u3001\u3053\u3093\u306b\u3061\u306f";

    PrintStream out = new PrintStream(System.out, true, "UTF-8");
    out.println(unicodeMessage);
  }
}

To determine the console encoding you could use the system command "locale" and parse the output which - on a german UTF-8 system looks like:

LANG="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_CTYPE="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_ALL=

3 Comments

2 problems: 1) how do I know which encoding is set in the console? 2) is there any way to still work with System.console() ?
No, you cannot retain your code using System.console(), and I don't perceive it critical. PrintStream also has everything you want and it does work properly. +1 for the answer from me.
1) You could use the system command "locale" and parse its output.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.