0

I tried to pass a UTF-8 String through a Java Socket.

The String contains a mix of English and Greek.

My problem is that when the message passes through the socket all Greek characters turn to "?".

I already tried to set the InputStream character set to UTF-8.

Bellow is my attempt, any help will be appreciated.

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.net.ServerSocket;
import java.net.Socket;
import java.nio.charset.StandardCharsets;

public class Main {
    public static void main(String[] args) {
        try {
            String msg = "This is a test - Αυτο ειναι μια δοκιμη";
            ServerSocket serverSocket = new ServerSocket(9999);

            Thread host = new Thread(new Runnable() {
                @Override
                public void run() {
                    while (true) {
                        try {
                            Socket socket = serverSocket.accept();

                            if (socket != null) {
                                BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(socket.getInputStream(), StandardCharsets.UTF_8));

                                while (true) {
                                    String line = bufferedReader.readLine();

                                    if (line != null) {
                                        System.out.println(line);
                                    } else if(bufferedReader.read() < 0) {
                                        break;
                                    }
                                }
                            }
                        } catch (IOException e) {
                            e.printStackTrace();
                        }
                    }
                }
            });

            host.start();

            Socket socket = new Socket("127.0.0.1", 9999);
            PrintWriter printWriter = new PrintWriter(socket.getOutputStream(), true);
            printWriter.println(msg);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Edit 1

I run and build my code through IntelliJ Idea and that is where I found the problem.

But after @Ihar Sadounikau comment I updated and my JDK and tried to build and run through PowerShell but still the problem persists.

And this is my result

& 'C:\Program Files\Java\jdk-13.0.2\bin\java.exe' Main
This is a test - ??τ? ε??α? ??α δ?????

2
  • Works perfectly without any question mark. Try to recompile the class. I used Java 11 for the test, not sure it is an issue Commented Mar 8, 2020 at 22:49
  • 1
    javadoc of PrintWriter: "Creates a new PrintWriter from an existing OutputStream. This convenience constructor creates the necessary intermediate OutputStreamWriter, which will convert characters into bytes using the default character encoding." (assuming console can display the characters) Commented Mar 8, 2020 at 23:40

2 Answers 2

4

With this line: PrintWriter printWriter = new PrintWriter(socket.getOutputStream(), true); you are converting a bytestream (i.e., InputStream / OutputStream into a charstream (i.e., Reader / Writer). Anytime you do that, if you fail to specify the encoding, you get platform default, which is unlikely what you want.

You (and @IharSadounikau) are seeing different results, because the 'platform default' is switching around on you. It's one of the reasons you REALLY do not want to use it, ever. Figuring out that your code has the bug where it only works if your platform default encoding is the same as the person who developed it – is generally untestable.

Try new PrintWriter(socket.getOutputStream(), true, StandardCharsets.UTF_8).

Sign up to request clarification or add additional context in comments.

2 Comments

OK, I updated my JDK, updated my gradle, changed all my run/builds and implemented your solution but still questions marks. Am I missing something?
Just print the line. A one-liner: System.out.println("Greek chars here"). Does that show the question marks? Then it's powershell which is misconfigured or incapable of printing these.
0

Maybe this will help:

String msgEncode = URLEncoder.encode(msg, "UTF-8");
printWriter.println(msgEncode);

And:

String line = bufferedReader.readLine();
String msgDecode = URLDecoder.decode(line, "UTF-8");

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.