32

I have problem in reading text file with utf-8 encoding I'm using java with netbeans 7.2.1 platform

I already configured the java project to handle UTF-8 javaproject==>right click==>properties==>source==>UTF-8

but still get the unknown character output: ����� �������� ���� �

the code:

File fileDirs = new File("C:\\file.txt");

BufferedReader in = new BufferedReader(
new InputStreamReader(new FileInputStream(fileDirs), "UTF-8"));

String str;

while ((str = in.readLine()) != null) {
    System.out.println(str);
}

any other ideas?

thanks

3
  • What is the encoding of System.out? What's your system encoding? Commented Feb 17, 2013 at 5:13
  • Are you sure, the input file is UTF-8 encoded? Commented Feb 17, 2013 at 6:49
  • 3
    thank you all for your comments. I found the solution to the problem.the text file was with ANSI encoding with arabic character. so to solve : BufferedReader in = new BufferedReader( new InputStreamReader(new FileInputStream(fileDirs), "windows-1256"));--thanks all Commented Feb 17, 2013 at 12:40

5 Answers 5

43

Use

    import java.io.BufferedReader;
    import java.io.File;
    import java.io.FileInputStream;
    import java.io.IOException;
    import java.io.InputStreamReader;
    import java.io.UnsupportedEncodingException;     
    public class test {
    public static void main(String[] args){

    try {
        File fileDir = new File("PATH_TO_FILE");

        BufferedReader in = new BufferedReader(
           new InputStreamReader(new FileInputStream(fileDir), "UTF-8"));

        String str;

        while ((str = in.readLine()) != null) {
            System.out.println(str);
        }

                in.close();
        } 
        catch (UnsupportedEncodingException e) 
        {
            System.out.println(e.getMessage());
        } 
        catch (IOException e) 
        {
            System.out.println(e.getMessage());
        }
        catch (Exception e)
        {
            System.out.println(e.getMessage());
        }
    }
}

You need to put UTF-8 in quotes

Sign up to request clarification or add additional context in comments.

1 Comment

Bad practice to put in.close before the catch. Should be in a finally block. Also can use multi catch format in Java 8
13

You need to specify the encoding of the InputStreamReader using the Charset parameter.

Charset inputCharset = Charset.forName("ISO-8859-1");
InputStreamReader isr = new InputStreamReader(fis, inputCharset));

This is work for me. i hope to help you.

1 Comment

Lg g3 = work with utf-8 but not ISO-8859-1, and my ASUS = works with ISO-8859-1 but not in utf-8 ...
10

You are reading the file right but the problem seems to be with the default encoding of System.out. Try this to print the UTF-8 string-

PrintStream out = new PrintStream(System.out, true, "UTF-8");
out.println(str);

Comments

5

I ran into the same problem every time it finds a special character marks it as ��. to solve this, I tried using the encoding: ISO-8859-1

BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream("txtPath"),"ISO-8859-1"));

while ((line = br.readLine()) != null) {

}

I hope this can help anyone who sees this post.

1 Comment

There's no ISO-8859-1 encoding set in the code. The encoding is the default
4

Ok, I am definitively late to the party but if you are still looking for an optimal solution I would use the following ( for Java 8 )

    Charset inputCharset = Charset.forName("ISO-8859-1");
    Path pathToFile = ....
    try (BufferedReader br = Files.newBufferedReader( pathToFile, inputCharset )) {
        ...
     }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.