1

When i trying parse txt file at 200k line in file i have this error:

java.nio.charset.UnmappableCharacterException: Input length = 1

After i got error my program break:

val bufferedSource = io.Source.fromFile( path)
for (line <- bufferedSource.getLines.drop(1)) {
    line.split('|').toList.drop(1)
    }

If I understand correctly, the error is in io.Source.fromFile( path). How i can skip bad rows ?

4
  • 2
    Possible duplicate of How to resolve java.nio.charset.UnmappableCharacterException in Scala 2.8.0? Commented Feb 12, 2017 at 21:20
  • i have this error not at first row. I have error at 200k row Commented Feb 12, 2017 at 21:23
  • It still seems to be a character encoding issue. Either you need to handle non-ascii characters or make sure that your file only has acii values. Commented Feb 12, 2017 at 21:56
  • My solution is :import scala.io.Codec implicit val codec = Codec("cp1251") codec.onMalformedInput(CodingErrorAction.REPLACE) codec.onUnmappableCharacter(CodingErrorAction.REPLACE) Commented Feb 12, 2017 at 22:07

1 Answer 1

1

Unfortunately you have to deal with encodings issues yourself. One of these 2 encodings often work for me:

val bufferedSource = io.Source.fromFile( path, enc = Codec.UTF8.name)
for (line <- bufferedSource.getLines.drop(1)) {
    line.split('|').toList.drop(1)
    }

or

val bufferedSource = io.Source.fromFile( path, enc = Codec.ISO8859.name)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.