9

Is it possible to convert Erlang binary UTF-8 string (like <<"HELLO">>) to lowercase without converting it to list and back?

1

2 Answers 2

12

If you know how to lowercase unicode character and key words here are "without converting it to list and back", then the answer could be:

<< <<(unicode_to_lower(C))/utf8>> || <<C/utf8>> <= <<"HELLO">> >>.
Sign up to request clarification or add additional context in comments.

4 Comments

@Kay: Having a working implementation of unicode_to_lower/1 is implied by the answer.
I knew I am missing something really simple! Thanks!
Note: This will only work on a very small subset (the ASCII range)! For some "values" You have to peek at the following bytes (I think up to 6 bytes). en.wikipedia.org/wiki/UTF-8
@bsmr It will work not just for ASCII 1> [ C || <<C/utf8>> <= unicode:characters_to_binary("ПРИВЕТ") ]. [1055,1056,1048,1042,1045,1058]
11

string:lowercase in Erlang 20 works with binaries:

1> string:lowercase(<<"HELLO">>).
<<"hello">>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.