JSON produced by Ruby not compatible with JavaScript's JSON parser

Question

I am running into an issue where the JSON produced by a Ruby script is not compatible when parsed by JavaScripts JSON.parse. Consider the following example:

# Ruby
require 'json'
hash = {}
hash["key"] = "value with \u001a unicode"
hash.to_json
=> '{"key":"value with \u001a unicode"}'

// JavaScript
JSON.parse('{"key":"value with \u001a unicode"}')
=> JSON.parse: bad control character in string literal at line 1 column 2 of the JSON data

The issue is the unicode character \u001a. The solution to this is to escape \u001a to \\u001a, but the thing is, the \u001a is automatically inserted into the string by Ruby. I can't reliably post-process the result. Any ideas about how to solve this?

Please note that I wish to call JSON.parse inside a JavaScript execution environment, not inside Ruby's interpreter.

I ran your code and I'm actually getting this as output: => "{\"key\":\"value with \\u001a unicode\"}" — Waynn Lue
– Waynn Lue, Commented Apr 24, 2015 at 21:21
You are looking at the output in the terminal. \\u001a is the terminal is the physical string \u001a. Ruby displays the backslash as \\ so you can tell the difference between the single character \u001a and the six character string also written \u001a. — Max
– Max, Commented Apr 24, 2015 at 21:29
Also note that JSON.parse should be called inside a JavaScript execution environment, not inside the Ruby interpreter. — Max
– Max, Commented Apr 24, 2015 at 21:31

Chris Heald · Accepted Answer · 2015-04-24 21:46:44Z

4

The short version is that you're interpreting your string as a Javascript expression before attempting to decode it as JSON.

U+001A is a control character. RFC 4627 explicitly disallows control characters U+0000-U+001F in quoted strings. Your problem here is not the the JSON is invalid, but that you are unescaping your control characters before attempting to parse them as JSON.

When you dump the string "\u001a" from Ruby and copy and paste it into a Javascript interpreter, the escape sequence translates to an unescaped control character, which is not a valid character in JSON! Non-prohibited characters work just fine - you can happily JSON.parse('["\u0020"]'), for example.

However, if you don't interpret the string as Javascript, and instead read it as raw bytes, it will parse correctly.

$ irb
irb(main):001:0> require 'json'
=> true
irb(main):003:0> open("out.json", "w") {|f| f.print JSON.dump(["\u001a"]) }
=> nil

$ node -e 'require("fs").readFile("out.json", function(err, data) { console.log(JSON.parse(data)); });'
[ '\u001a' ]

If you're going to be copy-pasting, you need to be copying an escaped version of the string, so that when the string is parsed by your Javascript engine, the escape double-escaped sequences properly unescape to escape sequences rather than characters. So, rather than copying the output of JSON.dump(["\u001a"]), you should be copying the output of puts JSON.dump(["\u001a"]).inspect, which will correctly escape any escape sequences in the string.

edited Apr 24, 2015 at 21:46

answered Apr 24, 2015 at 21:40

Chris Heald

62.8k10 gold badges131 silver badges143 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Max Over a year ago

Is there a way to write out the properly escaped version of the string? I'm writing the string out to a file and then someone else is reading the file in and copying the string into a JavaScript file (programmatically).

Chris Heald Over a year ago

If you're writing it out with Javascript, JSON.stringify(json_string). If you're writing it with Ruby, JSON.dump(json_string).

jon snow · Accepted Answer · 2015-04-24 21:22:03Z

0

To me following ruby code gives "{\"key\":\"value with \\u001a unicode\"}" in output.

And JSON.parse also abel to pass it. and gives Object {key: "value with unicode"}.

answered Apr 24, 2015 at 21:22

jon snow

3,0621 gold badge21 silver badges31 bronze badges

1 Comment

Max Over a year ago

You're looking at the code in the terminal. It escapes the display String so you can see the characters. Otherwise, how could you tell the difference between \\u001a and \u001a. So \\u001a is the literal string \u001a without unicode escaping. To see the difference, compare the results of "\\u001a".size and "\u001a".size. Notice that the length of \\u001a is 6 not 7, meaning that Ruby is displaying the `` escaped.

Alex Pan · Accepted Answer · 2015-04-24 21:55:16Z

0

According to the RFC:

JSON text is encoded in unicode. The default unicode is utf-8.

I ran your code in irb and got the following:

1.9.3-p484 :001 > require 'json'
 => true
1.9.3-p484 :002 >
1.9.3-p484 :003 >   hash = {}
 => {}
1.9.3-p484 :004 > hash["key"] = "value with \u001a unicode"
 => "value with \u001A unicode"
1.9.3-p484 :005 > hash.to_json
 => "{\"key\":\"value with \\u001a unicode\"}"

Then running the returned string in a javascript console, I get the following:

> JSON.parse("{\"key\":\"value with \\u001a unicode\"}")
> Object {key: "value with  unicode"}

It is returning an object. To get the value with unicode, you have to access the hash by calling:

> str = JSON.parse("{\"key\":\"value with \\u001a unicode\"}")
> Object {key: "value with  unicode"}
> str.key
> "value with  unicode"

edited Apr 24, 2015 at 21:55

answered Apr 24, 2015 at 21:26

Alex Pan

4,7618 gold badges39 silver badges46 bronze badges

4 Comments

Max Over a year ago

JSON.parse should be executed in JavaScript execution environment, not inside Ruby's interpreter.

D-side Over a year ago

@Max actually, that works too. Just copy-pasted it into Chrome's console. Those are even different languages!.. Whatever.

Max Over a year ago

@D-side take a look at the accepted answer if you'd like an explanation of why Ruby's console output works. The console output is not the exact string returned by the to_json call.

D-side Over a year ago

@Max I actually know. I've hit a similar issue before when escaping shell commands in Ruby. The general rule here is: know when and how many times your input will be unescaped.

Collectives™ on Stack Overflow

JSON produced by Ruby not compatible with JavaScript's JSON parser

3 Answers 3

2 Comments

1 Comment

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

1 Comment

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related