5
//string with correct json format

{"reaction":"\ud83d\udc4d","user":{"id":"xyz"}}

//after JSON.parse()

{ reaction: '👍', user: [Object] }

What I want to do is keep the reaction value encoded, but JSON.parse() does not exactly do what I want.

Update

In the end I decided to leave JSON.parse() alone and fix the database issue as @Brad suggested. I changed the database format, but that was not enough to fix the problem, so I found this. Every statement must now start with SET NAMES utf8mb4; then the query. Also in the connection you then have to have these {charset : 'utf8mb4', multipleStatements: true}. Without node-mysql proper documentation it's quite hard to find the best answer, but in the end I got to learn a lot along the way, Thank you.

12
  • Works fine for me? Running on Chrome 63 Commented Feb 23, 2018 at 23:06
  • 1
    Ah, that's what you meant. Commented Feb 23, 2018 at 23:08
  • 2
    JSON.parse won’t be encoding anything, looks to me like it’s the console or whatever way your logging the data out. What’s the intended use of the parsed data? Commented Feb 23, 2018 at 23:09
  • 1
    @Adminy What are you viewing your database with and why don't you want it to show up as a bunch of question marks? It seems best to store the actual characters as-is, even if the tool you're viewing your DB with doesn't know how to display them. Commented Feb 23, 2018 at 23:24
  • 1
    @Adminy Fix your character encoding. You're addressing this problem in completely the wrong way. Leave JSON alone. Commented Feb 24, 2018 at 0:44

1 Answer 1

2

If you don't want parse to unencode that string then you could escape the backslashes, e.g. "\\ud83d\\udc4d"

Do you control where that data comes from? Perhaps you want to provide a "replacer" in JSON.stringify to escape those, or an "reviver" in JSON.parse.

What options do you have for exercising control over the stringify or parse?

apply a reviver

const myReviver = (key, val) => key === "reaction" ? val.replace(/\\/g, "\\\\") : val;

var safeObj = JSON.parse(myJson, myReviver);

CAUTION: This doesn't seem to work in a browser, as it appears the \uxxxx character is decoded in the string before the reviver is able to operate on it, and therefore there are no backslashes left to escape!

Multiple escaping

Following on from chat with the OP it transpired that adding multiple escaped backslashes to the property with utf characters did eventually lead to the desired value being stored in the database. A number of steps were unescaping the backslashes until the real utf character was eventually being exposed.

This is brittle and far from advisable, but it did help to identify what was/wasn't to blame.

NO backslashes

This appears to be the best solution. Strip all backslashes from the data before it is converted into the utf characters or processed in any way. Essentially storing deactivated "uxxxxuxxxx" codes in the database.

Those codes can be revived to utf characters at the point of rendering by reinserting the backslashes using a regular expression:

database_field.replace(/(u[0-9a-fA-F]{4})/g, "\\$1");

Ironically, that seems to skip utf interpretation and you actually end up with the string that was wanted in the first place. So to force it to deliver the character that was previously seen, it can be processed with:

emoji = JSON.parse(`{"utf": "${myUtfString}"}`).utf;
Sign up to request clarification or add additional context in comments.

7 Comments

To answer your question, I have control over the string that I parse, but I have to manipulate the string, I can't change the format is coming in.
So as to the solution you are proposing is I escape the backslashes with more backslashes? I can try that.
Yes, my basic answer is to escape your backslashes. I'd personally prefer to do that at stringify, but sounds like you only have option during parse, so have provided example of reviver.
I was about to say I did data.replace(/\\/g, "\\\\") and it worked but thanks for your solution!
Sadly, starting to think that the reviver isn't the solution after all. It seems that the character gets converted within the string before the reviver is able to manipulate it. So applying to the string may be the only way. In which case it is worth using a more robust regex to only escape that property.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.