2

In a Java servlet I'm doing:

protected void handleRequests(HttpServletRequest request, HttpServletResponse response)

  PrintWriter pw = response.getWriter();

  /*...*/

  Vector<String> buf = new Vector<>();
  for(...) {
    ret.add(">žd¿ [?²„·ÜðÈ ‘");
  }

  /*JSONArray*/ responseArray.put(responseArray.length(), buf); 


  /*...*/

  pw.println(responseArray);

  pw.close();
}

In a web page client javascript I'm doing a XMLHttpRequest and the reply is incorrect, looks like: >?d¿ [\u001a?²\u201e·ÜðÈ \u2018

(for the above >žd¿ [?²„·ÜðÈ ‘ input)

Then I tried on the servlet:

ret.add(URLEncoder.encode(">žd¿ [?²„·ÜðÈ ‘", "UTF-8"));

and I get:

%3E%C5%BEd%C2%BF%C2%A0%5B%7F%1A%3F%C2%B2%E2%80%9E%C2%B7%C3%9C%C3%B0%C3%88%C2%A0%E2%80%98

in javascript, then I apply:

unescape(reply.replace(/\+/g,' ') (the replace is because + signs are not converted to spaces)

which nets me:

>žd¿ [?²â??·Ã?ðÃ? â

What do I do wrong?

(Some other questions tells me the servlet should send as utf8. But when do I encode in utf8 - before placing inside a JSON object (I use org.json.) or after (with a .toString on the JSON response array and then convert to utf8 before PrintWriter.println)

P.S. This is not all my code, I've inherited it and some of the theoretical background I'm lacking.

Edit: doing a decodeURIComponent(reply).replace(/\+/g,' ') in javascript seems to do the trick. But I could not find the difference between URLEncoder.encode and decodeURIComponent. Is the +/space the only mismatch?

5
  • You shouldn't have to URL encode the string at all. You're not using it in a URL, right? Commented Jul 24, 2014 at 15:07
  • No, I'm displaying it only. If I don't URL encode it, I get, as shown >?d¿ [\u001a?²\u201e·ÜðÈ \u2018. After a JSON.parse I get >?d¿ [?²„·ÜðÈ ‘ which is close but not quite... Commented Jul 24, 2014 at 15:16
  • Make sure that your HTTP response has the right "Content-Type" header too - it has to include "charset=UTF-8" Commented Jul 24, 2014 at 15:22
  • @Pointy response.setCharacterEncoding( "UTF-8" ); did the trick. Thanks! If you add as reply I'll accept it. Commented Jul 24, 2014 at 15:31
  • Ha ha I'm on a roll today; this is the second UTF-8 issue that's come up here :) Commented Jul 24, 2014 at 15:32

1 Answer 1

1

decodeURIComponent nets the expected result

decodeURIComponent("%3E%C5%BEd%C2%BF%C2%A0%5B%7F%1A%3F%C2%B2%E2%80%9E%C2%B7%C3%9C%C3%B0%C3%88%C2%A0%E2%80%98");
">žd¿ [?²„·ÜðÈ ‘"
Sign up to request clarification or add additional context in comments.

1 Comment

Seems so. But I still need to do: .replace(/\+/g,' ') afterwards. decodeURIComponent does not perfectly decode output from URLEncoder.encode. But are there other diferences besides +/space?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.