2

I have a binary value being URL Encoded, and then POSTed to an HttpServlet. The following code shows how I first attempted to extract this data. Very simple except that the result is a String, not bytes.

This seemed to work at first, except that an extra byte appeared three bytes from the end. What I eventually figured out was that my data was being treated as Unicode and converted from one Unicode encoding to UTF-8.

So, other that getting the entire post body and parsing it myself, how can I extract my data without treating it as a string after the url encoding is decoded? Have I misunderstood the specs for posted data in general, or is this a Java/Tomcat specific issue?

protected void doPost(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException {

    // Receive/Parse the request
    String requestStr = request.getParameter("request");
    byte[] rawRequestMsg = requestStr.getBytes();

Here is a snippet of the Python test script I'm using for the request:

    urlRequest = urllib.urlencode( {'request': rawRequest} )

    connection = urllib.urlopen(self.url, data = urlRequest)
    result = connection.readlines()
    connection.close()
2
  • Can you show some example of what have you posted and what do you receive? Commented Jan 7, 2010 at 0:32
  • URL encoding is, as the name implies, for URLs. Binary data should be encoded with e.g. Base64. Commented Jan 7, 2010 at 0:42

3 Answers 3

3

There are two possible solutions:

  • ASCII-encode your data before POSTing it. Base64 would be a sensible choice. Decode it in your servlet and you have your original binary again.

  • Use form content type multipart/form-data ( http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4 ) to encode your binary data as a stream of bytes; then your servlet can do servletRequest.getReader() to read the data in, again as a binary stream.

Sign up to request clarification or add additional context in comments.

1 Comment

I think you're right and that using multipart/form-data is the correct answer.
2

I think this should work (it treats request as a single-byte encoding, so transformation to String is completely reversible):

String someSingleByteEncoding = "ISO-8859-1";
request.setCharacterEncoding(someSingleByteEncoding);
String requestStr = request.getParameter("request"); 
byte[] rawRequestMsg = requestStr.getBytes(someSingleByteEncoding);

3 Comments

This is working, but I'm not sure if I should consider it 'right' or not. This is for a web API to be exposed inside the company to a variety of people in a variety of languages.
I'm going back to this as the 'right' answer. In large part because it will also allow calls to be done via GET as well as POST. The binary blobs in question are small (Protocol Buffer structs) and flexibility in calls to the server is important.
You'll really need to document that the string has to be encoded with the very same charset prior to sending. To prepare world domination, I would recommend using UTF-8 for that, in the both sides.
0

you can do this with a servlet wrapper (HttpServletRequestWrapper)... catch the request and snatch the request body before its decoded

but the best way is probably to send the data as a file upload (multipart/form-data content type)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.