4

I've got a ASP.NET WebService that looks something like this:

[WebMethod]
public static void DoSomethingWithStrings(string stringA, string stringB)
{
    // and so on
}

An third party application should call this webservice. However this application encodes strings as UTF-8 and all umlauts are replaced by '??'. I can view the call and the special characters are formatted well:

<?xml version="1.0" encoding="utf-8" ?>
<!-- ... -->
<SoapCall>
    <DoSomethingWithStrings>
        <stringA>Ä - Ö - Ü</stringA>
        <stringB>This is a test</stringB>
    </DoSomethingWithStrings>
</SoapCall>

This produces the following output, when I simply print the strings inside the webservice method:

?? - ?? - ??

This is a test

How can I configure the WebService to accept UTF-8 encoded strings?

Update

Fiddler also tells me that the content-type charset of the http request is UTF-8.

Update 2

I tried to add following code to global.asax for debugging purposes:

public void Application_BeginRequest(object sender, EventArgs e)
{
    using (var reader = new System.IO.StreamReader(Request.InputStream))
    {
        string str = reader.ReadToEnd();
    }
}

This reads the actual SOAP call. The StreamReaders encoding is set to UTF-8. The SOAP call looks correct:

<?xml version="1.0" encoding="UTF-8" ?> 
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
    <SOAP-ENV:Body>
        <DoSomethingWithStrings xmlns="http://www.tempuri.org/">
            <stringA>Ä - Ö - Ü</stringA>
            <stringB>This is a test!</stringB>
        </DoSomethingWithStrings>
    </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

In the web.config file the globalization settings are set correctly:

<globalization requestEncoding="UTF-8" responseEncoding="UTF-8" culture="de-DE" uiCulture="de-DE" />

So it looks like something that deserializes the SOAP message does not use UTF-8 but ASCII encoding.

2
  • Just a technical point, but encrypted is not the right adjective. As UTF-8 is just a character set, encoded would be more accurate. Commented Dec 4, 2012 at 14:45
  • Of course you are right, sorry for this mix up :) Commented Dec 4, 2012 at 14:54

4 Answers 4

6

Finally it turns out that something went wrong within accepting HTTP-Messages. I don't actually know what manipulates the HTTP-Request, but I found a workaround for this. Eventhough Fiddler showed me the correct content type (text/xml; charset=utf-8) in my Application_BeginRequest the Request.RequestContext.HttpContext.Request.ContentType was just text/xml, which lead to a fallback to default (ASCII) encoding within the ASMX serializer. I've added the following code to the Application_BeginRequest handler and everything works for now.

if (Request.RequestContext.HttpContext.Request.ContentType.Equals("text/xml"))
{
    Request.RequestContext.HttpContext.Request.ContentType = "text/xml; charset=UTF-8";
}

Thanks for your help!

Sign up to request clarification or add additional context in comments.

3 Comments

Good to hear you got it fixed, I wonder why it doesn't fallback to reading the encoding attribute from the xml though. I was so sure any automatic process would get it right since all the info is there and only a manual override would have screwed it up.
Exactly the reason, why I asked the question... I have no idea, why the SOAP-message (which in general IS a XML message) get's not parsed like an XML file. However... thanks for your help :-)
Did the same with a custom SoapExtension. Within the ProcessMessage method I appended "charset=UTF-8" before deserialization.
0

Try this:-

  byte[] bytes=Encoding.UTF8.GetBytes(yourString);

NOTE:-

Strings never contain anything utf-* or anything else encoded

1 Comment

strings (or System.Strings) are sets of unicode characters (msdn.microsoft.com/en-us/library/system.string.aspx). Maybe I can get the bytes from an UTF-8 encoded string this way, but inside this string the Umlauts are already replaced. I've updated the question.
0

The SOAP call is being decoded as ASCII somewhere - each of the umlauts are 2 bytes with high bit being set, which turns into ?? when decoded as ASCII.

So, something like this is happening:

byte[] bytesSentFromClient = Encoding.UTF8.GetBytes("Ä - Ö - Ü");
string theStringIThenReceiveInMyMethod = Encoding.ASCII.GetString(bytesSentFromClient);
Console.WriteLine(theStringIThenReceiveInMyMethod);
//?? - ?? - ??

To verify this is happening for sure, you should compare stringA == "Ä - Ö - Ü" rather than printing it somewhere.

I guess you could start by doing a project-wide search for "ASCII" and then work from there if you find anything.

You could also try

<globalization requestEncoding="utf-8" responseEncoding="utf-8"/>

Under the <system.web> tag in Web.config file.

1 Comment

Interesting point! Thanks for your research. I tried to decode the request input stream and updated the question (Update 2). Also searching the whole solution for ASCII does not give any matches.
0

I had the same problem. Asmx web service converted my UTF-8 to ASCII or, better to say to ??????. Your post helped me a lot. The solution I found was to change version of SOAP protocol from 1.1 to 1.2 I mean:

POST /WebService1.asmx HTTP/1.1
Host: www.tempuri.org
Content-Type: text/xml; charset=utf-8
Content-Length: length
SOAPAction: "http://www.tempuri.org/HelloWorld"

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <HelloWorld xmlns="http://www.tempuri.org/">
        <inputParam>Привет</inputParam>
    </HelloWorld>
  </soap:Body>
</soap:Envelope>

had the problem. But when I changed my request to SOAP 1.2:

POST /WebService1.asmx HTTP/1.1
Host: www.tempuri.org
Content-Type: application/soap+xml; charset=utf-8
Content-Length: length

<?xml version="1.0" encoding="utf-8"?>
<soap12:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap12="http://www.w3.org/2003/05/soap-envelope">
  <soap12:Body>
    <HelloWorld xmlns="http://www.tempuri.org/">
       <inputParam>Привет</inputParam>
    </HelloWorld>
  </soap12:Body>
</soap12:Envelope>

The issue was solved.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.