1

I've come across a website that seems to resist being read via HTTP/Get with C# but works fine in Firefox, Postman and Powershell and I would really love to understand why.

Here is the repro-case in powershell

Invoke-WebRequest -UseBasicParsing -Uri https://www.tripadvisor.de/

This call works, it gets a 200 and powershell dumps the result.

When I do the same in C#, the request hangs until it times out, I'll never get a response.

Here is the c# code. It is not that complicated

var client = new HttpClient();
var r = await client.GetAsync("https://www.tripadvisor.de/");
r.EnsureSuccessStatusCode();

In a .net6.0 application with top level statements this is actually the entire program code. I've tested this in a .net framework 4.8 console app (of course with an async Task main), same behaviour.

Tested on Windows 11 and Windows 10.

Can somebody reproduce this behaviour?

P.S.: I am sure this is not an async/await or deadlock issue, because plenty of other sites do work correctly

**Edit: ** Working hypothesis is that the server simply ignores those request by some metric because it assumes that this request come from an invalid client (non-browser, scraper, flooder).

What I have also tried

I've been trying to make this C# code work by passing a HttpClientHandler with Tls1.2

using var handler = new HttpClientHandler() { SslProtocols = SslProtocols.Tls12 };
var client = new HttpClient(handler);

Same result.

I've been sending ALL the additional headers, that for example firefox sends (User-Agent, Accept, Accept-Encoding, etc), no luck.

I've added a certificate callback to the handler just in case, but sslPolicyErrors had no errors anyway:

handler.ServerCertificateCustomValidationCallback = (message, cert, chain, sslPolicyErrors) => true;

The parallel stack shows, that the code tries to receive a full TLS frame (I am just guessing)

tls-stack

Can somebody test the basic C# code and validate the behaviour? Is this an issue with HttpClient? Am I missing something? Is this server configured in a weird way?

Update: The old, and deprecated WebRequest approach worked for a short time:

var res = WebRequest.Create("https://www.tripadvisor.de/");
res.Method = "GET";
var resp = res.GetResponse();

var content = await new StreamReader(resp.GetResponseStream()).ReadToEndAsync();

Console.WriteLine(content);

While it resulted in several 200, it appears, that later the server black-listed me because I was probably lacking indicators of a "real" user. I understand, that this is some scraper / flood control mechanism. What I am missing is, why Invoke-WebRequest did not cause to be blocked.

Note: There is no http version of this page to test against :/

I've also tried ConfigureAwait(false) but from my understanding this cannot be the issue because other websites do work.

12
  • Have you looked at stackoverflow.com/questions/10343632/… ? And have you tried to do a ConfigureAwait(false)? Commented Aug 25, 2023 at 8:10
  • You just proved that HttpClient just works. WebClient uses HttpClient undernead in .NET Core. Even HttpWebRequest is nothing more than a compatibility wrapper over HttpClient Commented Aug 25, 2023 at 8:12
  • @Lucero I tried, this does not appear to be the solution. I'll update the post Commented Aug 25, 2023 at 8:13
  • 2
    Have you actually checked what's going on in the Network tab when you request tripadvisor.de? I see a redirect to www.tripadvisor.de, 20 secs of small responses and a never ending GET to tripadvisor.de. I don't know whether that's eg a long request to allow server push (pointless in Chrome) or a trap for crawlers Commented Aug 25, 2023 at 8:35
  • 2
    If you have a working request (browser) and a failing request (app), run them both through Fiddler and compare. If the requests are truly identical, then there must be a stateful firewall of sorts, and you'll need to send all the other requests the working client does; the other requests are acting as a "key". It's also possible you may have to simulate delays. In the extreme case, the key may be dynamic and you'd need to embed a browser or at least a JS interpreter to determine the key on each request. Commented Aug 25, 2023 at 10:20

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.