1

So I am trying to query an API that's accessible via HTTP ( no authorization ). To speed things up, I tried to use a Parallel.ForEach loop but it seems like the longer it runs, the more errors pop up.

It fails to retrieve more and more requests. I know the API provider isn't limiting me because I can request the very same blocked URLs in my Internet browser. Also, these are different failed URLs each time, so it doesn't seem to be the case of malformed requests.

The error doesn't seem to occur while I use single threaded foreach loop.

My malfunctioning loop is below:

Parallel.ForEach(this.urlArray, singleUrl => {
this.apiResponseBlob = new System.Net.WebClient ().DownloadString(singleUrl );
this.responsesDictionary.Add(singleUrl, apiResponseBlob);
}

Normal foreach loop works fine but is very slow:

foreach (string singleUrl in this.urlArray) {
this.apiResponseBlob = new System.Net.WebClient ().DownloadString(singleUrl);
this.responsesDictionary.Add(singleUrl, apiResponseBlob);
}

Also: I've had a solution in PHP - I spawned several "fetchers" simultaneously and it never hung up. It seems strange to me that PHP would handle multithreaded retrieval better than C# so I must obviously miss something.

How do I query the API fastest way? Without these strange failures?

3
  • 1
    Wouldn't it be easier to use the async version of that call? Commented Dec 8, 2014 at 12:44
  • You mean together with Parrarel.ForEach or normal ForEach loop? Commented Dec 8, 2014 at 12:58
  • 1
    with the normal foreach and let the WbClient instance handle the completion Commented Dec 8, 2014 at 12:59

1 Answer 1

2

Hi did you try to speed up your code with a sync downloads like in this question (see marked answer):

DownloadStringAsync wait for request completion

your could loop through your uris and get a callback for each successfull download.

EDIT : i have seen that you use

this.apiResponseBlob = DL

when you use multithreading every thread tries to write in that variable. This could be a reason vor your bug. Try using an instance of that object type or use

lock{}

so that only one thread can write this variable at time. http://msdn.microsoft.com/de-de/library/c5kehkcz.aspx

like

    Parallel.ForEach(this.urlArray, singleUrl => {
    var apiResponseBlob = new System.Net.WebClient ().DownloadString(singleUrl );
    lock(singleUrl.ToString()){
    this.responsesDictionary.Add(singleUrl, apiResponseBlob);
}
    }
Sign up to request clarification or add additional context in comments.

3 Comments

What about when the WebClient is inside other class? Can I still use async by passing this class just one url?
yes you can. but you have to use an event - for the response. Something like this. e.g. (written from my mind ;) not proofed public class MyDownloader{ public event EventHandler<EventArgs> DlFinished; public void DLAsync(uri url){ var client = new WebClient(); client.DownloadStringCompleted += (sender, e) => { doSomeThing(e.Result); this.DlFinished(null,null); }; client.DownloadStringAsync(uri); }} usage from other class: MyDownloader loader = new MyDownloader; loader.DlFinished+=CallbackFunction; loader.DlAsync(uri);
hope this is readable ;) more info here about custom events: stackoverflow.com/questions/6644247/simple-custom-event

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.