I have a c# script task in an ssis package designed to geocode data through my company's proprietary system. It currently works like this:
1) Pull query of addresses and put in data table 2) Loop through that table and Foreach row, build request, send request, wait for response, then insert back into the database.
The issue is that each call takes forever to return, because before going out and getting a new address on the api side, it checks a current database(string match) to ensure the address does not already exist. If not exists, then go out and get me new data from a service like google.
Because I'm doing one at a time, it makes it easy to keep the ID field with the record when I go back to insert it into the database.
Now comes the issue at hand... I was told to configure this as multi-thread or asynchronous. Here is the page I was reading on here about this topic: ASP.NET Multithreading Web Requests
var urls = new List<string>();
var results = new ConcurrentBag<OccupationSearch>();
Parallel.ForEach(urls, url =>
{
WebRequest request = WebRequest.Create(requestUrl);
string response = new StreamReader(request.GetResponse().GetResponseStream()).ReadToEnd();
var result = JsonSerializer().Deserialize<OccupationSearch>(new JsonTextReader(new StringReader(response)));
results.Add(result);
});
Perhaps I'm thinking about this wrong, but if I send 2 requests(A & B) and lets say B actually returns first, how can I ensure that when I go back to update my database I'm updating the correct record? Can I send the ID with the API call and return it?
My thoughts are to create an array of requests, burn through them without waiting for a response and return those value in another array, that I will then loop through on my insert statement.
Is this a good way of going about this? I've never used Parrallel.ForEach, and all the info I find on it is too technical for me to visualize and apply to my situation.
WebRequestand NOTHttpClient