0

I'm trying to implement a web scraping crawler as a part of my ASP.NET MVC project. It works with large data gathered from different URLS using Html Agility Pack. The problem is when I want to actually run the function I get "The connection was reset" from remote server after a minute. I'm getting better result when I run it locally. I have access to remote IIS. Any suggestion to solve this problem and/or any alternatives?

2
  • Are you trying to retrieve all of the URLs in one request to your page? Commented Jan 9, 2011 at 9:17
  • Yes I guess. there is a loop which generates the URLs and try to capture them. but with this time limit it can just get few of the URLs. Commented Jan 9, 2011 at 9:20

2 Answers 2

1

If you have a long running process in ASP.NET, it is best to let it run on a different thread.

See this and this - related questions and this MSDN article.

Sign up to request clarification or add additional context in comments.

Comments

0

Connection and network problems could result in such problems. To avoid blocking scraping of other urls you could parallelize the work into separate threads.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.