1

I need to load multiple sql statements from SQL Server into DataTables. Most of the statements return some 10.000 to 100.000 records and each take up to a few seconds to load. My guess is that this is simply due to the amount of data that needs to be shoved around. The statements themselves don't take much time to process.

So I tried to use Parallel.For() to load the data in parallel, hoping that the overall processing time would decrease. I do get a 10% performance increase, but that is not enough. A reason might be that my machine is only a dual core, thus limiting the benefit here. The server on which the program will be deployed has 16 cores though.

My question is, how I could improve the performance more? Would the use of Asynchronous Data Service Queries be a better solution (BeginExecute, etc.) than PLINQ? Or maybe some other approach?

The SQl Server is running on the same machine. This is also the case on the deployment server.

EDIT: I've run some tests with using a DataReader instead of a DataTable. This already decreased the load times by about 50%. Great! Still I am wondering whether parallel processing with BeginExecute would improve the overall load time if a multiprocessor machine is used. Does anybody have experience with this? Thanks for any help on this!

UPDATE: I found that about half of the loading time was consumed by processing the sql statement. In SQL Server Management Studio the statements took only a fraction of the time, but somehow they take much longer through ADO.NET. So by using DataReaders instead of loading DataTables and adapting the sql statements I've come down to about 25% of the initial loading time. Loading the DataReaders in parallel threads with Parallel.For() does not make an improvement here. So for now I am happy with the result and leave it at that. Maybe when we update to .NET 4.5 I'll give the asnchronous DataReader loading a try.

4
  • please show some source code... Commented Nov 23, 2012 at 16:52
  • there is always a good chance that loading 1000s of records is a sign of bad design ... (unless you do it for caching for example) Commented Nov 23, 2012 at 17:06
  • And describe your use case... how timely does the data need to be? What caching options do you have? Can you restrict the elements you need to load, reducing network traffic? Can you optimize your server-side database to facilitate the query? Commented Nov 23, 2012 at 17:12
  • I am writing a search engine. I do cache the largest results already, but that does not really solve the problem since the users can search just about anything. Basically the data needs to be returned asap since the user waits for the search results. Commented Nov 23, 2012 at 17:34

3 Answers 3

1

My guess is that this is simply due to the amount of data that needs to be shoved around.

No, it is due to using a SLOW framework. I am pulling nearly a million rows into a dictionary in less than 5 seconds in one of my apps. DataTables are SLOW.

Sign up to request clarification or add additional context in comments.

4 Comments

Loading them into a dictionary would be just fine to me. I don't use the DataTables for additional processing anyway. Would be great if there was another way to load these!
Well, then do it. I use BlToolkit asa leeightweight ORM and pull them into a dictionary. DataTables are about the slowest way to deal with data. Put the data into objects, load them.
Thanks for your advice! I've looked at the source code of BIToolkit and it is using the DataReader itself to load data into dictionaries or lists. My queries might take a little longer than yours since I read a lot of columns with string data. Probably I won't get faster than using a DataReader for now. Or maybe still if I load several DataReaders in parallel. But that only seems to be well supported with .NET 4.5. I am still on .NET 4.
BlToolkit will lbe a lot faster in PROGRAMMING though - and that is Money, straight and directly (less hours spent). THAT is why ORM's are used - NOTHING in .NET is at the end NOT using a DataReader.
1

You have to change the nature of the problem. Let's be honest, who needs to view 10.000 to 100.000 records per request? I think no one.

You need to consider to handle paging and in your case, paging should be done on sql server. To make this clear, lets say you have stored procedure named "GetRecords". Modify this stored procedure to accept page parameter and return only data relevant for specific page (let's say 100 records only) and total page count. Inside app just show this 100 records (they will fly) and handle selected page index.

Hope this helps, best regards!

4 Comments

-1. Ther are quite some valid reasons to pull 10.000+ rows per request, especially when you do further processing. I regularly process multi million row datasets. It really depends on the application.
TomTom I can't agree with your concept. Puling 10k-100k rows on asp page request just kills performance. Don't we just talk about this right now!? Beside that data operations are much more efective on sql server side than inside .net application.
I do a lot of further processing with the requests and the end result is indeed paged. I've done all the sql side optimizations I can imagine, but the source table has 10 million records. Yes I could rewrite the whole thing in T-SQL. But what a pain that would be. Another solution would be to do the entire processing in the SQL Server CLR (load the assembly into the database). Still that would be a major redesign, which I want to avoid.
@GregorPrimar Well, on an ASP.NET page possibly, but I do write software that prcesses millions of rows in a grid. When you visualize 3 years of 5 seconds data for fast scrolling horizontally then - yes, that is a lot of rows. Start doing mathematical analysis on it to write out the results and - yes, we talk of gigabytes of data.
0

Do you often have to load these requests? If so, why not use a distributed cache?

1 Comment

I do cache the largest results already, but that does not really solve the problem since the users can search just about anything.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.