5

I have a long running process in C# that hits a Sql table anywhere from 10 to 200 times. When the process exceeds about 50 hits and queries greater than about 100,000 rows from the same table each time, it will throw a System Out of Memory Exception at this line, specifically at the bottom where it converts the IQuery object to a List:

var cht = from p in _db.TickerData
          where p.Time >= Convert.ToDateTime(start) &&
          p.Time <= Convert.ToDateTime(end)
          orderby p.Time
          select p;

_prices = cht.ToList();    < this is where the System.OutofMemoryException occurs >

What can I do to prevent this error??

3
  • I have tried to create an index on the [Time] column in Sqlexpress. The current index is a primary key [Id] column. Commented Sep 20, 2015 at 15:34
  • So it runs out of memory when you try to load 100,000 rows into memory? How do you think you might fix this? Commented Sep 20, 2015 at 15:39
  • The question is, what are you doing with _prices? After that we can tell you how to fix the problem. Commented Sep 20, 2015 at 18:41

5 Answers 5

6

First off:

specifically at the bottom where it converts the IQuery object to a List

Yes, that's where you'd expect the Out Of Memory condition to occur.

The assignment of cht above doesn't actually hit the database; all it does is declare the shape of the query. This is called deferred execution and LINQ uses it all over the place. It means "we don't actually process anything until your code needs it."

Calling ToList, though, essentially says "the code needs it, all of it, right now." So that's where it sends the query to the database, pulls back all the results all at once, uses LINQ magic to turn them into CLR objects, and stuff them all in a List<T> for you.

Having said that, this is just a hunch, but it's possible that your LINQ provider doesn't know what Convert.ToDateTime is. If it doesn't know how to handle that, it won't put it into the WHERE clause in the query it executes, and instead it will load the entire table and filter it client-side, which might be why you crash when the table gets too big, rather than when the result set gets too big.

To verify this, use a profiler for your database to intercept the query, and see if the WHERE clause looks like you'd expect. If it's not translating right, try this instead:

var startTime = Convert.ToDateTime(start);
var endTime = Convert.ToDateTime(end);
var cht = from p in _db.TickerData
          where p.Time >= startTime && p.Time <= endTime
          orderby p.Time
          select p;
_prices = cht.ToList();

If that doesn't help, you're probably just pulling back too much data, and you'll have to work on that the same ways you'd work with processing too much data in any other context.

Sign up to request clarification or add additional context in comments.

2 Comments

The Problem was occurring due to Convert.ToDateTime(). Thanks @Mason
That problem, however was only a subset of the real problem which was using ToList instead of using IEnumerable to create a class object which seemed to solve the larger issue. Hope that is an accurate description of how to use IEnumerable. That answer was mentioned by @IVan below in the comments.
5

The data you trying to retrieve is too huge for your list. The exception comes on ToList() because it's exactly the place where query is executing. What do you want to achieve with such large list? Possible solutions are:

1) Restrict your search with more criteria. Load not the whole data, but a part of it, and another part if you really need it.

2) Use another data structure than a list if you want to load whole data in memory, take a look on ConcurrentDictionary

Comments

4

Your problem is that the query returns a very large set of data which needs to be stored in our process's memory. Very much data => OutOfMemoryException. That's normal. What's not normal is trying to do such a thing. Instead, you could limit the result set with some extra filtering or break the large result set into smaller ones, maybe like so:

        DateTime startDateTime = Convert.ToDateTime(start);
        DateTime endDateTime = Convert.ToDateTime(end);
        int fetched = 0;
        int totalFetched = 0;

        do
        {
            //fetch a batch of 1000 records
            var _prices = _db.TickerData.Where(p => p.Time >= startDateTime && p.Time <= endDateTime)
                                    .OrderBy(p => p.Time)
                                    .Skip(totalFetched)
                                    .Take(1000)
                                    .ToList();                

            //do here whatever you want with your batch of 1000 records
            fetched = _prices.Count;
            totalFetched += fetched;
        }
        while (fetched > 0);

This way you can process any amount of data, in batches.

EDIT: fixed some issues, as reported by @Code.me in the comments section.

EDIT: I suggest you set up an index at the database level on the Time column, if you haven't already, to speed up these queries.

4 Comments

I see 3 issues here. First, Take(1000) will always fetch the first 1000 records. You need to Skip(1000) after first iteration and continue fetching. Second, _prices is out of scope in the while statement. Third, IEnumerable<T> doesn't have a Count property. You need to invoke .Count() extension method on it.
@Code.me Thank you for spotting those problems. Fixed first two issues (post was made in a hurry...). As for third, ToList() returns a List<T> which does have the Count property.
You are correct. I always work with IEnumerable<T>. I forgot that it returns IList<T>.
@Code.me public static List<TSource> ToList<TSource>( this IEnumerable<TSource> source ) It really is returning a List<T>, not IList<T>.
2

Because of deferred execution, the query will execute when you call ToList() on it. Since loading all the data will consume too much memory, its a good practice to batch process.

The below code will let you fetch 1000 (or whatever works best for you) records at time and you can process them.

var startTime = Convert.ToDateTime(start);
var endTime = Convert.ToDateTime(end);

IEnumerable<T> prices= new List<T>();  // whatever T is

var currentFetched = 0;
var totalFetched = 0;

do
{
    var cht = _db.TickerData.Where(p => p.Time >= startTime && p.Time << endTime)
                        .OrderBy(p => p.Time)
                        .Skip(totalFetched)
                        .Take(1000)
                        .ToList();

    currentFetched = cht.Count();
    totalFetched += currentFetched;

    // prices = prices.Concat(cht).ToList();  
    // ^ This might throw an exception later when the list becomes too big 
    // So you can probably process currently fetched data
}
while (currentFetched > 0);

6 Comments

This implementation retrieves records 1-1000, then 1001-2000, and repeatedly same 1001-2000 because after the first fetch, you will always skip only the first 1000 records.
Ah, thanks for noticing it. I accidentally deleted the part for calculating total fetched when fixing an earlier typo. Fixed it now.
I don't see how this answer helps. If the OP wanted to process the query result, he could do that himself by turning it into IEnumerable instead of using ToList. Also loading 100K records in memory is not so big deal nowadays.
@Ivan I am going to try your suggestion on IEnumerable.
@Ivan the real issue I was having was solved by creating an IEnumerable Class instead of using ToList... but you did not give me an answer only a comment to someone else's answer so I could not mark it correct!
|
0

What I did with mine I just let it return an IQueryable object. It's still a list but it performs a lot better.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.