1

Disclaimer: I have little experience with linq.

One of my tasks at my work is to maintain an e commerce web site. Yesterday, one of our customers started complaining of a timeout that would occur when they tried to create a feed file for google. Turns out, if the user has more than 9,000 items to put in their feed file, our code takes at least one minute to execute.

I couldn't find the source of the problem by running the debugger, so I fired up a profiler (ANTS) and let it do its thing. It found the source of our problem, a foreach loop that contains a bit of linq code. Here is the code:

var productMappings = GoogleProductMappingAccess.GetGoogleProductMappingsByID(context, productID);
List<google_Category> retCats = new List<google_Category>(numCategories);
int added = 0;

//this line was flagged by the profiler as taking 48.5% of total run time
foreach (google_ProductMapping pm in (from pm in productMappings orderby pm.MappingType descending select pm))
{
    if (pm.GoogleCategoryId.HasValue && pm.GoogleCategoryId > 0)
    {
        //this line was flagged as 36% of the total time
        retCats.Add(pm.google_Category);
    }

    else if (pm.GoogleCategoryMappingId.HasValue && pm.GoogleCategoryMappingId > 0)
    {
        retCats.Add(pm.google_CategoryMapping.google_Category);
    }
    else
    {
        continue;
    }

    if (++added >= numCategories)
    {
        break;
    }
}

Do any of you more experienced devs have any ideas? I was toying with trying to replace all the linq with sql, but I am unsure if that is the best course of action here (if it was written with linq, there must be a reason for it).

4
  • don't do the linq in the loop. store the results in a variable and use it. Commented Feb 28, 2011 at 23:15
  • Bump up the DataContext.CommandTimeout for a quick fix. Commented Mar 1, 2011 at 0:45
  • May end of having to do that. Like to avoid it if possible Commented Mar 1, 2011 at 0:48
  • @mad: That will not make any difference at all. Commented Mar 1, 2011 at 14:39

4 Answers 4

2

If you can filter out the results you don't want anyway your query should be faster - you are using orderby hence all these results use up processing in your query since they all have to be evaluated:

 productMappings.Where( pm => (pm.GoogleCategoryMappingId.HasValue
                                && pm.GoogleCategoryMappingId > 0)
                              ||(pm.GoogleCategoryMappingId.HasValue && 
                                 pm.GoogleCategoryMappingId > 0)
                      )
                .OrderBy(...)

Also you should limit the number of results returned by the query since you only use up to numCategories. So add a

.Take(numCategories)

to your query, instead of checking within the foreach loop.

Sign up to request clarification or add additional context in comments.

6 Comments

I'm going to give this a try, let you know what happens
The added where clause boosted the time out from 48%, to 78%. The take didn't seem to do much either.
how big is that table? Is there a proper index on MappingType (and the other columns in the where clause for that matter)?
The table has 411,086 entries. It doesn't look like MappingType has an index, nor do I have the permissions to create one.
if you don't have a proper index your options of optimizing that query are very limited - it'll probably always result in a full table scan
|
1

The reason retCats.Add(pm.google_Category); takes so long is because you are referecing a lazily loaded object which does another round trip to the server. If you can refactor that so you only take a local copy of the Id instead of the whole object it will speed that part up.

If you do need to take the whole object, then investigate how you can pull it down in a single query when getting the productMappings. How to do this will depend on what LINQ wrapper you are using on your SQL.

1 Comment

Once I get the first bottle neck fixed I will look into this more. Thank you
0

Not knowing your database schema it's really hard to tell. A couple of ideas:

1) Run the query through the Database Engine Tuning Advisor. Maybe the query needs some indexes?

2) pre-processing this information and putting it in another table or file. That way when google requests it it won't timeout.

Comments

0

This should probably work:

var productMappings = GoogleProductMappingAccess.GetGoogleProductMappingsByID(context, productID);
var categories = from pm in productMappings
                 where pm.GoogleCategoryId > 0 ||
                       pm.GoogleCategoryMappingId > 0
                 orderby pm.MappingType descending
                 select pm.google_Category ??
                        pm.google_CategoryMapping.google_Category;

return categories.Take(numCategories);

It would work best if GetGoogleProductMappingsByID would return an IQueryable (if applicable). If so, LINQ will convert the entire statement into a T-SQL command and that would be far faster than in memory LINQ.

Feel free to add a .ToList() to the last statement to get it into the same return type as in your code (and to force execution of the LINQ statement).

Checking for both .HasValue and > 0 is useless. Checking for Id > 0 is enough.
For more info: http://msdn.microsoft.com/en-us/library/2cf62fcy.aspx (operators)

3 Comments

I'll give this a go right now. Agreed on the .hasvalue. Everything in the table was either a 1 or 0
compile error: A query body must end with a select clause or a group clause. The odd thing I see the select clause, so I don't know what is making the compiler complain
Ah I think it was my mistake of typing 'desc' instead of 'descending'. Sorry I'm typing this without Visual Studio, so no syntax validation :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.