0

I'm wondering if someone can help simplify this procedure - and improve performance...!?

We have data on grants. 'Donors' give funds to 'Recipients' and we want to show the top 15 recipients for each donor over 3 periods: CurrentYear-20, CurrentYear-10 and CurrentYear. We publish an annual report and show percentage shares of World and GeoZone totals for each donor.

I have "inherited" this code which was written by one of my predecessors. Until we switched to using a view, execution time was around 15-30 mins. Currently, this runs in just under FOUR hours (scheduled as a Server Agent job)! Management are not happy. For various reasons, the view must continue to be used and currently has just under 900,000 rows with data from the 1950s onwards. We current run this report for 30 (large) donors and more are added each year.

To help improve performance, I have thought about using a CTE or/using SUM() OVER(Partition BY...) or combination of these, but I'm not sure how to go about it.

Could someone point me in the right direction?

Here is the process:

  • create a table (variable) to hold the top 15 recipients for the current donor
  • create a table (variable) to hold the list of donors
  • populate the donor table with the donors in the order they appear in the report
  • loop thru the donor table and for each donor:
    • put the donor ID for this donor into a temp table
    • loop 3 times (for CurrentYear-20, CurrentYear-10, CurrentYear)
    • calculate the share totals for each of 18 regions/zones
    • print the values for each section in the report
  • get the next donor ID

As you may see from the above, the calculations are run 54 times (18x3) for each donor!

Here is the code (simplified):

-- @LatestYear is passed as a parameter, hardcoded here for simplicity
DECLARE @LatestYear SMALLINT ,
    @CurrentYear SMALLINT ,
    @DonorID SMALLINT ,
    @totalWorld NUMERIC(10, 2) ,
    @LoopCounter TINYINT ,
    @DonorName VARCHAR(100)  
SELECT  @latestyear = 2012  

    -- create a table to hold list of top 15 recipients for each donor and their 'share' of ODA.  
DECLARE @Top15 TABLE
(
  Country VARCHAR(100) ,
  Percentage REAL
)  

    -- create a table to hold list of donors, ordered as they need to appear in the report.  
DECLARE @PageOrder TABLE
(
  DonorID SMALLINT ,
  DonorName VARCHAR(100) ,
  SortOrder SMALLINT IDENTITY(1, 1)
)

    -- create a table to store the "focus" donor.  
DECLARE @CurrentDonor TABLE ( DonorID SMALLINT )

INSERT  INTO @PageOrder
        SELECT  DonorID ,
                DonorName
        FROM    dbo.LookupDonor
        ORDER BY DonorName;  

    -- cursor to loop through the donors in SortOrder
DECLARE DonorCursor CURSOR
FOR
    SELECT  DonorID ,
            DonorName
    FROM    @PageOrder
    ORDER BY DonorName;
OPEN DonorCursor
FETCH NEXT FROM DonorCursor INTO @DonorID, @DonorName

WHILE @@fetch_status = 0 
    BEGIN

        INSERT  INTO pubOutput
                ( XMLText )
                SELECT  @DonorName;

    -- Populate the DonorID table
        INSERT  INTO @CurrentDonor
        VALUES  ( @DonorID )

    /* The following loop is invoked 3 times. The first time through, the year will be 20 years before the latest year,
    the second time through, 10 years before. The last time through the year will be the latest year.
    */

        SET @LoopCounter = 1
        WHILE @LoopCounter <= 3 
            BEGIN
                SELECT  @CurrentYear = CASE @LoopCounter
                                         WHEN 1 THEN @LatestYear - 20
                                         WHEN 2 THEN @LatestYear - 10
                                         ELSE @LatestYear
                                       END

        -- calculate the world total for the current years (year,year-1) for all recipients
                SELECT  @totalWorld = SUM(Amount)
                FROM    dbo.vData2 d
                        INNER  JOIN ( SELECT    RecipientID
                                      FROM      dbo.RecipientGroup
                                      WHERE     GroupID = 160
                                    ) c ON d.RecipientID = c.RecipientID
                        INNER  JOIN @CurrentDonor z ON d.DonorID = z.DonorID
                WHERE   d.year IN ( @CurrentYear - 1, @CurrentYear )

        -- calculate the GeoZones total for the current years (year,year-1) 
                SELECT  @totalGeoZones = SUM(Amount)
                FROM    dbo.vDac2a d
                        INNER  JOIN ( SELECT    RecipientID
                                      FROM      dbo.GeoZones
                                      WHERE     GeoZoneID = 100
                                    ) x ON d.RecipientID = x.RecipientID
                        INNER  JOIN @CurrentDonor z ON d.DonorCode = z.DonorCode
                WHERE   d.year IN ( @CurrentYear - 1, @CurrentYear )

        -- Find the top 15 recipients for the current donor
                INSERT  INTO @Top15
                        SELECT TOP 15
                                r.RecipientName ,
                                ( ISNULL(SUM(Amount), 0) / @totalWorld ) * 100
                        FROM    dbo.vData2 d
                                INNER JOIN dbo.LookupRecipient r ON r.RecipientID = d.RecipientID
                                INNER JOIN @CurrentDonor z ON d.DonorID = z.DonorID
                        WHERE   d.year IN ( @CurrentYear - 1, @CurrentYear )
                        GROUP BY r.RecipientName
                        ORDER BY 2 DESC

        -- Print the top 15 recipients and total
                INSERT  INTO pubOutput
                        (
                          XMLText
                        )
                        SELECT  country + @Separator + CAST(percentage AS VARCHAR)
                        FROM    @Top15
                        ORDER BY percentage DESC
                INSERT  INTO pubOutput
                        (
                          XMLText
                        )
                        SELECT  @Heading1 + @Separator + CAST(SUM(Percentage) AS VARCHAR)
                        FROM    @Top15

    -- Breakdown by Regionas
        -- Region1
                IF @totalWorld IS NOT NULL 
                    INSERT  INTO pubOutput
                            (
                              XMLText
                            )
                            SELECT  'Region1' + @Separator
                                    + CAST(( ISNULL(SUM(Amount), 0) / @totalWorld ) * 100 AS VARCHAR)
                            FROM    dbo.vData2 d
                                    INNER JOIN ( SELECT RecipientID
                                                 FROM   dbo.RecipientGroup
                                                 WHERE  RegionID = 1
                                               ) c ON d.RecipientID = c.RecipientID
                                    INNER JOIN @CurrentDonor z ON d.DonorID = z.DonorID
                            WHERE   d.year IN ( @CurrentYear - 1, @CurrentYear )

                ELSE    -- force output of sub-total heading
                    INSERT  INTO pubOutput
                            (
                              XMLText
                            )
                            SELECT  @Heading2 + @Separator + '--'

        -- Region2-8
        /* similar syntax as Region1 above, for all Regions 2-8 */

        -- Total Regions
                INSERT  INTO pubOutput
                        (
                          XMLText
                        )
                        SELECT  @Heading2 + @Separator + CAST(@totalWorld AS VARCHAR)

    -- Breakdown by GeoZones 1-7
        -- GeoZone1
                INSERT  INTO pubOutput
                        (
                          XMLText
                        )
                        SELECT  'GeoZone1' + @Separator
                                + CAST(( ISNULL(SUM(Amount), 0) / @totalGeoZones ) * 100 AS VARCHAR)
                        FROM    dbo.vDac2a d
                                INNER JOIN ( SELECT RecipientID
                                             FROM   dbo.GeoZones
                                             WHERE  GeoZoneID = 1
                                           ) m ON d.RecipientID = m.RecipientID
                                INNER JOIN @CurrentDonor z ON d.DonorCode = z.DonorCode
                        WHERE   d.year IN ( @CurrentYear - 1, @CurrentYear )

        -- GeoZones2-8
        /* similar syntax as GeoZone1 above for GeoZones 2-7 */

        -- Total GeoZones - currently hard-coded as 100, due to minor rounding errors
                INSERT  INTO pubOutput
                        (
                          XMLText
                        )
                        SELECT  @Heading3 + @Separator + '100'

                SET @LoopCounter = @LoopCounter + 1

            END -- year loop

    -- Get the next donor from the cursor
        FETCH NEXT FROM DonorCursor 
    INTO @DonorID, @DonorName

    END
 -- donorcursor

    -- Cleanup
CLOSE DonorCursor
DEALLOCATE DonorCursor

Many thanks in advance for any help you may be able to provide.

7
  • P.S. We are using SQL2008R2, but will soon be migrating to SQL2012. Commented Feb 27, 2014 at 11:57
  • 1
    I wouldn't try to remove cursors 'just because' since I've had several projects where cursors actually improved performance and built away complexity (SQL Query Optimizer must be tamed!), anyhow; I would first copy the database locally and then run the query in Management Studio with "Include Actual Execution Plan" option turned on, to help identify the bottleneck query in this. (My bet is the vData2 view, and you might consider creating a simplified derivation of the view tuned just for this batch). Commented Feb 27, 2014 at 12:10
  • I am sorry I dont see why you you would need loop over the DonorId, you have the top 15 donors already, shouldn't all the looping just be a join to the top 15 and then a group by donor ID or am I missing something. Commented Feb 27, 2014 at 12:37
  • I would look at the view and the base table. 54 loops is not that many. What in the loop is taking time? You can index a view. You have some static selects for join that you could run once and put in a #temps and declare a PK on the #temp as it helps the join. Commented Feb 27, 2014 at 13:22
  • 1
    In cases like this, it would be great with a script to create the structures with some bogus data (i.e. link to a .SQL or .BAK file). I think many with me love a challenge but the groundwork to setup and try to reproduce the issue from zero is to timeconsuming. I guess it could take you a couple of hours, but the win would probably save you dozens of them. Commented Feb 27, 2014 at 16:27

1 Answer 1

3

Avoiding cursor is must. You can use 'while' instead of cursor. However considering the complexity of query, keep cursor at this moment.

To improve performance in other way, check the number of records for below queries:

  1. SELECT RecipientCode FROM dbo.RecipientGroup WHERE GroupID=160
  2. SELECT RecipientCode FROM dbo.GeoZones WHERE GeoZoneID=100
  3. SELECT RecipientID FROM dbo.RecipientGroup WHERE RegionID=1

I suggest create 3 temp tables for above query "outside" of cursor and use them inside of cursor.

Hope this helps!

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks. This is something I could try. However, there are more than 3 different groups of recipients...we have 18! In this case, would there be any improvement in using 18 different temp tables !?
Depending on your cursor you can use FAST_FORWARD to improve it's performance
@HermantPune,@Blam Thanks for your suggestions. I took your advice and removed all the static selects, placing them in table variables before the loop. I also went one step further and inserted the data from the view into a table variable, before the loop. All table variables have primary keys. The CURSOR and WHILE loops remain for now. This now gives remarkable performance -- the whole thing executes between 20-40 SECONDS !!!!
@jean Thanks. I have also added FAST_FORWARD to the CURSOR declaration.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.