1

I just need advice on how I could speed up my code. I'm supposed to count on yearly base, how the grades of some students are improving and calculate in percentage. Also keep in mind that I have around 100k-150k records per year.

Basically end results look like this, so at end of 20150131, 2% of students had grade A finished with grade B and so on.

  Grade    Date     B    C
   A      20150131  2%   3%
   B      20150131  88%  85%
   C      20150131  10%  12%
   A      20140131  2%   3%
   B      20140131  88%  85%
   C      20140131  10%  12%
   A      20130131  2%   3%
   B      20130131  88%  85%
   C      20130131  10%  12% 

Input looks like this .. just info about student and his grade on certain date

 Student    Date    Grade
   1      20150131   A
   2      20150131   C
   3      20150131   A
   1      20140131   B
   2      20140131   B
   3      20140131   A

My code looks like this:

WHILE @StartDateInt > @PeriodSpan
BEGIN
while @y <= @CategoriesCount
 BEGIN

    set @CurrentGr = (Select Grade from #Categories where RowID = @y)
    set @CurrentGrCount = (Select COUNT(Students) from #TempTable where Period = @PeriodSpan and Grade = @CurrentGr)       
    set @DefaultCurrentGr = (Select Grade from #Categories where RowID = @y)

    insert into Grade_MTRX (Student, Period, Grades_B, SessionID)            
    select temp1.Grade, @PeriodNextSpan as Period, COUNT(Grades_B)/@CurrentGrCount as 'Grades_B', @SessionID               
    from #TempTable temp1
    join #TempTable temp2 on temp1.Student = temp2.Student and temp1.Period + 10000 = temp2.Period
    where temp1.Grade = @CurrentGr and temp2.Grade = 'C' and temp1.Period = @PeriodSpan       
    group by temp1.Grade, temp1.Period   

    update Grade_MTRX set Grades_C = (
    select COUNT(Grades_C)/@CurrentGrCount
    from #TempTable
    where Grade = 'C' and Period = @PeriodNextSpan)
    where Category = @CurrentGr and Period = @PeriodNextSpan

 end
end

I understand SQL Server doesn't like while loops, as I understand it kills it's performance... But I'm using while inside of while loop... going over years, for each grade and just counting them and... first I insert 1 row of current grade, and then I keep updating that row until its fully populated.

I do understand this is really bad, but at the end that's why I am here to learn better way to accomplish this.

Thank you in advance!

5
  • Share input and desired output and you will get the answer without loop Commented Oct 7, 2015 at 7:37
  • You should learn about GROUP BY and subqueries. Commented Oct 7, 2015 at 7:40
  • You will only have A, B and C grades? Commented Oct 7, 2015 at 8:19
  • Yes, exercise just gave it simple to be A, B, C Commented Oct 7, 2015 at 8:31
  • Sql-server doesn't like or dislike while loops. Where's the declaration of #TempTable ? Does it have indexes ? how much data are you scanning and aggregating ? Where are the SET commands for the variables in the While loops conditions to exit them ? Commented Oct 7, 2015 at 9:15

1 Answer 1

1

150,000 records per year is really nothing. Let's say you had this Grade table:

CREATE TABLE Grade(
 student_id INT,
 date INT,
 grade CHAR);

With this info:

student_id date grade
    1      2013   A
    1      2014   A
    1      2015   B
    2      2013   B
    2      2014   A
    2      2015   C
    3      2013   C
    3      2014   A
    3      2015   B

Then if you just run a query like:

SELECT this_year.date, last_year.grade AS last_year, this_year.grade AS this_year, COUNT(*) AS total,
 (100.0 * COUNT(*)) / (SELECT COUNT(*) FROM Grade WHERE date = this_year.date) AS percent
 FROM Grade AS this_year
  INNER JOIN Grade AS last_year ON this_year.date = last_year.date + 1
    AND this_year.student_id = last_year.student_id
 GROUP BY this_year.date, this_year.grade, last_year.grade
 ORDER BY 1, 2, 3;

you end up with these results:

 date | last_year | this_year | total |       percent       
------+-----------+-----------+-------+---------------------
 2014 | A         | A         |     1 | 33.3333333333333333
 2014 | B         | A         |     1 | 33.3333333333333333
 2014 | C         | A         |     1 | 33.3333333333333333
 2015 | A         | B         |     2 | 66.6666666666666667
 2015 | A         | C         |     1 | 33.3333333333333333
(5 rows)

Having a few million rows of data with this kind of query shouldn't be any real trouble. Even tens of millions of rows. But if you need things to be faster still then check out windowing functions that you can do with Postgres, Oracle, and MSSQL server.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.