improve performance of stored procedure in SQL Developer - nested loop

Question

I've got a program that assigns employees with a "preferred vehicle" and at the end I make sure that no employees have the same vehicle.

The first loop gets a list of all preferred vehicles that occur more than once, and the nested loop goes through every employee that has that vehicle. In the end, only one employee will have it set as his/her preferred vehicle

--Loop through to see if a pref_veh occurs more than once 
FOR every_duplicate_veh IN 
(SELECT PREFERRED_VEH FROM FLEET_USER GROUP BY PREFERRED_VEH HAVING COUNT (PREFERRED_VEH) > 1)
LOOP      
  max_count:=0; 

  --Loop through all the employees with the duplicate vehicle
  FOR every_employee IN (SELECT EMPLOYEE_ID FROM FLEET_USER WHERE PREFERRED_VEH=every_duplicate_veh.PREFERRED_VEH)
  LOOP

    --Find which employee is assigned the vehicle the most 
    SELECT COUNT(ASSIGN_VEH_ID) INTO assigned_veh_count FROM FLEET_TRANSACTION 
    WHERE ASSIGN_VEH_ID=every_duplicate_veh.PREFERRED_VEH AND DRIVER_EMP_ID=every_employee.EMPLOYEE_ID
    AND SYSDATE - 30 <= RESERV_START_DT;   

    IF assigned_veh_count>max_count THEN
      max_count:=assigned_veh_count; 
      preferred_employee_id:=every_employee.EMPLOYEE_ID; 
    END IF; 

    --Reset the employee's preferred vehicle to NULL
    UPDATE FLEET_USER SET PREFERRED_VEH = NULL WHERE EMPLOYEE_ID = every_employee.EMPLOYEE_ID;
    INSERT INTO FLEET_PREF_VEH_LOG VALUES (SYSDATE, every_employee.EMPLOYEE_ID, NULL);

  END LOOP;

  --One employee will get the preferred vehicle 
  UPDATE FLEET_USER SET PREFERRED_VEH = every_duplicate_veh.PREFERRED_VEH  WHERE EMPLOYEE_ID = preferred_employee_id;
  INSERT INTO FLEET_PREF_VEH_LOG VALUES (SYSDATE, preferred_employee_id, every_duplicate_veh.PREFERRED_VEH);
  COMMIT;

END LOOP;

FLEET_USER is a table with thousands of rows...My goal is to eliminate the nested loop... can I do that? I'm still pretty new to sql so I would really appreciate any advice/point out anything I've missed

Wouldn't it be better to set the value correctly to start with, without duplicates, rather than try to clean up later? Within this section you can be updating the same row multiple times at the moment, I think, as an aside; you blank all the matching rows, then put one back to how it was to start with. It looks you want the preferred vehicle to be set to the most-used in the last 30 days, and if you were the second-highest user of that vehicle you end up with no preference, right? — Alex Poole
– Alex Poole, Commented Jan 26, 2015 at 19:39
May want to check out this site: codereview.stackexchange.com. You might have better luck. — macoms01
– macoms01, Commented Jan 26, 2015 at 19:46

Allan · Accepted Answer · 2015-04-09 15:17:56Z

First off, the insert into FLEET_PREF_VEH_LOG strikes me as being better served as a trigger, rather than a piece of a procedure. With it offloaded that way, I think you can reduce this to a single SQL statement.

I don't have your objects or data, so this is totally untested, but I think it's a close approximation to your existing code and should result in a performance improvement.

MERGE INTO fleet_user fu_m
USING      (SELECT employee_id,
                   cnt,
                   MAX (cnt) OVER (PARTITION BY fm_s.preferred_veh) 
                       AS max_cnt
            FROM   (SELECT   fu.employee_id, fu.preferred_veh, COUNT (*) AS cnt
                    FROM     fleet_user fu
                             JOIN fleet_transaction ft
                                ON     ft.assign_veh_id = fu.preferred_veh
                                   AND ft.driver_emp_id = fu.employee_id
                    WHERE        EXISTS
                                    (SELECT   preferred_veh
                                     FROM     fleet_user fu2
                                     WHERE    fu2.preferred_veh =
                                                 fu.preferred_veh
                                     GROUP BY preferred_veh
                                     HAVING   COUNT (preferred_veh) > 1)
                             AND SYSDATE - 30 <= ft.reserv_start_dt
                    GROUP BY fu.employee_id, fu.preferred_veh)) fm_s
ON         (fm_m.employee_id = fm_s.employee_id)
WHEN MATCHED THEN
   UPDATE SET preferred_veh = NULL
      WHERE      max_cnt <> cnt;

Another advantage to this method is should prevent unnecessary records from being written to FLEET_PREF_VEH_LOG. With your current code (by my reading), the preferred user's preference is getting cleared and then set back to its original value. This results in two FLEET_PREF_VEH_LOG records when nothing has actually changed.

As pointed out in the comments, this answer does not cover tied counts. The code in the question does, but it's non-deterministic: in the case of a tie whichever employee is processed last wins. Due to the vagaries of database and the lack of an ORDER BY, this is not guaranteed to be the same every time.

I'd prefer to add a deterministic means to break ties, even if it's arbitrary:

MERGE INTO fleet_user fu_m
USING      (SELECT employee_id,
                   cnt,
                   MAX (cnt) OVER (PARTITION BY fm_s.preferred_veh) 
                       AS max_cnt,
                   ROW_NUMBER () OVER (PARTITION BY cnt ORDER BY employee_id) 
                       AS tie_order
            FROM   (SELECT   fu.employee_id, fu.preferred_veh, COUNT (*) AS cnt
                    FROM     fleet_user fu
                             JOIN fleet_transaction ft
                                ON     ft.assign_veh_id = fu.preferred_veh
                                   AND ft.driver_emp_id = fu.employee_id
                    WHERE        EXISTS
                                    (SELECT   preferred_veh
                                     FROM     fleet_user fu2
                                     WHERE    fu2.preferred_veh =
                                                 fu.preferred_veh
                                     GROUP BY preferred_veh
                                     HAVING   COUNT (preferred_veh) > 1)
                             AND SYSDATE - 30 <= ft.reserv_start_dt
                    GROUP BY fu.employee_id, fu.preferred_veh)) fm_s
ON         (fm_m.employee_id = fm_s.employee_id)
WHEN MATCHED THEN
   UPDATE SET preferred_veh = NULL
      WHERE      max_cnt <> cnt or tie_order <> 1;

ROW_NUMBER () OVER (PARTITION BY cnt order by employee_id) will give each employee with the same count a sequential number, from lowest employee_id to highest. When there's no tie, all of the values will be 1, so there query will work exactly as before. But when there is a tie, every member of the tie except the one with the lowest employee_id will be set to null.

Obviously, you could change the ORDER BY to whatever arbitrary criteria you choose.

I have changed both above queries to satisfy the latest issue raised in the comments (I think). By only getting the largest count by vehicle number, we ensure that we are processing each vehicle separately.

this is great, but it does not accommodate the scenario if there is a tie between two or more employees.
What if there is more than one duplicate vehicle? For example, if employees A and B have vehicle ABC and the max count for ABC is 18, but employees C and D and vehicle DEF, and the max count for this vehicle is 15... this code will only show that the max count is 18. And now C and D both won't have DEF because 15!=18.
Got it - instead of partition by one, partition by preferred_veh. This gave me the right numbers. 18 for vehicle ABC and 15 for vehicle DEF.

Collectives™ on Stack Overflow

improve performance of stored procedure in SQL Developer - nested loop

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related