I have an Azure Sql database about 9GB in size. It serves a web app that handles about 135K requests per hour. Most of the data is transient, it lives in the database from a few minutes to five days and is deleted. About 10GB moves through the database per day.
I tried to run a delete query on a table to delete about 250,000 records out of 350,000 records total. About 10 percent of the records have one or two nvarchar(max) values large enough to be stored in LOB storage.
Over the weekend, I tried to delete them all at once. It ran for four hours before I canceled the query, then it was rolling back for another 8 hours - bad move. I really wasn't expecting it to be that bad.
Then I tried another approach. This batch ran at night when the web app was handling about 100K requests per hour. tblJobs Id field is a uniqueidentifier that is the primary key.
insert @tableIds select Id from dbo.tblJobs with(nolock)
where (datediff(day, SchedDate, getDate()) > 60)
or (datediff(day, ModifiedDate, getDate()) > 3 and ToBeRemoved = 1)
set @maintLogStr = 'uspMaintenance [tblJobs] Obsolete J records count @tableIds: ' + convert(nvarchar(12), (select count(1) from @tableIds))
insert dbo.admin_MaintenanceLog(LogEntry) values(@maintLogStr)
set @maintLogId = newid()
set @maintLogStr = 'uspMaintenance [tblJobs] Obsolete J records beginning loop...'
insert dbo.admin_MaintenanceLog(Id, LogEntry) values(@maintLogId, @maintLogStr)
while exists(select * from @tableIds)
begin
delete @tableIdsTmp
begin transaction
insert @tableIdsTmp select top 1000 id from @tableIds
delete p from @tableIdsTmp i join dbo.tblJobs p on i.id = p.Id
delete x from @tableIdsTmp t join @tableIds x on t.id = x.id
set @maintLogStr = 'uspMaintenance [tblJobs] Obsolete J records remaining count @tableIds: ' + convert(nvarchar(12), (select count(1) from @tableIds))
update dbo.admin_MaintenanceLog set LogEntry = @maintLogStr, RecordCreated = getdate() where Id = @maintLogId
commit transaction
if @dowaits = 1 WAITFOR DELAY '00:00:01.000'
end
SchedDate, ModifiedDate and ToBeRemoved are not indexed so gathering the Ids in @tableIds took about 3 minutes - not bad.
Then from the log entries, it took 1 hour 55 minutes to delete 11,000 records from tblJobs at which time the job called from a remote machine timed out.
Why is it taking so long? What can I do to speed it up?