2

I have a table TableA, its PK is AId. I have another table TableB, its only column is BId. BId is actually is a subset of AId. Now, we can not use FK (because of this and that reason).

How can I Delete the TOP 1000 rows in TableB and its related rows (if the same id exists in table A) in table A?

A solution is to SELECT TOP 100) from tableB and save to temp table, then use it to delete the data in table A, then delete rows in tableB.

But it should have a more efficient way to do this.

I also tried:

DELETE TOP (1000) FROM tableA WHERE AId IN (SELECT BId FROM TableB)

But how can I make sure the top 1000 BId got deleted?

Thanks

2
  • i reckon two delete steps, i don't know any other way.. so DELETE FROM TABLEA WHERE KEY IN (SELECT TOP 1000 KEY FROM TABLEB) ... DELETE FROM TABLEB WHERE KEY IN (SELECT TOP 1000 KEY FROM TABLEB) ... provided that tableB is untouched in the next minute or two. otherwise #temptables should do be the way to go Commented Jun 6, 2017 at 21:40
  • Without an order by you can't guarantee the TOP returns the same set of data. Commented Jun 7, 2017 at 0:42

3 Answers 3

3

One way to go would be to create a table variable, stored the 1000 ID's and reference that in both deletes. Of course I have no idea if they're integers or not, so change the data type if required. There are other ways, as I'm sure others will point out but this is my standard approach if I need consistency.

DECLARE @keyValues TABLE (keyValue INT);

INSERT INTO @keyValues
SELECT TOP 1000 FROM TABLEA WHERE AID IN (SELECT BID FROM TABLEB);

DELETE FROM TABLEB WHERE BID IN (SELECT keyValue FROM @keyValues);
DELETE FROM TABLEA WHERE AID IN (SELECT keyValue FROM @keyValues);
Sign up to request clarification or add additional context in comments.

2 Comments

thanks. I'm very curious whether there is a more efficient way to do this. Obviously, no matter it is temp table or table variable, it will take some 'extra' memory/CPU.
You need to store the intermediate list of ID's somewhere. The only other place to store it is in the source table itself (by marking it directly)
1

In general, one DELETE statement can change data only in one table. So, in general, you need two separate DELETE statements to delete rows from two tables.

Technically, you don't have to write BId to a temp table, just read them twice from TableB.

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

BEGIN TRANSACTION;

DELETE FROM TableA
WHERE TableA.AId IN
(
    SELECT TOP(1000) TableB.BId
    FROM TableB
    ORDER BY TableB.BId
)
;

DELETE FROM TableB
WHERE TableB.BId IN
(
    SELECT TOP(1000) TableB.BId
    FROM TableB
    ORDER BY TableB.BId
)
;

COMMIT TRANSACTION;

You need to make sure that the set of 1000 BId are the same in both queries, which means that you have to use ORDER BY with TOP. Also, you should do something to prevent changes to TableB by another process between two DELETE statements.

In the end, taking care of concurrency issues (for example, by setting the transaction isolation level to serializable) may incur more overhead than writing these 1000 Ids into a temp table.


On the other hand, you said "BId is actually is a subset of AId". So, TableA is master, TableB is detail.

Technically, you can define a Foreign Key with ON DELETE CASCADE option. It is not clear to me from the question why you can't use foreign key constraint.

ALTER TABLE TableB WITH CHECK ADD CONSTRAINT [FK_TableB_TableA] FOREIGN KEY(BId)
REFERENCES TableA(AId) ON DELETE CASCADE
GO

ALTER TABLE TableB CHECK CONSTRAINT [FK_TableB_TableA]
GO

Then you will delete only from TableA and child rows from TableB would be deleted by the foreign key constraint. Whether that would be more efficient than two explicit DELETEs is hard to tell - you need to test yourself on your hardware.

In this case one explicit DELETE is enough:

DELETE FROM TableA
WHERE TableA.AId IN
(
    SELECT TOP(1000) TableB.BId
    FROM TableB
    ORDER BY TableB.BId
)
;

This will take care of concurrency issues as well.

7 Comments

You can probably wrap that in some kind of transaction + isolation level to make sure it's the same set.
@Nick.McDermaid, yes, two separate deletes should be in a transaction.
will read them twice slower than save the TOP 1000 in a table variable and read them from table variable?
The reason why we can not use FK is that, B is a subset of A, but sometimes, we have another process which delete rows from A first, We really have control on it (in this case, B is not a subset of A anymore).
@urlreader, writing is usually slower than reading. With temp table you still read twice and write once. Without temp table you only read twice. Second read is most likely from cache anyway. But, actual performance should be tested on your hardware. There are so many things that can affect it.
|
0

other possible ways:

1) DELETE TOP (1000) from TableB with Output clause (into temp table), and then Delete from TableA where AID is in that output table.

or

2) Implement DELETE trigger for TableB, so when any rows are deleted from TableB, they are also deleted from TableA automatically.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.