1

I have this problem that's been killing me for a couple days now.

So we have a table of all processed orders. We have a table for all orders that come in.

We need to effectively cross-reference the orders in the new table that is continually updating against the orders already completely in the primary table so that we don't complete the same order multiple times.

After we get a batch of new orders, this is the query that I currently run in an attempt to cross reference it with the table of completed orders:

$sql = "DELETE
FROM
    `orders_new`
WHERE
    `order` IN (
        SELECT DISTINCT
            `order`
        FROM
            `orders_all`
    )
AND `name` IN (
    SELECT DISTINCT
        `name`
    FROM
        `orders_all`
)
AND `jurisdiction` IN (
    SELECT DISTINCT
        `jurisdiction`
    FROM
        `orders_all`
)";

As you can probably tell, I want to delete rows from the "orders_new" table where a row with the same order, name, and jurisdiction already exists in the "orders_all" table.

Is this the right way to handle this sort of query?

2
  • What is in the order field?? Is it an ID or a product name or product code? Commented Dec 16, 2016 at 19:07
  • It is an order ID, which is provided by customer/client and we cannot change. Commented Dec 16, 2016 at 19:14

3 Answers 3

1

Well, the right way depends on many things. But first, I do not like your division into two tables. In that case I would introduce a column identfying state, that woul reference a table with possible states. Those would be "new", "in process", "completed". That way you have one order stored as only one record as it should be. But your query migt be ok, but you should check the performance. Take a look at: https://sqlperformance.com/2012/12/t-sql-queries/left-anti-semi-join Not exactly your case but very similar.

Another thing: Why do you use DISTINCT. That would imply that "order" is not a unique identifier.

Based on your edit you identify the order with composite key "order", "name", "jurisdiction". Is this really the key, the whole key and nothing but the key so help you Codd. If not, you could delete a bunch of records. But even so your query would delete an all orders for which the order, name and jurisdiction can be found in table order IN DIFFERENT RECORDS. So your query is false.

Saying that, a variant of your query might be

DELETE order_new
FROM
    order_new
    INNER JOIN
    order_all ON order_all.order = order_new.order 
                    AND order_all.name = order_new.name
                    AND order_all.jurisdiction = order_new.jurisdiction

But, the real problem is your ER model.

Sign up to request clarification or add additional context in comments.

5 Comments

His query is most likely not ok; but I second the sentiment that two tables for this is the wrong way to have gone.
"order" is not a unique identifiers, but they are provided to us by the customer so we don't have say over that.
@justin In that case you should introduce srrogate key. But... even a client should have a way to uniquely identify an order.
In your query, should "order" be "order_all"?
+1 for merging these two tables, lots of data replication atm. I would +1 again for the single key - every order should be uniquely identified for you, your client, your boss... and in this case the database... if you cannot have a PK with a unique constraint (on the 3 fields), maybe a UUID?
0

No, your query will delete any record where there are any records with the same order, name, and jurisdiction, even if those records are different from one another. In other words, a row in orders_new will be deleted if one row in order_all has the same order, a different one has the same name, and a third one has the same jurisdiction. You are very very likely to delete way more than you want to. Instead, this would be more appropriate:

DELETE FROM `orders_new` 
WHERE (`order`, `name`, jurisdiction`) IN (
  SELECT `order`, `name`, `jurisdiction`
  FROM `orders_all`
)

or maybe

DELETE FROM `orders_new` 
WHERE EXISTS (
  SELECT 1
  FROM `orders_all` AS oa
  WHERE oa.`order` = `orders_new`.`order`
    AND oa.`name` = `orders_new`.`name`
    AND oa.`jurisdiction` = `orders_new`.`jurisdiction`
)

Comments

0

You should convert that to a DELETE - JOIN construct like

DELETE `orders_new`
FROM `orders_new`
INNER JOIN `orders_all` ON `orders_new`.`order` = `orders_all`.`order`
AND `orders_new`.`name` = `orders_all`.`name`
AND `orders_new`.`jurisdiction` = `orders_all`.`jurisdiction`;

1 Comment

Shouldn't these be "AND" conditions rather than "OR" since we want to delete rows where all three columns match values?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.