0

I have run into this scenario a couple of times, but it does not occur all the time on the same databases while testing. I have two separate databases I am merging into a single db both structured exactly the same. When inserting records from one database to the other, I am seeing distinct values duplicate on my target database however exist only once in one source and not in the target.

Example:

DB1..Customer

Cust_ID | Last_Name | First_Name | Phone    | Email   | Field1
1       | Smith     | John       | 111-1111 | [email protected] |      

DB2..Customer

Cust_ID | Last_Name | First_Name | Phone    | Email   | Field1
1       | Jones     | Steve      | 222-2222 | [email protected] | 
2       | Smith     | Tom        | 333-3333 | [email protected] | 

When I run my query:

INSERT INTO DB1..Customer (Last_Name, First_Name, Phone, Email, Field1)
    SELECT
        Last_name, First_Name, Phone, Email, Cust_ID 
    FROM
        DB2..Customer DB2 
    WHERE 
        DB2.Cust_ID NOT IN (SELECT DB2.Cust_ID 
                            FROM DB2..Customer DB2 
                            INNER JOIN DB1..Customer DB1 ON DB1.Last_Name = DB2.Last_Name 
                                                         AND DB1.First_Name = DB2.First_Name 
                                                         AND DB1.Email = DB2.Email)

Results:

DB1..Customer

Cust_ID | Last_Name | First_Name | Phone    | Email   | Field1
1       | Smith     | John       | 111-1111 | [email protected] |      
2       | Jones     | Steve      | 222-2222 | [email protected] | 1
3       | Jones     | Steve      | 222-2222 | [email protected] | 1
4       | Jones     | Steve      | 222-2222 | [email protected] | 1
5       | Jones     | Steve      | 222-2222 | [email protected] | 1
6       | Smith     | Tom        | 333-3333 | [email protected] | 2
7       | Smith     | Tom        | 333-3333 | [email protected] | 2
8       | Smith     | Tom        | 333-3333 | [email protected] | 2

I notice duplicate values entered when I run a count on the field1 column having more than one count of db2..customer.cust_id. Since Cust_ID is the PK value I should only have one value flow into the field1 column per my query.

Any ideas or suggestions on why this may be occurring? My last run of my query duplicated some items up to 4 times. It seems to me SQL is caught in a bit of a loop searching for the patient while also writing them to the target db at the same time.

1
  • SQL Server is not caught in a loop. Commented Feb 3, 2017 at 3:23

2 Answers 2

3

Left joining is a little slower, but easier to read and does what you want.

INSERT INTO DB1..Customer(
  Last_Name
, First_Name
, Phone
, Email
, Field1)
SELECT
  B.Last_name
, B.First_Name
, B.Phone
, B.Email
, B.Cust_ID
FROM
  DB2..Customer B
    LEFT JOIN
    DB1..Customer A ON
  A.Last_Name = B.Last_Name
  AND
  A.First_Name = B.First_Name
  AND
  A.Email = B.Email
  AND
  A.Phone = B.Phone
WHERE A.Cust_ID IS NULL;
Sign up to request clarification or add additional context in comments.

3 Comments

I had NOT EXISTS in there as a second method and explained that it would perform better. However we would have to assume that he abused field1 since day one, there was a unique index on it, and it was the same data type. NOT EXISTS would not perform better if we had to reference the main query's fields (Last_Name , First_Name, Email, Phone) to make a WHERE NOT EXISTS (SELECT 1 FROM ... WHERE Last_Name = A.Last_Name ...
It works in the example from the link you provided because those are the PKs.
0

Could you try changing the aliases used in the outer query and sub-query to be different? I don't have multiple instances at hand to test, but I wonder if it is being interpreted as a correlated subquery.

Try the following query, which uses DB1_Inner/DB2_Inner/DB2_Outer to differentiate the aliases:

Insert into DB1..Customer (Last_Name, First_Name, Phone, Email, Field1)
SELECT Last_name, First_Name, Phone, Email, Cust_ID 
from DB2..Customer DB2_Outer 
Where DB2_Outer.Cust_ID not in 
    (Select DB2_Inner.Cust_ID 
     from DB2..Customer DB2_Inner  
     Inner Join DB1..Customer DB1_Inner 
     on DB1_Inner.Last_Name=DB2_Inner.Last_Name 
         and DB1_Inner.First_Name=DB2_Inner.First_Name 
         and DB1_Inner.Email=DB2_Inner.Email) 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.