I have run into this scenario a couple of times, but it does not occur all the time on the same databases while testing. I have two separate databases I am merging into a single db both structured exactly the same. When inserting records from one database to the other, I am seeing distinct values duplicate on my target database however exist only once in one source and not in the target.
Example:
DB1..Customer
Cust_ID | Last_Name | First_Name | Phone | Email | Field1
1 | Smith | John | 111-1111 | [email protected] |
DB2..Customer
Cust_ID | Last_Name | First_Name | Phone | Email | Field1
1 | Jones | Steve | 222-2222 | [email protected] |
2 | Smith | Tom | 333-3333 | [email protected] |
When I run my query:
INSERT INTO DB1..Customer (Last_Name, First_Name, Phone, Email, Field1)
SELECT
Last_name, First_Name, Phone, Email, Cust_ID
FROM
DB2..Customer DB2
WHERE
DB2.Cust_ID NOT IN (SELECT DB2.Cust_ID
FROM DB2..Customer DB2
INNER JOIN DB1..Customer DB1 ON DB1.Last_Name = DB2.Last_Name
AND DB1.First_Name = DB2.First_Name
AND DB1.Email = DB2.Email)
Results:
DB1..Customer
Cust_ID | Last_Name | First_Name | Phone | Email | Field1
1 | Smith | John | 111-1111 | [email protected] |
2 | Jones | Steve | 222-2222 | [email protected] | 1
3 | Jones | Steve | 222-2222 | [email protected] | 1
4 | Jones | Steve | 222-2222 | [email protected] | 1
5 | Jones | Steve | 222-2222 | [email protected] | 1
6 | Smith | Tom | 333-3333 | [email protected] | 2
7 | Smith | Tom | 333-3333 | [email protected] | 2
8 | Smith | Tom | 333-3333 | [email protected] | 2
I notice duplicate values entered when I run a count on the field1 column having more than one count of db2..customer.cust_id. Since Cust_ID is the PK value I should only have one value flow into the field1 column per my query.
Any ideas or suggestions on why this may be occurring? My last run of my query duplicated some items up to 4 times. It seems to me SQL is caught in a bit of a loop searching for the patient while also writing them to the target db at the same time.