0

We have a rule engine that is categorizing articles. Currently there is an issue, where we get the following error:

Cannot insert duplicate key row in object 'dbo.article_category' with unique index 'IX_article_category_no_duplicates'. The duplicate key value is (123, 456). The statement has been terminated.

The creation of the entry in article_category happens in a stored procedure. The main part of the procedure is this:

MERGE article_category AS [target]
USING (
    SELECT articleId, @categoryId, @creator, @now, @ruleId, 2 
    FROM @articleIdList
) AS [source] (articleId, categoryId, creator, createDate, ruleId, assignmentTypeId)
    ON (
        target.articleId = source.articleId 
        AND target.categoryId= source.categoryId
    )
WHEN NOT MATCHED THEN
    INSERT (articleId, categoryId, creator, createDate, ruleId, assignmentTypeId)
    VALUES (source.articleId, source.categoryId, source.creator, source.createDate, source.ruleId, source.assignmentTypeId);

When two rules try to insert the same category for the same article, this exception above happens.

How is it possible to stop this from happening? I thought that using the merge statement instead of if not exists (...) should prevent this from happening?

Are there other possibilities to prevent this exception?

Thanks in advance

3
  • 2
    Does @articleIdList have any rows which would cause the duplication? The merge clause is only checked once for the dataset, not one row at a time. Commented Feb 19, 2019 at 12:03
  • Not sure if it's a duplicate but Aaron Bertrand's answer to this post can help you understand and prevent the race-condition that happens when two inserts of the same key are executed simultaneously. Commented Feb 19, 2019 at 12:06
  • @Larnu the @articleIdList is unique per call of the stored procedure Commented Feb 19, 2019 at 14:02

3 Answers 3

1

Independent from merge, if the data you are inserting/updating violates unique constraint, you can have this error on both update or insert.

the merge ... using does not handle Unique Index Constraints. Merge makes insertion if it does not match with the condition held on using statement, otherwise make update operation.

In your case,

target.articleId = source.articleId 
    AND target.categoryId= source.categoryId

does not guarantee uniqueness thus you are getting unique index contraint expception.

Sign up to request clarification or add additional context in comments.

Comments

0

Yes as you answered the not exists clause will help instead of merge, plus modify the stored proc in a way that any given time only one condition will have access to insert the row.. This is a classy case of dead lock

1 Comment

with the "not exists" clause the duplicate error still happened :(
0

You need to use a SOURCE that doesn't have duplicates by your UNIQUE columns. You can use a ROW_NUMBER() with a PARTITION BY <unique column> and a posterior filter to remove potentially duplicates before the actual insert:

;WITH DuplicateRanking AS
(
    SELECT 
        articleId = articleId, 
        categoryId = @categoryId, 
        creator = @creator, 
        createDate = @now, 
        ruleId = @ruleId, 
        assignmentTypeId = 2 ,
        DuplicateRanking = ROW_NUMBER() OVER (
            PARTITION BY 
                articleId       -- Your unique columns here
            ORDER BY 
                (SELECT NULL))  -- Your desired order here (will determine which row gets inserted)
    FROM 
        @articleIdList  
)
MERGE article_category AS [target]
USING DuplicateRanking AS [source]
    ON (
        target.articleId = source.articleId AND 
        target.categoryId= source.categoryId
    )
WHEN NOT MATCHED AND [source].DuplicateRanking = 1 -- Only insert 1 row per unique column set 
    THEN
    INSERT (articleId, categoryId, creator, createDate, ruleId, assignmentTypeId)
    VALUES (source.articleId, source.categoryId, source.creator, source.createDate, source.ruleId, source.assignmentTypeId);

3 Comments

@xeraphim it's not possible to insert duplicates with this code unless you aren't correctly declaring the PARTITION BY column/values or there's another clause of the merge that does an update. Is your unique constraint on columns articleId and categoryId?
yes it's on articleId and categoryId. Can the problem not be that this procedure is executed multiple times in milliseconds?
The merge ensures that the comparison/join is done on the same transaction and scope as the operations it will do. If its failing due to other records then there must be records already inserted with these values even if this insert fails. Do you have other operations on the same transaction? Are you sure that this merge is causing the error?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.