I have a database with the following tables :
CREATE TABLE [Document](
[IdDocument] [int] NOT NULL,
[DocumentType] [int] NOT NULL
CONSTRAINT PK_Document PRIMARY KEY CLUSTERED (IdDocument)
)
CREATE TABLE [ExternalKey](
[IdExternalKey] [int] NOT NULL,
[RefDocument] [int] NOT NULL,
[EntityType] [int] NOT NULL,
[Value] [int] NOT NULL,
CONSTRAINT PK_ExternalKey PRIMARY KEY CLUSTERED (IdExternalKey)
CONSTRAINT FK_RefDocument FOREIGN KEY (RefDocument) REFERENCES Document(IdDocument),
CONSTRAINT UC_ExternalKey UNIQUE (RefDocument, EntityType, Value)
)
- Documents are mapped to physical files on a drive. Each document has a type (eg : 2 = IDENTITY_CARD)
- Those documents are linked with external entities with a one to many relation. Each external key has an entity type (eg : 50 = PERSON) and a ID (eg : 213235)
Database is quite big : Document is 10M records. ExternalKey 40M.
Example of data inside ExternalKey:
IdExternalKey RefDocument EntityType Value
1 1 50 3421
2 1 50 9524
3 1 60 7893
4 2 50 1752
5 2 50 8979
I want to filter documents based on their type and the entities they are linked with (simplified query) :
WHERE (DocumentType IN (10, 20, ...)
AND ExternalKeys ANY (EntityType = 50 AND Value IN (4,5,6,7,8,9,...)))
OR (DocumentType IN (80, 90, ...)
AND ExternalKeys ANY (EntityType = 60 AND Value IN (110,120,130,...)))
The best i could come up is this:
SELECT TOP 50 IdDocument
FROM Document d
WHERE
(d.DocumentType IN (SELECT Id FROM @listA1) AND d.IdDocument IN (SELECT ex.RefDocument FROM ExternalKey ex WHERE ex.EntityType = 60 AND ex.Value IN (SELECT Id FROM @listB1))) OR
(d.DocumentType IN (SELECT Id FROM @listA2) AND d.IdDocument IN (SELECT ex.RefDocument FROM ExternalKey ex WHERE ex.EntityType = 61 AND ex.Value IN (SELECT Id FROM @listB2))) OR
...
(d.DocumentType IN (SELECT Id FROM @listA3) AND d.IdDocument IN (SELECT ex.RefDocument FROM ExternalKey ex WHERE ex.EntityType = 59 AND ex.Value IN (SELECT Id FROM @listB3)))
ORDER BY IdDocument
Table value parameters contains a lot of IDs (about 100K in total) Table definition is:
CREATE TYPE int_list_type AS TABLE(Id int NOT NULL PRIMARY KEY)
The query is slow and performance vary a lot depending the sort (from a dozen of seconds to several minutes). For example, sorting documents on IdDocument is way faster than sorting them on DocumentType.
What I have tried so far:
- Creating indexes on
RefDocument,EntityType,Value(mulitple column index). I have tried different column orders. - Replacing the table value parameters (eg : @listA1) by hardcoded values in the query : parsing time explode (as expected).
- Replacing IN with EXISTS : performance is similar. Replace IN with JOIN : a lot slower
Is there some more efficient ways to perform the filter, any tips ?
Here is the query plan for 2 OR clauses (each OR clause is in RED)

