1

I have two SQLite tables (list1 and list2) each with only one text column (val). I want efficiently search all combinations, where list2.value can be substring in list1.value.

Currently I have this solution:

import sqlite3

list1 = ["this is string1", "this is string2", "this is string3"]
list2 = ["string1", "string2"]

in_memory = sqlite3.connect(':memory:')
c = in_memory.cursor()
c.execute('CREATE TABLE list1 (val text NOT NULL)')
c.execute('CREATE TABLE list2 (val text NOT NULL)')

for v in list1:
    c.execute("INSERT INTO list1 VALUES (?)", (v, ))

for v in list2:
    c.execute("INSERT INTO list2 VALUES (?)", (v, ))

l = [*c.execute("SELECT list1.val, list2.val FROM list1, list2 WHERE instr(list1.val, list2.val)")]
print(l)

Prints correctly:

[('this is string1', 'string1'), ('this is string2', 'string2')]

Is there more effective SQL solution than iterating over each list1.val and list2.val combination and search if there's substring?

2
  • 1
    Full Text Search might be of interest. Commented Jun 16, 2019 at 23:06
  • @Shawn Yes, I looked at it but couldn't get it to work (I'm SQL beginner). Could you please provide an example? Commented Jun 16, 2019 at 23:08

1 Answer 1

2

You can phrase this as a single query:

select l1.value, l2.value
from list1 l1 join
     list2 l2
     on l1.val like '%' || l2.val || '%';

Doing the loop inside the database is slightly more efficient that doing the loop yourself -- because only matching rows are returned and you don't have the overhead of multiple queries.

However, this will still be doing nested loops. Such a query cannot take advantage of traditional indexes.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for your answer! I looked at FTS5 SQLite extension, but I couldn't get it work. Is it possible to use this SQLite full-text index in this case?
@AndrejKesely . . . Possibly not. For instance, full text search looks for words in the text, and your code is looking for generic strings. I also think that match requires that the pattern be a constant -- which would suggest your approach of a nest-loop in the application level.
" Possibly not full text search looks for words in the text, and your code is looking for generic strings. " @AndrejKesely " 3.2. FTS5 Phrases FTS queries are made up of phrases. A phrase is an ordered list of one or more tokens." see manual the search just not only have to be a "word" SQLite supports searching generic strings also.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.