0

Below is a dummy example to demonstrate how the query results are different, the real query is more more complex so the query structure may seem overkill in this example. Set up a connection to a sqlite3 database and add these records to start with:

import sqlite3

connection = sqlite3.connect(
    'file:test_database',
    detect_types=sqlite3.PARSE_DECLTYPES,
    isolation_level=None,
    check_same_thread=False,
    uri=True
)

cursor = connection.cursor()

tableA_records = [(1, 202003), (2, 202003), (3, 202003), (4, 202004), (5, 202004), (6, 202004), (7, 202004), (8, 202004), ]
tableB_records = [(1, 202004), (2, 202004), (3, 202004), (4, 202004), (5, 202004),]

tableA_ddl = """
    create table tableA
    (
        ID           int,
        RunYearMonth int
    );
"""

tableB_ddl = """
    create table tableB
    (
        ID           int,
        RunYearMonth int
    );
"""

cursor.execute(tableA_ddl)
cursor.execute(tableB_ddl)

cursor.executemany("INSERT INTO tableA VALUES (?, ?)", tableA_records)
cursor.executemany("INSERT INTO tableB VALUES (?, ?)", tableB_records)

Now we have two tables (A and B) with 8 and 5 records, respectively. I want to count records that have the same ID and date between the two when the date is 202004.

I have this query now:

SELECT COUNT(*)
    FROM (
        SELECT *
        FROM `tableA`
        WHERE `RunYearMonth` = 202004
    ) AS `A`
    INNER JOIN (
        SELECT *
        FROM `tableB`
        WHERE `RunYearMonth` = 202004
    ) AS `B`
      ON `A`.`ID` = `B`.`ID`
      AND `A`.`RunYearMonth` = `B`.`RunYearMonth`

This, as expected, returns 2 when run in a sqlite console.

When run in Python, though, you get a different result.

q = """
SELECT COUNT(*)
    FROM (
        SELECT *
        FROM `tableA`
        WHERE `RunYearMonth` = 202004
    ) AS `map1`
    INNER JOIN (
        SELECT *
        FROM `tableB`
        WHERE `RunYearMonth` = 202004
    ) AS `map2`
      ON `map1`.`ID` = `map2`.`ID`
      AND `map1`.`RunYearMonth` = `map2`.`RunYearMonth`
"""
cursor.execute(q)
print(cursor.fetchall())

This returns instead 5 which effectively ignores the WHERE clauses in the subqueries and the join condition they have the same RunYearMonth, there are records 1-5 in both.

What would cause this difference? Does Python not simply pass the query string through?

Pertinent versions:

sqlite3.version == 2.6.0
sqlite3.sqlite_version == 3.31.1
sys.version == 3.6.5
2
  • I'm getting 2 for the query run in Python. My SQLite version is 3.22.0 and my Python version is 3.8.0 Commented Apr 24, 2020 at 3:34
  • 2
    @mechanical_meat The relevant bug was introduced in 3.31. Commented Apr 24, 2020 at 4:54

1 Answer 1

1

I created a test database using your first script, and then opened it in a sqlite3 shell. Your query returns 5 in it, not the 2 you're getting. After changing it to show all the rows, not just the count, it results in:

ID          RunYearMonth  ID          RunYearMonth
----------  ------------  ----------  ------------
1           202003        1           202004
2           202003        2           202004
3           202003        3           202004
4           202004        4           202004
5           202004        5           202004

I'm not sure why those rows from tableA with RunYearMonth of 202003 are getting included; I'd think they would be filtered out by the subquery's WHERE.

This appears to be a bug in Sqlite3 - using an older version (3.11.0) gives the expected results, and a slight tweak to the query to remove the AND map1.RunYearMonth = map2.RunYearMonth produces the correct results on 3.31.1.


Anyways, that query can be cleaned up significantly, like so:

SELECT count(*)
FROM tableA AS A
JOIN tableB AS B ON A.ID = B.ID
                AND A.RunYearMonth = B.RunYearMonth
WHERE A.RunYearMonth = 202004;

which does return the expected count of 2.

Sign up to request clarification or add additional context in comments.

1 Comment

Turns out this was already found and fixed in February for the upcoming 3.32 release: sqlite.org/src/timeline?c=c9a8defcef35 It only affects 3.31.0 and 3.31.1 from the sounds of it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.