0

I'm trying to store a playlist in an sqlite table. To save on storage space I want to keep song artists and titles in a table and use both table's rowids to store the plays.

The structure of the database looks like this:

conn = sqlite3.connect('playlist.db')
c = conn.cursor()
sqlcommand= 'CREATE TABLE if not EXISTS artists(artist text PRIMARY KEY)'
c.execute(sqlcommand)
sqlcommand= 'CREATE TABLE if not EXISTS titles(title text PRIMARY KEY)'
c.execute(sqlcommand)
sqlcommand= 'CREATE TABLE if not EXISTS plays(PlayDate date, PlayTime datetime, ArtistID integer, TitleID integer )'
c.execute(sqlcommand)

Each song play event is stored like this

sqlcommand= 'INSERT or IGNORE INTO artists VALUES ("' + FoundArtist + '")'
c.execute(sqlcommand)
sqlcommand= 'INSERT or IGNORE INTO titles VALUES ("' + FoundTitle + '")'
c.execute(sqlcommand)

sqlcommand= 'SELECT ROWID from artists where artist = "'+FoundArtist+'"'
c.execute(sqlcommand)
ArtistID = str(c.fetchone()[0])

sqlcommand= 'SELECT ROWID from titles where title = "'+FoundTitle+'"'
c.execute(sqlcommand)
TitleID = str(c.fetchone()[0])

sqlcommand= 'INSERT or IGNORE INTO plays VALUES ("' + FoundPlayDate + '",' + \
                                             '"'+ FoundPlayTime + '",' + \
                                                  ArtistID + ',' + \
                                                  TitleID + ')'

c.execute(sqlcommand)

This works, but is rather slow.

How can I combine the two select and the last insert command so that they run at the same time?

I'm running Python 2.7.6 on a raspberry pi (with Raspian/Debian wheezy)

0

2 Answers 2

1

You cannot rely on the rowid values when this column does not appear in the table definition; any VACUUM might change them. So your tables should be defined like this:

CREATE TABLE artists(ID INTEGER PRIMARY KEY, name TEXT UNIQUE);
CREATE TABLE titles(ID INTEGER PRIMARY KEY, name TEXT UNIQUE);
CREATE TABLE plays(
    PlayDate DATE,
    PlayTime DATETIME,
    /* ... */
    ArtistID INTEGER REFERENCES artists(ID),
    TitleID INTEGER REFERENCES titles(ID)
);

In theory, the ID lookup could be done with subqueries, like this:

INSERT INTO plays VALUES('...', '...',
                         (SELECT ID FROM artists WHERE name = '...'),
                         (SELECT ID FROM titles WHERE name = '...'));

However, the database already must do a lookup of the names when you are trying to insert them. For names that already exist, you can avoid one of these lookups if you try to read the ID (and check for its existence at the same time) before you insert them:

c.execute("SELECT ID FROM artists WHERE name = ?", (FoundArtist,))
row = c.fetchone()
if row is None:
    c.execute("INSERT INTO artists(name) VALUES(?)", (FoundArtist,))
    artistID = c.lastrowid
else:
    artistID = row[0]

c.execute("SELECT ID FROM titles WHERE name = ?", (FoundTitle,))
row = c.fetchone()
if row is None:
    c.execute("INSERT INTO titles(name) VALUES(?)", (FoundTitle,))
    titleID = c.lastrowid
else:
    titleID = row[0]

c.execute("INSERT INTO plays VALUES(?,?,?,?)",
          (FoundPlayDate, FoundPlayTime, artistID, titleID))
Sign up to request clarification or add additional context in comments.

2 Comments

Good point with the VACUUM comment. I'll test this later in my free time and mark it solved when this works for me. Thanks.
I had problems with c.execute("SELECT ID FROM titles WHERE name = ?", (FoundTitle,)) where I got an error sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings so I'm back to assembling the commands as strings. The rest works fine, so I will mark this solved.
0

You don't need to pull rowids from artists and titles. You've set the artist and title names as primary keys. They are the ids, and you don't need to query them because you already have them.

All you really need to do is to alter your plays schema like this:

CREATE TABLE plays (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    founddatetime TEXT DEFAULT CURRENT_TIMESTAMP,
    artist TEXT,
    title TEXT,
    FOREIGN KEY(artist) REFERENCES artists(artist),
    FOREIGN KEY(title) REFERENCES titles(title)
);

The FOREIGN KEY directive ensures that each artist and title in plays corresponds to an entry in artists and titles, respectively.

Then adding a record would look a bit like this:

c.execute('INSERT OR IGNORE INTO artists(artist) VALUES ?', [FoundArtist])
c.execute('INSERT OR IGNORE INTO titles(title) VALUES ?', [FoundTitle])
c.execute('INSERT INTO plays(founddatetime, artist, title) VALUES (?, ?, ?)',
          [FoundPlayDateTime, FoundArtist, FoundTitle])

c.commit()

1 Comment

Thanks for the reply. I set the artist and titles to primary just to avoid inserting duplicates. (There definitely is a better way of doing this, but I don't know that syntax yet) I treat rowid as the real primary key, because the database size shrinks by 50% if I use only ids in the plays table. This also makes queries a lot faster.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.