I have a large table (table1, columns are name, key, info) with ~1,000,000 rows on which I need to perform the following:
- Select all rows in which
infoisNullor"" - Execute a conversion function in Python 3 that converts
nametoinfo(let's call itconversion(name)) - Update the row with the new
infovalue
What is the fastest way to perform this update? Are there any SQLite3 settings which could be activated to improve performance?
My current research has suggested the following with the SQLite3 library:
cursor = db.cursor()
cursor.execute('SELECT longkey, name FROM table1 WHERE info IS NULL or info = "";')
rows = cursor.fetchall()
items = []
for row in rows:
# Convert name to info
info = conversion(row[1])
items.append(info,row[0])
cursor.executemany('UPDATE table1 SET info = ? WHERE longkey = ?;',items)
The problem with this of course is the creation of the list rows which is enormous and very memory intensive.
I have considered multiple cursors but this seems to not be a good solution.
Edit: Is using connection.create_function(name, num_params, func) a possible solution to this?
How can I optimise this process to be fast but not extremely memory intensive?