0

I have a large table (table1, columns are name, key, info) with ~1,000,000 rows on which I need to perform the following:

  1. Select all rows in which info is Null or ""
  2. Execute a conversion function in Python 3 that converts name to info (let's call it conversion(name))
  3. Update the row with the new info value

What is the fastest way to perform this update? Are there any SQLite3 settings which could be activated to improve performance?

My current research has suggested the following with the SQLite3 library:

cursor = db.cursor()
cursor.execute('SELECT longkey, name FROM table1 WHERE info IS NULL or info = "";')
rows = cursor.fetchall()
items = []
for row in rows:
    # Convert name to info
    info = conversion(row[1])
    items.append(info,row[0])
cursor.executemany('UPDATE table1 SET info = ? WHERE longkey = ?;',items)

The problem with this of course is the creation of the list rows which is enormous and very memory intensive.

I have considered multiple cursors but this seems to not be a good solution.

Edit: Is using connection.create_function(name, num_params, func) a possible solution to this?

How can I optimise this process to be fast but not extremely memory intensive?

1 Answer 1

1

When you call the function directly from SQLite, the rows are updated one by one:

db.create_function('conversion', 1, conversion)
cursor.execute("UPDATE table1 SET info = conversion(name);")
Sign up to request clarification or add additional context in comments.

1 Comment

thanks. will this be most efficient method for the update then?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.