0

I have a huge (1GB+) database dump that I want to load into new databases on other servers. I tried parsing it line by line and executing each into mysql, but it unfortunately doesn't split lines evenly into commands and just fails on the incomplete ones.

filename='/var/test.sql'
fp = open(filename)
while True:
        a = fp.readline()
        if not a:
           break
        cursor.execute(a) #fails most of the time

It is also way too large to load the entire thing into memory call that. Furthermore, the python MySQLdb module does not support the source command.

EDITED

File includes a bunch of insert and create statements. Where its failing is on the inserts of large tables that contain raw text. There are all sorts of semi-colons and newlines in the raw text so hard to split commands based on that.

4
  • 1
    It may help if you provide sample code from test.sql. Commented May 13, 2011 at 12:12
  • Please translate "huge" into megabytes. Please show what test.sql looks like. Commented May 13, 2011 at 12:46
  • huge is around 1GB, but could theoretically be 5GB, 10GB Commented May 13, 2011 at 18:15
  • 2
    A quick silly question: Doesn't mySQL supply an executable program that will do bulk create and insert statements from your file (presumably a lot faster than a Python script that parsed your file and did it a statement at a time)? Commented May 14, 2011 at 0:13

3 Answers 3

1

Any reason you can't spawn out a process to do it for you?

import subprocess

fd = open(filename, 'r')
subprocess.Popen(['mysql', '-u', username, '-p{}'.format(password), '-h', hostname, database], stdin=fd).wait()

You may want to tailor that a little as the password will be exposed to ps.

Sign up to request clarification or add additional context in comments.

Comments

1

Assuming queries do end on line boundaries, you could just add lines together until they make a complete query.

Something like:

filename='/var/test.sql'
fp = open(filename)
lines = ''
while True:
        a = fp.readline()
        if not a:
           break
        try:
           cursor.execute(lines + a)
           lines = ''
        except e:
           lines += a

If it's only insert statements, you could look for lines ending ; and with the next line starting "INSERT".

filename='/var/test.sql'
fp = open(filename)
lines = ''
while True:
        a = fp.readline()
        if not a:
           break
        if lines.strip().endswith(';') and a.startswith('insert'):
           cursor.execute(lines)
           lines = a
        else:
           lines += a
# Catch the last one
cursor.execute(lines)

edit: replaced trim() with strip() & realised we don't need to execute the line a in second code example.

3 Comments

Thanks RJ. Good idea, I tried that but the dumps contain text that include semi-colons and newlines
@john machin @Xavier, yes, I meant strip. I wanted to take the newlines away before lines ended with ;. Unfortunately, python is not the language I've been programming with the most recently.
@JiminyCricket It has lines that end in ";" where the next line starts with "insert" in the data. What about the first method that just tries adding more lines to the query until it works?
0

I think, sometimes, we should choose the other ways to do the job effectively. I prefer to use this stuff for large data: http://www.mysqldumper.net/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.