0

The Python code I have looks like this

f = open('data.tsv', 'r')

for line in f:
    a = line.split("\t")
    e_id = a[1]
    name = a[2]
    start = a[3]
    end = a[4]
    loc = a[5]
    tags = a[6]
    url = a[7]
    cur = con.cursor(mdb.cursors.DictCursor)
    cur.execute("INSERT INTO data_table VALUES (" + e_id + "," + name + "," + start + "," + end + "," + loc + "," + tags + "," + url + ");")

The "loc" piece of data very often has a comma in it because a lot of it is formatted "City, State", so MySQL interprets it as a value. I can't directly comment the commas out because they're being stored as variables. Is there a way around this?

4
  • I imagine that your data is safe since you're not escaping it in anyway? Commented May 31, 2013 at 14:41
  • Yes, it is. And adding ' around the values makes the code say I exceed the column count Commented May 31, 2013 at 14:47
  • Do you have any ' inside of any other values? If your data is all correctly input then you will not get that warning. Unless you do not have 7 columns in the table. Commented May 31, 2013 at 14:56
  • 1
    Regardless of whether the data is 'safe' or not, use the cur.execute("INSERT INTO data_table VALUES (%s,%s,%s,%s,%s,%s,%s)",(e_id,name,start,end,loc,tags,url)) syntax suggested by Miquel - it'll save you a lot of headaches. Commented May 31, 2013 at 15:00

3 Answers 3

3

Be careful when building queries in this format. Before and after concatenating an string you need to put commas. For example,

INSERT INTO data_table VALUES ('" + e_id + "','" + name + "','" + start + "','" + end + "','" + loc + "','" + tags + "','" + url + "');")

I hope it helps to solve your issue.

Also, you can use specific python syntax to perform the inserts:

cur.execute("INSERT INTO data_table VALUES (%s,%s,%s,%s,%s,%s,%s)",(e_id,name,start,end,loc,tags,url)); 
Sign up to request clarification or add additional context in comments.

Comments

2

Why is no one talking about prepared statements? This is exactly the use case. It's an easier syntax, and it's completely safe.

cur.execute("INSERT INTO data_table VALUES (?,?,?,?,?,?,?)", (e_id,name,start,end,loc,tags,url))

(syntax may be slightly different. See Does Python support MySQL prepared statements? )

1 Comment

Ah, I just saw Miguel's answer. However prepared statements aren't python-specific. They're supported by the database.
1

Your true problem likes in that your "City, State" line is a sequence of characters. You cannot directly insert a sequence of characters in SQL like this:

INSERT INTO test VALUES (My String);

Instead, you treat it like a string. MySQL expects strings to be wrapped in single quotes, so you would change the above to:

INSERT INTO test VALUES ('My String');

Now you've got the string 'My String' stored. Now, generating that dynamically isn't much different, except you're going to have to make sure it's wrapped in single quotes - like:

loc = "City, State"
sql = "INSERT INTO test VALUES ('" + loc + "');"

Notice the single quotes around where I'm inserting location.

The rest is additional information

This is a dangerous operation though because I'm allowing any value to directly enter my database, even if that values is "'); DROP DATABASE test; -- " which would do some damage. You'd think it's harmless but after plugged in you get:

INSERT INTO test VALUES(''); DROP DATABASE test; -- ');

And so now I've just lost all of my data. To fix this you want to escape values before putting them in your database which is as simple as MySQLdb.escape_string(str). Then you just:

loc = "'); DROP DATABASE test; -- "
loc = MySQLdb.escape_string(loc);
sql = "INSERT INTO test VALUES ('" + loc + "');"

And the result is:

INSERT INTO test VALUES ('\'); DROP DATABASE test; -- ');

Which will not result in any kind of damage to your database. This is not the only vulnerability that you open yourself to and it's a very simplified example.

2 Comments

Fine answer, but it lacks a pointer to the right direction, namely how to do it: namely with cur.execute("INSERT INTO data_table VALUES (%s,%s,%s,%s,%s,%s,%s)",(e_id,name,start,end,loc,tags,url)).
@glglgl I didn't demonstrate that as it was already done. I chose to use the code he was using and explain why it was incorrect instead of show him another way to do it. That way he can understand what was wrong with the original snippet of code. I'm not teaching Python, I'm answering a question with already written code.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.