3

This is probably quite simple, but I can't get there...

How can I store html code in a SQLITE Data Base?

I'm using text as the data-type for the field in the DB (should it be blob??)

I'm getting weird errors (and changing erros with the same input, so I think it has something to do with escaping)

MY CODE:

con = sqlite.connect(bd)
cur = con.cursor()
temp=cur.execute ('SELECT * from posts').fetchall()
#temp[Z][1] = ID
#temp[Z][4] = URL
i=0
while i< len (temp):
    if temp[i][0]==None:
        try:
            html = urllib2.urlopen(str(temp[i][4])).read()
        except:
            html=None
        #sql = 'UPDATE posts SET html = "' + str(html) + '" WHERE  id = ' +  str(temp[i][1])
        #cur.execute( 'UPDATE posts SET html = ? WHERE  id = ?' ,(html,temp[i][1]) )
        cur.execute("UPDATE posts SET html = '" + str(html) + "' WHERE  id = " +  str(temp[i][1]))
        con.commit()
        print temp[i][4]
    i=i+1

The errors:

1 -

OperationalError: near "2": syntax error WARNING: Failure executing file: Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) Type "copyright", "credits" or "license" for more information.

2-

ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.

P.s. I would rather If it would be text (human readble) than blob, but if it's the easier way, I'm all for it.

Thanx

1 Answer 1

3

Try:

cur.execute(
    "UPDATE posts SET html = ? WHERE id = ?", (html ,temp[i][1]))

Use parameterized arguments to allow sqlite3 to escape the quotes for you. (It also helps prevent SQL injection.)

Regarding the ProgrammingError: html should be a unicode object, rather than a string object. When you open the url:

response=urllib2.urlopen(str(temp[i][4]))

Look at the content type header:

content_type=response.headers.getheader('Content-Type')
print(content_type)

It might say something like

'text/html; charset=utf-8'

in which case you should decode the html string with the utf-8 codec:

html = response.read().decode('utf-8')

This will make html a unicode object, and (hopefully) address the ProgrammingError.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.