0

I want to save the following html into mysql database.

 <table border="0" cellpadding="2" cellspacing="7" style="vertical-align:top;"><tr><td     width="80" align="center" valign="top"><font style="font-size:85%;font-family:arial,sans-serif"><a href="http://news.google.com/news/url?sa=t&amp;fd=R&amp;usg=AFQjCNFJZ2wrps7WAV59VWbRP_g0aQ4oew&amp;url=http://www.usatoday.com/money/companies/story/2012-06-09/cost-lunch-warren-buffett/55476718/1"><img src="//nt3.ggpht.com/news/tbn/n9IJTSGY5UhSzM/6.jpg" alt="" border="1" width="80" height="80" /><br /><font size="-2">USA TODAY</font></a></font></td><td valign="top" class="j"><font style="font-size:85%;font-family:arial,sans-serif"><br /><div style="padding-top:0.8em;"><img alt="" height="1" width="1" /></div><div class="lh"><a href="http://news.google.com/news/url?sa=t&amp;fd=R&amp;usg=AFQjCNGyE1zK-V0YHEoCAnXZhUxElVsH_Q&amp;url=http://www.washingtonpost.com/national/private-lunch-with-investor-warren-buffett-sells-for-35-million-in-charity-auction/2012/06/08/gJQAEp07OV_story.html"><b>Private lunch with investor Warren Buffett sells for $3.5 million in charity <b>...</b></b></a><br /><font size="-1"><b><font color="#6f6f6f">Washington Post</font></b></font><br /><font size="-1">OMAHA, Neb. — The cost to dine with investor Warren Buffett has apparently spiked in value, with one deep-pocketed bidder forking over nearly $3.5 million during a charity auction Friday night. The annual auction for a private lunch with the Nebraska <b>...</b></font><br /><font size="-1"><a href="http://news.google.com/news/url?sa=t&amp;fd=R&amp;usg=AFQjCNHIe3njmIFMpS9E6-co2zswxEsBZQ&amp;url=http://www.google.com/hostednews/ap/article/ALeqM5jJKkBW1l-UDJC2cTIbtdsnf3HVPA?docId%3D83206c782d534f83ac832145ba2d9c1d">Cost to lunch with Warren Buffett: $3.5 million</a><link rel="syndication-source" href="www.ap.org/83206c782d534f83ac832145ba2d9c1d" /><font size="-1" color="#6f6f6f"><nobr>The Associated Press</nobr></font></font><br /><font size="-1"><a href="http://news.google.com/news/url?sa=t&amp;fd=R&amp;usg=AFQjCNEiOqIjoDlaluVXLlwIYI-_3wCABw&amp;url=http://www.sfgate.com/cgi-bin/article.cgi?f%3D/g/a/2012/06/08/bloomberg_articlesM4UMRV0D9L3701-M5BYZ.DTL">Buffett Lunch Auction Raises Record $3.46 Million for Charity</a><font size="-1" color="#6f6f6f"><nobr>San Francisco Chronicle</nobr></font></font><br /><font size="-1"><a href="http://news.google.com/news/url?sa=t&amp;fd=R&amp;usg=AFQjCNE07-nsDiD0R0w_g7Juy6bU_oPv7w&amp;url=http://money.cnn.com/2012/06/08/investing/buffett-auction-lunch/">Bidding for lunch with Buffett hits $400000</a><font size="-1" color="#6f6f6f"><nobr>CNN</nobr></font></font><br /><font size="-1" class="p"></font><br /><font class="p" size="-1"><a class="p" href="http://news.google.com/news/more?ned=us&amp;topic=b&amp;ncl=divwXhAe6uG0iLMj4RF1YfpKIhlsM"><nobr><b>all 446 news   articles&nbsp;&raquo;</b></nobr></a></font></div></font></td></tr></table>

There is no error report but there is no data in DB.

I'm using the following code to escape html. I use Python 2.6.6

  def get_feeds():
    import time,datetime
    import MySQLdb
    import cgi
    import xml.etree.ElementTree as etree
    from urllib import urlopen
    db=MySQLdb.connect("localhost","test","test","headline")
    cursor = db.cursor()
    URL = "http://news.google.com/news?cf=all&ned=us&hl=en&topic=b&output=rss"
    tree = etree.parse(urlopen(URL))
    items = tree.findall('.//item')
    for item in items:
            link = item.find('link')


            title = item.find('title')
            title_text = title.text
            #print(title_text)
            desc = item.find('description')
            desc_text = cgi.escape(desc.text)
            #print(desc_text)
            timestamp = item.find('pubDate')
            timestamp2 = time.mktime(time.strptime(timestamp.text, '%a, %d %b %Y %H:%M:%S GMT'))
            guid = item.find('guid')
            guid_text = guid.text
            link = item.find('link')
            link_text = link.text
            #save item to DB
            sql = 'insert into headline(guid,title,link,description,timestamp) values ("%s","%s","%s","%s","%d")' % \
                     (guid_text,title_text,link_text,desc_text,timestamp2)

            #try:
            cursor.execute(sql)
            db.commit()
            #except:
            #       db.rollback()   

    db.close()

if __name__ == '__main__':
    get_feeds()
13
  • Be sure you .commit() after any insert statement(s). Commented Jun 9, 2012 at 5:45
  • I have try: cursor.execute(sql) db.commit() except: db.rollback() Commented Jun 9, 2012 at 5:45
  • I'm new in Python. If data can not be insert, why there is no error message? Commented Jun 9, 2012 at 5:47
  • don't use the naked except:. it is hiding the real error message. Commented Jun 9, 2012 at 5:48
  • @Sophia: please post the code. Commented Jun 9, 2012 at 5:49

1 Answer 1

1

We have been working through other issues in the comments, but this one is too long to put in a comment...

You need to pass your data as a second argument to .execute().
This will allow MySQLdb to handle quoting appropriately for you.
It will also make your code more resistant to SQL-injection attack.

Also, note that all parameter placeholders are %s regardless of type as specified in the DB-API.
Example:

sql = '''insert into headline (guid,title,link,description,timestamp) 
                       values ("%s","%s","%s","%s","%s");'''
cursor.execute(sql, (guid_text,title_text,link_text,desc_text,timestamp2))

To summarize a couple of the aforementioned issues:

  1. A naked except: was hiding the true error.
  2. The character set of the MySQL database was changed and the appropriate charset kwarg was used when calling MySQLdb.connect().
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.