I have a scrapy code that scrapes a website and writes to MySQL
import MySQLdb.cursors
def __init__(self,stats):
self.dbpool = adbapi.ConnectionPool(<dbnam>,host=<host>,user=<user>,port=<port>,passwd=<pwd>, db=<dbname>, cursorclass=MySQLdb.cursors.DictCursor, charset='utf8', use_unicode=True)
def process_item(self, item, spider):
query = self.dbpool.runInteraction(self._conditional_insert, item)
query.addErrback(self.handle_error)
Scrapy script for a list of numbers in table
item['numbers'] = sites.xpath('//*[@id="numbers-0"]/tbody/tr/td/text()').extract()
I'm scraping the following content: 10″ 11″ 12″ etc. My code returns the following:
'numbers': [u'10\u2033', u'11\u2033', u'12\u2033'],
Inserting this into a MySQL db throws an error message - I'm guessing due to unicode issue.
tx.execute("""INSERT INTO numbers ('{0}').format(", ".join(item['numbers'])))
Could you please help get the insert to succeed. Better still, how can I remove the special character '\u2033' from the list?
Thanks in advance!
pymysqlin place. And to install it runsudo pip install PyMySQL.