Scrapy ProgrammingError: Not all parameters were used in the SQL statement

Question

The problem i am facing is that Scrapy code, specifically pipeline presents me with a Programming error mysql.connector.errors.ProgrammingError: Not all parameters were used in the SQL statement'

This is my code for the pipeline:

import csv
from scrapy.exceptions import DropItem
from scrapy import log
import sys
import mysql.connector

class CsvWriterPipeline(object):

    def __init__(self):
        self.connection = mysql.connector.connect(host='localhost', user='test', password='test', db='test')
        self.cursor = self.connection.cursor()

    def process_item(self, item, spider):
        self.cursor.execute("SELECT title, url FROM items WHERE title= %s", item['title'])
        result = self.cursor.fetchone()
        if result:

            log.msg("Item already in database: %s" % item, level=log.DEBUG)
        else:
            self.cursor.execute(
               "INSERT INTO items (title, url) VALUES (%s, %s)",
                    (item['title'][0], item['link'][0]))
            self.connection.commit()

            log.msg("Item stored : " % item, level=log.DEBUG)
        return item

    def handle_error(self, e):
            log.err(e)

It gives me this exact error when i run the spider. http://hastebin.com/xakotugaha.py

As u can see, it clearly crawls so i doubt anything wrong with the spider.

I am currently using Scrapy web crawler with MySql database. Thanks for your help.

alecxe · Accepted Answer · 2015-01-22 03:12:58Z

1

The error is happening while you are making a SELECT query. There is a single placeholder in the query, but item['title'] is a list of strings - it has multiple values:

self.cursor.execute("SELECT title, url FROM items WHERE title= %s", item['title'])

The root problem is actually coming from the spider. Instead of having a single item being returned with multiple links and titles - you need to return a separate item for every link and title.

Here is the code of the spider that should work for you:

import scrapy

from scrapycrawler.items import DmozItem


class DmozSpider(scrapy.Spider):
    name = "dmoz"
    allowed_domains = ["snipplr.com"]

    def start_requests(self):
        for i in range(1, 146):
            yield self.make_requests_from_url("https://snipt.net/public/?page=%d" % i)

    def parse(self, response):
        for sel in response.xpath('//article/div[2]/div/header/h1/a'):
            item = DmozItem()
            item['title'] = sel.xpath('text()').extract()
            item['link'] = sel.xpath('@href').extract()
            yield item

edited Jan 22, 2015 at 3:12

answered Jan 21, 2015 at 18:41

alecxe

476k127 gold badges1.1k silver badges1.2k bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

CharlieC Over a year ago

I see, how would want go about doing this ? it worked before but for some reason stopped working. This is the spider code if u need it: hastebin.com/yalivovifo.py

CharlieC Over a year ago

ah, does for sel in response.xpath('//article/div[2]/div/header/h1/a'): have any diffrence compared to this ? sel = Selector(response)

alecxe Over a year ago

@CharlieC it's just that you need to yield item instances from inside the loop, for every link. Hope that makes things work.

Collectives™ on Stack Overflow

Scrapy ProgrammingError: Not all parameters were used in the SQL statement

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related