I couldn't find any answer to my problem so I hope it will be ok to ask here.
I am trying to scrap cinema shows and still getting following error.
What is really confusing for me that the problem apparently lies in pipelines. However, I have second spider for opera house with the exact same code(only place is different) and it works just fine."Shows" and "Place" refers to my Django models. I've changed their fields to be CharFields so it's not a problem with wrong date/time format.
I also tried to use dedicated scrapy item "KikaItem" instead of "ShowItem" (which is shared with my opera spider) but the error still remains.
class ScrapyKika(object):
def process_item(self, ShowItem, spider):
place, created = Place.objects.get_or_create(name="kino kika")
show = Shows.objects.update_or_create(
time=ShowItem["time"],
date=ShowItem["date"],
place=place,
defaults={'title': ShowItem["title"]}
)
return ShowItem
Here is my spider code.I expect the problem is somewhere here, because I used a different approach here than in the opera one. However,I am not sure what can be wrong.
import scrapy
from ..items import ShowItem, KikaItemLoader
class KikaSpider(scrapy.Spider):
name = "kika"
allowed_domains = ["http://www.kinokika.pl/dk.php"]
start_urls = [
"http://www.kinokika.pl/dk.php"
]
def parse(self, response):
divs = response.xpath('//b')
for div in divs:
l = KikaItemLoader(item=ShowItem(), response=response)
l.add_xpath("title", "./text()")
l.add_xpath("date", "./ancestor::ul[1]/preceding-sibling::h2[1]/text()")
l.add_xpath("time", "./preceding-sibling::small[1]/text()")
return l.load_item()
ItemLoader
class KikaItemLoader(ItemLoader):
title_in = MapCompose(strip_string,lowercase)
title_out = Join()
time_in = MapCompose(strip_string)
time_out = Join()
date_in = MapCompose(strip_string)
date_out = Join()
Thank you for your time and sorry for any misspellings :)
