I have an HTML body with 4 divs with text inside the divs. I use Scrapy Selectors to remove the text and write it to csv. However, if the div has no text the selector skips it. This is bad because the result needs to match up to each column in the csv. I need empty divs to return empty strings.
Desired result is:
blah,blah,,blah
Because of this requirement, this does not work:
csvfile.writerow(Selector(text=Z).xpath('//div/text()').extract())
giving:
blah,blah,blah
where Z is the html body.
Current code is:
for sl in Selector(text=Z).xpath('//div'):
g = sl.xpath('./text()').extract()
jl.append(g)
csvfile.writerow(sum(jl,[]))
This almost works but I get a list of lists returned:
[u'blah'],[u'blah'],[],[u'blah']
instead of what's desired:
blah,blah,,blah
If I attempt to flatten the list:
csvfile.writerow(sum(jl,[]))
I'm back where I started, the empty strings are removed from the list.
blah,blah,blah