Encoding Issue when using Item in Scrapy in Python

Question

My Item class is as follows:

class SkeletonItem(scrapy.Item):
    name = scrapy.Field()

In the parse() function of spider class, the logic is as follows:

soup = BeautifulSoup(response.body, 'html.parser')
si = SkeletonItem()
si['name'] = li.find("div", {"class": "info-panel"}).find("h2").text.encode('utf-8')
print si['name'] # str'中文'
print si         # str'\u1234\u5678'
retirm si

As we can see, it outputs well when printing si['name'], whereas it merely shows unicode when printing si as a whole. The above phenomenon leads to the problem that when I write si to file, it only shows unicode in the file.

Could anyone give me some idea? Great thanks.

Djunzu · Accepted Answer · 2016-06-14 00:37:24Z

1

I think you will find your answer here.

answered Jun 14, 2016 at 0:37

Djunzu

5182 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Encoding Issue when using Item in Scrapy in Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related