I am trying to write a Web app that crawls info from Sony's PlayStation store. I've found the JSON file that has the data I want, but I'm wondering how to use Scrapy to store only certain elements of the JSON file?
Here's part of the JSON data:
{
"age_limit":0,
"attributes":{
"facets":{
"platform":[
{"name":"PS4™","count":96,"key":"ps4"},
{"name":"PS3™","count":5,"key":"ps3"},
{"name":"PS Vita","count":7,"key":"vita"},
]
}
}
}
I only want the "count" value for the "name" PS4. How would I get this in Scrapy? Here is my Scrapy code thus far:
from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from crossbuy.items import PS4Vita
class PS4VitaSpider(BaseSpider):
name = "ps4vita" # Name of the spider, to be used when crawling
allowed_domains = ["store.playstation.com"] # Where the spider is allowed to go
start_url = "https://store.playstation.com/chihiro-api/viewfinder/US/en/999/STORE-MSF77008-9_PS4PSVCBBUNDLE?size=30&gkb=1&geoCountry=US"
def parse(self, response):
jsonresponse = json.loads(response)
pass # To be changed later
Thanks!
[ p["count"] for p in jsonresponse["attributes"]["facets"]["platform"] if p["name"] == "PS4™" ]?