I am trying to extract pricing and other attributes from this JS-Code:
<script type="application/ld+json">
{
"@context": "http://schema.org/",
"@type": "Product",
"name": "Rolex Cellini Time 50505",
"image": [
"https://chronexttime.imgix.net/S/1/S1006/S1006_58774a90efd04.jpg?w=1024&auto=format&fm=jpg&q=75&usm=30&usmrad=1&h=1024&fit=clamp" ],
"description": "Werk: automatic; Herrenuhr; Gehäusegröße: 39; Gehäuse: rose-gold; Armband: leather; Glas: sapphire; Jahr: 2018; Lieferumfang: Originale Box, Originale Papiere, Herstellergarantie",
"mpn": "S1006",
"brand":{
"@type": "Thing",
"name": "Rolex"
},
"offers":{
"@type": "Offer",
"priceCurrency": "EUR",
"price": "11500",
"itemCondition": "http://schema.org/NewCondition",
"availability": "http://schema.org/InStock",
"seller":{
"@type": "Organization",
"name": "CHRONEXT Service Germany GmbH"
}
}
}
</script>
Alternatively this code might do it as well:
<script type="text/javascript">
window.articleInfo = {
'id': 'S1006',
'model': 'Cellini Time',
'brand': 'Rolex',
'reference': '50505',
'priceLocal': '11500',
'currencyCode': 'EUR'
};
There is much more other JS code on the same page, so I am not sure how to adress this particular script with xpath.
I tried this:
response.xpath('//script[contains(.,"price")]/text()').extract_first()
but the response contains a bunch of values, while I am only looking for the price of 11500. Later on I would also try to get e.g. the name and condition.
"""//script/substring-before(substring-after(., '"price": '), ',') | //script/substring-before(substring-after(., "'priceLocal': "), ",") """response.xpath('''//script/substring-before(substring-after(., '"price": '), ',')''').extract_first()