1

I do not manage to extract the thumbnail link with xpath in the following script/json:

<script type="application/ld+json">{"@context":"https://schema.org","@type":"VideoObject","name":"XYZ","description":"Text","thumbnailUrl":"https://www.thumbnail.com/thumb.jpg","uploadDate":"2000-01-01","contentUrl":"https://www.example.com"}</script>

I need the thumbnail link (https://www.thumbnail.com/thumb.jpg) after "thumbnailUrl", but do not know how the selector would follow after

//script[@type="application/ld+json"]/...

Please help.

1
  • If your XPath processor supports XPath 3.1, then parse-json(//script[@type="application/ld+json"])?thumbnailUrl would do. Commented Apr 23, 2022 at 18:38

1 Answer 1

2

when the you have a file name "script.html" with the content:

<script type="application/ld+json">{"@context":"https://schema.org","@type":"VideoObject","name":"XYZ","description":"Text","thumbnailUrl":"https://www.thumbnail.com/thumb.jpg","uploadDate":"2000-01-01","contentUrl":"https://www.example.com"}</script>

then:

xidel -s -e "//script[1]"  script.html  >temp.json
xidel -s -e '$json."thumbnailUrl"' temp.json

should output:

https://www.thumbnail.com/thumb.jpg

tested with: Xidel 0.9.8 (on Windows)

EDIT:

It is also possible in one step

xidel -s -e "json(//script)/thumbnailUrl"  script.html
Sign up to request clarification or add additional context in comments.

3 Comments

There's no need for a temporary file when you can do: xidel -s input.htm -e "json(//script[@type='application/ld+json'])/thumbnailUrl". And for the latest xidel release that's parse-json().
@Reino: I was not able to get an example using parse-json() working. Probablu because of the mixed html/json
parse-json() will only work with v0.9.9.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.