2

I'm trying to extract inline javascript that is uniquely different on thousands of URLs, and is nested within the code at various levels.

As I familiarize myself with XPATH syntax I am trying to see if anyone knows a good way to target javascript For example:

<script type="text/javascript"> ...data_#...</script>
<script type="text/javascript"> ...data_#...</script>
<script type="text/javascript"> ...data_n...</script>
<script type="text/javascript"> ...data_#...</script>
<script type="text/javascript"> ...data_#...</script>

The only unique identfier within the <script>...data_n...</script> that I am attempting to extract is it contains:

var tabsRelated = ...

Within the confines of XPATH does anyone know a way to find the script that contains that variable and target the entire script? Sorta like:

//script[inner.text contains='var tabsRelated'

syntax is not proper

3
  • possible duplicate of xpath to get Node containing text Commented Nov 7, 2011 at 19:30
  • The question I am asking refers to a more complex problem. In the cited discussion text() seems only to apply to HTML elements. I am unable to use this to isolate the above mentioned inline javascript. Commented Nov 7, 2011 at 20:09
  • 1
    XPath has no concept of javascript. It's just plain text as far as string searching is concerned. Find JS nodes, and check if their textvalue contains the string you want. Commented Nov 7, 2011 at 20:10

1 Answer 1

5

Use:

//script[contains(., $someDistinguishingValue)]

where $someDistinguishingValue should be replaced with the corresponding value (for example the above XPath expression may be dynamically generated as a string and then this string evaluated as an XPath expression using the available XPath API (such as the DOM method SelectNodes() ).

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.