-2

I have a .js file. This is a javascript file with text like below. I want to extract all of the href URLs and add them to a variable inside a loop for processing further. How can I do this? Thanks very much.

 document.write('<tr bgcolor="#6691BC">'); document.write('<td
 width="15" height="25">&nbsp;</td>'); document.write('<td width="690"
 height="25" class="headertext">');

 document.write('<a href="../myspace.com/index.html" class="headerLink"
 style="color: #ffffff;">My Space</a>&nbsp;&nbsp;|&nbsp;&nbsp;');

 document.write('<a href="../technotes.com/index.html"
 class="headerLink" style="color: #ffffff;">Tech
 Notes</a>&nbsp;&nbsp;|&nbsp;&nbsp;');

 document.write('<td width="15" height="25">&nbsp;</td>');
 document.write('</tr>');
3
  • XSLT is not a Javascript parser, and arbitrary JS is not valid XML. Commented Feb 4, 2014 at 3:54
  • XSLT can solve towers of Hanoi, so i don't think sub-string matching is beyond it's capabilities, but it won't be easy or pretty... Commented Feb 4, 2014 at 3:56
  • -1 XSLT is for XML only. Commented Feb 4, 2014 at 9:09

2 Answers 2

1

I would adopt a different approach - first convert your html into a single xhtml string (note the missing </td>, and & will need to be escaped as &amp;)

var xhtml = [
'<tr bgcolor="#6691BC">', 
  '<td width="15" height="25">&amp;nbsp;</td>',
  '<td width="690" height="25" class="headertext">',
    '<a href="../myspace.com/index.html" class="headerLink" style="color: #ffffff;">My Space</a>&amp;nbsp;&amp;nbsp;|',
    '<a href="../technotes.com/index.html" class="headerLink" style="color: #ffffff;">Tech Notes</a>'
  '</td>',
  '<td width="15" height="25"><a id="JustAnAnchor">Anchor</a></td>',
'</tr>'].join("");

document.write(xhtml);

You'll then need to solve the challenge of applying the xslt transform in javascript.

The following xslt will extract the hrefs from all <a href> tags and dump them into a comma delimited list which you can then use back in javascript (There should be no need to remove the extraneous last trailing comma)

<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    <xsl:template match="/">
        <xsl:apply-templates select="//a[@href]"></xsl:apply-templates>
    </xsl:template>

    <xsl:template match="a">'<xsl:value-of select="@href"/>',</xsl:template>
</xsl:stylesheet>

Output:

'../myspace.com/index.html','../technotes.com/index.html',
Sign up to request clarification or add additional context in comments.

Comments

0

XSLT cannot parse Javascript easily. It's the wrong tool for the job.

Here are some approaches you could pursue:

(1) Run the javascript, capture the resulting document, then use XSLT on that. This may be troublesome if the document is not well formed XML.

(2) Use regular expressions e.g. grep, perl -e, Javascript match function

(3) Run the javascript, then use document.querySelectorAll('*[href]') to grab all the elements with an href and work form there

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.