0

I have a function that through regular expression removes html content:

a.replace( /<.*?>/g, "");

However, if there are spaces they remain, for example:

<a href='site.com'>    testing</a>

That will keep the spaces. Also for something like this:

<a href='site.com'>    $20</a>

I would like the function to return only 20. So, the question is:

How do I modify the regular expression so that $ and spaces get removed as well?

5
  • 1
    stackoverflow.com/questions/1732348/… Commented Aug 11, 2011 at 17:30
  • great thread, very popular, but it doesn't answer my question :p Commented Aug 11, 2011 at 17:47
  • 2
    @luquita: He's got a point though, you really should be using DOM methods for this kind of thing. Commented Aug 11, 2011 at 18:16
  • Simply use a.innerText = ""; or $(a).text(""); Regex is not the tool you're looking for. Commented Aug 11, 2011 at 18:39
  • Using DOM is a good point here. For example jQuery("<a href='site.com'> $20</a>").text() returns " $20" (StackOverflow strips the spaces) which is easier to process. Continuing jQuery(…).text().replace(/[\s$]*/, '') results in 20. Commented Aug 11, 2011 at 18:53

3 Answers 3

3

You could extend your expression and use:

a.replace( /(?:\s|\$)*<.*?>(?:\s|\$)*/g, "");

Now, (?:\s|\$) was added. This forms a pattern of whitespaces (\s) or the $ sign (\$). The escape before the $ sign is necessary since it would match line ends otherwise. Putting ?: directly after the parenthesis creates a group for searching that is not returned as a group result.

The pattern occurs twice to allow removal of whitespace or $ signs before or after the tag.

Sign up to request clarification or add additional context in comments.

3 Comments

And the ?: is there to make it more l33t. :-)
As mentioned the ?: is there so that the whitespace-$ compound does not count as a search group. It does not matter for replacing, but in case somebody wants capture some groups, it prevents surprises. I was just thinking the asker could possibly want to reuse the pattern and extent it to search for elements of the string. Then the asker would have funny additional captured groups if the ?: is missing.
You can just replace the (?:\s|\$) with [\s$] ([\s\$] in some regex flavors).
0

alternatively

a.replace( /<.*?[>][\s$]*/g, "");

Comments

0

or to also remove the whitespace and dollar if there is no html tag present.

a.replace( /(<.*?>)|([\s$])/g, "");

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.