0

Before I start I just want to say that I am completely new to Regex so please be gentle with me. Any comments about Regex in general will be greatly appreciated.

I have write the below code

var str = "<blah blah more <b>test</b>>";

var reg1 = "<(?!b>)(?!/b>)";
str = str.replace(new RegExp(reg1), "&lt;"); 

var reg2 = ">(?<!b>)(?<!/b>)";
str = str.replace(new RegExp(reg2), "&gt;"); 

alert(str);

I have checked the regex's using http://regexr.com?2toe2 and it does what I want it to which is to match any < or > but only if they are not html tags. 'I have only covered for now.

Now if your run this code, http://jsfiddle.net/ashburlaczenko/JdATY/9/ the alert is never executed. I put an alert after the first replace which displayed so the error is in the second stage.

Can anyone help me? Please remember these Regexs are my first attempt.

Thank you in advance.

Edit:

<blah blah more <b>test</b>><another <b>blah</b> blah <b>test</b>>

Should become

&lt;blah blah more <b>test</b>&gt;&lt;another <b>blah</b> blah <b>test</b>&gt;

Hope this is clearer.

1
  • Check out this website: rubular.com , it's really helpful when constructing regexes. Commented Jul 28, 2013 at 7:52

2 Answers 2

2

JavaScript doesn't support look-behind assertions in regular expressions. You can mimic it, with a little help from this blog post, but it's still not that great.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for your answer. Do you know how I could write this without using look-behind assertions?
1
function escapeHtml($0, $1, $2, $3)
{
    var result = $1 == '<' ? '&lt;' + $2 + $3 : $1 + $2 + '&gt;';
  return result;
}

str = str.replace(/([<>])([^<>]*?)(\1)/g, escapeHtml);
str = str.replace(/^([^<]*?)>/, '$1&gt;');
str = str.replace(/<([^>]*?)$/, '&lt;$1');

This one works for the examples given.

The first replace does the following:

  • It looks for repetitions of < or > that don't have < or > in between them e.g. <...< or >...>.
  • The first bracket is $1, the text in between is $2, and the last bracket is $3.
  • If $1 is <, then $1 is escaped, otherwise $3 is escaped.

The second replace escapes > if it occurs near the start of the string without a < before it.

The third replace escapes < if it occurs near the end of the string without a > after it.

3 Comments

That's works for the example but I wasn't very clear what I need. The regex should replace all < with &lt; unless it's part of a html tag, which for now we can say the only valid html tag is a <b> and </b>. This should be the same for a >. Hope this is clearer.
Also, do you know whether vbscript regexs can contain look-behinds?
I don't even know if vbscript has regexps sorry! Have you got a bigger example, so I can see what's wrong with my 2nd one.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.