1

Hey this may have been asked elsewhere somewhere but i couldnt seen to find it.

Essentially im just trying to remove the a tags from a string using regex in javascript.

So i have this:

<a href="www.google.com">This is google</a>

and i want the output to just be "this is google". How would this be done in javascript using regex? Thanks in advance!!

SOLUTION:

Ok so the solution i was provided from my boss goes as follows

The best way to do that is in two parts. One is to remove all closing tags. Then you’re going to want to focus on removing the opening tag. Should be as straightforward as:

/<a\s+.*?>(.*)<\/a>/

With the .*? being the non-greedy version of match /anything/

2
  • Do you get this as a String or this is part of your HTML code? Commented Aug 20, 2015 at 17:10
  • @GeorgeRappel as a string Commented Aug 20, 2015 at 17:16

4 Answers 4

2

This shouldn't be done with regex at all, but like this for example:

var a = document.querySelectorAll('a');
var texts = [].slice.call(a).map(function(val){
   return val.innerHTML;
});
console.log(texts);
<a href="www.google.com">this is google</a>

If you only have the a string with multiple <a href...>, you can create an element first

var a_string = '<a href="www.google.com">this is google</a><a href="www.yahoo.com">this is yahoo</a>',
el = document.createElement('p');
el.innerHTML = a_string;
var a = el.querySelectorAll('a');
var texts = [].slice.call(a).map(function(val){
   return val.innerHTML;
});
console.log(texts);

Sign up to request clarification or add additional context in comments.

Comments

0

I don't know your case, but if you're using javascript you might be able to get the inside of the element with innerHTML. So, element.innerHTML might output This is google.

The reasoning is that Regex really isn't meant to parse HTML.

If you really, really want a Regexp, here you go:

pattern = />(.*)</;
string  = '<a href="www.google.com">This is google</a>';
matches = pattern.exec(string);
matches[1] => This is google

This uses a match group to get the stuff inside > and <. This won't work with every case, I guarantee it.

Comments

0

Try this with lookahead.Get the first capturing group.

(?=>).([^<]+)

Check Demo

Comments

-1

One more way, with using of capturing groups. So, you basically match all, but grab just one result:

    var re = /<a href=.+>(.+)<\/a>/; 
    var str = '<a href="http://www.somesite.com">this is google</a>';
    var m;

    if ((m = re.exec(str)) !== null) {
        if (m.index === re.lastIndex) {
            re.lastIndex++;
        }

    }
console.log(m[1]);

https://regex101.com/r/rL0bT6/1 Note: code created by regex101.

Demo:http://jsfiddle.net/ry83mhwc/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.