Remove a href tags from a string using regex

Question

Hey this may have been asked elsewhere somewhere but i couldnt seen to find it.

Essentially im just trying to remove the a tags from a string using regex in javascript.

So i have this:

<a href="www.google.com">This is google</a>

and i want the output to just be "this is google". How would this be done in javascript using regex? Thanks in advance!!

SOLUTION:

Ok so the solution i was provided from my boss goes as follows

The best way to do that is in two parts. One is to remove all closing tags. Then you’re going to want to focus on removing the opening tag. Should be as straightforward as:

/<a\s+.*?>(.*)<\/a>/

With the .*? being the non-greedy version of match /anything/

Do you get this as a String or this is part of your HTML code? — George Rappel
– George Rappel, Commented Aug 20, 2015 at 17:10

baao · Accepted Answer · 2015-08-20 17:15:26Z

2

This shouldn't be done with regex at all, but like this for example:

var a = document.querySelectorAll('a');
var texts = [].slice.call(a).map(function(val){
   return val.innerHTML;
});
console.log(texts);

<a href="www.google.com">this is google</a>

If you only have the a string with multiple <a href...>, you can create an element first

var a_string = '<a href="www.google.com">this is google</a><a href="www.yahoo.com">this is yahoo</a>',
el = document.createElement('p');
el.innerHTML = a_string;
var a = el.querySelectorAll('a');
var texts = [].slice.call(a).map(function(val){
   return val.innerHTML;
});
console.log(texts);

edited Aug 20, 2015 at 17:15

answered Aug 20, 2015 at 17:07

baao

73.5k18 gold badges152 silver badges209 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Community · Accepted Answer · 2017-05-23 11:50:36Z

0

I don't know your case, but if you're using javascript you might be able to get the inside of the element with innerHTML. So, element.innerHTML might output This is google.

The reasoning is that Regex really isn't meant to parse HTML.

If you really, really want a Regexp, here you go:

pattern = />(.*)</;
string  = '<a href="www.google.com">This is google</a>';
matches = pattern.exec(string);
matches[1] => This is google

This uses a match group to get the stuff inside > and <. This won't work with every case, I guarantee it.

edited May 23, 2017 at 11:50

CommunityBot

11 silver badge

answered Aug 20, 2015 at 17:12

Leroy

2472 silver badges12 bronze badges

Comments

Arunesh Singh · Accepted Answer · 2015-08-20 17:15:53Z

0

Try this with lookahead.Get the first capturing group.

(?=>).([^<]+)

Check Demo

answered Aug 20, 2015 at 17:15

Arunesh Singh

3,53521 silver badges26 bronze badges

Comments

sinisake · Accepted Answer · 2015-08-20 17:14:44Z

-1

One more way, with using of capturing groups. So, you basically match all, but grab just one result:

    var re = /<a href=.+>(.+)<\/a>/; 
    var str = '<a href="http://www.somesite.com">this is google</a>';
    var m;

    if ((m = re.exec(str)) !== null) {
        if (m.index === re.lastIndex) {
            re.lastIndex++;
        }

    }
console.log(m[1]);

https://regex101.com/r/rL0bT6/1 Note: code created by regex101.

Demo:http://jsfiddle.net/ry83mhwc/

answered Aug 20, 2015 at 17:14

sinisake

11.3k2 gold badges21 silver badges29 bronze badges

Collectives™ on Stack Overflow

Remove a href tags from a string using regex

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related