PHP Regex - How to append to a URL (in a string variable with a lot of text) where there is an <a href>

Question

I am working on an automation for building Landingpages.

A copy/pastes from a word doc to a TinyMCE textarea which creates the in the output.

so if I copy/paste something like this:

This is my Website.

from a word doc - the output of it after sending the form will look like this:

This is my <a href="http://www.google.com">Website</a>.

I want to append to every link within an <a href> tag (only within an <a href> tag!) something like this:

?utm=foo_foo_foo

so it will look like this:

This is my <a href="http://www.google.com?utm=foo_foo_foo">Website</a>.

P.S: urls can end with "/" or without, this shouldn't matter, but should work both ways.

P.S2: TinyMCR adds the tags by itself (if you haven't noticed me mentioning it..,). I just need to append to a string that looks like this:

$string = "This is my <a href="http://www.google.com">Website</a>.";

any code? have you tried anything? Moreover, please, let us know that either you're generating these href and yielding page OR you've page and want to change all anchor tags? — Mubin
– Mubin, Commented Nov 1, 2015 at 12:25
No code, I went through regex and preg_replace tutorials, everything is basic and not accurate to my needs. and I'm not sure I understand your second question. — Imnotapotato
– Imnotapotato, Commented Nov 1, 2015 at 12:29

chris85 · Accepted Answer · 2015-11-01 12:39:00Z

1

You should use a parser, not a regex for this.

$html = 'This is my <a href="http://www.google.com">Website</a>.';
$dom = new DOMDocument(); 
$dom->loadHTML($html);
$links = $dom->getElementsByTagName('a');
foreach($links as $link) {
    $link->setAttribute('href', $link->getAttribute('href') . '?utm=foo_foo_foo');
}
echo $dom->saveHTML();

Output:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>This is my <a href="http://www.google.com?utm=foo_foo_foo">Website</a>.</p></body></html>

If you had to use a regex you could do

$html = 'This is my <a href="http://www.google.com">Website</a>.';
echo preg_replace('~href=("|\')(.+?)\1~', 'href=$1$2?utm=foo_foo_foo$1', $html);

Output:

This is my <a href="http://www.google.com?utm=foo_foo_foo">Website</a>.

Both these approaches presume you never have a ? in the URL already..

edited Nov 1, 2015 at 12:39

answered Nov 1, 2015 at 12:35

chris85

23.9k7 gold badges36 silver badges51 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Imnotapotato Over a year ago

Oooh. ok. I'm new to PHP, Can you give me a short explanation before I try it and start exploring "parers" ? Just guessing, Parsers pars texts/variables... so what difference them from regex? From what I know "Regex" searches for a pattern in a text and can change/add stuff to it.

Imnotapotato Over a year ago

I am free to use whatever I want in this code, tell me what you think is better and why, anyway I'm gonna continue researching both now. Learning everyday something new B|.

chris85 Over a year ago

There's a longer write up on parsers here, stackoverflow.com/questions/3577641/…. Parsers are cleaner and have predefined functions, also if a failure occurs with a parser it won't destroy, with regex you have possibility to go very wrong.

Imnotapotato Over a year ago

Yes, the parsers code looks more organized to be honest.. Thanks, I'll try this in a second and update you if it's working right.

chris85 Over a year ago

See my previous last comment.

|

Collectives™ on Stack Overflow

PHP Regex - How to append to a URL (in a string variable with a lot of text) where there is an <a href>

1 Answer 1

8 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related