Using RegEx to Capture All Links & In Between Text From A String

Question

<Link to: http://www.someurl(.+)> maybe some text here(.*) <Link: www.someotherurl(.+)> maybe even more text(.*)

Given that this is all on one line, how can I match or better yet extract all full urls and text? ie. for this example I wish to extract:

http://www.someurl(.+) . maybe some text here(.*) . www.someotherurl(.+) . maybe even more text(.*)

Basically, <Link.*:.* would start each link capture and > would end it. Then all text after the first capture would be captured as well up until zero or more occurrences of the next link capture.

I have tried:

preg_match_all('/<Link.*?:.*?(https|http|www)(.+?)>(.*?)/', $v1, $m4);

but I need a way to capture the text after the closing >. The problem is that there may or may not be another link after the first one (of course there could also be no links to begin with!).

It might be easier to try and preg_split using a pattern for a full URL — CrayonViolent
– CrayonViolent, Commented Dec 10, 2013 at 20:58

CrayonViolent · Accepted Answer · 2013-12-10 21:10:55Z

2

$string = "<Link to: http://www.someurl(.+)> maybe some text here(.*) <Link: www.someotherurl(.+)> maybe even more text(.*)";
$string = preg_split('~<link(?: to)?:\s*([^>]+)>~i',$string,-1,PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
echo "<pre>";
print_r($string);

output:

Array
(
    [0] => http://www.someurl(.+)
    [1] =>  maybe some text here(.*) 
    [2] => www.someotherurl(.+)
    [3] =>  maybe even more text(.*)
)

answered Dec 10, 2013 at 21:10

CrayonViolent

32.6k6 gold badges61 silver badges80 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Casimir et Hippolyte · Accepted Answer · 2013-12-10 21:10:45Z

0

You can use this pattern:

preg_match_all('~<link\b[^:]*:\s*\K(?<link>[^\s>]++)[^>]*>\s*(?<text>[^<]++)~',
               $txt, $matches, PREG_SET_ORDER);

foreach($matches as $match) {
    printf("<br/>link: %s\n<br/>text: %s", $match['link'], $match['text']);
}

answered Dec 10, 2013 at 21:10

Casimir et Hippolyte

90k5 gold badges102 silver badges131 bronze badges

Collectives™ on Stack Overflow

Using RegEx to Capture All Links & In Between Text From A String

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related