1

I'm trying to find ALL images in my blog posts with regex. The code below returns images IF the code is clean and the SRC tag comes right after the IMG tag. However, I also have images with other attributes such as height and width. The regex I have does not pick that up... Any ideas?

The following code returns images that looks like this:

<img src="blah_blah_blah.jpg">

But not images that looks like this:

<img width="290" height="290" src="blah_blah_blah.jpg">

Here is my code

$pattern = '/<img\s+src="([^"]+)"[^>]+>/i';

preg_match($pattern, $data, $matches);

echo $matches[1];
1
  • Why don't you parse the HTML using something like SimpleHTMLDOM and then grab the IMG tags that way? It's more reliable Commented Dec 6, 2013 at 16:03

5 Answers 5

4

Use DOM or another parser for this, don't try to parse HTML with regular expressions.

$html = <<<DATA
<img width="290" height="290" src="blah.jpg">
<img src="blah_blah_blah.jpg">
DATA;

$doc = new DOMDocument();
$doc->loadHTML($html); // load the html

$xpath = new DOMXPath($doc);
$imgs  = $xpath->query('//img');

foreach ($imgs as $img) {
   echo $img->getAttribute('src') . "\n";
}

Output

blah.jpg
blah_blah_blah.jpg
Sign up to request clarification or add additional context in comments.

2 Comments

Wow, thanks! I love the solution, and it works perfectly well. I'm learning a lot, and reading up on DOM parsing! Thanks!
Yup, and it's much safer than regex too.
3

Ever think of using the DOM object instead of regex?

$doc = new DOMDocument();
$doc->loadHTML('<img src="http://example.com/img/image.jpg" ... />');
$imageTags = $doc->getElementsByTagName('img');

foreach($imageTags as $tag) {
    echo $tag->getAttribute('src');
}

1 Comment

Thanks, I had no idea that I could parse like this! Thanks, learning some PhP every day!
1

You'd better to use a parser, but here is a way to do with regex:

$pattern = '/<img\s.*?src="([^"]+)"/i';

Comments

1

The problem is that you only accept \s+ after <img. Try this instead:

$pattern = '/<img\s+[^>]*?src="([^"]+)"[^>]+>/i';

preg_match($pattern, $data, $matches);

echo $matches[1];

Comments

1

Try this:

$pattern = '/<img\s.*?src=["\']([^"\']+)["\']/i';

Single or double quote and dynamic src attr position.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.