0

If I have a string like so:

Hi this is a photo of me <img src='myself.jpg' alt='pic of me' />. Another pic of me <img src='abc.jpg'/>

How can I turn that into:

Hi this is a photo of me (myself). Another pic of me (image)

Basically I want to remove all images from a string and replace them with their alt tag if it had one. If it doesn't it should say just 'image'.

5
  • Where's your code? What have you tried? Commented Dec 15, 2013 at 9:34
  • Best would be to use an HTML DOM parser library. Commented Dec 15, 2013 at 9:35
  • Javascript would be good but in this case I need it for php. Commented Dec 15, 2013 at 9:36
  • Don't understand that comment. There are HTML parsers for PHP. Commented Dec 15, 2013 at 9:42
  • Oh thought you meant dom as in javascript wise. Commented Dec 15, 2013 at 10:06

2 Answers 2

1

I'd use a DOM parser instead of regex. Here's how:

  • Load the HTML string using loadHTML()
  • Use getElementsByTagName() to get all the images
  • Loop through them and check if the image has an alt attribute.
    • If the image has an alt attribute, set the value of $replacement variable as the alt attribute.
    • If the image doesn't have an alt attribute, set the $replacement to (image).
  • Use replaceChild() to replace the node with the newly created text node:

Code:

$html = <<<HTML
Hi this is a photo of me <img src='myself.jpg' alt='pic of me' /> 
another pic of me <img src='abc.jpg'/> 
HTML;

$dom = new DOMDocument;
$dom->loadHTML($html);
$images = $dom->getElementsByTagName('img');
$i = $images->length - 1;

while ($i > -1) { 
    $node = $images->item($i); 

    if ($node->hasAttribute('alt')) {
        $replacement = '('.$node->getAttribute('alt').')';
    }
    else {
        $replacement = '(image)';
    } 

    $text = $dom->createTextNode($replacement."\n");
    $node->parentNode->replaceChild($text, $node);

    $i--; 
} 

echo strip_tags($dom->saveHTML());

Output:

Hi this is a photo of me (pic of me)
another pic of me (image)

Demo.

Sign up to request clarification or add additional context in comments.

7 Comments

Is this faster then using regex?
Regular expressions will likely be faster. But does it matter which is faster if one gives you incorrect results?
Ok but with your example what does it do if there is no alt attribute on the image?
@Nuvolari: If there is no alt attribute, it will now simply display the rest of the text. You can test it out yourself, btw.
Ok how can I do it so that if there is no alt it displays: '(image)' instead ?
|
1

Something like this should work:

preg_match_all('/\<img[^\>]*\>/', $yourString, $matches);

foreach ($matches as $match)
{
   $replacement = 'image';

   if (preg_match('/alt=\'([^\']+)\'/', $match, $matches2))
      $replacement = $matches2[1];

   $yourString = str_replace($match, '('.$replacement.')', $yourString);
}

What it does: finds all img tags and gets them to $matches array. Cycles through them and looks for alt value. If one exists the IMG tag is replaces with (ALT VALUE) otherwise it's replaced with (image).

5 Comments

If the image doesn't contain an alt tag, what will this display?
There's not gonna be a match so you won't replace anything
Use $matches[1] ? $matches[1] : 'image'
And make the alt=... stuff optional in the regexp so the if succeeds.
I guess first of all you have to define what the algorithm should do in that case. I'm sure we'll come up with something here :)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.