0

I have a bit of a situation. The site am working on has two sections the mobile and the main site. They both fetch content from the same db/table. Its a blog-site. When admins create content that has images using the text editor (CKEditor), the style attribute is attached to the resulting img tag. so the output looks like this.

<img alt="some content" src="some location" style="width:520px; height:600px;" />

this works great on the main site but on the mobile site the images are poorly scaled and stretched. i have a thumbnailing script that could address that but i want a way to get the src attribute before the page loads and a way to remove the style attribute.

i did this using regex.

$str=$blog_post_column_from_database

$pattern=array ('#\<img alt="(.*?)" src="(.*)" style="(.*?)" /> #' );

$replacement=array ( '<img src="$my_thumbnailer_here.php?src=\\2" width="100%" />' );

$a=(string)$str; //converts text to string to avoid code lines from executing

return preg_replace($pattern,$replacement,$a);

please what am i doing wrong?..Regex is not my strong points thanks.

4
  • 5
    regexes on html should be avoided. Use DOM instead. Commented Nov 20, 2013 at 15:25
  • stackoverflow.com/questions/5517255/… Commented Nov 20, 2013 at 15:25
  • @MarkResølved thanks for the link..it works well but does not give me an option to place my thumnbail variable...dont really know how to hack it...:) Commented Nov 20, 2013 at 15:46
  • @MarcB thanks for the link...will look in to that but however i need a quick fix for now..once i get around using php DOM will switch...Thanks all the same Commented Nov 20, 2013 at 15:47

2 Answers 2

1

...as already suggested in the comments, you'll be better off using PHPs DOMDocument:

Something like this should do the trick:

example: http://3v4l.org/Gv4dp

//get new domdoc instance
$dom=new DOMDocument();

//load your html
$dom->loadHTML($your_html);

//get all images
$imgs = $dom->getElementsByTagName("img");

//iterate over those
foreach($imgs as $img){
    //remove style attribute
    $img->removeAttribute('style');
    //prefix src attribute with scriptname
    $img->setAttribute( 'src' , 'thumbnail.php?img=' . $img->getAttribute('src') );
}

//output modified html
echo $dom->saveHTML();

you might want to remove the <doctype>, <html> and <body> elements, created when saving the doc as html by replacing the last line with:

echo preg_replace('/^<!DOCTYPE.+?>/', '', str_replace( array('<html>', '</html>', '<body>', '</body>'), '', $dom->saveHTML()));

see removing doctype while saving domdocument

Sign up to request clarification or add additional context in comments.

Comments

0

Try next regexp

$pattern=array ('#<img alt="(.*?)" src="(.*)" style="(.*?)" />#' );

There is removed / from begin and space from end.

And for correct work you should in first find all img tags and then change it.

Your regexp will not work attribute tag alt is missed or when attributes are in other orders

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.