1

So i have some images with several forms like this :

<a href="" class="link-img" alt="">
  <img editable="true" style="display: block; cursor: default;" class="main-image" 
    width="538" height="auto" src="src" alt="large image">
</a>

and like this :

<a href="" class="link link-img">
 <img src="src" style="width: 100%; display: block; cursor: pointer;" editable="true" 
  class="main-image imageLink" width="" height="auto" alt="">
</a>

so my code to select the src image is :

$c = preg_replace('/<a href="(.+)" class="link link-img" alt="(.+)"><img src="(.+)"><\/a>/i'
,'<% link url="$1" caption="<img style=max-width:500px; src=$8 >" html="true" %>',$c);

i tried that several times but the code doesn't worked, so please if someone has any idea i will be very appreciative.

4
  • 2
    To extract data from HTML, use an HTML parser and not some regexp. Commented Jan 16, 2015 at 16:43
  • You need to use something like PHPDomDocument Commented Jan 16, 2015 at 16:45
  • but i need that in server side Commented Jan 16, 2015 at 16:50
  • Good thing PHPDomDocument isn't for use on the client side then Commented Jan 16, 2015 at 16:53

1 Answer 1

2

Try this way to grab src from image src="([^"]+)"

enter image description here

enter image description here

EDIT : see regex here https://www.regex101.com/r/yF8tJ1/1

CODE EXAMPLE:

$re = "/src=\"([^\"]+)\"/"; 
$str = "<a href=\"\" class=\"link-img\" alt=\"\">\n  <img editable=\"true\" style=\"display: block; cursor: default;\" class=\"main-image\" \n    width=\"538\" height=\"auto\" src=\"src\" alt=\"large image\">\n</a>\n\n<a href=\"\" class=\"link link-img\">\n <img src=\"src\" style=\"width: 100%; display: block; cursor: pointer;\" editable=\"true\" \n  class=\"main-image imageLink\" width=\"\" height=\"auto\" alt=\"\">\n</a>"; 

preg_match_all($re, $str, $matches);
Sign up to request clarification or add additional context in comments.

4 Comments

I suggest a different regex. /src=((['"]?)(?:[^\2]|\2+)*\2)/g. This one matches broken HTML attributes as close as I can get to the behavior of HTML/XML parsers.
Actually, use /src=((['"]?)(?:[^\2]|\2+)+?\2/gi. This one works properly. An input like <img src="test"> will match src="test", <img src="test"" > will match src="test"" and <img src='test""/><span>My new 'test'</span><img src='55'> will match src='test""/><span>My new ' and src='55.
Sorry for flooding this! I've made a new regex: src=([^"'>]+|(['"]?)(?:[^\2]|\2+)+?\2). This one works now with <img src=55>, which matches src=55.
Just trying to make the impossible. Some people would get really angry with such regex. But hey, it works!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.