0

I've got a HTML-File with several tables from which I try to extract the link and image part. I'm using the PHP Simple HTML DOM Parser.

Here's the HTML-File to parse:

<h1>Title</h1>
<p>Text</p>

<table cellspacing="0" cellpadding="0" border="0">
    <tbody>
        <tr><td>
            <a href="http://www.google.com/some_url">
                <img width="100" height="100" border="0" src="http://google.com/some_image.jpg"/>
            </a>
        </td></tr>
    </tbody>
</table>

<h2>Title</h2>
<p>Text</p>

<table cellspacing="0" cellpadding="0" border="0">
    <tbody>
        <tr><td>
            <a href="http://www.google.com/this_url">
                <img width="100" height="100" border="0" src="http://google.com/this_image.jpg"/>
            </a>
        </td></tr>
    </tbody>
</table>

<p>Text</p>
<p>Text</p>

And what I need as an output:

<a href="http://www.google.com/some_url">
    <img width="100" height="100" border="0" src="http://google.com/some_image.jpg"/>
</a>

<a href="http://www.google.com/this_url">
    <img width="100" height="100" border="0" src="http://google.com/this_image.jpg"/>
</a>

Here's the PHP part – but doesn't work the way i want it...

<?php

// Include the library
include('simple_html_dom.php');

// Retrieve the DOM from a given URL
$html = file_get_html('http://google.com');

// Find all images & links
foreach($html->find('img') as $IMGelement)
foreach($html->find('a') as $Aelement)
echo '<a href="' . $Aelement->href . '"><img src="' . $IMGelement->src . '" /><br>';

?>
4
  • Where are you using the PHP Simple HTML DOM Parser? I don't see it, looks like HTML. I also don't see a question here about the issue you are having with the parsing. Commented May 29, 2015 at 20:11
  • Please find the PHP code above. But it's not really working the way i want it... Commented May 29, 2015 at 20:46
  • Okay, and what happens with the current code vs. what do you want to happen? Commented May 29, 2015 at 20:47
  • Current code generates a list of every picture with every link. What i want is to extract the link and image only out of each table. In the HTML-Code above commented as "<!-- Output in need: -->". Commented May 29, 2015 at 20:52

1 Answer 1

1

I think you want to find an img within a tag :

foreach($html->find('a img') as $IMGelement) {
    echo '<a href="' . $IMGelement->parent()->href . '"><img src="' .$IMGelement->src .'" /><br>';
}
Sign up to request clarification or add additional context in comments.

11 Comments

This generates a list of all images, but without the link inside the "a href"-tag...
@CorruptNetwork $IMGelement->parent()->href show the link very well
Yes, this works! Thanks a lot! Is there also a way to extract only the images/links within the specific tables? i have a lot of other HTML-stuff in that file i don't need (see HTML above).
@CorruptNetwork Write a condition to select, I try. By the way, it would be good to accept the answer
Glad to help! And what about your additions?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.