Scraping a website using PHP "Simple HTML Dom Parser"

Question

I'm having trouble figuring out how to use PHP Simple HTML DOM Parser for pulling information from a website.

require('simple_html_dom.php');
$html = file_get_html('https://example.com');

$ret = array();
foreach($html->find(".project-card-mini-wrap") as $element)  { 
   echo $element;   
}

The output of $element is:

<div class="project-card-mini-wrap"> 
<a class="project_item block mb2 green-dark"    href="/projects/andrewkostirev/kostirev-the-real-you">
<div class="project_thumbnail hover-group border border-box mb1"> 
     <img alt="Project image" class="hover-zoomin fit" src="https://ksr-ugc.imgix.net/projects/2123706/photo-original.png?v=1444253259&amp;w=218&amp;h=162&amp;fit=crop&amp;auto=format&amp;q=92&amp;s=9d6c437e96b720dce82fc9b598b3e8ae" /> 
    <div class="funding_tag highlight">10 days to go</div> 
   <div class="hover-zoomout bg-green-90"> 
   <p class="white p2 h5">A clothing brand like never seen before</p> 
</div> 
</div> 
<div class="project_name h5 bold"> KOSTIREV - THE REAL YOU </div>
</a>
</div>

This is the information I'd like to pull from the website:
1: Link href
2: Image src
3: Project name

I'm voting to close this question as off-topic because SO is not a code writing service. — Niki van Stein
– Niki van Stein, Commented Nov 20, 2015 at 8:18

jaggedsoft · Accepted Answer · 2015-11-20 09:17:02Z

2

Hopefully this will provide some insight to you as well as other users of PHP Simple HTML DOM Parser

foreach($html->find(".project-card-mini-wrap") as $element)  { 
   echo "Project name: ",$element->find('.project_name',0)->innertext,"<br/>\n";
   echo "Image source: ",$element->find('img',0)->src,"<br/>\n";
   echo "Link: ",$element->find('a',0)->href,"<br/>\n";
}

Produces this output:

Project name: KOSTIREV - THE REAL YOU 
Image source: https://ksr-ugc.imgix.net/projects/2123706/photo-original.png?v=1444253259&w=218&h=162&fit=crop&auto=format&q=92&s=9d6c437e96b720dce82fc9b598b3e8ae
Link: /projects/andrewkostirev/kostirev-the-real-you

edited Nov 20, 2015 at 9:17

answered Nov 20, 2015 at 8:23

jaggedsoft

4,0582 gold badges35 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

M Arfan Over a year ago

tried this but i am getting this error Notice: Trying to get property of non-object

jaggedsoft Over a year ago

@MArfan The find() function requires a second parameter, use 0 to grab the first element it finds. Example updated

Dustin Maddox · Accepted Answer · 2018-02-03 03:31:00Z

-1

I tried this and it worked, thanks for the help! Here is something i made using primewire.ag as a example.... The goal here was to extract all the links of a given page.

<?php

require('simple_html_dom.php');

// Create DOM from URL or file
$html = file_get_html('http://www.primewire.ag/watch-2805774-Star-Wars-The-Last-Jedi-online-free');


// Find All Movie Links
$linkPrefix = 'http://primewire.ag';
$linkClass;
foreach($html->find(".movie_version_link") as $linkClass)  {
    echo "Link: ",$linkPrefix,$linkClass->find('a',0)->href,"<br/>\n";

}
?>

edited Feb 3, 2018 at 3:31

answered Feb 3, 2018 at 3:25

Dustin Maddox

11 bronze badge

Comments

DisappointedByUnaccountableMod · Accepted Answer · 2021-01-28 19:12:13Z

-2

This is also a good library for scraping and traversing via HTML

https://github.com/paquettg/php-html-parser

edited Jan 28, 2021 at 19:12

DisappointedByUnaccountableMod

6,8444 gold badges21 silver badges23 bronze badges

answered Jan 27, 2021 at 4:19

Edwin M

3613 silver badges6 bronze badges

Collectives™ on Stack Overflow

Scraping a website using PHP "Simple HTML Dom Parser"

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related