0

I have a site which I need to parse it.

First, I have to parse all catalog's urls in the page, then I need to enter all urls then go through all the urls and parse urls on each page again, then go through all urls and get element ('.description div').

I'm using simple html dom.

But I have one problem in moment when I want to go through all urls which I parse for the first time. I'm getting empty page

include 'simple_html_dom.php';
$catalogs = file_get_html('http://optnow.ru/catalog');
$catalogLink = [];
if(!empty($catalogs)) {
    foreach( $catalogs->find('div.cat-name a') as $catalog) {
         $catalogUrl = 'http://optnow.ru/' . $catalog->href . '?page=0';
         $catalogLink[] = $catalogUrl;
         $catalogHtml = file_get_html($catalogUrl);
         $productsLink = $catalogHtml->find('.link-pv-name');
         print_r($productsLink->href);
    }
}

Where is my mistake?

Thanks.

1
  • $catalogLink[0] as $catalogSingleLink Commented Mar 4, 2017 at 6:52

1 Answer 1

1

You need to pass array, not a single element in foreach:

include 'simple_html_dom.php';
$catalog = file_get_html('http://optnow.ru/catalog');
$catalogLink = [];
if(!empty($catalog)) {
    foreach( $catalog->find('div.cat-name a') as $catalogHref) {
         $myLink = 'http://optnow.ru/' . $catalogHref->href . '?page=0';
         $catalogLink[] = $myLink;
         echo '<pre>';
         print_r($myLink);
         echo '</pre>';
    }
    foreach ($catalogLink as $catalogSingleLink ) {
         if(!empty($catalogSingleLink)) {
             $catalogHtml = file_get_html($catalogSingleLink);
             $catalogProduct = $catalogHtml->find('.link-pv-name');
             echo $catalogProduct->href;
         }
    }
}
Sign up to request clarification or add additional context in comments.

4 Comments

If I print $catalogSinglLink I'll get just string of all urls like that optnow.ru/categories/istochniki-pitaniya?page=0http://optnow.ru/…?
Ok, but I need to get only one url, then next in each loop. I need to make for() loop and iterate each url?
@Frunky no I mean to say that you'll get all URLs individually. The code would work as you expect. Try it
I'm really don't know why I'm getting empty result. I'll update question now, thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.