simple html dom scraping large html file

Question

I need to scrape a large html file (eg: http://www.indianrail.gov.in/mail_express_trn_list.html) using simple html dom. I started with a simple script:

<?php
require "simple_html_dom.php";
echo file_get_html('http://www.indianrail.gov.in/mail_express_trn_list.html')->plaintext;
?>

which shows nothing, just a blank page with the error message in Apache error.log file

 PHP Notice:  Trying to get property of non-object in /var/www/index.php on line 3
 PHP Notice:  Trying to get property of non-object in /var/www/index.php on line 3

at the same time all other pages (eg: http://www.indianrail.gov.in/special_trn_list.html) works fine with the same script.

Have you tried using file_get_contents instead of file_get_html? php.net/manual/en/function.file-get-contents.php — Funk Forty Niner
– Funk Forty Niner, Commented Jul 30, 2013 at 5:51
i am able to replicate the issue, i will dig deeper and let u know — DevZer0
– DevZer0, Commented Jul 30, 2013 at 5:54
@krizna These answers on SO may be of help stackoverflow.com/a/6006379/1415724 and stackoverflow.com/a/6519443/1415724 — Funk Forty Niner
– Funk Forty Niner, Commented Jul 30, 2013 at 6:02

DevZer0 · Accepted Answer · 2013-07-30 06:02:27Z

21

The issue appears to be MAX_FILE_SIZE defined in simple_html_dom.

you can adjust it by editing define('MAX_FILE_SIZE', 600000); line in simple_html_dom.php file.

answered Jul 30, 2013 at 6:02

DevZer0

13.5k7 gold badges29 silver badges54 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

krizna Over a year ago

i tried define('MAX_FILE_SIZE', 6000000000000000000); .. but no luck .. still the same error .. thanks

DevZer0 Over a year ago

define a realistic number, i set it to 12600000

krizna Over a year ago

it seams working , but i'm getting different error now .. exit signal Segmentation fault (11)

Collectives™ on Stack Overflow

simple html dom scraping large html file

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related