9

I need to scrape a large html file (eg: http://www.indianrail.gov.in/mail_express_trn_list.html) using simple html dom. I started with a simple script:

<?php
require "simple_html_dom.php";
echo file_get_html('http://www.indianrail.gov.in/mail_express_trn_list.html')->plaintext;
?>

which shows nothing, just a blank page with the error message in Apache error.log file

 PHP Notice:  Trying to get property of non-object in /var/www/index.php on line 3
 PHP Notice:  Trying to get property of non-object in /var/www/index.php on line 3

at the same time all other pages (eg: http://www.indianrail.gov.in/special_trn_list.html) works fine with the same script.

14

1 Answer 1

21

The issue appears to be MAX_FILE_SIZE defined in simple_html_dom.

you can adjust it by editing define('MAX_FILE_SIZE', 600000); line in simple_html_dom.php file.

Sign up to request clarification or add additional context in comments.

3 Comments

i tried define('MAX_FILE_SIZE', 6000000000000000000); .. but no luck .. still the same error .. thanks
define a realistic number, i set it to 12600000
it seams working , but i'm getting different error now .. exit signal Segmentation fault (11)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.