2

I am trying to build a crawler that gets the movie urls from an imdb list. I am able to get all the links on the page into an array and want to select only those ones with "title" in them.

preg_match_all($pattern, "[125] => href=\"/chart/2000s?mode=popular\" [126] => href=\"/title/tt0111161/\" ", $matches);

where $pattern='/title/'.

I am getting the following error:

Warning: preg_match_all() [function.preg-match-all]: Delimiter must not be alphanumeric or backslash in C:\xampp\htdocs\phpProject1\index.php on line 53

Any idea on how to go about this? Thanks a lot.

2 Answers 2

1

Use a DOM Parser:

// Create DOM from URL or file
$html = file_get_html('http://www.example.com/');

// Find all links containing title as part of their HREF 
$links = $html->find('a[href*=title]');

// loop through links and do stuff
foreach($links as $link) { 
       echo $element->href . '<br>';
}

http://simplehtmldom.sourceforge.net/manual.htm

Sign up to request clarification or add additional context in comments.

Comments

1

Are you sure $pattern is '/title/' at the time when preg_match_all is called?

The error you are getting comes when the pattern provided to preg_match_all (1st argument) is not properly delimited.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.