0

I am trying to get those matches from an html file. It has several statements on them looking like this:

links(6)  = "chicas-con-juguetes.asp"

I am trying to extract them with this function:

    public static function extract_all_vid_links($html){
    $pattern="links([0-9])  = \"(.*).asp\"";
    preg_match_all( "/$pattern/i", $html ,$out, PREG_SET_ORDER);

    print_r($out);
    //    foreach($out as $values){
    //       echo $values[0]."<br/>";
    //   }
}

But it is not working!?!? why?

modified:

    $pattern="links\([0-9]\)  = \"(.*).asp\"";
    preg_match_all( "/$pattern/i", $html ,$out, PREG_SET_ORDER);

Still doesn't work.

4
  • 1
    You need to escape the ( and ) characters if you don't want to create a new match-group, take a look here: links\(\d\). Commented Jun 3, 2012 at 12:00
  • 1
    hakre, that looks like an answer, not a comment :-) Commented Jun 3, 2012 at 12:02
  • Are there always two spaces? If not, use \s* in place Commented Jun 3, 2012 at 12:06
  • I thought about it...and no, it is irrelevant..Jeroens answer worked Commented Jun 3, 2012 at 12:08

2 Answers 2

2

You need to escape the ( and ) characters if you don't want to create a new match-group, take a look here: links\(\d\).

This does fix the problem you have described (so your description is wrong probably):

<?php
header('Content-Type: text/plain;');


$html = 'links(6)  = "chicas-con-juguetes.asp"';

$pattern="links\(\d\)  = \"(.*).asp\"";
$r = preg_match_all( "/$pattern/i", $html ,$matches);

var_dump($r, $matches);

Demo, Output:

int(1)
array(2) {
  [0]=>
  array(1) {
    [0]=>
    string(37) "links(6)  = "chicas-con-juguetes.asp""
  }
  [1]=>
  array(1) {
    [0]=>
    string(19) "chicas-con-juguetes"
  }
}
Sign up to request clarification or add additional context in comments.

4 Comments

I think ur answer..worked.. why did you put \d and not [0-9]? does it matter
No, \d is exactly the same as [0-9].
@DmitryMakovetskiyd: Probably the string is not what you told it is. You probably want to scrape actual HTML and you only pasted what you see in the browser? Or your actual data differs in some other way?
I had to use the file get content to get the html.. but , you are right, I forgot to escape those () brackets..thats where the error lied
1
preg_match_all('/links\([0-9]+\)  = \"([^\.]*?).asp\"/i', $html, $out, PREG_SET_ORDER);

You need to escape the () characters. Also, I added a + after [0-9] so that numbers above 9 also work.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.