PHP: Get specific links with preg_match_all()

Question

i want to extract specific links from a website.

The links look like that:

<a href="1494761,offer-mercedes-used.html">

The links are always the same - except the brandname (mercedes in this case).

This works fine so far but only delivers the first part of the link:

preg_match_all('/((\d{7}),offer-)/s',$inhalt,$results);

And this delivers the first link with the whole website :(

preg_match_all('/((\d{7}).*html)/s',$inhalt,$results);

Any ideas?

Note that i use preg_match_all() and not preg_match().

Thanks, Chama

mario · Accepted Answer · 2012-03-24 16:49:56Z

1

While .*? would do (= less greedy), in both cases you should specify a more precise pattern.

Here [\w.-]+ would do. But [^">]+ might also be feasible, if the HTML source is consistent (or you specifically wish to ignore other variations).

preg_match_all('/((\d{7}),offer-[\w.-])/s',$inhalt,$results);

answered Mar 24, 2012 at 16:49

mario

146k20 gold badges243 silver badges293 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mark Achée · Accepted Answer · 2012-03-24 16:52:56Z

1

Trying to parse xml/html with regex generally isn't a good idea, but if you're sure it will always be formatted well, this should return any links in the content.

/<a href="([^">]+)">/

This will more closely match only the example pattern you gave, but not sure what variations you might have

/<a href="([0-9]{7},offer-[a-z]+-used\.html)">/
// [7 numbers],offer-[at least one letter]-used.html

answered Mar 24, 2012 at 16:52

Mark Achée

5171 gold badge4 silver badges14 bronze badges

Collectives™ on Stack Overflow

PHP: Get specific links with preg_match_all()

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related