0

I am trying to extract the number 203 from this sample.

Here is the sample I am running the regex against:

<span class="crAvgStars" style="white-space:no-wrap;"><span class="asinReviewsSummary" name="B00KFQ04CI" ref="cm_cr_if_acr_cm_cr_acr_pop_" getargs="{&quot;tag&quot;:&quot;&quot;,&quot;linkCode&quot;:&quot;sp1&quot;}">

<a href="https://www.amazon.com/Moto-1st-Gen-Screen-Protector/product-reviews/B00KFQ04CI/ref=cm_cr_if_acr_cm_cr_acr_img/181-2284807-1957201?ie=UTF8&linkCode=sp1&showViewpoints=1" target="_top"><img src="https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/customer-reviews/ratings/stars-4-5._CB192238104_.gif" width="55" alt="4.3 out of 5 stars" align="absbottom" title="4.3 out of 5 stars" height="12" border="0" /></a>&nbsp;</span>(<a href="https://www.amazon.com/Moto-1st-Gen-Screen-Protector/product-reviews/B00KFQ04CI/ref=cm_cr_if_acr_cm_cr_acr_txt/181-2284807-1957201?ie=UTF8&linkCode=sp1&showViewpoints" target="_top">203 customer reviews</a>)</span>

Here is the code I am using that does not work

preg_match('/^\D*(\d+)customer reviews.*$/',$results[0], $clean_results);
echo "<pre>";
print_r( $clean_results);
echo "</pre>";
//expecting 203

It is just returning

<pre>array ()</pre>
8
  • '/(\d+) customer reviews\b/' Commented Feb 25, 2016 at 23:15
  • The result of preg_match is an array, so to print it you have to use print_r, not echo. The first parenthesis group is in $clean_results[1] Commented Feb 25, 2016 at 23:17
  • Awesome! This is the answer. I'm happy to mark this! @WiktorStribiżew Commented Feb 25, 2016 at 23:18
  • @fusion3k He's using print_r. It's showing an empty array. Commented Feb 25, 2016 at 23:19
  • Does a faster operation then preg_match exist? Thanks @fusion3k I will be storing it via $clean_results[1] Commented Feb 25, 2016 at 23:21

1 Answer 1

1

Your regexp has two problems.

First, there are other numbers in the string before the number of customer reviews (like 4.3 out of 5 stars and height="12"), but \D* prevents matching that -- it only matches if there are no digits anywhere between the beginning of the string and the number of reviews.

Second, you have no space between (\d+) and customer reviews, but the input string has a space there.

There's no need to match any of the string before and after the part that contains the number of customer reviews, just match the part you care about.

preg_match('/(\d+) customer reviews/',$results[0], $clean_results);
$num_reviews = $clean_results[1];

DEMO

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.