2

I am having a bit trouble to get only number from specific part of html code, i am parsing one page and output of content looks like this.

<div class="priceitem"> 1,098&nbsp;USD <span id="XUwt-price-mb-aE068a15dcca8E168a15dcca8-tooltipIcon" class="tooltip-icon afterPrice info-icon"> <svg class="" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 200 200" width="100%" height="100%"><use xlink:href="#common-icon-icon-info"></use></svg> </span> <br></div>

I am using simplehtmldom to get content, so everything inside priceitem get output with it. Can i somehow use preg_match to match pattern or preg_replace to get only price number like 1,098.

The price can change so sometimes it will be only 29 usd which will output 29&nbsp;USD, sometimes price can be 305&nbsp;USD, but over 1k it will have comma which i don't need really.

Here is my attempt on everything:

foreach($html->find('div.priceitem') as $element) {
    $pricenum = preg_match("/([^\s]+)/","", $element->innertext);
    echo $pricenum;
}
1

2 Answers 2

0

Here's a pattern that should get you all possible prices:

(\d{1,3}(?:,\d{1,3})*)+(?=&nbsp;USD)

The idea is, the numbers are in blocks of 1-3 digits, groups with a leading comma are allowed but not required after a regular block. &nbsp;USD is as an anchor.

Online sample

However, if you are only interested in the integer part, removing the comma is still the best option: str_replace(',', '' , $string);

Sign up to request clarification or add additional context in comments.

1 Comment

This is awesome, thank you, i just found solution using combination of preg_match and preg_replace, but this is more foolproof.
0

For int values it makes more sense to remove commas and then preg_match for /\d+/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.