3

I am in the process of trying to parse HTML with simple_html_dom.php. The HTML I am trying to parse is shown below. I can successfully grab each product name: Product 1, Product 2, Product 3, etc.

I would also like to grab the itemprice_0 from each product. This is where I am running into issues. Here is my code:

<?php
require_once 'simple_html_dom.php';

$html = file_get_html('https://www.webaddress.com');

foreach($html->find('span.productName') as $e)
echo $e.'<br />'; //successfully displays all product names

foreach($html->find('#itemprice_0') as $e)
echo $e; //doesn't display the item prices

foreach($html->find('.dollar') as $e)
echo $e; //doesn't display the dollar amounts
?>

Here is the HTML:

<span class="productName">Product 1</span>  

<p class="price">
<strike>
<span class="dollar-symbol">$</span>  
<span class="dollar">15</span><span class="dot">.</span>  
<span class="cents">99</span></strike>
</p>  

<p class="salePrice" id='itemprice_0'>  
<span class="dollar-symbol">$</span>  
<span class="dollar">13</span><span class="dot">.</span>  
<span class="cents">99</span>  
</p>
3
  • I think you're missing the innertext. Try echo $e->innertext; Commented Oct 3, 2018 at 14:28
  • foreach($html->find('.salePrice') as $e) echo $e->children(2)->plainText; Commented Oct 3, 2018 at 14:32
  • Thank you both for providing suggestions. Both innertext and children(2)->plainText were unsuccessful. Commented Oct 3, 2018 at 15:54

3 Answers 3

1

I accessed the salePrice class and echoed out the dollar amount.

foreach($html->find('span.productName') as $e)
    echo $e.'<br />'; //successfully displays all product names

foreach($html->find('p.price') as $e)
    $e = str_replace(' ', '', $e);
    echo 'Regular Price: ' . $e;

foreach($html->find('p.salePrice') as $e)
    $e = str_replace(' ', '', $e);
    echo 'Sale Price: ' . $e;

I also removed whitespaces.

Result:

Product 1
Regular Price: $15.99
Sale Price: $13.99

I also made the loop look for the itemprice_0 id only, and got the same result:

foreach($html->find('p[id=itemprice_0]') as $e)
$e = str_replace(' ', '', $e);
echo 'Sale Price: ' . $e;

Same Result:

Product 1
Regular Price: $15.99
Sale Price: $13.99

Is this what you were looking for?

Sign up to request clarification or add additional context in comments.

2 Comments

Hmmm... This is what the code returned for me (i understand there are missing line breaks): Product 1 Regular Price: Product 1Sale Price: Product 1 Is it possible to grab the price which contains the id='itemprice_0'?
Edited my answer with the itemprice_0 id.
1

itemprice_0 is unique, if you want to select more than one element you should use class selector. In simple_html_dom you can fetch nested elements like this(didn't test it):

<?php
require_once 'simple_html_dom.php';

foreach($html->find('.salePrice') as $prices){
    echo $price->find('.dollor')->plaintext;
    echo $price->find('.cents')->plaintext;
}

2 Comments

itemprice_0 is not unique in this HTML. It is located within each product, whether it is within class="price" or class="salesPrice". What would your suggestion be for grabbing the price which contains the id="itemprice_0"?
id attribute always must be unique in html. could you edit your question and put the parent html element?
0

You can use the following solution to solve your problem:

$domd=@DOMDocument::loadHTML($html);
$xp=new DOMXPath($domd);
foreach($xp->query('//*[contains(@class,"dollar")]') as $e)
var_dump($e->textContent);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.