0

Trying to get the value of Internet Data Volume Balance - the script should echo 146.30mb

New to all these, having a look at all the tutorials.

How can this be done?

<tr >
    <td bgcolor="#F8F8F8"><div align="left"><B><FONT class="tplus_text">Account Status</FONT></B></div></td>
    <td bgcolor="#FFFFFF"><div align="left"><FONT class="tplus_text">You exceeded your allowed credit.</FONT></div></td>
</tr> 

<tr >
    <td bgcolor="#F8F8F8"><div align="left"><B><FONT class="tplus_text">Period Free Time Remaining</FONT></B></div></td>
    <td bgcolor="#FFFFFF"><div align="left"><FONT class="tplus_text">0:00:00 hours</FONT></div></td>
</tr> 

<tr >
    <td bgcolor="#F8F8F8"><div align="left"><B><FONT class="tplus_text">Internet Data Volume Balance</FONT></B></div></td>
    <td bgcolor="#FFFFFF"><div align="left"><FONT class="tplus_text" style="text-transform:none;">146.30 MB</FONT></div></td>
</tr> 
3
  • 1
    I think you'll find that while you can use regex to parse HTML, it's not usually advisable. DOM or SimpleXML will likely be much better options in this situation. Commented Apr 24, 2012 at 12:49
  • could you point me to a good resource? Commented Apr 24, 2012 at 12:50
  • stackoverflow.com/questions/1732348/… SimplHTMLDom Parser is exactly what the name suggests a simple way to parse html! simplehtmldom.sourceforge.net, there are quite a few other html parsing options for php too. Commented Apr 24, 2012 at 12:51

2 Answers 2

1

If you were willing to or have already installed phpQuery, you can use that.

phpQuery::newDocumentFileHTML('htmlpage.html');
echo pq('td:eq(6)')->text();
Sign up to request clarification or add additional context in comments.

Comments

1

PHP can interact with the DOM just like JavaScript can. This is vastly superior to parsing the markup, as most people will tell you is the wrong approach anyway:

Loading from an HTML File

// Start by creating a new document
$doc = new DOMDocument();
// I've loaded the table into an external file, and am loading it into the $doc
$doc->loadHTMLFile( 'htmlpage.html' );
// Since you have six table cells, I'm calling up all of them
$cells = $doc->getElementsByTagName("td");
// I'm grabbing the sixth cell's textContent property
echo $cells->item(5)->textContent;

This code will output "146.30 MB" to the screen.

Loading from a String

If you have the HTML stored within a string, you can load that into your document as well. We'll change the method used to load the file, into the method used to load from a string:

$str = "<table><tr><td>Foo</td></tr>...</table>";
$doc->loadHTML( $str );

We would then proceed with the same code as above to select the cells, and show their textContent in the output.

Check out the DOMDocument Class.

3 Comments

I am using curl to get the content of the page. it is a protected page. can i load the output from curl directly into loadHTMLFile ?
@devilived Yes, you can load the HTML from a string too. Use $doc->loadHTML($str) for that, where $str is your HTML.
It works, thanks. But i am getting a few warnings : <pre><code>Warning: DOMDocument::loadHTML() [domdocument.loadhtml]: htmlParseEntityRef: expecting ';' in Entity, line: 295 in /home/premiu59/public_html/phpcurl/scraper.php on line 27 Warning: DOMDocument::loadHTML() [domdocument.loadhtml]: Opening and ending tag mismatch: tr and table in Entity, line: 484 in /home/premiu59/public_html/phpcurl/scraper.php on line 27 146.30 MB</code></pre>

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.