1

I am trying to get the total yearly value of solar irradiation and other values from a table I get with curl from European pv_gis.

The table I get is:

<table class=data_table border="1" width="300" >
<tr> <td> Jan </td><td align="right">2.27</td><td align="right">70.3</td><td align="right">2.86</td><td align="right">88.5</td></tr>
<tr> <td> Feb </td><td align="right">2.79</td><td align="right">78.0</td><td align="right">3.56</td><td align="right">99.7</td></tr>
<tr> <td> Mar </td><td align="right">3.59</td><td align="right">111</td><td align="right">4.74</td><td align="right">147</td></tr>
<tr> <td> Apr </td><td align="right">4.23</td><td align="right">127</td><td align="right">5.68</td><td align="right">171</td></tr>
<tr> <td> May </td><td align="right">4.46</td><td align="right">138</td><td align="right">6.13</td><td align="right">190</td></tr>
<tr> <td> Jun </td><td align="right">4.53</td><td align="right">136</td><td align="right">6.38</td><td align="right">191</td></tr>
<tr> <td> Jul </td><td align="right">4.74</td><td align="right">147</td><td align="right">6.70</td><td align="right">208</td></tr>
<tr> <td> Aug </td><td align="right">4.59</td><td align="right">142</td><td align="right">6.53</td><td align="right">202</td></tr>
<tr> <td> Sep </td><td align="right">4.32</td><td align="right">130</td><td align="right">5.96</td><td align="right">179</td></tr>
<tr> <td> Oct </td><td align="right">3.63</td><td align="right">113</td><td align="right">4.87</td><td align="right">151</td></tr>
<tr> <td> Nov </td><td align="right">2.64</td><td align="right">79.1</td><td align="right">3.41</td><td align="right">102</td></tr>
<tr> <td> Dec </td><td align="right">2.15</td><td align="right">66.5</td><td align="right">2.72</td><td align="right">84.3</td></tr>
<tr><td colspan=5> </td></tr>
<tr><td><b> Yearly average </b></td><td align="right"><b>3.67 </b></td><td align="right"><b>111 </b></td></td><td align="right"><b>4.97 </b></td><td align="right"><b>151 </b></td></tr>
<tr><td><b>Total for year</b></td><td align="right" colspan=2 ><b>  1340 </b> </td> <td align="right" colspan=2 ><b>  1810 </b> </td> </tr>
</table>

As you can see, the Total values are contained in the last tag of that table. Specifically, the total yearly value is in the second tag.

Now, I have tried to use txt2reg tools to build a regular expression, but with success, as I don't know how to target the last row of the above mentioned table.

I get infinite string of numbers, by deleting all TR and TD, but at that point, numbers get confused.

Do you guys have some suggestions?

Thank you very much.

EDIT

I did the following, but I get an error. The error is:

Catchable fatal error: Argument 1 passed to DOMXPath::__construct() must be an instance of DOMDocument, instance of DOMElement given in C:\Users\test\www2\test_pvgis.php on line 49

And the code is:

$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($varResponse);

$table = $doc->getElementsByTagName('table')->item(1); 

print_r($table);


$xpath = new DOMXpath($table);

$lastRow = $xpath->query("(//tr)[last()]");

// look for td elements inside the last row we isolated above
// path for td elements is relative
$cells = $xpath->query('./td',$lastRow[0]);

// you can also store the values for later use
foreach($cells as $key=>$cell){
    //we are ignoring the first key, since it holds the "Total for year" bit

    if ($key != 0){
        $store[] = trim($cell->nodeValue); // trim out the leading and trailing spaces
    }
}
print_r($store);

The error is located here: $xpath = new DOMXpath($table); but I have to idea why. Any clue?

3
  • dont use Regex. PHP offers native implementations of handling HTML: stackoverflow.com/questions/3577641/… Commented Oct 30, 2015 at 11:19
  • Thank you!! I didn't know about it. Is there the chance to target the contant to be loaded on a specific element class or ID? Commented Oct 30, 2015 at 11:26
  • remove parenthesis in the XPath query. Commented Oct 30, 2015 at 12:31

1 Answer 1

2

Edit

Assuming you have more tables and the first one is the relevant one.
You need to pass a DOMDocument instance to the DOMXpath constructor.
So you will use the $doc for $xpath = new DOMXpath($doc);
And when you query for the last row, you pass as second parameter the $table element


Here's an example using DOMDocument and DOMXpath

// start edit
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($varResponse);

$table = $doc->getElementsByTagName('table')->item(1); 

print_r($table);

$xpath = new DOMXpath($doc);

$lastRow = $xpath->query("(./tr)[last()]",$table);
// end edit

// look for td elements inside the last row we isolated above
// path for td elements is relative
$cells = $xpath->query('./td',$lastRow->item(0)); // fixed 'Cannot use object of type DOMNodeList as array i'

// you can also store the values for later use
foreach($cells as $key=>$cell){
    //we are ignoring the first key, since it holds the "Total for year" bit

    if ($key != 0){
        $store[] = trim($cell->nodeValue); // trim out the leading and trailing spaces
    }
}
print_r($store);
/*
ouputs
Array
(
    [0] => 1340
    [1] => 1810
)
*/
Sign up to request clarification or add additional context in comments.

5 Comments

Cool...I followed your example and edit my question. It seems I have a problem with the code that I cannot figure out
I edited my answer to match your last question update
I don't get it...I still get this error: Fatal error: Cannot use object of type DOMNodeList as array in C:\Users\test\www2\test_pvgis.php on line 54
$cells = $xpath->query('./td',$lastRow[0]);

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.