PHP DOM parser get data from a span

Question

I am trying to use DOM to get the days and times and also the rooms (im actually getting everything in my script but its getting these two im having trouble with) from the following batch of HTML:

                    </td><td class="call">
                    <span>12549<br/></span><a href="http://www.bkstr.com/webapp/wcs/stores/servlet/booklookServlet?bookstore_id-1=584&term_id-1=201190&crn-1=12549" target="_blank">View Book Info</a>
                    </td><td>
                    <span id="ctl10_gv_sectionTable_ctl03_lblDays">F:1000AM - 1125AM<br />T:230PM - 355PM</span>


                    </td><td class="room">
                    <span id="ctl10_gv_sectionTable_ctl03_lblRoom">KUPF106<br />KUPF106</span>
                    </td><td class="status"><span id="ctl10_gv_sectionTable_ctl03_lblStatus" class="red">Closed</span></td><td class="max">20</td><td class="now">49</td><td class="instructor">
                    <a href="https://directory.njit.edu/PersDetails.aspx?persid=SCHOENKA" target="_blank">Schoenebeck Kar</a>
                    </td><td class="credits">3.00</td>

        </tr><tr class="sectionRow">
            <td class="section">
                    101<br />

Here is what I have so far for finding days

    $tracker =0;
    // DAYS AND TIMES
    $number = 3;
    $digit = "0";
    while($tracker<$numSections){           
        $strNum = strval($number);
        $zero = strval($digit);
        $start = "ctl10_gv_sectionTable_ctl";
        $end = "_lblDays";
        $id = $start.$zero.$strNum.$end;
        //$days = $html->find('span.$id');
        $days=$html->getElementByTagName('span')->getElementById($id);
            echo "Days : ";
            echo $days[0] . '<br>';


        $tracker++;
        $number++;
        if($number >9){
            $digit = "1";
            $number=0;
        }
    }

as you can see from the HTML, the site im parsing has pretty unique ID's for some of its spans (ctl10_gv_sectionTable_ctl03_lblRoom). As I only posted 1 section's HTML block, what you don't see is that the code for the next class section is identical except for the "ctl03" part, which is what all the extra code I have takes care of, just so no one is thrown off by it.

I've tried a few different ways but can not seem to get the days (i.e. "1000AM - 1125AM") or the rooms (i.e. KUPF106). The rest of the stuff is pretty simple to grab but these two don't have class identifiers or even a td identifier. I think I just need to know how to use the value I have in $id as the specific span id I am looking for? If so can someone show me how to do that?

Francis Avila · Accepted Answer · 2011-11-29 07:29:37Z

2

This:

$html->getElementByTagName('span')->getElementById($id);

makes no sense. getElementByTagName returns a DOMList, which does not have a getElementById method.

I think you mean $html->getElementById($id);, but I can't be sure because I don't know what $html is.

Once you have the element, you can get the text value with $element->textContent if you don't need to walk among the text nodes.

Have you considered using DOMXPath for your parsing task? It's probably much easier and clearer.

answered Nov 29, 2011 at 7:29

Francis Avila

31.8k7 gold badges63 silver badges99 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

tomsseisums Over a year ago

I'd avoid the statement about DOMXPath being easier, not to mention about it being cleaner. It is more powerful, but easier? Huh...

user1070764 Over a year ago

Yea i figured that line wasnt going to do what i wanted, it was a last attempt getting it. And $html is the html of whatever site i need... " $html = file_get_html($fp);" and yea i did look into xpath a little and it didnt seem easier, but im going to try your suggestion now, thanks

Francis Avila Over a year ago

@Tom, I think XPath is both easier and clearer. Using the DOM is a mess for anything more complex than getElementById.

Francis Avila Over a year ago

@user1070764, is $html really just a string? You need to load that into a DOMDocument! How is any of your other parsing working?

user1070764 Over a year ago

@francisAvilla, about $html i guess so, after trying DomDocument and xpath a few different ways and it not working with what i was doing, i found simple_html_dom.php which worked like a charm without any examples of or need for a DOMDoc. On another note your solution worked, thank you, i didnt even need the textContent line so it was just that one line, i really was over thinking it. thanks again

pguardiario · Accepted Answer · 2011-11-29 09:30:27Z

0

Simple Html Dom should be avoided unless you're using Php version <= 4. The built in Dom functions in Php5 use the much more reliable libxml2 library.

The proper way to iterate that html is to first identify the rows to iterate and then write xpath expressions to pull the data relative to that row.

$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DomXpath($dom);

foreach($xpath->query("//tr[@class='sectionRow']") as $row){
    echo $xpath->query(".//span[contains(@id,'Days')]",$row)->item(0)->nodeValue."\n";
    echo $xpath->query(".//span[contains(@id,'Room')]",$row)->item(0)->nodeValue."\n";
    echo $xpath->query(".//span[contains(@id,'Status')]",$row)->item(0)->nodeValue."\n";
}

answered Nov 29, 2011 at 9:30

pguardiario

55.2k21 gold badges130 silver badges169 bronze badges

1 Comment

user1070764 Over a year ago

Thanks for that.. for now i just want this to work because its a small part of a bigger project, but i am going to want to optimize it so thanks for this example.

Collectives™ on Stack Overflow

PHP DOM parser get data from a span

2 Answers 2

5 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related