2

Hi I am doing a screen scrape on a weather website that has inline styles in it's div and has no class or id here is their code:

<div class="TodaysForecastContainer">

                    <div class="TodaysForecastContainerInner">
                        <div style="font-size:12px;"><u>This morning</u></div>
                        <div style="position:absolute;top:17px;left:3px;">
                            <a href="forecastPublicExtended.asp#Period0" target="_blank">
                                <img src="./images/wimages/b_cloudy.gif" height="50px" width="50px" alt="weather image">        
                            </a>                    </div>
                        <div style="position:absolute; top:25px; left:57px; text-align:left; height:47px; width:90px;">
                            Sunny Breaks                            </div>
                    </div>

                    <div class="TodaysForecastContainerInner">
                        <div style="font-size:12px;"><u>This afternoon</u></div>
                        <div style="position:absolute;top:17px;left:3px;">
                            <a href="forecastPublicExtended.asp#Period0" target="_blank">
                                <img src="./images/wimages/b_pcloudy.gif" height="50px" width="50px" alt="weather image">       
                            </a>                    </div>
                        <div style="position:absolute; top:25px; left:57px; text-align:left; height:47px; width:90px;">
                            Mix of Sun and Cloud                            </div>
                    </div>

The problem is the absolute position inline style and they have no class or id, I was hoping I could add a class name and remove inline style on div with "This morning", div containing the image and also remove the link and the div with discription(ex. Sunny Breaks)also changing all of the TodaysForecastContainerInner since it has about 4 forecast. making it similar to:

<div class="day>This morning</div><div class="thumbnail"><img src="sample.jpg"></div><div class="description">Sunny Breaks</div>

I was using :

foreach($html->find('.TodaysForecastContainerInner div') as $e)
echo $e->innertext . '<br>';

which removes all divs living me with u and img tag, I just can't style the div with discription I use img and u tag to style the other two divs, I'm just a beginner at php I hope someone could give me advice thank you so much.

2 Answers 2

1

Check out the phpQuery library. It can do jQuery-like manipulation using PHP. This code essentially accomplishes what you are trying to do:

<?php

include 'phpQuery-onefile.php';

$text = <<<EOF
<div class="TodaysForecastContainer">
    <div class="TodaysForecastContainerInner">
        <div style="font-size:12px;"><u>This morning</u></div>
        <div style="position:absolute;top:17px;left:3px;">
                <a href="forecastPublicExtended.asp#Period0" target="_blank">
                        <img src="./images/wimages/b_cloudy.gif" height="50px" width="50px" alt="weather image">        
                </a>
        </div>
        <div style="position:absolute; top:25px; left:57px; text-align:left; height:47px; width:90px;">
            Sunny Breaks
        </div>
    </div>
    <div class="TodaysForecastContainerInner">
        <div style="font-size:12px;"><u>This afternoon</u></div>
        <div style="position:absolute;top:17px;left:3px;">
            <a href="forecastPublicExtended.asp#Period0" target="_blank">
                <img src="./images/wimages/b_pcloudy.gif" height="50px" width="50px" alt="weather image">       
            </a>
        </div>
        <div style="position:absolute; top:25px; left:57px; text-align:left; height:47px; width:90px;">
            Mix of Sun and Cloud
        </div>
    </div>
EOF;

$doc = phpQuery::newDocumentHTML( $text );

$containers = pq('.TodaysForecastContainerInner', $doc);
foreach( $containers as $container ) {
    $div = pq('div', $container);

    $div->eq(0)->removeAttr('style')->addClass('day')->html( pq( 'u', $div->eq(0) )->html() );  
    $div->eq(1)->removeAttr('style')->addClass('thumbnail')->html( pq( 'img', $div->eq(1))->removeAttr('height')->removeAttr('width')->removeAttr('alt') );
    $div->eq(2)->removeAttr('style')->addClass('description');  
}

print $doc;

Result:

<div class="TodaysForecastContainer">
  <div class="TodaysForecastContainerInner">
    <div class="day">This morning</div>
    <div class="thumbnail"><img src="./images/wimages/b_cloudy.gif"></div>
    <div class="description">
      Sunny Breaks
    </div>
  </div>
  <div class="TodaysForecastContainerInner">
    <div class="day">This afternoon</div>
    <div class="thumbnail"><img src="./images/wimages/b_pcloudy.gif"></div>
    <div class="description">
      Mix of Sun and Cloud
    </div>
  </div>
Sign up to request clarification or add additional context in comments.

2 Comments

hi thanks for your comment Im getting syntax error on $text = <<<EOF by the way what is EOF sorry to be so new in this
Hrm...I just tried it again by pasting and copying and it worked for me. No biggie, though. The "$text = <<<EOF ... EOF;" captures everything between <<<EOF and EOF; and assigns it as a string into $text. That was just a quick example. As long as you have the HTML in a variable as a string, the phpQuery code should work. The key piece is: $doc = phpQuery::newDocumentHTML( $text ); You can populate $text however you like.
0

It's easier to do it on the client than on the server.

This jQuery+Javascript will clear your inline styles and apply a class name to each:

$(document).ready(function() { 
     var target = $('.TodaysForecastContainerInner div')
         for(var x=0;x< target.length;x++) {
               target.eq(x).attr('style','');
               target.eq(x).addClass("A_"+x)
         }   
})

Result:

<div class="TodaysForecastContainerInner">
    <div style="" class="A_0"><u>This morning</u></div>
    <div style="" class="A_1">
        <a target="_blank" href="forecastPublicExtended.asp#Period0">
            <img height="50px" width="50px" alt="weather image" src="./images/wimages/b_cloudy.gif">        
        </a>                    </div>
    <div style="" class="A_2">
        Sunny Breaks                            </div>
</div>

You can the use a stylesheet to make it look the way you want.

3 Comments

thank you for your reply unfortunately I am not getting your out put here is my test site j2sdesign.com/rgw/article/20101222/NEWS01/712229951/0/example/…. can you show me how you do it here is my php // find all foreach($html->find('.TodaysForecastContainerInner div') as $e) echo $e->innertext . '<br>';
also Im getting error on $(document).ready(function() { var target = $( I don't know what's wrong
here is the source you can view the php I put <!-- j2sdesign.com/rgw/article/20101222/NEWS01/712229951/0/example/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.