0

I tried to web-scraping dataes from a website called : flightradar24

With my code, I'm looking for the name of the airport, and I want to web-scraping the "Arrivals" table. The web-scraping the name is worked, because this is only a h1 HTML format, but if I try to web-scraping this table with my code, I don't get any values, I get only the objects names (maybe because there is a javascript?)

Is there any solution which one I can web-scraping this part of this page? (Python 2.7)

I tried this :

import urllib2, sys
from BeautifulSoup import BeautifulSoup

site= "https://www.flightradar24.com/data/airports/bud/arrivals"
hdr = {'User-Agent': 'Mozilla/5.0'}
req = urllib2.Request(site,headers=hdr)
page = urllib2.urlopen(req)
soup = BeautifulSoup(page)
name = soup.find('h1' , attrs={'class' : 'airport-name'})
print name

table = soup.find('div', { "class" : "row cnt-schedule-table" })
print table

I get this, when i want to print the table :

<div class="row cnt-schedule-table"><label class="m-b-m">ARRIVALS</label><table class="table table-condensed table-hover data-table m-n-t-15"><thead><tr class="hidden-xs hidden-sm"><th class="w-80">TIME</th><th class="w-90">FLIGHT</th><th>FROM</th><th>AIRLINE</th><th class="w-120">AIRCRAFT</th><th class="w-10"></th><th class="w-160">STATUS</th></tr><tr ng-cloak="ng-cloak" data-ng-class="{hidden: btnLoadEarlier === false}" ng-show="(isFetching == false &amp;&amp; airportView.schedule.arrivals.data.length &gt; 0)"> 0)"&gt;<td colspan="7" class="text-center"><button data-mode="arrivals" data-page="-1" data-timestamp="{{currentUtcTimestampRender / 1000}}" ng-click="loadMoreFlights($event)" data-current-page="{{airportView.schedule.arrivals.page.current}}" data-loading-text='&lt;i class="fa fa-circle-o-notch fa-spin"&gt;&lt;/i&gt; Loading earlier flights...' class="btn btn-table-action btn-flights-load">Load earlier flights</button></td></tr></thead><tbody><tr ng-cloak="ng-cloak" class="loader" ng-show="(isFetching == true)"><td colspan="7" class="text-center"><i class="fa fa-spinner fa-pulse"></i> Loading...</td></tr><tr ng-cloak="ng-cloak" ng-show="(isFetching == false &amp;&amp; airportView.schedule.arrivals.data.length == 0)"><td colspan="7" class="text-center">Sorry, we don't have any information about flights for this airport</td></tr><tr ng-cloak="ng-cloak" class="hidden-md hidden-lg" ng-repeat="objFlight in airportView.schedule.arrivals.data track by $index" ng-show="(isFetching == false)"><td colspan="7" class="state-block-{{objFlight.flight.status.generic.status.color || 'gray'}}"><div class="row"><div class="col-xs-12 col-sm-12 p-xxs"><span ng-bind-html="objFlight.flight.statusMessage.text | unsafe"></span> {{objFlight.flight.status.generic.eventTime.utc * 1000 || '' | date: timeFormat: timeZone}}</div></div><div class="row"><div class="col-xs-3 col-sm-3 p-xxs"><i class="fa fa-clock-o"></i> <span>{{objFlight.flight.time.scheduled.arrival * 1000 || '-' | date: timeFormat : timeZone}}</span></div><div class="col-xs-3 col-sm-3 p-xxs"><i class="fa fa-tag"></i> <a class="notranslate" ng-href="/data/flights/{{objFlight.flight.identification.number.default | lowercase}}">{{objFlight.flight.identification.number.default}}</a></div><div class="col-xs-6 col-sm-6 p-xxs"><i class="fa fa-map-marker"></i> <span ng-bind-html="objFlight.flight.airport.origin.position.region.city || '-' | unsafe">{{objFlight.flight.airport.origin.position.region.city}} </span><a class="notranslate" ng-href="/data/airports/{{objFlight.flight.airport.origin.code.iata | lowercase}}" title="{{objFlight.flight.airport.origin.name}}, {{objFlight.flight.airport.origin.position.country.name}}">({{objFlight.flight.airport.origin.code.iata}})</a></div></div><div class="row"><div class="col-xs-3 col-sm-3 p-xxs" title="{{objFlight.flight.aircraft.model.text || ''}}"><i class="fa fa-plane"></i> {{objFlight.flight.aircraft.model.code || '-'}}</div><div class="col-xs-3 col-sm-3 p-xxs"><a ng-show="(objFlight.flight.aircraft.registration)" class="notranslate" ng-href="/data/aircraft/{{objFlight.flight.aircraft.registration | lowercase}}">{{objFlight.flight.aircraft.registration}}</a></div><div class="col-xs-6 col-sm-6 p-xxs">{{ objFlight.flight.airline.name || '-'}}</div></div></td></tr><tr ng-cloak="ng-cloak" class="hidden-xs hidden-sm" ng-repeat="objFlight in airportView.schedule.arrivals.data track by $index" ng-show="(isFetching == false)" data-date="{{(objFlight.flight.time.scheduled.arrival * 1000) | date: 'EEEE, MMM dd' : timeZone}}" tbl-render-directive="tbl-render-directive"><td>{{objFlight.flight.time.scheduled.arrival * 1000 || '-' | date: timeFormat : timeZone}}</td><td class="p-l-s cell-flight-number"><a class="chevron-toggle" ng-if="(objFlight.flight.identification.codeshare != null)" data-codeshare="{{objFlight.flight.identification.codeshare}}"></a> <a class="notranslate" ng-href="/data/flights/{{objFlight.flight.identification.number.default | lowercase}}">{{objFlight.flight.identification.number.default}}</a></td><td><div ng-show="(objFlight.flight.airport.origin)"><span class="hide-mobile-only">{{objFlight.flight.airport.origin.position.region.city}} </span><a class="fs-10 fbold notranslate" ng-href="/data/airports/{{objFlight.flight.airport.origin.code.iata | lowercase}}" title="{{objFlight.flight.airport.origin.name}}, {{objFlight.flight.airport.origin.position.country.name}}">({{objFlight.flight.airport.origin.code.iata}})</a></div><div ng-show="!(objFlight.flight.airport.origin)">-</div></td><td ng-bind-html=" objFlight.flight.airline.name || '-' | unsafe" title="{{ objFlight.flight.airline.name || ''}}" class="cell-airline"></td><td><span class="notranslate" ng-show="(objFlight.flight.aircraft.model.code)">{{objFlight.flight.aircraft.model.code}} </span><a ng-show="(objFlight.flight.aircraft.registration)" class="fs-10 fbold notranslate" ng-href="/data/aircraft/{{objFlight.flight.aircraft.registration | lowercase}}">({{objFlight.flight.aircraft.registration}}) </a><span ng-if="(!objFlight.flight.aircraft.model.code &amp;&amp; !objFlight.flight.aircraft.registration)">-</span></td><td><div class="state-block {{objFlight.flight.status.generic.status.color || 'gray'}}"></div></td><td><span ng-bind-html="objFlight.flight.statusMessage.text | unsafe"></span> {{objFlight.flight.status.generic.eventTime.utc * 1000 || '' | date: timeFormat: timeZone}}</td></tr></tbody><tfoot><tr ng-cloak="ng-cloak" data-ng-class="{hidden: btnLoadLater === false }" ng-show="(isFetching == false &amp;&amp; airportView.schedule.arrivals.data.length &gt; 0 &amp;&amp; airportView.schedule.arrivals.page.current &lt; airportView.schedule.arrivals.page.total)"> 0 &amp;&amp; airportView.schedule.arrivals.page.current &lt; airportView.schedule.arrivals.page.total)"&gt;<td colspan="7" class="text-center"><button data-mode="arrivals" data-page="2" data-timestamp="{{currentUtcTimestampRender / 1000 | int}}" ng-click="loadMoreFlights($event)" data-current-page="{{airportView.schedule.arrivals.page.current}}" data-loading-text='&lt;i class="fa fa-circle-o-notch fa-spin"&gt;&lt;/i&gt; Loading later flights...' class="btn btn-table-action btn-flights-load">Load later flights</button></td></tr><tr ng-cloak="ng-cloak" ng-show="(isFetching == false)"><td colspan="7">* All times are in {{(airportView.schedule.arrivals.data &amp;&amp; timeZone.toUpperCase() == 'UTC' ? 'UTC' : 'local')}} timezone</td></tr></tfoot></table></div>

The articel code in the answer doesn't working :

import urllib2
from bs4 import BeautifulSoup
import json

# new url      
url = 'https://www.flightradar24.com/data/airports/bud/arrivals'

# read all data
page = urllib2.urlopen(url).read()

# convert json text to python dictionary
data = json.loads(page)

print(data['row cnt-schedule-table'])

2
  • "I get only the objects names" - What do you mean by that? What is the exact output you get? What is the HTML you're examining with this? (The value of page, or some subset thereof.) Why do you think JavaScript is involved? Commented Jul 24, 2017 at 13:34
  • I am new in the programmin world, i just think it. I edit my question and you can see what i get if i print the result. Commented Jul 24, 2017 at 13:36

1 Answer 1

0

Here is another stack overflow article that has a solution with a very similar problem. It seems that you need to change the URL to match the one rendered rather than the one you would normally use in a browser.

Sign up to request clarification or add additional context in comments.

1 Comment

I tried with this article code too (i update it to my question) but still not working.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.