Scraping a table appearing on click with python

Question

I want to scrape information from this page.

Specifically, I want to scrape the table which appears when you click "View all" under the "TOP 10 HOLDINGS" (you have to scroll down on the page a bit).

I am new to webscraping, and have tried using BeautifulSoup to do this. However, there seems to be an issue because the "onclick" function I need to take into account. In other words: The HTML code I scrape directly from the page doesn't include the table I want to obtain.

I am a bit confused about my next step: should I use something like selenium or can I deal with the issue in an easier/more efficient way?

Thanks.

My current code:

from bs4 import BeautifulSoup
import requests


Soup = BeautifulSoup
my_url = 'http://www.etf.com/SHE'
page = requests.get(my_url)
htmltxt = page.text

soup = Soup(htmltxt, "html.parser")
print(soup)

api: 'http://www.etf.com/view_all/holdings/SHE', but you must have my_url as referer. — t.m.adam
– t.m.adam, Commented Sep 10, 2017 at 14:52

t.m.adam · Accepted Answer · 2017-09-10 15:11:02Z

1

You can get a json response from the api: http://www.etf.com/view_all/holdings/SHE. The table you're looking for is located in 'view_all'.

import requests
from bs4 import BeautifulSoup as Soup

url = 'http://www.etf.com/SHE'
api = "http://www.etf.com/view_all/holdings/SHE"
headers = {'X-Requested-With':'XMLHttpRequest', 'Referer':url}
page = requests.get(api, headers=headers)
htmltxt = page.json()['view_all']
soup = Soup(htmltxt, "html.parser")
data = [[td.text for td in tr.find_all('td')] for tr in soup.find_all('tr')]

print('\n'.join(': '.join(row) for row in data))

answered Sep 10, 2017 at 15:11

t.m.adam

15.4k3 gold badges34 silver badges54 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Emjora Over a year ago

Follow-up question: On the same page, I would like to retrieve the data for sectors and countries. Both are in the same format. Can I use a similar API for these? And how can I see whether an API is available?

t.m.adam Over a year ago

@Emjora you'll have to inspect the network traffic (click inspect > network > xhr). So the api for sectors should be http://www.etf.com/etf-chart/top-10-sectors-json/SHE, and for countries http://www.etf.com/etf-chart/top-10-countries-json/SHE.

Collectives™ on Stack Overflow

Scraping a table appearing on click with python

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related