1

I want to scrape information from this page.

Specifically, I want to scrape the table which appears when you click "View all" under the "TOP 10 HOLDINGS" (you have to scroll down on the page a bit).

I am new to webscraping, and have tried using BeautifulSoup to do this. However, there seems to be an issue because the "onclick" function I need to take into account. In other words: The HTML code I scrape directly from the page doesn't include the table I want to obtain.

I am a bit confused about my next step: should I use something like selenium or can I deal with the issue in an easier/more efficient way?

Thanks.

My current code:

from bs4 import BeautifulSoup
import requests


Soup = BeautifulSoup
my_url = 'http://www.etf.com/SHE'
page = requests.get(my_url)
htmltxt = page.text

soup = Soup(htmltxt, "html.parser")
print(soup)
1
  • api: 'http://www.etf.com/view_all/holdings/SHE', but you must have my_url as referer. Commented Sep 10, 2017 at 14:52

1 Answer 1

1

You can get a json response from the api: http://www.etf.com/view_all/holdings/SHE. The table you're looking for is located in 'view_all'.

import requests
from bs4 import BeautifulSoup as Soup

url = 'http://www.etf.com/SHE'
api = "http://www.etf.com/view_all/holdings/SHE"
headers = {'X-Requested-With':'XMLHttpRequest', 'Referer':url}
page = requests.get(api, headers=headers)
htmltxt = page.json()['view_all']
soup = Soup(htmltxt, "html.parser")
data = [[td.text for td in tr.find_all('td')] for tr in soup.find_all('tr')]

print('\n'.join(': '.join(row) for row in data))
Sign up to request clarification or add additional context in comments.

2 Comments

Follow-up question: On the same page, I would like to retrieve the data for sectors and countries. Both are in the same format. Can I use a similar API for these? And how can I see whether an API is available?
@Emjora you'll have to inspect the network traffic (click inspect > network > xhr). So the api for sectors should be http://www.etf.com/etf-chart/top-10-sectors-json/SHE, and for countries http://www.etf.com/etf-chart/top-10-countries-json/SHE.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.