0

I'm attempting to use a website to scrape a specific html table that contains the following information:

  1. Balance
  2. Addresses
  3. % Addresses (Total)
  4. Coins
  5. USD
  6. % Coins (Total)

The code that I am using is below:

url = "https://app.intotheblock.com/coin/AMP/deep-dive?group=ownership&chart=all"

r = requests.get(url)
html = r.text

soup = BeautifulSoup(html)
table = soup.find('table', {"class": "sc-lhVmIH fYUufF sc-cmTdod gUHuZc"})

rows = table.find_all('tr')
data = []
for row in rows[1:]:
    cols = row.find_all('td')
    cols = [ele.text.strip() for ele in cols]
    data.append([ele for ele in cols if ele])

result = pd.DataFrame(data, columns=['Balance','Addresses','% Addresses (Total)','Coins','USD','% Coins (Total)'])

print(result)

I attempted to inspect the webpage so that I could grab the class type of the table but when I thought I found the html table I was looking for I keep getting an error on the following line "rows = table.find_all('tr')". This is telling me that I am not selecting the right class for the table that I would like to scrape.

I wrote code that would automatically login to the website, enter credentials, click the login button and navigate to the specific page that I would like to scrape and the table is returning back empty. The class type that I choose came after the table data so I thought that it was the correct class to use.

The specific link I am trying to scrape the data is below:

Link: https://app.intotheblock.com/coin/AMP/deep-dive?group=ownership&chart=all

The website makes you sign up to be able to see the data, I will post a picture below of the table/code from the website just to show the class I chose for the table. I would greatly appreciate if anyone could provide me some assistance as I am stuck on what I am doing wrong here.

enter image description here

1 Answer 1

1

You need JS to view that, so it is easier to scrape via the underlying API.

Google Chrome Inspect Network XHR then search api, find the one you need, then structure a python request to receive the json using the authorization token as such.

import requests

url = "https://services.intotheblock.com/api/internal/metrics/coin/8bdae7d9-b8ff-41a1-8229-2dd07f047845/ownership/holdings_distribution_matrix"

payload={}
headers = {
  'Authorization': 'Bearer some long text'
}

request = requests.request("GET", url, headers=headers, data=payload)
request.json()

Sign up to request clarification or add additional context in comments.

4 Comments

Is there anyway to read in the html as text? I realize the easier way would be to use an API but the companies API can be quite expensive just for a piece of data that I would like to analyze first. Is there no way to read in the html page as text because I need JS?
Something like a solution like this? Link to Scraping Dynamic Webpage
Just scrape instead with the heavier solution like Selenium that you use to automate the login process. Let me know if this helps: stackoverflow.com/questions/60899709/…
Yes! This helped me so much finally got it using xpath, thank you very much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.