Python Pandas - read_html No tables Found

Question

I am very new to python and trying to do my own data analysis.

I am trying to parse data from this website: https://www.tsn.ca/nhl/statistics

I wanted to get the table in a data frame format.

I tried this:

import pandas as pd

players_list_unclean = pd.read_html('https://www.sportsnet.ca/hockey/nhl/players/?season=2021&?seasonType=reg&tab=Skaters')

I get the following error:

raise ValueError("No tables found") ValueError: No tables found

I can see there is table, but for some reason it is not being read.

I found another stack overflow solution recommending using selenium:

pandas read_html ValueError: No tables found

However, when I tried to implement this code I could not find the table ID in the html page source. Does anyone know another way to do this? I have tried other websites, but I ultimately have the same issue.

from selenium.webdriver.common.keys import Keys

driver = webdriver.Firefox()
driver.get("https://www.wunderground.com/personal-weather-station/dashboard?ID=KMAHADLE7#history/tdata/s20170201/e20170201/mcustom.html")
elem = driver.find_element_by_id("history_table")

head = elem.find_element_by_tag_name('thead')
body = elem.find_element_by_tag_name('tbody')

list_rows = []

for items in body.find_element_by_tag_name('tr'):
    list_cells = []
    for item in items.find_elements_by_tag_name('td'):
        list_cells.append(item.text)
    list_rows.append(list_cells)
driver.close() ```

Willow · Accepted Answer · 2022-02-23 03:10:59Z

2

If you right click the table and choose inspect, you will see that the "table" on that page is not actually using the html table element.

From the Pandas documentation:

This function searches for <table> elements and only for <tr> and <th> rows and <td> elements within each <tr> or <th> element in the table.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_html.html

I don't think this will work on this page. Probably need to find another data source.

answered Feb 23, 2022 at 3:10

Willow

1,4759 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

pguardiario · Accepted Answer · 2022-02-23 04:08:07Z

0

There's no table but you're in luck because the data is coming from a fetch:

https://datacrunch.9c9media.ca/statsapi/sports/hockey/leagues/nhl/sortablePlayerSeasonStats/skater?brand=tsn&type=json&seasonType=regularSeason&season=2021

answered Feb 23, 2022 at 4:08

pguardiario

55.2k21 gold badges130 silver badges169 bronze badges

6 Comments

confused Over a year ago

Does that mean I just have to manually clean the data from the html file?

pguardiario Over a year ago

No, that's json data you parse it with json.loads

confused Over a year ago

Ok, and how did you get this fetch data from the link you sent above?

pguardiario Over a year ago

The same way. With requests or any other way you're getting the html

confused Over a year ago

Sorry I am not quite following. I was just using read_html. I am guessing you can't do that to get the fetch data?

|

Collectives™ on Stack Overflow

Python Pandas - read_html No tables Found

2 Answers 2

Comments

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related