Scrape Table Data Into Dataframe

Question

An example URL is 'http://www.hockey-reference.com/players/c/crosbsi01/gamelog/2016'

The table name I am trying to grab is named Regular Season.

What I use to do in previous instances was something like this...

import requests
from bs4 import *
from bs4 import NavigableString
import pandas as pd


url = 'http://www.hockey-reference.com/players/o/ovechal01/gamelog/2016'
resultsPage = requests.get(url)
soup = BeautifulSoup(resultsPage.text, "html5lib")
comment = soup.find(text=lambda x: isinstance(x, NavigableString) and "Regular Season  Table" in x)
df = pd.read_html(comment)

That's the type of approach I took to a site similar to this one, however, I'm unable to locate the table properly with this page. Not sure what I'm missing.

Padraic Cunningham · Accepted Answer · 2016-10-20 19:45:55Z

1

There is one table which you can get using the id:

import requests
from bs4 import BeautifulSoup


url = 'http://www.hockey-reference.com/players/o/ovechal01/gamelog/2016'
resultsPage = requests.get(url)
soup = BeautifulSoup(resultsPage.text, "html5lib")
table = soup.select_one("#gamelog")
print(table)

or using just pandas:

 df = pd.read_html(url, attrs = {'id': 'gamelog'})

Your code could never work as you are looking for a NavigableString which is inside a caption tag <caption>Regular Season Table</caption> not the table, you would need to call *.find_previous`* to get the table:

comment = soup.find(text=lambda x: isinstance(x, NavigableString) and "Regular Season  Table" in x)
table = comment.find_previous("table")

You could also use table = comment.parent.parent but find_previous is a better approach.

edited Oct 20, 2016 at 19:45

answered Oct 20, 2016 at 19:40

Padraic Cunningham

181k30 gold badges264 silver badges327 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Scrape Table Data Into Dataframe

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related