0

sorry if this is not the place for this question, but I'm not sure where else to ask.

I'm trying to scrape data from rotogrinders.com and I'm running into some challenges.

In particular, I want to be able to scrape previous NHL game data using urls of this format (obviously you can change the date for other day's data): https://rotogrinders.com/game-stats/nhl-skater?site=draftkings&date=11-22-2016

However, when I get to the page, I notice that the data is broken up into pages, and I'm unsure what to do to get my script to get the data that's presented after clicking the "all" button at the bottom of the page.

Is there a way to do this in python? Perhaps some library that will allow button clicks? Or is there some way to get the data without actually clicking the button by being clever about the URL/request?

2
  • "Perhaps some library that will allow button clicks?" Selenium. Commented Nov 25, 2016 at 19:01
  • What have you done so far, if you show some code or attempts to do the task people are more willing to help. Commented Nov 25, 2016 at 19:02

1 Answer 1

1

Actually, things are not that complicated in this case. When you click "All" no network requests are issued. All the data is already there - inside a script tag in the HTML, you just need to extract it.

Working code using requests (to download the page content), BeautifulSoup (to parse HTML and locate the desired script element), re (to extract the desired "player" array from the script) and json (to load the array string into a Python list):

import json
import re

import requests
from bs4 import BeautifulSoup

url = "https://rotogrinders.com/game-stats/nhl-skater?site=draftkings&date=11-22-2016"
response = requests.get(url)

soup = BeautifulSoup(response.content, "html.parser")
pattern = re.compile(r"var data = (\[.*?\]);$", re.MULTILINE | re.DOTALL)

script = soup.find("script", text=pattern)

data = pattern.search(script.text).group(1)
data = json.loads(data)

# printing player names for demonstration purposes
for player in data:
    print(player["player"])

Prints:

Jeff Skinner
Jordan Staal
...
William Carrier
A.J. Greer
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot! I had heard about BeautifulSoup but hadn't had much luck when I've used it before. Clearly I need to read more of the documentation to really grasp what all it can do. Thanks again for the help

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.