0

I am scraping from this page: https://www.pro-football-reference.com/years/2018/week_1.htm

It is a list of game scores for American Football. I want to open the link to the stats for the first game. The text displayed for said says "Final". My code so far...

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup


#assigning url
my_url = "https://www.pro-football-reference.com/years/2018/week_1.htm"

# opening up connection, grabbing the page
raw_page = uReq(my_url)
page_html = raw_page.read()
raw_page.close()

# html parsing
page_soup = soup(page_html,"html.parser")

#find all games on page
games = page_soup.findAll("div",{"class":"game_summary expanded nohover"})

link = games[0].find("td",{"class":"right gamelink"})
print(link)

When I run this i receive the following output...

<a href="/boxscores/201809060phi.htm">Final</a>

How do I assign only the link text (i.e. "/boxscores/201809060phi.htm") to a variable?

1 Answer 1

1
link = games[0].find("td",{"class":"right gamelink"}).find('a')

print(link['href'])
Sign up to request clarification or add additional context in comments.

1 Comment

Perfect. Thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.