Pulling the href from a link when web scraping using Python

Question

I am scraping from this page: https://www.pro-football-reference.com/years/2018/week_1.htm

It is a list of game scores for American Football. I want to open the link to the stats for the first game. The text displayed for said says "Final". My code so far...

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup


#assigning url
my_url = "https://www.pro-football-reference.com/years/2018/week_1.htm"

# opening up connection, grabbing the page
raw_page = uReq(my_url)
page_html = raw_page.read()
raw_page.close()

# html parsing
page_soup = soup(page_html,"html.parser")

#find all games on page
games = page_soup.findAll("div",{"class":"game_summary expanded nohover"})

link = games[0].find("td",{"class":"right gamelink"})
print(link)

When I run this i receive the following output...

<a href="/boxscores/201809060phi.htm">Final</a>

How do I assign only the link text (i.e. "/boxscores/201809060phi.htm") to a variable?

fernand0 · Accepted Answer · 2018-09-15 15:40:42Z

1

link = games[0].find("td",{"class":"right gamelink"}).find('a')

print(link['href'])

answered Sep 15, 2018 at 15:40

fernand0

3101 silver badge10 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

jbenfleming Over a year ago

Perfect. Thank you!

Collectives™ on Stack Overflow

Pulling the href from a link when web scraping using Python

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related