0

I am getting a KeyError: 'title' error in my web scraping program and not sure what the issue is. When I use inspect element on the webpage I can see the element that I am trying to find;

import pandas as pd
import requests
from bs4 import BeautifulSoup
import re

url = 'https://www.ncaagamesim.com/college-basketball-predictions.asp'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

table = soup.find('table')

# Get column names
headers = table.find_all('th')
cols = [x.text for x in headers]

# Get all rows in table body
table_rows = table.find_all('tr')

rows = []
# Grab the text of each td, and put into a rows list
for each in table_rows[1:]:
    odd_avail = True
    data = each.find_all('td')
    time = data[0].text.strip()
    try:
        matchup, odds = data[1].text.strip().split('\xa0')
        odd_margin = float(odds.split('by')[-1].strip())
    except:
        matchup = data[1].text.strip()
        odd_margin = '-'
        odd_avail = False
    odd_team_win = data[1].find_all('img')[-1]['title']

    sim_team_win = data[2].find('img')['title']
    sim_margin = float(re.findall("\d+\.\d+", data[2].text)[-1])

    if odd_avail == True:
        if odd_team_win == sim_team_win:
            diff = sim_margin - odd_margin
        else:
            diff = -1 * odd_margin - sim_margin
    else:
        diff = '-'

    row = {cols[0]: time, 'Matchup': matchup, 'Odds Winner': odd_team_win, 'Odds': odd_margin,
           'Simulation Winner': sim_team_win, 'Simulation Margin': sim_margin, 'Diff': diff}
    rows.append(row)

df = pd.DataFrame(rows)
print (df.to_string())
# df.to_csv('odds.csv', index=False)

I am getting the error on setting the sim_team_win line. It is getting data[2] which is the 3rd column on the website and finding the img title to get the team name. Is it because the img title is within another div? Also, when running this code it also does not print out the "Odds" column, which is being stored in the odd_margin variable. Is there something that is wrong when setting that variable? Thanks in advance for the help!

1 Answer 1

1

As far as the not finding the img title, if you look at the row with New Mexico @ Dixie State, there is no image in the third column - no img title in the source either.

For the Odds column, after try/excepting the sim_team_win assignment, I get all the Odds values in the table.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks, I just added a try except for the stimulation winner and this fixed my problem. Thank you for pointing that out. And what do you mean by getting all the odds values after the sim team win assignment?
No problem. I meant that I was seeing all the Odds values after I fixed the sim_team_win error.
ohh okay, I'm still not seeing the odds values after fixing the sim team win assignment, did you change anything else?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.