1

I'll start by saying: I know this is a question that gets asked a lot. I've read the other answers, and have ruled out:

I'm not using += for the assignment;

I've tried explicitly assigning each variable within the function to ensure they're not empty, in case the other work the function does fails;

They're not global variables, and I don't want them to be - they're just internal variables that I use to work out what I'm eventually returning.

    ##  Gets the data from external website - refreshes whenever the programme is called. 
## Urllib2 required module 
##  csv to make life easier handling the data 

import urllib2
import csv
import sys  
import math
# import sqlite3    #don't need this just now, will probably run Django with MySQL when it comes to it
# import MySQLdb Likewise, don't need this just now. 
#python3
import atexit
from time import time
from datetime import timedelta

def secondsToStr(t):
    return str(timedelta(seconds=t))

line = "="*40
def log(s, elapsed=None):
    print(line)
    print(secondsToStr(time()), '-', s)
    if elapsed:
        print("Elapsed time:", elapsed)
    print(line)
    print()

def endlog():
    end = time()
    elapsed = end-start
    log("End Program", secondsToStr(elapsed))

def now():
    return secondsToStr(time())

start = time()
atexit.register(endlog)
log("Start Program")
def open_external_source():
    # Checks if the core file's been modified since the last time we used it - if it hasn't, then we skip all of the file reading stuff. 
    #need to change this to just pull the headers the first time.  
    master_data_file = urllib2.urlopen("http://www.football-data.co.uk/mmz4281/1213/E0.csv", "GET")
    print master_data_file
    headers = master_data_file.info()
    last_mod = headers["last-modified"]
    settings = open ("settings.csv","r+")
    historic_last_mod = settings.readline() #this only works when the setting is a 1 line file
    print "Local file version: " + historic_last_mod 
    print "Server file version: " +last_mod
    if last_mod == historic_last_mod :
        print "It's the same, file not loaded"
        return true
    else : 
        return false
    settings.close()

#the if statement's commented out because it was messing up the variables into the function
#if open_external_source == False:
master_data_file = urllib2.urlopen("http://www.football-data.co.uk/mmz4281/1213/E0.csv", "GET")
data = list(tuple(rec) for rec in csv.reader(master_data_file, delimiter=','))
print len(data)
print "printing full file"
print data
league_list = ["Arsenal", "Chelsea", "Liverpool", "Man City", "Man United", "Newcastle", "Newcastle", "Norwich","Reading","Southampton", "Stoke", "Sunderland", "Swansea", "Tottenham", "West Brom", "West Ham", "Wigan"]

league_stats = league_list

#for teams in league_list: - come back to this, will do this as a split and append. 


#call the next set of functions to skip the data reading stuff 
#This is the data reading section, that puts the data into our system
#If we do proceed, then we redo all of the calculations, and read the data file in again, in case of any corrections, etc.  

#Column references:
#Home Goals 4
#Away Goals 5
#Full Time Result 6
#Home Shots 10
#Away Shots 11
#Home Shots on Target 12
#Away Shots on Target 13


#Calculates the average for a given team at home, columns are 4 Home Goals, 5 Away Goa
def CalcAverageHome(team, column, data):
    total = 0
    count = 0
    n=0
    for row in data:
        if  data[count][2] == team:
            total += int(data[count][column])
            n+=1
        count += 1      
    try:
        average = float(total) / n
    except ZeroDivisionError:
        average = 'Not played'
    return average

def CalcAverageAway(team, column, data):
    total = 0
    count = 0
    n=0
    for row in data:
        if  data[count][3] == team:
            total += int(data[count][column])
            n+=1
        count += 1      
    try:
        average = float(total) / n
    except ZeroDivisionError:
        average = 'Not played'  
    return average


home_team = "Chelsea"
away_team = "Newcastle" 
print "Here's the Average number of goals scored Home"
home_goals = CalcAverageHome(home_team, 4, data)
away_goals = CalcAverageAway(home_team, 5, data)
home_conceded = CalcAverageHome(home_team, 5, data) 
away_conceded = CalcAverageAway(away_team, 4, data)
adjusted_home = home_goals * away_conceded
adjusted_away = away_goals * home_conceded

print home_team, home_goals, home_conceded, adjusted_home
print away_team, away_goals, away_conceded, adjusted_away

print "starting to try and work the league averages out here." 

def poisson_probability(actual, mean):
    # naive:   math.exp(-mean) * mean**actual / factorial(actual)

    # iterative, to keep the components from getting too large or small:
    p = math.exp(-mean)
    for i in xrange(actual):
        p *= mean
        p /= i+1
    return p

for i in range (10):
    print str((100*poisson_probability(i,adjusted_home)))+"%"


league_list = ["Arsenal", "Chelsea", "Liverpool", "Man City", "Man United", "Newcastle", "Newcastle", "Norwich","Reading","Southampton", "Stoke", "Sunderland", "Swansea", "Tottenham", "West Brom", "West Ham", "Wigan"]


# just assign the league list to the stats for now - 
# eventually each team entry will become the first column of a new sublist

def LeagueAverages(data,column):
    total = 0
    n = 0
    for row in data :
        string = row[column]
        if string.isdigit() == True:
            total = total + int(row[column])
            n += 1
    league_average = float(total) / n
    return league_average


print "League home goals average is:", LeagueAverages(data, 4)
print "League away goals average is:", LeagueAverages(data, 5)

print "finished that loop..."




league_stats = []
test_team = "Arsenal"

# Function iterates through the league teams and calculates the averages
# and places them in one long list. 
for team in league_list:
    league_stats.append(team)
    league_stats.append(CalcAverageHome(team, 4, data))
    print CalcAverageHome(team, 4, data)
    league_stats.append(CalcAverageHome(team, 5, data))
    CalcAverageHome(team, 5, data)
    league_stats.append(CalcAverageHome(team, 7, data))
    CalcAverageHome(team, 7, data)
    league_stats.append(CalcAverageHome(team, 8, data))
    CalcAverageHome(team, 8, data)
    league_stats.append(CalcAverageHome(team, 10, data))
    CalcAverageHome(team, 10, data)
    league_stats.append(CalcAverageHome(team, 11, data))
    CalcAverageHome(team, 11, data)
    league_stats.append(CalcAverageHome(team, 12, data))
    CalcAverageHome(team, 12, data)
    league_stats.append(CalcAverageHome(team, 13, data))
    CalcAverageHome(team, 13, data)

# This function should chunk the 'file', as when we run the above code, 
# we'll end up with one incredibly long list that contains every team on the same line
def chunker(seq, size):
    return (seq[pos:pos + size] for pos in xrange(0, len(seq), size))

chunker (league_stats, 9)

final_stats = []
for group in chunker(league_stats, 9):
   print repr(group)
   final_stats.append(repr(group))

#retrieve a particular value from the final stats array
"""
    for row in final_stats:
        if  data[count][2] == team:
            total += int(data[count][column])
            n+=1
        count += 1  
"""

def create_probability_table(hometeam, awayteam, final_stats):
#reads in the home and away sides, calculates their performance adjusted 
#ratings and then calculates the likelihood of each team scoring a particular
#number of goals (from 0-10)
#those likelihoods are then combined to provide an 11x11 matrix of probabilities
    poisson_array = []
    poisson_list_home = []
    poisson_list_away = []
    goals_home = 0
    conceded_home = 0
    goals_away = 0
    conceded_away = 0

    for team in final_stats:
        if team == hometeam:
            goals_home = team[1]
            conceded_home = team [3]
            print "home Goals, Home Conceded"
            print goals_home, conceded_home
        elif team == awayteam:
            goals_away = team[2]
            conceded_away = team[4]
            print "Away Goals, Away Conceded"
            print goals_away, conceded_away, 
        else:           
            pass

    adjusted_goals_home = goals_home * conceded_away
    adjusted_goals_away = goals_away * conceded_home 

    #this section creates the two probability lists for home and away num goals scored      
    for i in range (10):

        poisson_list_home.append = (100*poisson_probability(i,adjusted_goals_home))
        poisson_list_away.append = (100*poisson_probability(i,adjusted_goals_away))

    print poisson_list_home
    print poisson_list_away

    for number in poisson_list_home:
        for number in poisson_list_away:
            probability_table.append(poisson_list_home[number] * poisson_list_away[number])
    return probability_table 

create_probability_table("Arsenal", "Chelsea", final_stats)

#and this section cross multiplies them into a new list

#   for i in range (10):

# print data_frame [0:100] prints to console to provide visual check

master_data_file.close()

When I run it, it throws a

line 272, in create_probability_table
adjusted_goals_home = goals_home * conceded_away UnboundLocalError: local variable 'conceded_away' referenced before assignment

error - I don't understand why! It is defined and assigned - right at the start of the function. It's not global.

I've looked at these questions, and they don't seem to answer the question: Local (?) variable referenced before assignment Assigning to variable from parent function: "Local variable referenced before assignment" How is this "referenced before assignment"? UnboundLocalError: local variable 'Core_prices' referenced before assignment

5
  • at the first if in your for, your indentation is off Commented Jun 26, 2013 at 19:08
  • What is the error you're getting? It would help to know which variable is being referenced before assignment. Anyway, you might have a problem with indentation - the elif is aligned with the for instead of with the if, and the line after if team == hometeam is misaligned. Commented Jun 26, 2013 at 19:09
  • what line is the error on, can we see the traceback? Commented Jun 26, 2013 at 19:09
  • 1
    you also wrote con d eded_away = 0 Commented Jun 26, 2013 at 19:10
  • The indentation is fine in the original code, it's been my crappy copy and pasting. Hang on and I'll adjust it. Commented Jun 26, 2013 at 19:11

1 Answer 1

5

You misspelled "conceded":

condeded_away = 0
   ^

Also, you may want to use a different data structure for final_stats, like a dictionary:

teams = {
    'team1': [...],
    'team2': [...],
    ...
}

You can then look up the stats for a team much more quickly:

stats = teams['team2']
Sign up to request clarification or add additional context in comments.

12 Comments

@ChrisCampbell: adjusted_goals_home is local to your function, so it doesn't exist outside of its scope. You should either make it global or return it from create_probability_table and store it in another variable.
That's an unrelated error, right? adjusted_goals_home isn't defined anywhere in this block of code... so it's a local. If you're using it later (outside this function), that would result in the error.
@RobI: It's defined in the second-to-last line.
That's right - I want it to be local, where I'm assigning it is still supposed to be in the function. I go on to run a nested loop using adjusted_goals_home and away before returning something else, but I hadn't included it as it's another 30 lines.
@ChrisCampbell: So where is the traceback coming from, inside or outside of this function?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.