0

I have a folder with csv files where each file has a string at the start identifying the game and a tag at the end identifying which table from that games. Example:

20020905_nyg_scoring.csv
20020905_nyg_team_stats.csv
20020908_buf_scoring.csv
20020908_buf_team_stats.csv

I've written a script that pairs csv files by the first part of the file name into a dictionary and then turns that dictionary into a list. I want to read the file name pairs in and perform dataframe shaping on each pair together. Ultimately, I will concat the data from the paired files into a single dataframe (concat is not my issue here).

import numpy as np
import pandas as pd
import os

game_list = {}
path = r'C:\Users\jobon\Documents\New NFL Stats\Experimental\2002 Game Logs'
for file in os.listdir(path):
    game_pairing = game_list.get(file[:12],[])
    game_pairing.append(file)
    game_list[file[:12]] = game_pairing

game_pairs = []
for game, stats in game_list.items():
    game_pairs.append(stats)

for scoring, team_stats in game_pairs:
    for file in os.listdir(path):
        df1 = pd.read_csv(scoring, header = 0, index_col = 0)
        df1.drop(['Detail', 'Quarter', 'Time', 'Tm'], axis = 1, inplace = True)
        ...more shaping...

I expect to end with a final set of data frames generated from each pair of game files that I can concat.

Instead I get

FileNotFoundError                         Traceback (most recent call last)
<ipython-input-37-fb1d4aa9f003> in <module>
     18 for scoring, team_stats in game_pairs:
     19     for file in os.listdir(path):
---> 20         df1 = pd.read_csv(scoring, header = 0, index_col = 0)
     21         #df1.drop(['Detail', 'Quarter', 'Time', 'Tm'], axis = 1, inplace = True)
     22         print(df1)

FileNotFoundError: [Errno 2] File b'20020905_nyg_scoring.csv' does not exist: b'20020905_nyg_scoring.csv'

The files are in the folder, and it worked for building the list, but I don't know why it suddenly can't find the files now.

1
  • Attempted: df1 = pd.read_csv(scoring.decode("utf-8"), header = 0, index_col = 0) and got: AttributeError: 'str' object has no attribute 'decode' So I tried: df1 = pd.read_csv(codecs.decode(scoring, encoding='ASCII', errors='strict'), header = 0, index_col = 0) and got: TypeError: decoding with 'utf-8' codec failed (TypeError: a bytes-like object is required, not 'str') Commented Jul 11, 2019 at 15:47

2 Answers 2

1

I just ran your code. I think the problem is that your .csv files are in the folder path, so you cannot find the files if just use the filename scoring without the directory name path. To fix this, you need

scoring = os.path.join(path, scoring)

in your loop.

Sign up to request clarification or add additional context in comments.

Comments

0

Seems like the the first variable that you are passing in the read_csv method is not a string literal but a byte literal. That is why the error mentions a file b'20020905_nyg_scoring.csv' and not '20020905_nyg_scoring.csv'. That b in the beginning denotes a byte literal.

Changing

df1 = pd.read_csv(scoring, header = 0, index_col = 0)

to

df1 = pd.read_csv(scoring.decode("utf-8"), header = 0, index_col = 0)

should fix your issue

1 Comment

Tried and got: AttributeError: 'str' object has no attribute 'decode'

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.