I'm reading a set of .csv files and adding them to one giant data frame called 'df', but I kept getting this error in some of my files: Error tokenizing data. C error: Expected 1 fields in line 88, saw 2
I actually figured out what was likely causing this: the CSVs I'm reading are the result of a newsletter sign up form, and there is a column where people specified what town they lived in. However, some users put a comma in that (ex. "El Paso, Texas" instead of just "El Paso").
Is there a way to handle this within the .read_csv command that doesn't involve just skipping the line? This error came up ~ a dozen times across the maybe 35 spreadsheets I'm stitching together, so I could theoretically alter the spreadsheets manually, but I'm trying to figure out how I could handle this in the future with new spreadsheets.
For reference, here is my code to add the spreadsheets to the dataframe.
for spreadsheet in os.listdir(path):
file_name = path + '/' + spreadsheet
if file_name[-3:] == "csv":
try:
temp = pd.read_csv(path + '/' + spreadsheet, encoding='utf-16')
df = pd.concat([df, temp])
except pd.errors.ParserError as e:
print('Something went wrong with' + file_name + f"error: {e}")
else: continue
pandas read_csv? Something like this?import csvwith open('some.csv') as f:reader = csv.reader(f)for row in reader:print(row)