I am using python's CSV module to iterate over the rows of a column.
What I need to do is:
- Get the first row for column "title"
- Remove any spanish characters (accents, Ñ)
- Remove single quotes
- Finally, replace spaces with dashes and convert everything to lowercase.
I got this to work with a simple test file,not a csv. I also managed to print each title in it's own separate line.
But now I'm using this code to go over the CSV file (sorry for the VERY ugly code, I'm a newbie programmer):
import csv
import unicodedata
import ast
def strip_accents(s):
return ''.join((c for c in unicodedata.normalize('NFD', s) if unicodedata.category(c) != 'Mn'))
dic_read = csv.DictReader(open("output.csv", encoding = "utf8"))
for line in dic_read:
#print(line) #I get each line of the csv file as a dictionary.
#print(line["title"]) # I get only the "title" column on each line
line = line.replace(' ', '-').lower()
line = line.replace("´", "")
line = strip_accents(line)
fp=open("cleantitles.txt", "a")
fp.write(line)
fp.close()
I get the following error:
Traceback (most recent call last):
File "C:/csvreader3.py", line 15, in <module> line = strip_accents(line)
File "C:/csvreader3.py", line 7, in strip_accents
return ''.join((c for c in unicodedata.normalize('NFD', s) if unicodedata.category(c) != 'Mn'))
TypeError: must be str, not dict
I also get a similar error when I try to do a .replace only. I understand now that these methods only apply to strings.
How can I get this to work? I searched around for a way to convert a dict to a string object but that didn't work.
Also, any criticism to optimize and make my code more readable are also welcome!