I have a column of data that looks like this:
df = pd.DataFrame({'Ex1':['apple','apple1','Peear','peAr','b$nana','Bananas'],
'Ex2': ['Applet','banan','apples','PAIR','banana','apple'],
'Ex3':['Pears', 'Banaa', 'Apple', 'apple1', 'pear', 'abanana]}); df
And then I have three arrays that identify misspellings of fruit types as the canonical fruit type:
apple = ['apple1','Applet','apples','Apple']
pear = ['Peear','peAr','PAIR','Pears','p3ar']
banana = ['b$nana','Bananas','banan','Banaa','abanana']
How can I iterate over each of the columns to change the misspelled fruit into the correct ones. I.e. the final data frame should look like this:
Ex1 Ex2 Ex3
0 apple apple pear
1 apple banana banana
2 pear apple apple
3 pear pear apple
4 banana banana pear
5 banana apple banana
I know I could achieve this outcome with the following code:
replacements = {
"apple":'apple1',
"apple":'Applet',
...}
df['Ex1'].replace(replacements, inplace=True)
But I have a list of 1000+ rows and I don't want go through and make each replacement in replacements because that will take a lot of time.
Any suggestions for doing this in a way that I can use my apple, pear, and banana variables as-is?
dictreplacementsbackwards? Are you just asking how to construct it programmatically?apple,bananaandpearvariables.listvariables above?listvariables above.