I have the following problem: I have two columns in my Dataframe in Python. The first one has names in it (string), the second one an integer, which codes the names. The code dissolves spelling variants. The problem is, that not all names are coded. I would like to make a third column, which has the clear name in it, when the second row is NaN and the code (as string) when there is a code.
Here is an example of the DataFrame:
import pandas as pd
df = pd.DataFrame([['Meyer', 2], ['Mueller', 4], ['Radisch', math.nan], ['Meyer', 2],['Pavlenko', math.nan]])
and here one, how I would like to have it:
df = pd.DataFrame([['Meyer', 2, '2'], ['Mueller', 4, '4'], ['Radisch',math.nan ,'Radisch'], ['Meyer', 2, '2'],['Pavlenko',math.nan ,'Pavlenko']])
Any suggestions how I can do that? I tried a for loop, but it does not work:
for d in range(0, len(df)):
if not (math.isnan(df['ref'][d])):
df.ix[d]['name2'] = df.ix[d]['ref']