I have a pandas dataframe in which I'm trying to run some operations on a column of string values which includes some missing data being interpreted as float('nan'), equivalent to:
df = pd.DataFrame({'otherData':[1,2,3,4],'stringColumn':[float('nan'),'Random string one... ','another string.. ','a third string ']})
DataFrame contents:
otherData stringColumn
1 nan
2 'Random string one... '
3 'another string.. '
4 ' a third string '
I want to clean the stringColumn data of the various trailing ellipses and whitespace, and impute empty strings, i.e. '', for nan values.
To do this, I'm using code equivalent to:
df['stringColumn'] = df['stringColumn'].fillna('')
df['stringColumn'] = df['stringColumn'].str.strip()
df['stringColumn'] = df['stringColumn'].str.strip('...')
df['stringColumn'] = df['stringColumn'].str.strip('..')
The problem I'm encountering is that when I run this code in the script I've written, it doesn't work. There are still nan values in my 'stringColumn' column, and there are still some, but not all, ellipses. There are no warning messages. However, when I run the exact same code in the python shell, it works, imputing '' for nan, and cleaning up as desired. I've tried running it in IDLE 3.5.0 and Spyder 3.2.4, with the same result.