Remove rows where column value type is string Pandas

Question

I have a pandas dataframe. One of my columns should only be floats. When I try to convert that column to floats, I'm alerted that there are strings in there. I'd like to delete all rows where values in this column are strings...

EdChum · Accepted Answer · 2017-11-06 20:01:16Z

28

Use convert_objects with param convert_numeric=True this will coerce any non numeric values to NaN:

In [24]:

df = pd.DataFrame({'a': [0.1,0.5,'jasdh', 9.0]})
df
Out[24]:
       a
0    0.1
1    0.5
2  jasdh
3      9
In [27]:

df.convert_objects(convert_numeric=True)
Out[27]:
     a
0  0.1
1  0.5
2  NaN
3  9.0
In [29]:

You can then drop them:

df.convert_objects(convert_numeric=True).dropna()
Out[29]:
     a
0  0.1
1  0.5
3  9.0

UPDATE

Since version 0.17.0 this method is now deprecated and you need to use to_numeric unfortunately this operates on a Series rather than a whole df so the equivalent code is now:

df.apply(lambda x: pd.to_numeric(x, errors='coerce')).dropna()

edited Nov 6, 2017 at 20:01

answered Nov 6, 2014 at 9:14

EdChum

397k204 gold badges836 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

porteclefs Over a year ago

Thanks for this! My dataframe has multiple columns. Some columns need to have strings. For instance, I have a column 'name' and a column 'age'. The column 'age' needs to be numeric. I tried: df.age.convert_objects(convert_numeric=True) and got 'Series' object has no attribute 'convert_objects'.

EdChum Over a year ago

You need to do df[['age']].convert_objects(convert_numeric=True) in that case

porteclefs Over a year ago

Oh I see, so [['age']] picks out a the column in df. Very helpful. However, I'm getting a TypeError: convert_objects() got an unexpected keyword argument 'convert_numeric. I just checked the documentation and 'convert_numeric = True' is the correct argument. Thoughts?

porteclefs Over a year ago

Okay, I think that my pandas is out of date. Updating now.

magicsword Over a year ago

Hi. I get a 'convert_objects deprecated' FutureWarning when trying to use this. Any suggestions?

|

jpp · Accepted Answer · 2018-09-07 22:07:50Z

6

One of my columns should only be floats. I'd like to delete all rows where values in this column are strings

You can convert your series to numeric via pd.to_numeric and then use pd.Series.notnull. Conversion to float is required as a separate step to avoid your series reverting to object dtype.

# Data from @EdChum

df = pd.DataFrame({'a': [0.1, 0.5, 'jasdh', 9.0]})

res = df[pd.to_numeric(df['a'], errors='coerce').notnull()]
res['a'] = res['a'].astype(float)

print(res)

     a
0  0.1
1  0.5
3  9.0

answered Sep 7, 2018 at 22:07

jpp

166k37 gold badges301 silver badges362 bronze badges

Comments

geomars · Accepted Answer · 2018-06-28 13:36:34Z

1

Assume your data frame is df and you wanted to ensure that all data in one of the column of your data frame is numeric in specific pandas dtype, e.g float:

df[df.columns[n]] = df[df.columns[n]].apply(pd.to_numeric, errors='coerce').fillna(0).astype(float).dropna()

edited Jun 28, 2018 at 13:36

answered Jun 28, 2018 at 13:09

geomars

951 silver badge9 bronze badges

Comments

Karthik V · Accepted Answer · 2014-11-06 06:08:19Z

0

You can find the data type of a column from the dtype.kind attribute. Something like df[col].dtype.kind. See the numpy docs for more details. Transpose the dataframe to go from indices to columns.

answered Nov 6, 2014 at 6:08

Karthik V

1,8971 gold badge16 silver badges24 bronze badges

Collectives™ on Stack Overflow

Remove rows where column value type is string Pandas

4 Answers 4

6 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

6 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related