1

I have a dataframe and want to drop the non numerical rows in the column Score

import pandas as pd

df=pd.DataFrame({
'Score': [4.0,6,'3 1/3',7,'43a'],
'Foo': ['Nis','and stimpy','d','cab','abba'],
'Faggio':[0,1,0,1,0]
})

The result I want should look like:

   Faggio         Foo  Score
0       0         Nis      4
1       1  and stimpy      6
3       1         cab      7

I have tried:

ds=df[df['Score'].apply(lambda x: str(x).isnumeric())]

print(ds)

ds2=df[df['Score'].apply(lambda x: str(x).isdigit())]

print(ds2)

But both of them erased the column with the float.

1 Answer 1

2

I think you need add isnull for checking NaN values, because your function return NaN if not number. Better and faster is use text method str.isnumeric() and str.isdigit() with boolean indexing:

print df['Score'].str.isnumeric()
0      NaN
1      NaN
2    False
3      NaN
4    False
Name: Score, dtype: object

print df['Score'].str.isnumeric().isnull()
0     True
1     True
2    False
3     True
4    False
Name: Score, dtype: bool

print df[df['Score'].str.isnumeric().isnull()]
   Faggio         Foo Score
0       0         Nis     4
1       1  and stimpy     6
3       1         cab     7

print df[df['Score'].str.isdigit().isnull()]
   Faggio         Foo Score
0       0         Nis     4
1       1  and stimpy     6
3       1         cab     7

Similar solution with to_numeric and notnull:

print df[pd.to_numeric(df['Score'], errors='coerce').notnull()]
   Faggio         Foo Score
0       0         Nis     4
1       1  and stimpy     6
3       1         cab     7
Sign up to request clarification or add additional context in comments.

5 Comments

Not on the example but on my real data I would get this error "Can only use .str accessor with string values, which use np.object_ dtype in pandas"
No problem, you can add casting to string - df['Score'].astype(str).str.isnumeric()
Thanks I did not remember you can coerce the data type
Ok, I try add similar solution. Thank you for accepting. If you want, you can upvote too. Thanks.
funny thing when errors are not coerced some csv sheets are all blank. So I think df[pd.to_numeric(df['Score'], errors='coerce').notnull()] is the best solution

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.