4

In the dataframe, I am trying to find numeric data columns which has dtype as "object". I want to do it automated way rather then looking into actual data within the dataframe.

I tried this, but it didn't work:

for obj_feature in df.select_dtypes(include="object").columns:
    if df[obj_feature].str.isalpha == False:
        print("Numeric data columns", obj_feature)

DDL to generate Dataframe:

import pandas as pd

df = pd.DataFrame({'id': [1, 2, 3],
                  'A': ['Month', 'Year', 'Quater'],
                  'B' : ['29.85', '85.43', '33.87'],
                  'C' : [45, 22, 33.4]})

Sorry forgot to add this: Expected Output: Pick Dataframe columns, B since it has numeric data values, but it has 'object' dtype.

Thanks!

3 Answers 3

3

You can use pandas.api.types.is_numeric_dtype:

from pandas.api.types import is_numeric_dtype
{c: is_numeric_dtype(df[c]) for c in df}

output:

{'id': True, 'A': False, 'B': False, 'C': True}

selecting the numeric columns:

Here use select_dtype:

df.select_dtypes('number')

output:

   id     C
0   1  45.0
1   2  22.0
2   3  33.4
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks @mozway for your answer, but I am expecting different output. Edited my question, Sorry I should have done that earlier.
not sure why I got downvoted, the question was clarified after I answered
I have not done that.
@Anku I didn't say it was you ;)
2

Not straight forward, the following is a wilcard and is all weather though

First select dtypes='object' Second attempt to coerce them to numeric, setting errors='coerce', what that will do is if alphanumeric, it will output them as NaN giving you the privilege to leverage dropna() and remain with only numeric/object dtypes

Code below

 df.select_dtypes('object').apply(lambda x: pd.to_numeric(x,errors='coerce')).dropna(axis=1)

Outcome

    B
0  29.85
1  85.43
2  33.87

5 Comments

Please see my edits.
Thanks @wwnde. This is what that I need as expected output.
What if I have NaN available in that column B. I suppose dropna(axis=1) will not work in that case. Am I right ?
I figure it out, then I will use this code: df.select_dtypes('object').apply(lambda x: pd.to_numeric(x, errors = 'coerce')).dropna(axis=1, how='all')
Dropna (how='all', axis=1)?
0

You might use pandas.api.types.is_numeric_dtype, consider following example

import pandas as pd
df = pd.DataFrame({'id': [1, 2, 3],
                  'A': ['Month', 'Year', 'Quater'],
                  'B' : ['29.85', '85.43', '33.87'],
                  'C' : [45, 22, 33.4]})
for colname in df.columns:
    print(colname,pd.api.types.is_numeric_dtype(df[colname]))

output

id True
A False
B False
C True

1 Comment

Thanks @Daweo for your answer, but I am expecting different output. Edited my question, Sorry I should have done that earlier.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.