0

I am doing a project which requires data cleaning. I wanted to clear columns which have strings in them.

What I planned to do is define a function and then use it. I wrote the function but it is not working.
Here's the function:-

def removeStringColumns(df):
    for i in (df.columns):
        if type(df[i][0]) == "str":
            df = df.drop(df[i], axis=1)
    return df

And here's how I call it.

data = pd.read_csv("./data.csv")
data.dropna()
data = data.replace(np.nan, 0)
data = removeStringColumns(data)

1 Answer 1

1

Try select_dtypes and exclude 'object':

filtered_df = df.select_dtypes(exclude='object')

Or to select only numeric columns include 'number':

filtered_df = df.select_dtypes(include='number')

Sample df:

import numpy as np
import pandas as pd

df = pd.DataFrame({'v1': np.arange(0, 10),
                   'v2': ['dog'] * 10,
                   'v3': ['cat'] * 10,
                   'v4': np.arange(10, 20)})
   v1   v2   v3  v4
0   0  dog  cat  10
1   1  dog  cat  11
2   2  dog  cat  12
3   3  dog  cat  13
4   4  dog  cat  14
5   5  dog  cat  15
6   6  dog  cat  16
7   7  dog  cat  17
8   8  dog  cat  18
9   9  dog  cat  19

filtered_df:

   v1  v4
0   0  10
1   1  11
2   2  12
3   3  13
4   4  14
5   5  15
6   6  16
7   7  17
8   8  18
9   9  19
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.