2

I am trying to remove all non-number values from a specific column using pandas:

enter image description here

(a) I want to change all the last column values to float. There are some non-zero numeric values in the column that I want to preserve as floats.

(b) If non-numeric values exist, I want to replace them all to 0.0 (float). For example, in line 247, the last column has "a", I would like to change it to 0.0.

I have tried:

def isnumber(x):
    try:
        float(x)
        return True
    except:
        return False

df = pd.read_csv(filename)
df = df[clean_up.applymap(isnumber)]

This however changes every column:

enter image description here

2 Answers 2

2

First of all, you should read your csv file with header=None, skipinitialspace=True and quoting=1 as arguments of read_csv because your header looks like a normal row, you have space after each field separator and you have some values enclosed by double quotes.

To convert your last column, use pd.numeric:

df = pd.read_csv('tubedata.csv', header=None, quoting=1, skipinitialspace=True)
df[5] = pd.to_numeric(df[5], errors='coerce').fillna(0)

Output:

# Before conversion
>>> df[5].unique()
array(['0', '3', '2', '4', '6', 'a'], dtype=object)

# After conversion
>>> df[5].unique()
array([0., 3., 2., 4., 6.])
Sign up to request clarification or add additional context in comments.

5 Comments

Please share your csv file (or a sample), it will be simpler
because it needs iloc[:, -1] not iloc[-1]
@MugheesAsif. Thanks to accepted my answer. Next time share your data sooner, it will be easier for everyone
Is it possible to do this to the entire dataframe?
@HirakSarkar. df.apply(lambda x: pd.to_numeric(x, errors='coerce'))
1

You can apply pd.to_numeric to every column by mapping it over the DataFrame. Using errors='coerce' turns columns that cannot be converted to NaN so they can be easily dropped in the following step:

df.apply(pd.to_numeric, errors='coerce').dropna(axis=1)

To allow for bad values (such as "a") within columns that are otherwise numeric, use fillna instead:

pd.to_numeric(df['some column'], errors='coerce').fillna(0.0)

2 Comments

This changes the full dataframe to one column? I only want to change the last column
I do not want to drop any columns, I am only interested in changing the last column to (a) all float, and (b) all non-numeric values to 0.0.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.