2

Using python3.8, pandas 1.1.4

MRE:

df = pd.DataFrame({"a":[1,2,3,4,5, 6], "b":["01", "02", 1, "//N", None, np.nan]})

I want to convert column b to integer so that "01" = 1 (they should be same), "02"=2, etc... while replacing values that cannot be converted to integer to 0.

Current method:

c = []
for val in df["b"].unique():
    try:
        int(val)
    except:
        c.append(val)
        pass
    
df["b"] = df["b"].replace(c, 0)
df["b"] = df["b"].astype(int)

outputting (it's desired output):

    a   b
0   1   1
1   2   2
2   3   1
3   4   0
4   5   0
5   6   0

Even though this works, I'm searching for more efficient and readable way.

2 Answers 2

3

We can use pd.to_numeric + fillna to replace NaN with 0 and downcast from float to int:

df['b'] = pd.to_numeric(df['b'], errors='coerce').fillna(0, downcast='int')

Or convert to int with astype:

df['b'] = pd.to_numeric(df['b'], errors='coerce').fillna(0).astype(int)

df:

   a  b
0  1  1
1  2  2
2  3  1
3  4  0
4  5  0
5  6  0
Sign up to request clarification or add additional context in comments.

2 Comments

from documentation, it says when downcast parameter is set to 'integer' it convert to smallest signed int dtype. So I've tried df['b'] = pd.to_numeric(df['b'], errors='coerce', downcast='integer').fillna(0) however this returns float datatype. Do you know why?
Smallest type possible. If there is a string "3.45" in b the smallest possible type is float. downcast will not make a dtype change that would result in a data loss. Additionally, NaN is a float so downcast will not work with to_numeric if coercing errors to NaN "As this behaviour is separate from the core conversion to numeric values, any errors raised during the downcasting will be surfaced regardless of the value of the ‘errors’ input.". This is why I added the astype(int) option in case you need to force the reduction from float to int.
0

You could also try np.where:

x = df['b'].astype(str).str.extract('(\d+)')
df['b'] = np.where(x.notna(), x.astype(float), 0)

And now:

print(df)

Is:

   a    b
0  1  1.0
1  2  2.0
2  3  1.0
3  4  0.0
4  5  0.0
5  6  0.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.