2

I am a beginner in pandas and python, asking for a little bit help. Here is my dataset, the k_symbol column labels either UVER or SIPO, I want to replace UVER as int 0, and SIPO as int 1.

dataset

I tried dff.replace(to_replace=['k_symbol'], value=[1, 0]), but it does not seem right. Appreciate for any suggestions

6 Answers 6

1

apply() functions are notoriously slow, so if you care about speed, consider one of these solutions

1) map()

df["k_symbol"].map({"UVER":0, "SIPO":1})

2) boolean to int conversion

df["k_symbol"] = (df["k_symbol"] == "SIPO").astype(int)

Timings

%%timeit
df["k_symbol"] = (df["k_symbol"] == "SIPO").astype(int)
10 loops, best of 3: 83.3 ms per loop

%%timeit
df['k_symbol'].apply(lambda x : 0 if x == 'UVER' else 1 )
1 loop, best of 3: 550 ms per loop

%%timeit
df["k_symbol"].map({"UVER":0,"SIPO":1})
10 loops, best of 3: 83.6 ms per loop

Sign up to request clarification or add additional context in comments.

Comments

0

Use this single line to get desired result.

df.k_symbol = df.k_symbol.apply(lambda o : 1 if o == 'SIPO' else 0 if o == 'UVER' else o)

You can simplify it as below , if all other than SIPO will be 0

df.k_symbol = df.k_symbol.apply(lambda o : 1 if o == 'SIPO' else 0)

Comments

0
import pandas as pd

df = pd.DataFrame(["SIPO","UVER"] * 3, columns=["k_symbol"])

df["k_symbol"].map({"UVER":0,"SIPO":1})

Output: df

  k_symbol
0     SIPO
1     UVER
2     SIPO
3     UVER
4     SIPO
5     UVER

mapped:

0    1
1    0
2    1
3    0
4    1
5    0

Comments

0

Use .loc :

import pandas as pd

df = pd.DataFrame(
    [[1, "SIPO"], [0, "UVER"], [0, "UVER"], [0, "UVER"], [1, "UVER"],],
    columns=["gender", "k_symbol"],
)

df.loc[df["k_symbol"] == "SIPO", "k_symbol"] = 1
df.loc[df["k_symbol"] == "UVER", "k_symbol"] = 0

print(df)

Returning:

   gender k_symbol
0       1        1
1       0        0
2       0        0
3       0        0
4       1        0

Comments

0

You can pass an anonymous function (Lambda) stating a condition to check within apply.

df['k_symbol'] = df['k_symbol'].apply(lambda x : 0 if x == 'UVER' else 1 )

Comments

0

I believe a better (faster) way is using .eq():

df['k_symbol'] = df['k_symbol'].eq('SIPO').astype(int)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.