0
import numpy as np
import pandas as pd

I have a df looks like this

   X
0  100A
1  100B
2  100B
3  500A
4  500B
5  400B
6  700A
7  200B
8  400B
9  900A
10  800B

My goal is to change them (string) to integer and divide the number by 10 inside the string if it contains 'A'

   X
0  10
1  100
2  100
3  50
4  500
5  400
6  70
7  200
8  400
9  90
10  800

I have tried to use for loop for the whole column

for i in df.X:
    if 'A' in i:
        y = i.replace('A','') 
        y = int(y)/10
        print(y)
    else:
        k = i.replace('B','')
        k = int(k)
        print(k)

But I can only print them out and I dont know how to replace the value directly and store them in the column. Also, this method seems to be slow? Is there a better methods in pandas? Thanks!

2 Answers 2

1

Try this:

a_index = df[df[0].str.endswith('A')]
df = df[0].str.slice(stop=-1).astype(int)
df[a_index.index] = df[a_index.index] / 10

First, I save indexes of every row that ends with A. Then, I convert everything to integer. Finally, we divide previously indexed items (a_index) by 10

Demo

Sign up to request clarification or add additional context in comments.

Comments

0

You can solve this using regular expression:

import re
import pandas as pd

df = pd.DataFrame({'A':['100A','100B','200A']})

for row in range(0,len(df)):
    if df.iloc[row].str.contains('A').any():
       new_value = int(re.sub('\D', '', df.iloc[row].values[0]))/10
       df.iloc[row] = new_value
    else:
        new_value = int(re.sub('\D', '', df.iloc[row].values[0]))
        df.iloc[row] = new_value

OUTPUT

     A
0  10.0
1   100
2  20.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.