0

I have a dataframe and would like to replace some of the entries based on a regular expression match. Here is a toy example:

import pandas as pd
dfl = pd.DataFrame(np.random.randn(5,4), columns=list('ABCD'))

          A         B         C         D
0  0.995647 -0.507860  0.246656  0.400589
1 -0.149536 -0.485617 -0.132031  0.214816
2 -0.730974 -0.932630  0.625197  1.887758
3  2.812800  0.329197  0.233513  0.140899
4 -1.897268  0.072307  0.790148  0.096455

Now let's convert all the entries to be strings.

dfl = dfl.astype(str)

Now I would like to replace every number that contains 40, say, with the word boat say.

I tried:

dfl = dfl.replace(r'.*40.*', "boat") 

but this doesn't modify dfl at all.

What am I doing wrong?

0

1 Answer 1

3

Pass regex=True

dfl = dfl.replace('.*40.*', 'boat', regex=True)

Details

In [278]: dfl
Out[278]:
                 A                 B                C                D
0  -0.389710060851    0.864059364935   0.499405126285   0.457617711403
1   0.136417007517  -0.0650312534859  0.0745132664561    2.02466341236
2   0.842889708053   -0.370605269504  -0.626932398518  0.0440612725966
3  -0.403271275281    -1.37477622923  -0.499721883883   -1.55997893498
4    3.39420415568    0.152915014005   0.205876128883  -0.644183954321

In [279]: dfl = dfl.replace('.*40.*', 'boat', regex=True)

In [280]: dfl
Out[280]:
                 A                 B                C                D
0  -0.389710060851              boat             boat             boat
1   0.136417007517  -0.0650312534859  0.0745132664561    2.02466341236
2   0.842889708053   -0.370605269504  -0.626932398518             boat
3             boat    -1.37477622923  -0.499721883883   -1.55997893498
4    3.39420415568              boat   0.205876128883  -0.644183954321
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.