Replacing specific strings in dataframe

Question

I have a dataframe like this:

     Basic Stats    Min       Max      Mean     Stdev    Num  Eigenvalue
0      Band 1    0.428944  0.843916  0.689923  0.052534   1    0.229509
1      Band 10  -0.000000  0.689320  0.513170  0.048885   2    0.119217

And I want to replace Band 1 with LG68 and Band 10 with LG69

I have tried:

df=df.replace({'Band 1': 'LG68', 'Band 10': 'LG69'}, regex=True)

but this returns:

     Basic Stats    Min       Max      Mean     Stdev  Num  Eigenvalue
0      LG68     0.428944  0.843916  0.689923  0.052534  1    0.229509
1      LG680   -0.000000  0.689320  0.513170  0.048885  2    0.119217

because Band 10 also contains Band 1 within it.

I have also tried:

df=df.T
df=df.rename(columns={'Band 1':'LG68', 'Band10': 'LG69'})

but this fails silently (no names change at all), possibly because I don't have Band 1 and Band 10 as column names but are instead actual rows?

Joseph Stover · Accepted Answer · 2015-08-25 18:50:27Z

1

You can fix the regex by adding a $ to the end of Band 1, making the statement look like

df=df.replace({'Band 1$': 'LG68', 'Band 10': 'LG69'}, regex=True)

The $ matches the end of the line, so that Band 1$ will only match when Band 1 is followed by the end of the string or a newline character. You could also use \Z, which only matches the end of the string.

answered Aug 25, 2015 at 18:50

Joseph Stover

4274 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Jianxun Li · Accepted Answer · 2015-08-25 18:27:00Z

1

Maybe try using map function with a dict to describe the mapping relation.

df['Basic Stats'] = df['Basic Stats'].map({'Band 1': 'LG68', 'Band 10': 'LG69'})
df

  Basic Stats     Min     Max    Mean   Stdev  Num  Eigenvalue
0        LG68  0.4289  0.8439  0.6899  0.0525    1      0.2295
1        LG69 -0.0000  0.6893  0.5132  0.0489    2      0.1192

answered Aug 25, 2015 at 18:27

Jianxun Li

24.9k10 gold badges64 silver badges78 bronze badges

6 Comments

Stefano Potter Over a year ago

That returns all values within Basic Stats as NaN.

Jianxun Li Over a year ago

@StefanoPotter Can you show the result of df['Basic Stats'].values[:5] in your original dataframe? The resulting NaN is typically a consequence of mismatch between the keys in dict and the values in that Basic Stats column.

Stefano Potter Over a year ago

['     Band 1' '     Band 2' '     Band 3' '     Band 4' '     Band 5'] ['     Band 1' '     Band 2' '     Band 3' '     Band 4' '     Band 5']

I simplified original post, there are rally 80 bands numbered 1-80 I am doing this for

Jianxun Li Over a year ago

@StefanoPotter So there is a whitespace ahead of each band. Try .map({' Band 1': 'LG68', ' Band 10': 'LG69'}). Alternatively, try removing the leading or trailing whitespace before mapping.

Stefano Potter Over a year ago

There actually shouldn't be whitespace...but even with added space it returns NaN still

|

Ben · Accepted Answer · 2015-08-25 18:46:47Z

1

You are setting regex to true, so you should be able to just use a regex. Add $ to match the end of the string.

df=df.replace({'Band 1$': 'LG68', 'Band 10$': 'LG69'}, regex=True)

answered Aug 25, 2015 at 18:46

Ben

3901 silver badge10 bronze badges

Collectives™ on Stack Overflow

Replacing specific strings in dataframe

3 Answers 3

Comments

6 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related