4

If pandas df looks like that:

Id    Name        Gene
1    ARR_R         C
1    AR2           C
1    A3412d_R      C
1    Asfsvv        C
1    A_RUUYR_R     C

And I need to delete for example _R but only if it occurs in for example 5 last chars.

I tried this way:

df['Name']=(df.Name.replace({'_R':''}, regex=True))

But in this example code change A_RUUYR_R to 'AUUYR' and should be A_RUUYR. It's possible to command replace function to start from the end?

1
  • 3
    post your expected output Commented Jul 31, 2019 at 7:37

3 Answers 3

2

IIUC, you can use slicing and concatenation like:

df.Name.str[:-5] + df.Name.str[-5:].replace({'_R':''}, regex=True)

[out]

0        ARR
1        AR2
2     A3412d
3     Asfsvv
4    A_RUUYR
Name: Name, dtype: object
Sign up to request clarification or add additional context in comments.

2 Comments

does OP wants to remove _R for 0 element.? I need to delete for example _R but only if it occurs in for example 5 last chars.
this solution works well if all the stringe have 5 string min length, a string like AB_R would be changed though
1

IIUC

df.Name.apply(lambda x: re.sub('(?<=\w{5})_R','',x) if re.findall('\w{5}_R',x) else x)

Output

0      ARR_R
1        AR2
2     A3412d
3     Asfsvv
4    A_RUUYR
Name: Name, dtype: object

Comments

1

If you want to replace _R if and only if it occurs after first 5 characters, use:

df['Name'].str.replace('(?<=.{5})(_R)','', regex=True)

Output:

0      ARR_R
1        AR2
2     A3412d
3     Asfsvv
4    A_RUUYR
Name: Name, dtype: object

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.