3

I have a string column in a dataframe and I'd like to insert a # to the begging of my pattern.

For example: My pattern is the letters 'pr' followed by any amount of numbers. If in my column there is a value 'problem in pr123', I would change it to 'problem in #pr123'.

I'm trying a bunch of code snippets but nothing is working for me.

Tried to change the solution to replace for 'pr#123' but this didn't work either.

df['desc_clean'] = df['desc_clean'].str.replace(r'([p][r])(\d+)', r'\1#\2', regex=True)

What's the best way I can replace all values in this column when I find this pattern?

2
  • You need df['desc_clean'] = df['desc_clean'].str.replace(r'(pr)(\d+)', r'\1#\2') if you need pr#123. To get #pr123, you need df['desc_clean'].str.replace(r'pr\d+', r'#\g<0>'). Add a word boundary, \b, in front of pr to match pr as a whole word. Commented Aug 25, 2020 at 19:46
  • This worked great! Thank you so much for your help @WiktorStribiżew Commented Aug 25, 2020 at 19:54

1 Answer 1

2

If you need pr#123 you can use

df['desc_clean'] = df['desc_clean'].str.replace(r'(pr)(\d+)', r'\1#\2')

To get #pr123, you can use

df['desc_clean'].str.replace(r'pr\d+', r'#\g<0>')

To match pr as a whole word, you can add a word boundary, \b, in front of pr:

df['desc_clean'].str.replace(r'\bpr\d+', r'#\g<0>')

See the regex demo.

Sign up to request clarification or add additional context in comments.

1 Comment

This works great and solved my problem. Thanks for the help Wiktor!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.