0

Basically I have a dataframe that could look like this:

ID NAME                      PAINT
0  some_name:target          blue
1  some_other_name           pink
2  other_name: other_target  yellow
3  other_name                black

And only want to replace values that follow a certain regex by applying a function to them.

def f(x):
  name, target = x.split(":")
  return "[" + target + "]" + " " + name
ID NAME                        PAINT
0  [target] some_name          blue
1  some_other_name             pink
2  [other_target] other_name   yellow
3  other_name                  black

I imagine it would look something like this but whatever works

df.replace(to_replace=strings_found_by_regex, value=f(strings_found_by_regex))

This could probably be done by iterating over rows and seing if those cells match the regex and then appplying f(x) but that looks rather ugly and I wondered whether there is a better way.

1 Answer 1

3

try this, using Series.str.replace

Find out regex explanation here, regex101.com

df.NAME.str.replace("(.+)\s*:\s*(.+)", r"[\2] \1")

0           [target] some_name
1              some_other_name
2    [other_target] other_name
3                   other_name
Name: NAME, dtype: object
Sign up to request clarification or add additional context in comments.

4 Comments

Shouldn't \2 be stripped? Row index 2 NAME column desired answer is [other_target] other_name while this answer provides [ other_target] other_name (i.e. extra space in [ other_target].
care to explain the raw string part? Thanks!
@MatejNovosad, update regex link with substitution & here is link to thread that discuss about backreference
@MatejNovosad Here is another interesting link that explains what "\1" does.. Hope it clarifies your queries.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.