Python pandas remove part from string after substring

Question

Trying to clear wrong text that comes after the "Model" value in the "Name" column.

df = pd.DataFrame([['ABC-12(s)', 'Some text ABC-12(s) wrong text'], ['ABC-45', 'Other text ABC-45 garbage text'], ['XYZ-LL', 'Another text XYZ-LL unneeded text']], columns = ['Model', 'Name'])

index	Model	Name
0	ABC-12(s)	Some text ABC-12(s) wrong text
1	ABC-45	Other text ABC-45 garbage text
2	XYZ-LL	Another text XYZ-LL unneeded text

Expected result:

index	Model	Name
0	ABC-12(s)	Some text ABC-12(s)
1	ABC-45	Other text ABC-45
2	XYZ-LL	Another text XYZ-LL

Have tried:

df["name"] = df["name"].str.partition(df["model"].to_string(), expand=False)

But that gives back the original string without changes or error. Like it could not find the delimiter within the "Name" cell. Seems like I'm missing something very simple.

Just out of curiosity does someone knows why the "Have tried" section did not work? — Lauris Kārkliņš
– Lauris Kārkliņš, Commented Aug 10, 2021 at 7:39

Andrej Kesely · Accepted Answer · 2021-08-09 20:04:43Z

3

Another solution, using re:

import re

df["Name"] = df.apply(
    lambda x: re.split(r"(?<=" + re.escape(x["Model"]) + r")\s*", x["Name"])[0],
    axis=1,
)
print(df)

Prints:

       Model                 Name
0  ABC-12(s)  Some text ABC-12(s)
1     ABC-45    Other text ABC-45
2     XYZ-LL  Another text XYZ-LL

answered Aug 9, 2021 at 20:04

Andrej Kesely

196k15 gold badges60 silver badges105 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Lauris Kārkliņš Over a year ago

Thank you, this works! Preferred the solution without additional module as the regex is well known for not being beginner user friendly.

ALollz · Accepted Answer · 2021-08-09 20:04:07Z

2

You can partition in a list comprehension and then join the first two parts back.

df['name_mod'] = [''.join(name.partition(model)[:-1]) 
                  for name,model in zip(df['Name'], df['Model'])]

       Model                               Name             name_mod
0  ABC-12(s)     Some text ABC-12(s) wrong text  Some text ABC-12(s)
1     ABC-45     Other text ABC-45 garbage text    Other text ABC-45
2     XYZ-LL  Another text XYZ-LL unneeded text  Another text XYZ-LL

answered Aug 9, 2021 at 20:04

ALollz

59.7k7 gold badges73 silver badges97 bronze badges

3 Comments

user15512272 Over a year ago

I wasn't familiar with partition. Is this the right one? docs.python.org/3/library/stdtypes.html#str.partition

ALollz Over a year ago

@merced yes that’s it

Lauris Kārkliņš Over a year ago

Thank you, this works, however be notified that the XXX in df['XXX'] is case sensitive so if it doesn't work right away check the case.

Collectives™ on Stack Overflow

Python pandas remove part from string after substring

2 Answers 2

1 Comment

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related