Pandas: Replace Function syntax

Question

I have a huge DataFrame with a column which has a list of names. The names have numbers and brackets attached to them. I'm trying to strip them off from the names. I found that the method which would work for this is:

df.Name = df.Name.str.replace(r'[\(\)\d]+', '')

Could someone please help me to understand the syntax inside the replace function?

(r'[\(\)\d]+', '')

It is a regex that replaces every (, ) and digit (0 to 9) with the empty string so it removes these characters. — willeM_ Van Onsem
– willeM_ Van Onsem, Commented Apr 1, 2017 at 21:09

Graham · Accepted Answer · 2017-10-07 15:50:58Z

5

Could someone please help me to understand the syntax inside the replace function?

What you see is a regular expression. Regular expressions have a special syntax to specify patterns.

In this regex the [...] means a character group. The character group here is filled with \( (the open bracket), \) (the closing bracket) and \d (digits).

The + at the end means "one or more" so we specify that the pattern consists out of sequence of the characters in the character group. So a string like '142(2' will match the regex.

You replace all substrings in the string that match with that pattern by the empty string, so you remove them.

A useful tool to build, test, and fix regexes, is regex101. If you follow the link you can specify a regex and see what strings that do match the described pattern. At the right side there is a panel that aims to explain in natural language what the pattern is doing.

Furthermore there is this regex visualizer that shows the structure of the regex:

A substring "matches" if you can follow the railroads until you reach your destination, so here we can keep cycling through the gray box as long as there is an open bracket, closing bracket or a digit until we decide to hit the finish.

edited Oct 7, 2017 at 15:50

Graham

7,86020 gold badges67 silver badges92 bronze badges

answered Apr 1, 2017 at 21:11

willeM_ Van Onsem

482k33 gold badges483 silver badges624 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Kishaan92 Over a year ago

Thanks a lot for such a detailed explanation! I'll go through the links you provided! :)

Collectives™ on Stack Overflow

Pandas: Replace Function syntax

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related