4

I have a huge DataFrame with a column which has a list of names. The names have numbers and brackets attached to them. I'm trying to strip them off from the names. I found that the method which would work for this is:

df.Name = df.Name.str.replace(r'[\(\)\d]+', '')

Could someone please help me to understand the syntax inside the replace function?

(r'[\(\)\d]+', '')
1
  • 2
    It is a regex that replaces every (, ) and digit (0 to 9) with the empty string so it removes these characters. Commented Apr 1, 2017 at 21:09

1 Answer 1

5

Could someone please help me to understand the syntax inside the replace function?

What you see is a regular expression. Regular expressions have a special syntax to specify patterns.

In this regex the [...] means a character group. The character group here is filled with \( (the open bracket), \) (the closing bracket) and \d (digits).

The + at the end means "one or more" so we specify that the pattern consists out of sequence of the characters in the character group. So a string like '142(2' will match the regex.

You replace all substrings in the string that match with that pattern by the empty string, so you remove them.

A useful tool to build, test, and fix regexes, is regex101. If you follow the link you can specify a regex and see what strings that do match the described pattern. At the right side there is a panel that aims to explain in natural language what the pattern is doing.

Furthermore there is this regex visualizer that shows the structure of the regex:

visualization of the regex

A substring "matches" if you can follow the railroads until you reach your destination, so here we can keep cycling through the gray box as long as there is an open bracket, closing bracket or a digit until we decide to hit the finish.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot for such a detailed explanation! I'll go through the links you provided! :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.