5

I'm trying to extract the number before character "M" in a series of strings. The strings may look like:

"107S33M15H"
"33M100S"
"12M100H33M"

so basically there would be a sets of numbers separated by different characters, and "M" may show up more than once. For the example here, I would like my code to return:

33
33
12,33 #doesn't matter what deliminator to use here

One way I could think of is to split the string by "M", and find items that are pure numbers, but I suspect there are better ways to do it. Thanks a lot for the help.

2 Answers 2

19

You may use a simple (\d+)M regex (1+ digit(s) followed with M where the digits are captured into a capture group) with re.findall.

See IDEONE demo:

import re
s = "107S33M15H\n33M100S\n12M100H33M"
print(re.findall(r"(\d+)M", s))

And here is a regex demo

Sign up to request clarification or add additional context in comments.

Comments

2

You can use rpartition to achieve that job.

s = '107S33M15H'    
prefix = s.rpartition('M')[0]

1 Comment

I used this to add a new column to my data frame. This is the code: df['new_col'] = df.old_col.str.rpartition('b')[2] # Where b is the letter to be removed and 2 is the position in the 'rpartition' array of the characters you want in the new column. Thanks for the code. was very useful.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.