Regular expression code is not working (Python)

Question

Assume I have a word AB1234XZY or even 1AB1234XYZ.

I want to extract ONLY 'AB1234' or 1AB1234 (ie. everything up until the letters at the end).

I have used the following code to extract that but it's not working:

base= re.match(r"^(\D+)(\d+)", word).group(0)

When I print base, it's not working for the second case. Any ideas why?

Do you want to match till 123 in both the cases? What if you have different numbers: - AB123452A? — Rohit Jain
– Rohit Jain, Commented Oct 17, 2012 at 15:45
I want to extract AB1234 so basically everything before the letters at the end. I'm pretty sure the code I have there worked before.... — user1328021
– user1328021, Commented Oct 17, 2012 at 15:49
@user1328021 why dont you put your input string to be searched so we can help better understand. also, if any of these answers have helped answer your question, you can mark them as accepted, or, if you have solved your own question, you can post it here as an answer so others can learn. — Inbar Rose
– Inbar Rose, Commented Oct 18, 2012 at 15:45
my input string to be searched is what I wrote 1AB1234XYZ and I want to extract 1AB1234 ... everything before the suffix of letters at the end. I'm working on trying solutions listed below and will mark the one that works as the answer. Thanks! — user1328021
– user1328021, Commented Oct 18, 2012 at 15:47

Justin Morgan · Accepted Answer · 2012-10-18 16:11:46Z

1

Your regex doesn't work for the second case because it starts with a number; the \D at the beginning of your pattern matches anything that ISN'T a number.

You should be able to use something quite simple for this--simpler, in fact, than anything else I see here.

'.*\d'

That's it! This should match everything up to and including the last number in your string, and ignore everything after that.

Here's the pattern working online, so you can see for yourself.

edited Oct 18, 2012 at 16:11

answered Oct 18, 2012 at 15:59

Justin Morgan

30.7k13 gold badges82 silver badges109 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user1328021 Over a year ago

Thank you!!!! I knew there had to be an easier way. And thanks for introducing me to RegexPlanet. That site is brilliant.

pogo · Accepted Answer · 2012-10-17 15:51:48Z

1

(.+?\d+)\w+ would give you what you want.

Or even something like this

^(.+?)[a-zA-Z]+$

answered Oct 17, 2012 at 15:51

pogo

1,5603 gold badges18 silver badges24 bronze badges

1 Comment

Justin Morgan Over a year ago

I would make the initial .+ greedy if I were you, since this will not work for 12AB1234XYZ (2 or more numbers at the beginning). However, it should work for his samples.

Inbar Rose · Accepted Answer · 2012-10-17 16:13:27Z

0

re.match starts at the beginning of the string, and re.search simply looks for it in the string. both return the first match. .group(0) is everything included in the match, if you had capturing groups, then .group(1) is the first group...etc etc... as opposed to normal convention where 0 is the first index, in this case, 0 is a special use case meaning everything.

in your case, depending on what you really need to capture, maybe using re.search is better. and instead of using 2 groups, you can use (\D+\d+) keep in mind, it will capture the first (non-digits,digits) group. it might be sufficient for you, but you might want to be more specific.

after reading your comment "everything before the letters at the end"

this regex is what you need:

regex = re.compile(r'(.+)[A-Za-z]')

answered Oct 17, 2012 at 16:13

Inbar Rose

43.7k24 gold badges91 silver badges137 bronze badges

1 Comment

Justin Morgan Over a year ago

re.match vs re.search shouldn't matter, since he's using the ^ anchor. That forces the match to start at the beginning of the string.

Collectives™ on Stack Overflow

Regular expression code is not working (Python)

3 Answers 3

1 Comment

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related