1

Given a text, I need to check for each char if it has exactly (edited) 3 capital letters on both sides and if there are, add it to a string of such characters that is retured.

I wrote the following: m = re.match("[A-Z]{3}.[A-Z]{3}", text) (let's say text="AAAbAAAcAAA")

I expected to get two groups in the match object: "AAAbAAA" and "AAAcAAA"

Now, When i invoke m.group(0) I get "AAAbAAA" which is right. Yet, when invoking m.group(1), I find that there is no such group, meaning "AAAcAAA" wasn't a match. Why?

Also, when invoking m.groups(), I get an empty tuple although I should get a tuple of the matches, meaning that in my case I should have gotten a tuple with "AAAbAAA". Why doesn't that work?

4
  • Are you doing the python challenge? :-) Commented May 25, 2012 at 17:24
  • Nope, which challenge? can you put a link? Commented May 25, 2012 at 17:30
  • It's fun ... but it can get frustrating ... :) pythonchallenge.com Commented May 25, 2012 at 18:05
  • nice, im on lvl 3, i guess my teacher copied the q's lol! can you answer why is that '[A-Z]{3}([a-z])(?=[A-Z]{3})' the regex i need? how do i decide where to use (?=...) type of parentheses? Commented May 25, 2012 at 18:35

2 Answers 2

4

You don't have any groups in your pattern. To capture something in a group, you have to surround it with parentheses:

([A-Z]{3}).[A-Z]{3}

The exception is m.group(0), which will always contain the entire match.

Looking over your question, it sounds like you aren't actually looking for capture groups, but rather overlapping matches. In regex, a group means a smaller part of the match that is set aside for later use. For example, if you're trying to match phone numbers with something like

([0-9]{3})-([0-9]{3}-[0-9]{4})

then the area code would be in group(1), the local part in group(2), and the entire thing would be in group(0).

What you want is to find overlapping matches. Here's a Stack Overflow answer that explains how to do overlapping matches in Python regex, and here's my favorite reference for capture groups and regex in general.

Sign up to request clarification or add additional context in comments.

3 Comments

ohh i just love this site,thx. so each parentheses defines a group? and what about parentheses such as (?=...), meaning, with q mark. And i still dont know why doesn't my regex work
(?=) is a positive lookahead. It means that the engine will look forward in the string to determine a match without consuming the characters it inspects.
They suggest using finditer, it's documentation says: "Return an iterator yielding MatchObject instances over all <b>non-overlapping</b> matches for the RE pattern in string". it doesn't help me, i even tried it..
2

One, you are using match when it looks like you want findall. It won't grab the enclosing capital triplets, but re.findall('[A-Z]{3}([a-z])(?=[A-Z]{3})', search_string) will get you all single lower case characters surrounded on both sides by 3 caps.

8 Comments

Thanks, i see it works. why doesn't the left expression [A-Z]{3} surrounded with parentheses ? When im surrounding it with parentheses i get no matches, why?
Not sure why you get no matches when you put it in parens... but it's not in parens because it's not a match group or a look ahead or look behind.
so why is the last one in parens? can you explain all the parens in this regex? it's really important for me to understand.
There are parens surrounding the argumenst to findall, then the parens in ([a-z])` are defining that as a capture group, then the ones in (?=[A-Z]{3}) are defining the bounds of the lookahead term.
my problem is with understanding the lookahead. i read that in regular-expression.info: given (?=regex) the explaination is: Zero-width positive lookahead. Matches at a position where the pattern inside the lookahead can be matched. Matches only the position. It does not consume any characters or expand the match. In a pattern like one(?=two)three, both two and three have to match at the position where the match of one ends. I really don't understand this explaination and all other explaination on that subject in general. also, what is consume,expand in that context?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.