0

Is there a way of retrieving a pattern inside a regular expression in Python? eg.:

strA = '\ta -- b'

and I would like to retrieve 'a' and 'b' into different variables

4
  • without using split() method, because the pattern of string can vary Commented Aug 27, 2014 at 21:35
  • (?<=\\t)(\w+\b).*?(\b\w+\b) (example: regex101.com/r/fW1fV7/1) Commented Aug 27, 2014 at 21:37
  • Please describe how the pattern varies (otherwise a regex may not match all possibilities) Commented Aug 27, 2014 at 21:38
  • What separates patterns in the regex? Will your first variable always follow \t? Will the two always be separated by ` -- `? Will there be multiple patterns within the string, or will there only ever be 2? Commented Aug 27, 2014 at 21:38

2 Answers 2

1

Sounds like you are talking about saving/capturing groups:

>>> import re
>>> pattern = r"\t(\w+) -- (\w+)"
>>> s = '       test1 -- test2'
>>> a, b = re.search(pattern, s).groups()
>>> a
'test1'
>>> b
'test2'
Sign up to request clarification or add additional context in comments.

1 Comment

Should \t be \\t ?
0

You can't retrieve a pattern, you can match or retrieve a capturing group content matching your pattern.

Following your example:

\ta -- b

If you want to retrieve the content you can use capturing groups using parentheses like:

\t(a) -- (b)

Regular expression visualization

The regex explanation for this is:

\t                       '\t' (tab)
(                        group and capture to \1:
  a                        'a'
)                        end of \1
 --                      ' -- '
(                        group and capture to \2:
  b                        'b'
)                        end of \2

Then you would be able to grab group content by accesing then through its index like \1 (or $1) and group \2 (or $2). But you are going to grab the matched content, not the pattern itself.

So, if you have:

\t(\w) -- (\d)

Regular expression visualization

You would grab content that matches that pattern:

\t                       '\t' (tab)
(                        group and capture to \1:
  \w                       word characters (a-z, A-Z, 0-9, _)
)                        end of \1
 --                      ' -- '
(                        group and capture to \2:
  \d                       digits (0-9)
)                        end of \2

But you can't retrieve \w or \d itself.

If you want to get the patterns for this:

\t\w -- \d

You should split above string by -- and you would get strings::

"\t\w "
" \d"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.