Extract two patterns at once using regex

Question

I have a list of strings, each with the following pattern (a set of words followed by parentheses enclosing comma separated words):

"vw xy zz (X, Y, Z)"

My desired output is:

["vw xy zz", "X", "Y", "Z"]

I know how to extract the text before the parentheses:

import re
pattern = r"(^[^\(]+)"
text = "vw xy zz (X, Y, Z)"
re.findall(pattern, text)
# ['vw xy zz ']

I also know how to extract the text between the parentheses:

pattern = r"\(.*\)"
text = "vw xy zz (X, Y, Z)"
re.findall(pattern, text)
# ['(X, Y, Z)']

But I'm wondering if there is a way to combine the patterns to get the desired output all at once.

re.findall(r'[^(),\s](?:[^(),]*[^(),\s])?', s) - all at once with no need to trim the items. Allows any chars but parentheses and commas. Demo — Wiktor Stribiżew
– Wiktor Stribiżew, Commented Feb 18, 2019 at 17:34

Wiktor Stribiżew · Accepted Answer · 2019-02-18 17:48:04Z

3

If the values are not alphanumeric only, and may contain any chars but whitespaces and commas, I suggest usign a "generic" regex based on negated character classes:

re.findall(r'[^(),\s](?:[^(),]*[^(),\s])?', s)

See the regex demo.

There is no need to strip() the items after the re.findall returns all the matches.

Details

[^(),\s] - a negated character class matching any char but (, ), , and whitespace
(?:[^(),]*[^(),\s])? - 1 or 0 occurrences of:
- [^(),]* - any chars but (, ) and ,
- [^(),\s] - any char but (, ), , and whitespace

answered Feb 18, 2019 at 17:48

Wiktor Stribiżew

631k41 gold badges502 silver badges632 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Aziz.G Over a year ago

great answer :)

Aziz.G Over a year ago

best one this will match what you want exactly

Ajax1234 · Accepted Answer · 2019-02-18 17:33:31Z

1

You can use re.findall:

s = "vw xy zz (X, Y, Z)"
result = [i.strip() for i in re.findall('[\w\s]+', s)]

Output:

['vw xy zz', 'X', 'Y', 'Z']

answered Feb 18, 2019 at 17:33

Ajax1234

71.7k9 gold badges67 silver badges110 bronze badges

Comments

Aziz.G · Accepted Answer · 2019-02-18 17:45:35Z

1

const regex = /([a-zA-Z]{1,2}\s?){3}|[A-Z]/g

const text = "vw xy zz (X, Y, Z)"
const res = text.match(regex);
console.log(res)

this regex will match : ["vw xy zz ", "X", "Y", "Z"]

you can test it here regex tester

([a-zA-Z]{1,2}\s){3}|[A-Z]

edited Feb 18, 2019 at 17:45

answered Feb 18, 2019 at 17:33

Aziz.G

3,7913 gold badges20 silver badges38 bronze badges

Collectives™ on Stack Overflow

Extract two patterns at once using regex

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related