Using regex to get the value between two characters (Python 3)

Question

import re

value = "world_wide='test1/one/two', " \
        "stage_test='ALPHA', world_wide='test2/one/two', " \
        "stage_test='GAMMA', world_wide='test3/one/two', " \
        "stage_test='GAMMA', world_wide='test4/one/two', " \
        "stage_test='ALPHA', world_wide='test5/one/two', " \
        "stage_test='GAMMA', world_wide='test6/one/two', " \
        "stage_test='GAMMA"

pattern = r"(world_wide=\'.*\')"

for match in re.findall(pattern, str(value)):
    print ("\n", match)

Trying to filter for a specific output given a string value. With the code above the following output is given:

 world_wide='test1/one/two', stage_test='ALPHA', world_wide='test2/one/two', stage_test='GAMMA', world_wide='test3/one/two', stage_test='GAMMA', world_wide='test4/one/two', stage_test='ALPHA', world_wide='test5/one/two', stage_test='GAMMA', world_wide='test6/one/two', stage_test='

What I'm trying to get is: if the string matches a specific condition such as:

if 'world_wide=' is found, return the following value between the two characters. In this case, this would be two single quotes excluding the '/one/two'.

Desired output:

>>>test1
test2
test3
test4
.........

hello welcome to Stack Overflow! Could you reformat your question so that in the input string value is more readable? It seems like you have a quote closing problem in your post. — syltruong
– syltruong, Commented Jun 4, 2018 at 4:25

Jan · Accepted Answer · 2018-06-04 04:50:11Z

1

You could use the following expression:

world_wide='([^/]+)
# world_wide='
# capture anything not a / into group 1

In Python this is:

import re

value = "world_wide='test1/one/two', " \
        "stage_test='ALPHA', world_wide='test2/one/two', " \
        "stage_test='GAMMA', world_wide='test3/one/two', " \
        "stage_test='GAMMA', world_wide='test4/one/two', " \
        "stage_test='ALPHA', world_wide='test5/one/two', " \
        "stage_test='GAMMA', world_wide='test6/one/two', " \
        "stage_test='GAMMA"

rx = re.compile(r'''world_wide='([^/]+)''')
parts = rx.findall(value)
print(parts)

This yields a list containing

['test1', 'test2', 'test3', 'test4', 'test5', 'test6']

See a demo on regex101.com.

answered Jun 4, 2018 at 4:50

Jan

43.3k11 gold badges57 silver badges87 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

jchung · Accepted Answer · 2018-06-04 04:40:41Z

0

The regex you're looking for is probably as simple as pattern = r"world_wide='(.*)\/one". Here's a demo: https://regexr.com/3qffn

>>> import re
>>> value = ("world_wide='test1/one/two',stage_test='ALPHA')>,")
>>> pattern = r"world_wide='(.*)\/one"
>>> re.finall(pattern, value)
['test1']

What's making your question particularly hard to answer is that I think you have some typos in your example. On line 6 where you have stage_test='GAMMA')>]", I think you actually mean just stage_test='GAMMA')>,. Is that right?

answered Jun 4, 2018 at 4:40

jchung

9532 gold badges11 silver badges23 bronze badges

Comments

Giorgi Jambazishvili · Accepted Answer · 2018-06-04 04:52:11Z

0

Once I've to deal with regex, I always use https://regex101.com which is pretty nice and easy for fast prototyping and multiple case testing. For your regex please see the following link https://regex101.com/r/TLXDXl/1. My suggested pattern is:

r'world_wide='(?Ptest\d+)/'

which filters tries to match prefix using 'world_wide=', then matches the group named 'name', which starts with 'test' followed by one or more digit. more general approach would be to use '[\w\d]+' for name which will match any letter and digit.

Hope it helps

answered Jun 4, 2018 at 4:52

Giorgi Jambazishvili

7435 silver badges16 bronze badges

1 Comment

jessie Over a year ago

Thanks. Found regex101.com super useful.

Ilya · Accepted Answer · 2018-06-04 04:57:16Z

0

Why don't you use just split() instead of re?:

for item in value.split(','):
    if 'world_wide' in item:
        world_parts = item.split('\'')
        test_parts = world_parts[1].split('/')

        if 'test' in test_parts[0]:
            print(test_parts[0])

answered Jun 4, 2018 at 4:57

Ilya

1935 bronze badges

1 Comment

jessie Over a year ago

Considering using this option. Thanks!

Collectives™ on Stack Overflow

Using regex to get the value between two characters (Python 3)

4 Answers 4

Comments

Comments

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related