0

Using Python 2.6.6, I'm trying to format each element in a list using regex.

Example of elements in an array:

test1;apple;-fgnsldfgsbfdgb
test2;watermelon;-iwerunvfgkjsfg
test3;orange;wervxddgjbdhnf

I'd like to format the text to just grab what is in between the semicolons ;

apple
watermelon
orange

The regex to capture that is the following:

(?<=\;)(.*?)(?=\;)

I tried different variations of the following code:

for member in fruits:
    parseFruit = re.compile(member)
    member = member.sub( (\.),((?<=\;)(.*?)(?=\;)) )
    print("Fruit: ", member)

Nothing seems to work...

3 Answers 3

2
import re

s = """test1;apple;-fgnsldfgsbfdgb
test2;watermelon;-iwerunvfgkjsfg
test3;orange;wervxddgjbdhnf"""

fruits = re.findall(r';(.*?);', s)

for fruit in fruits:
    print('Fruit: %s' % fruit)

#fruits is ['apple', 'watermelon', 'orange']

Output:

Fruit: apple
Fruit: watermelon
Fruit: orange
Sign up to request clarification or add additional context in comments.

2 Comments

Sorry, I initially wrote lists in the title but it should have been array. I'm not sure if this changes anything but when I tried it, it prints it fine but the change doesn't keep
In variable fruits you have ['apple', 'watermelon', 'orange']
1

For your example data, instead of sub you can use search without a capturing group to get the first match.

(?<=;).*?(?=;)

fruits = ['test1;apple;-fgnsldfgsbfdgb',
'test2;watermelon;-iwerunvfgkjsfg',
'test3;orange;wervxddgjbdhnf']

for member in fruits:
    print ("Fruit: " + re.search("(?<=;).*?(?=;)", member).group(0))

Demo

If you want to use sub, you could match from the beginning of the string till the first occurence of ; and then als match that. Or match the last occurence of ; till the end of the string. Then replace that with an empty string.

^[^;]+;|;[^;]+$

for member in fruits:
    print("Fruit: " + re.sub(r'^[^;]+;|;[^;]+$', '', member))

Demo

5 Comments

Sorry, I initially wrote lists in the title but it should have been array. I'm not sure if this changes anything but when I tried it, it prints it fine but the change doesn't keep.
This works great! I had to add a "member =" instead of the "print" to keep the change, thank you!
Actually when I do a print(fruits) it prints it the original way... is there a write or save command I'm missing ?
@Awsmike In the example the result of each operation is printed but fruits itself is not modified. If you want to change fruits you might use map (Demo) or store the result in a new list (Demo)
Thank You, I ended up doing just that, creating a new list and using newlist.append(member) for anyone who was interested...
0

Alternatively, instead of using regex you can use the split function

FruitList= ['test1;apple;-fgnsldfgsbfdgb', 'test2;watermelon;-iwerunvfgkjsfg', 'test3;orange;wervxddgjbdhnf']
Fruits= [i.split(';')[1::2] for i in FruitList]

[['apple'], ['watermelon'], ['orange']]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.