3

i have a file similar to that shown below- is it possible to perform a regex expression

text1  769,230,123
text2  70
text3  213,445
text4  24,356
text5  1,2,4

to give output as shown here?

['769','230','123']
['70']
['213','445']

My current code is as follows:

with open(filename,'r') as output:
    for line in output:
        a = line
        a = a.strip()
        #regex.compile here
        print regex.findall(a)

Any help or direction would be greatly useful to me. Thank you

5 Answers 5

1

The following regex will extract the comma separated numbers from the line, and then we can apply split(',') in order to extract the numbers:

import re
line = "text1  769,230,123"
mat = re.match(r'.*? ([\d+,]+).*', line)
nums = mat.group(1).split(',')
for num in nums:
    print num

OUTPUT

769
230
123
Sign up to request clarification or add additional context in comments.

Comments

1

It looks like you could just findall number sequences:

regex = re.compile("[ ,]([0-9]+)")

1 Comment

-1 using this regex to search the line will also return the 1 from text1
1

The following should work for you.

>>> import re
>>> regex = re.compile(r'\b\d+\b')
>>> with open(filename, 'r') as output:
...     for line in output:
...         matches = regex.findall(line)
...         for m in matches:
...             print m

Output

769
230
123
70
213
445
24
356
1
2
4

Comments

0

You don't need regular expressions for this. Just line.split(',').

4 Comments

-1 If we'll take your suggestion, the first line will return "text1 769" as the first value of the split.
@alfasin Could just split twice? x.split(',') for x in line.split(' '). I find it easier to comprehend.
@VivekRai counting on the number of spaces and on the position of each element from the split in the resulted list seems very unsafe IMHO.
Sure. Just in the the OP wished to try out other alternatives. Thanks!
0

Assuming you always have 2 spaces between text# and your comma separated values. Here's a simple way to extract the separated values into arrays

list = []
with open(filename,'r') as output:
    for line in output:
        line = line.strip('  ')
        list.append(line[1].strip(','))

This will produce a nested list

print list[0] #['769','230','123']
print list[1] #['70']
print list[2] #['213','445']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.