Python - Getting user data using regex

Question

So, I'm still a newbie with regex and python. I've been searching for some time but don't know how to ask what I'm looking for.

I need to get data from a formatted string into a list of lists, or dictionary.

-------------------------------------------------------------------
Frank         114      0         0         0          0         114       
Joe           49       1         0         0          0         50        
Bob           37       0         0         0          0         37        
Sally         34       2         0         0          0         36

This is the output of a script. Currently I have:

match_list = []
match = re.search('\n(\w+)\s+(\d*)\s+(\d*)', output)
  if match:
    match_list.append([match.group(1),
                       match.group(2),
                       match.group(3)])
>>>print match_list
[['frank', '114', '0']]

This is perfect, except that I need to have match_list return:

[['frank', '114', '0'],
 ['Joe', '49', '1'],
 ['Bob', '37', '0'],
 ['Sally', '34', '2']]

My initial thought was to for loop, and check if the match.group(1) was already listed, and if so move to the next, but then I realized I didn't know how to do that. But there you have it. I am having a hard time figuring this out. Any help would be fantastic! :)

Oh also. The list size changes. Sometimes there may only be one user, other times there may be 20 users. So I can just set up a giant static regex. (that I know of...)

Is there a reason that you have to use regex (like an assignment requirement) or can you use anything which works? — DSM
– DSM, Commented Sep 11, 2012 at 20:23
No it's not an assignment. I'm just data tracking. I was hoping to keep it in regex, as I've been told it's very useful, and would like to be more familiar with it. If there's an incredibly simpler way though, I'd be fine with that. — jtsmith1287
– jtsmith1287, Commented Sep 11, 2012 at 20:27

thikonom · Accepted Answer · 2012-09-11 20:41:54Z

4

You can use re.findall:

match_list = []
match = re.findall('\n(\w+)\s+(\d*)\s+(\d*)', output)
for k in match:
    #k will be a tuple like this: ('frank', '114', '0')
    match_list.append(list(k))

or Same solution as an oneliner:

match_list = map(list, re.findall('\n(\w+)\s+(\d*)\s+(\d*)', output))

edited Sep 11, 2012 at 20:41

answered Sep 11, 2012 at 20:28

thikonom

4,2673 gold badges30 silver badges30 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

jtsmith1287 Over a year ago

This is perfect. I needed to loop through the matches anyway, so this will cut out a step for me. It also let's me add to my regex and pull from the other columns in the future without getting bloated lists.

the wolf · Accepted Answer · 2012-09-11 21:47:59Z

3

You don't need a regex:

table="""\
-------------------------------------------------------------------
Frank         114      0         0         0          0         114       
Joe           49       1         0         0          0         50        
Bob           37       0         0         0          0         37        
Sally         34       2         0         0          0         36"""

print [line.split() for line in table.splitlines()[1:]]

Or, if you want a regex:

print [list(t) for t in re.findall(r'^(\w+)'+r'\s+(\d+)'*6,table,re.MULTILINE)]

Either case, prints:

[['Frank', '114', '0', '0', '0', '0', '114'], 
 ['Joe', '49', '1', '0', '0', '0', '50'], 
 ['Bob', '37', '0', '0', '0', '0', '37'], 
 ['Sally', '34', '2', '0', '0', '0', '36']]

edited Sep 11, 2012 at 21:47

answered Sep 11, 2012 at 20:31

the wolf

35.7k13 gold badges57 silver badges73 bronze badges

1 Comment

DSM Over a year ago

This is similar to what I would have done, except that I'd've used .splitlines(). This makes assumptions about how the data looks that the regex doesn't, but I'd still start from this.

Collectives™ on Stack Overflow

Python - Getting user data using regex

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related