create list from list based on string pattern

Question

I have a list like the example data below. Every entry in the list follows the pattern 'source/number_something/'. I would like to create a new list like the output below, where the entries are just the "something". I was thinking I could use a for loop and string split on _ but some of the texts that follow also include _. This seems like something that could be done with regex, but I'm not that good at regex. Any tips are greatly appreciated.

example data:

['source/108_cash_total/',
 'source/108_customer/',
 'source/108_daily_units_total/',
 'source/108_discounts/',
 'source/108_employee/',
'source/56_cash_total/',
 'source/56_customer/',
 'source/56_daily_units_total/',
 'source/56_discounts/',
 'source/56_employee/']

output:

['cash_total',
 'customer',
 'daily_units_total',
 'discounts',
 'employee',
'cash_total',
 'customer/',
 'daily_units_total',
 'discounts',
 'employee']

Jan · Accepted Answer · 2020-04-21 00:05:58Z

6

You can use a regular expression:

\d+_([^/]+)

See a demo on regex101.com.

In Python:

import re

lst = ['source/108_cash_total/',
       'source/108_customer/',
       'source/108_daily_units_total/',
       'source/108_discounts/',
       'source/108_employee/',
       'source/56_cash_total/',
       'source/56_customer/',
       'source/56_daily_units_total/',
       'source/56_discounts/',
       'source/56_employee/']

rx = re.compile(r'\d+_([^/]+)')

output = [match.group(1) 
          for item in lst 
          for match in [rx.search(item)] 
          if match]
print(output)

Which yields

['cash_total', 'customer', 'daily_units_total', 
 'discounts', 'employee', 'cash_total', 'customer',
 'daily_units_total', 'discounts', 'employee']

edited Apr 21, 2020 at 0:05

answered Apr 20, 2020 at 22:06

Jan

43.3k11 gold badges57 silver badges87 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Karl Knechtel Over a year ago

This creation of a dummy list to filter out the None search results is an interesting technique. I think I like it.

Tupteq Over a year ago

IMO this use of list comprehension isn't much shorter than classical approach (ordinary for) but is very cryptic. In Python 3.8 you may write this in much more elegant way: output = [m.group(1) for item in lst if (m := rx.search(item))].

Tupteq · Accepted Answer · 2020-04-20 22:11:31Z

0

You can easily do this without regex using only offsets and split() with maxsplit parameter set:

offset = len("source/")
result = []
for item in lst:
    num, data = item[offset:].split("_", 1)
    result.append(data[:-1])

Of course, it's not very flexible, but as long as your data follow the schema, it doesn't matter.

answered Apr 20, 2020 at 22:11

Tupteq

3,1141 gold badge23 silver badges35 bronze badges

Comments

sahasrara62 · Accepted Answer · 2020-04-20 22:15:57Z

0

probably not so good and clean as compare to regex

using list comprehension and split function

lst = ['source/108_cash_total/',
 'source/108_customer/',
 'source/108_daily_units_total/',
 'source/108_discounts/',
 'source/108_employee/',
'source/56_cash_total/',
 'source/56_customer/',
 'source/56_daily_units_total/',
 'source/56_discounts/',
 'source/56_employee/']

res = [ '_'.join(i.split('_')[1:]).split('/')[:-1][0]  for i in lst]

print(res)

# output ['cash_total', 'customer', 'daily_units_total', 'discounts', 'employee', 'cash_total', 'customer', 'daily_units_total', 'discounts', 'employee']

answered Apr 20, 2020 at 22:15

sahasrara62

11.4k3 gold badges35 silver badges48 bronze badges

Collectives™ on Stack Overflow

create list from list based on string pattern

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related