0

I want to split a list into sublists using its sublist as a separator. The elements in the sublist must act like a starting point for list creation until the next element in the sublist comes up in the list.

It might be better to see it in an example:

lst = ['a','b','c','d','1','11','111','x','y','z']
sep = ['b','11','y']

This is my desired output:

[['b','c','d','1'],['11','111','x'],['y','z']]

So far, I have the following:

import itertools

[list(x[1]) for x in itertools.groupby(lst, lambda x: x in sep)]

But this spits out ['a'], ['b'], ['c', 'd', '1'], ['11'], ['111', 'x'], ['y'], ['z']] which is not what I want.

11
  • 4
    Is there an 'a' in your desired output? Commented Jul 7, 2020 at 19:53
  • 2
    Are all separators present in lst? Commented Jul 7, 2020 at 19:54
  • 2
    Have you tried doing it manually? out = []; for x in lst: if x in sep: out.append(something) ... Commented Jul 7, 2020 at 19:54
  • 2
    Can a separator appear more than once? Commented Jul 7, 2020 at 20:30
  • 2
    @wjandrea Thanks, have done - with assumption stated. Commented Jul 7, 2020 at 22:19

3 Answers 3

3

Solution with an explicit loop.

This is on the assumption that the separators are unordered -- and indeed that it does not matter whether you use the same separator multiple times, or not at all -- just that whenever an element in lst matches any of the separators, it starts a new list.

lst = ['a','b','c','d','1','11','111','x','y','z']
sep = ['b','11','y']

out = []
for el in lst:
    if el in sep:
        out.append([])
    if out:
        out[-1].append(el)
        
print(out)

Depending on the number of elements, it might be worth starting by creating a set from sep for more efficient inclusion testing.

Sign up to request clarification or add additional context in comments.

1 Comment

The separators are indeed ordered, but each separator appears at most once. So your solutions still works. Thanks a lot!
2

Here is how you can use slices:

lst = ['a','b','c','d','1','11','111','x','y','z']
sep = ['b','11','y']

l = [lst.index(i) for i in sep]+[len(lst)]
l = [lst[l[i-1]:v] for i,v in enumerate(l)][1:]

print(l)

Output:

[['b', 'c', 'd', '1'], ['11', '111', 'x'], ['y', 'z']]

2 Comments

I think this will fail if a separator is repeated. Like lst = ['a','b','c','d','1','11','111','x','y','z','a',',b','c'] sep = ['b','11','y','b']
@WilfRosenbaum If a separator repeated, there could be multiple explanations as to what the output should be. The OP didn't specify.
2

Assumptions:

  • Separators are not repeated.
  • Separators occur in the same order in lst and sep.

Identify all potential start and end positions in the right order (the start of the next segment is the end of the previous one). Append None as the end of the last segment:

starts = sorted(lst.index(s) for s in sep)
ends = starts[1:] + [None]

Take a slice of the original list from each start position to the matching end position:

[lst[slice(start, end)] for start, end in zip(starts, ends)]
#[['b', 'c', 'd', '1'], ['11', '111', 'x'], ['y', 'z']]

The latter can be written using the asterisk notation:

[lst[slice(*ends)] for ends in zip(starts, ends)]
#[['b', 'c', 'd', '1'], ['11', '111', 'x'], ['y', 'z']]

3 Comments

Nice. I wonder if conceptually it is easier to grasp if the + [None] is omitted from starts but then you have ends = starts[1:] + [None].
Actually I'm going to edit it to do that - you can roll back if you disagree, obviously.
Sorry, I deleted my comment because I changed my mind again -- or at least I wasn't sure -- but obviously too late... I'm going to ask the OP for clarification.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.