Complicated list comprehension with many loop in Python

Question

I am currently doing some list of comprehension and come across a problem while increasing the number of loops in it. My code so far is as following:

selected_sheet_names = []
selected_sheet_names.append([x for x in sheet_names if x.endswith("b1")])
selected_sheet_names.append([x for x in sheet_names if x.endswith("b2")])
selected_sheet_names.append([x for x in sheet_names if x.endswith("b3")])

sheet_names list contains different strings all of which end with b1, b2, or b3. If you want to check them in your code:

sheet_names = ['0.5C_1_b1', '0.5C_2_b1', '1C_1_b1', '1C_2_b1', '1C_3_b1', '1C_4_b1', '1C_5_b1', 
'0.11C_1_b2', '0.57C_1_b2', '1.14C_1_b2', '1.14C_2_b2', '1.14C_3_b2', '1.14C_4_b2', '1.14C_5_b2', 
'1.14C_6_b2', '1.14C_7_b2', '1.14C_8_b2', '1C_1_b3', '1C_2_b3', '1C_3_b3', '1C_4_b3', '1C_5_b3', 
'1C_6_b3', '1C_7_b3', '1C_8_b3']

And if I want to print(selected_sheet_names) the results is as following:

[
    ['0.5C_1_b1', '0.5C_2_b1', '1C_1_b1', '1C_2_b1', '1C_3_b1', '1C_4_b1', '1C_5_b1'], 
    ['0.11C_1_b2', '0.57C_1_b2', '1.14C_1_b2', '1.14C_2_b2', '1.14C_3_b2', '1.14C_4_b2', '1.14C_5_b2', '1.14C_6_b2', '1.14C_7_b2', '1.14C_8_b2'], 
    ['1C_1_b3', '1C_2_b3', '1C_3_b3', '1C_4_b3', '1C_5_b3', '1C_6_b3', '1C_7_b3', '1C_8_b3']
]

Exactly as I expected, but in case I want to have more x.endswith(some_string) as in the first code block, the code becomes too massive and, therefore, I think I should try to change the selected_sheet_names.append([x for x in sheet_names if x.endswith(some_string)]) which repeats many times to some other more complicated list comprehension which could iterate over some_list and do the same.

some_list = ["b1", "b2", "b3" ... ]

Could someone please suggest me something?

EDIT 1: I know that I can implement it with for loop, but in this example I am specifically interested in list of comprehension implementation, if possible. The for loop can be as following:

selected_sheet_names = []
for ending in some_list:
    selected_sheet_names.append([x for x in sheet_names if x.endswith(ending)])

EDIT 2 (Thanks to Pedro Maia):

If the data is contiguous (, but it is not my case) you can go with:

from itertools import groupby

selected_sheet_names = [list(l[1]) for l in groupby(sheet_names, lambda x: x[-2:])]

My bad that I showed you a list to be contiguous. In case your data is not contiguous, the output may look something like this:

[
    ['0.11C_1_b2'], 
    ['0.5C_1_b1'], 
    ['0.57C_1_b2'], 
    ['0.5C_2_b1', '1C_1_b1', '1C_2_b1', '1C_3_b1', '1C_4_b1', '1C_5_b1'], 
    ['1.14C_1_b2', '1.14C_2_b2', '1.14C_3_b2', '1.14C_4_b2', '1.14C_5_b2', '1.14C_6_b2', '1.14C_7_b2', '1.14C_8_b2'], 
    ['1C_1_b3', '1C_2_b3', '1C_3_b3', '1C_4_b3', '1C_5_b3', '1C_6_b3', '1C_7_b3', '1C_8_b3']
]

However, if you data IS contiguous, this method seems better

Thanks you guys for the replies!

ShadowRanger · Accepted Answer · 2021-11-25 02:39:51Z

6

Simple nested listcomp matching your suggested form would loop over an anonymous tuple of the strings to check for:

selected_sheet_names = [[x for x in sheet_names if x.endswith(some_string)]
                        for some_string in ("b1", "b2", "b3")]

If you get some_list from somewhere else, or it gets too long to comfortably define inline, you can replace the anonymous tuple with some_list if it's already defined.

edited Nov 25, 2021 at 2:39

answered Nov 25, 2021 at 2:33

ShadowRanger

158k12 gold badges221 silver badges314 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Tarik Over a year ago

In addition to this answer, I would like to mention that you could have looped.

ShadowRanger Over a year ago

@Tarik: Sure. Every listcomp can be transformed to an equivalent for loop, but in this case, may as well just build it all at once. And apparently the OP just clarified they want the listcomp, so all's well that ends well.

loamoza Over a year ago

@ShadowRanger That worked excellent for me! Thank you very much!

loamoza Over a year ago

@Tarik Thanks for your comment! That's actually true, I have included possible loop implementation in the question during the last edit!

Tarik Over a year ago

Sure ShadowRanger, your answer is obviously more elegant that a loop. I just wanted the OP to be aware of it.

Pedro Maia · Accepted Answer · 2021-11-25 02:47:35Z

3

Alternatively you can use groupby from the built-in itertools module:

from itertools import groupby

selected_sheet_names = [list(l[1]) for l in groupby(sheet_names, lambda x: x[-2:])]

Which provides a cleaner and better performance code since you don't iterate multiple unnecessary times

answered Nov 25, 2021 at 2:47

Pedro Maia

2,7321 gold badge7 silver badges21 bronze badges

2 Comments

ShadowRanger Over a year ago

This assumes the entries to group are, and always will be, contiguous, and you want all of them (or can use the key to easily identify the ones to discard). That said, yes, this is the better solution if the number of endings to handle gets large enough (current implementation is O(n * m) where n is len(sheet_names) and m is len(some_list); your implementation, if it required a pre-sort, is O(n log n), or O(n) if it's assumed sorted). Up-voted regardless.

loamoza Over a year ago

Thanks for your reply! As mentioned by @ShadowRanger it is a good way to go with if the data is contiguous. Actually, it is my bad that I shared data structured contiguously in the sheet_names, whereas in reality the entries are not contiguous. I am adding an edit with your code! Thank you very much again! Seems a good approach as well

Collectives™ on Stack Overflow

Complicated list comprehension with many loop in Python

2 Answers 2

5 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related