3

I've got a list of strings dumped by readlines() and I want to find the index of the first line that includes a substring, or the last line.

This works, but seems clunky:

fooIndex=listOfStrings.index((next((x for x in listOfStrings if "foo" in x),listOfStrings[-1])))

There has to be a better way than searching twice, but I can't find it.

3
  • 1
    Are you sure you need the index? It seems to me the most efficient way would be to loop over the file contents directly (do not call readlines) and stop/return when you see the line with the text you want. Commented Jul 13, 2016 at 14:24
  • You could use enumerate when iterating to get the index of each line. Commented Jul 13, 2016 at 14:25
  • I'm slicing up blocks of data, so I'd think it's more efficient to pass an index-sliced sublist to the function containing numpy.genfromtxt especially since I can't get that to stop throwing Value errors without a clean data block (another question I guess I should ask). Commented Jul 14, 2016 at 6:04

2 Answers 2

3

I don't think there is a good (i.e. readable) one-line solution for this. Alternatively to @eugene's loop, you could also use a try/except.

def get_index(list_of_strings, substring):
    try:
        return next(i for i, e in enumerate(list_of_strings) if substring in e)
    except StopIteration:
        return len(list_of_strings) - 1

The code is a little longer, but IMHO the intent is very clear: Try to get the next index that contains the substring, or the length of the list minus one.


Update: In fact, there is a good (well, somewhat okay-ish) one-liner, and you almost had it, using the default parameter of next, but instead of using the last element itself as default, and then calling index, just put the index itself and combine with enumerate:

next((i for i, e in enumerate(list_of_strings) if substring in e), 
     len(list_of_strings) - 1)
Sign up to request clarification or add additional context in comments.

5 Comments

That second one is more or less what I was looking for, although since I'm going to use the indexes to slice the list, I don't even need the len(list_of_strings), I can just set the default to -1: <br/> next((i for i, e in enumerate(list_of_strings) if substring in e), - 1)
@DanielForsman Yes, for slicing (and for pretty much every other purpose) the two are interchangeable. Be aware, though, that slicing to [...:-1] will exclude the last element. Not sure if you want that.
No I don't! Thanks.
@DanielForsman So, you want to use that for slicing, and the slice should go up to and including the first element with the substring, or the entire list?
Yes, although I'm doing two slices usually. One looking for the header of a specific data block, and one looking for the end of that block.
2

Using enumerate() in a function would be both more readable and efficient:

def get_index(strings, substr):
    for idx, string in enumerate(strings):
        if substr in string:
            break
    return idx

Note that you don't need to call .readlines() on a file object to iterate over the lines — just use it as an iterable.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.