8

I am new to Python and I did my search but I could not find what I am looking for. I apologise in advance if this question has been asked and if I could not find it due to my lack of not knowing the name of what I am trying to achieve. I will gladly read any document you might suggest.

I have a list of lists. e.g. => [int, 'str']

t = [[0234, 'str_0'],
     [1267, 'str_1'],
     [2445, 'str_2']]

I want to find out if a str exists in index(1) position of one of the lists of list t. I can do this with a function containing 2 for or while loops but what I am seeking is to achieve this, if possible, using one single iteration. I want to learn the shortest function.

for input str('str_3'), I want to get int(2) (index of the list which has str_3 on its own 1st index location) for str_1, I want to get int(0)

and for str_1234 I want to get False as it is not in any of the lists within the list t

As a newbie, I would normally do:

for n in range(len(t)):
    if t[n][1] == 'str_1'
        return n
    return False

What I am seeking to get is, if possible, a better and shorter way of achieving this in one line of a code or just simply to learn if there is a smarter, better or more pythonic way that any one of you who is surely more experienced would recommend.

Thank you

1
  • 1
    It is much more useful if your example data are valid python which can be used to test answers. Commented Aug 18, 2012 at 15:44

4 Answers 4

11
[n for n, (i, s) in enumerate(t) if s == 'str_3']

Explanation:

>>> t = [[100, 'str_1'], [200, 'str_2'], [300, 'str_3']]

# Use enumerate to get each list item along with its index.
>>> list(enumerate(t))
[(0, [100, 'str_1']), (1, [200, 'str_2']), (2, [300, 'str_3'])]

# Use list comprehension syntax to iterate over the enumeration.
>>> [n for n, (i, s) in enumerate(t)]
[0, 1, 2]

# An if condition can be added right inside the list comprehension.
>>> [n for n, (i, s) in enumerate(t) if s == 'str_3']
[2]
>>> [n for n, (i, s) in enumerate(t) if s == 'str_1234']
[]

This will return all of the matching indices, since there could be more than one.

If you know there will only be one index, may I suggest using a dict instead of a nested list? With a dict you can lookup elements in constant time using very straightforward syntax, rather than having to iterate.

>>> t = {'str_1': 100, 'str_2': 200, 'str_3': 300}

>>> 'str_3' in t
True
>>> t['str_3']
300

>>> 'str_1234' in t
False
>>> t['str_1234']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'str_1234'
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you very much Mr. Kugelman. I learnt a lot from this post of yours. I thought about using a dictionary but I am trying to create a representation of a site map. Each node must have int_16(id), str_32(name) and int_8(type) elements. If a node (section of the website) has children, I am thinking of replacing the type with another list or tuple. I need this to use with a function which matches a url to the sitemap to give 404 or to fetch from the DB. What would you advise in this case? Thank you very much.
@Phil That's a bit much to respond to in a comment! Some kind of nested structure does seem enticing. Perhaps you can post another question.
7

Use the any function with a generator expression:

any(el[1] == 'str_1' for el in t)

What this does is loop over t, and for each el in t we test if the second value in it is equal to str_1, just like your loop.

But it'll only do this until it finds an element for which this is true, then stop, returning True. If it doesn't find such an element, it'll return False instead.

If, instead, you don't just want to test for the presence of str_1 but also want to know where it is located, the most efficient method is to use a loop like you did. You can use enumerate to give you both the index and the value in a sequence:

for n, el in enumerate(t):
    if el[1] == 'str_1'
        return n
    return False

3 Comments

I might use next instead of the loop: next((n for n, el in enumerate(t) if el[1] == 'str_1'), False) or something. [Untested, but some variant should work.] But the loop is equally clear -- maybe even more so -- so it doesn't matter much.
Thank you Mr. Pieters. I learn greatly from your post as well. Unfortunately any function won't do the trick a lone but your second function is awesome.
@DSM: Yeah, that feels like using generator expressions for the sake of generator expressions.
3

The example of a for loop that you give is not the usual way of using for loops in python, because the for loop is not a numerical construct (as it is in most languages), but a way of applying a loop body to each element yielded by iteration of the collection given.

the normal way (using a loop) to find the index would be:

for (n, (integer, string)) in enumerate(t):
    if 'str1' == string:
        return n

Note that in this case, sequence assignment is used to assign the elements of each two-item list to separate variables. Try out the enumerate function yourself in the python shell to find out what it does.

If your list is sorted, you may instead wish to use bisect: http://docs.python.org/library/bisect.html

If you merely want to find out if there is a pair which meets your condition, and you do not care about the index, use any, which stops iteration once it finds a match:

any(string == 'str_1' for integer, string in t)

The expression inside any is a generator expression, which yields True or False for each element in t.

Finally, consider if this is really the most appropriate datastructure to use.

2 Comments

Hello Marcin. Thank you for this excellent answer. I aim to create a datastructure representing a complex sitemap. I thought about lists (or tuples) with 3 values: id, name and type_of. If the node has children, it will replace the type_of as another list (or tuple). I am trying to create a function to match a given URL to this data structure representing the sitemap. First I convert url/a/b/c to a list and try to match each element to the data structure accordingly. What do you think would be the best approach for this?
@Phil I think the best approach is to either use an existing URL routing system, or look at how they do it. However, if you are using trees, consider either making an explicit tree structure, or perhaps using nested dicts, if that suits your needs.
1

First, let's do an actual data structure similar to what you are describing:

>>> LoL=[[i,'str_{}'.format(i)] for i in list(range(10))+list(range(10,0,-1))]
# this is Py 3, so that is why I need 'list' around range
>>> LoL
[[0, 'str_0'], [1, 'str_1'], [2, 'str_2'], [3, 'str_3'], [4, 'str_4'], [5, 'str_5'], 
 [6, 'str_6'], [7, 'str_7'], [8, 'str_8'], [9, 'str_9'], [10, 'str_10'], [9, 'str_9'], 
 [8, 'str_8'], [7, 'str_7'], [6, 'str_6'], [5, 'str_5'], [4, 'str_4'], [3, 'str_3'], 
 [2, 'str_2'], [1, 'str_1']]

Now make a list of tuples for every element that matches the test of equal to 'str_5'

>>> [(i,li,ls) for (i,(li,ls)) in enumerate(LoL) if ls == 'str_5']
[(5, 5, 'str_5'), (15, 5, 'str_5')]

Now, test with a string that is not there:

>>> [(i,li,ls) for (i,(li,ls)) in enumerate(LoL) if ls == 'str_123']
[]

It is then easy to both test presence, count the occurrence and extract the items needed:

>>> for t in [(i,li,ls) for (i,(li,ls)) in enumerate(LoL) if ls == target]:
...    print('index: {}, int: {}, str: "{}"'.format(*t))
... 
index: 7, int: 7, str: "str_7"
index: 13, int: 7, str: "str_7"

As others have said, you could re thing your data structure. Are you keeping int_x as an index? Don't do that. Use a list of the strings and enumerate. If it is just a mapping between str_x and int_x , use a dictionary.

1 Comment

Thank you carrot-top! Excellent explanation.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.