2

Using the standard syntax that Python supplies to check if an element is in a list:

if someElement in someList:

What is actually being executed here? Is Python looping through every index and checking for equality, or is something more sophisticated being implemented?

The program I am writing is running extremely slowly. No math is being performed, but it relies heavily on checking to see if items exist in long lists. Is there a more rapid solution?

SOLVED: Checking if an element is in a list is the same as looping through every item and checking for equality. However, checking for an item in a set is significantly faster since the items are hashed.

Even if the items in your list are unhashable (in my case, other lists), it is still worth it to convert to a string, store in a set, and convert back when needed. At first, I thought this was bulky and would decrease performance. However, it literally allowed my program to complete in a matter of minutes when it would have taken days previously.

Don't underestimate the speed of checking items in a set.

1

3 Answers 3

6

Yes, it is looping through every index and checking for equality.

This:

someElement in someList

Is equivalent to:

any( x == someElement for x in someList )

To speed up, you can probably use a set instead of list, but that really depends on the type of the elements in your collections.

If the list is big, the lookup can be slow.

Sign up to request clarification or add additional context in comments.

1 Comment

Switching to a set worked perfectly. The change in speed was dramatic!
5
nc=set(someList)
if someElement in nc: #this will now be O(1) rather than O(n)

You can make a set out of your list and improve your performance.

Comments

0

Yes, Python is looping through every index. The in operator calls the __contains__() special method ( source ).

I think for a list - assuming CPython 2 - it ends up at this list_contains code in listobject.c, a straightforward for loop over the list items:

list_contains(PyListObject *a, PyObject *el)
{
    Py_ssize_t i;
    int cmp;

    for (i = 0, cmp = 0 ; cmp == 0 && i < Py_SIZE(a); ++i)
        cmp = PyObject_RichCompareBool(el, PyList_GET_ITEM(a, i),
                                           Py_EQ);
    return cmp;
}

Is there a more rapid solution?

Use a container with faster lookups - @vks suggests a set, dictionaries are also common. Both depend on your data being hashable with hash(item) or they can't work.

But the study of datastructures and their different performance characteristics is way too big for an answer, especially with no detail on what your task is, and with no given code and no given performance. Tree structures can also have faster lookups, but off-loading the work to an existing library written in C is a good Python tactic if you can find one.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.