0

Edit: I know to iterate over a copy of my list when I want to modify the original. However, the only explanation I've ever received on what's wrong with modifying a list while iterating over it is that "it can lead to unexpected results."

Consider the following:

lst = ['a', 'b', 'c', 'd', 'e']
for x in lst:
    lst.remove(x)
print(lst)

Here is my attempt at explaining what actually happens when one modifies a list while iterating over it. Note that line2 is equivalent to for i in range(len(lst)):, and that len(lst) decreases by 1 with every iteration.

len(lst) begins as 5.

When i = 0, we have lst[i] = 'a' being removed, so lst = ['b', 'c', 'd', 'e']. len(lst) decreases to 4.

When i = 1, we have lst[i] = 'c' being removed, so lst = ['b', 'd', 'e'] len(lst) decreases to 3.

When i = 2, we have lst[i] = 'e' being removed, so lst = ['b', 'd']. len(lst) decreases to 2.

This is where I thought an IndexError would be raised, since i = 2 is not in range(2). However, the program simply outputs ['b', 'd']. Is it because i has "caught up" with len(lst)? Also, is my reasoning sound so far?

9
  • Possible duplicate of How to modify list entries during for loop? Commented May 2, 2018 at 5:27
  • Copy your list, and use your indexing on that copy. Commented May 2, 2018 at 5:28
  • 2
    @jozzas she is asking how the iteration works. I didn't see that answered your referenced question. Commented May 2, 2018 at 5:30
  • @BcK my intention is not to clear the list; I just want to understand what happens in the background. Commented May 2, 2018 at 5:31
  • 1
    Possible duplicate of Removing from a list while iterating over it This duplicate's top answer discusses (with a good visualization) what is happening here. It is slightly different since not all elements are removed, but I think close enough to be a dupe. Commented May 2, 2018 at 5:50

3 Answers 3

2

The C implementation is in the listiter_next function in listobject.c and the pertinent lines are

if (it->it_index < PyList_GET_SIZE(seq)) {
    item = PyList_GET_ITEM(seq, it->it_index);
    ++it->it_index;
    Py_INCREF(item);
    return item;
}

it->it_seq = NULL;
Py_DECREF(seq);
return NULL;

The iterator returns an object if its still in range (it->it_index < PyList_GET_SIZE(seq)) and returns NONE otherwise. It doesn't matter if you are off by 1 or a million, its not an error.

The general reason for doing things this way is that iterators and iterables can be consumed in multiple places (consider a file object that is read inside a for loop). An outer loop shouldn't crash with an IndexError just because its run out of things to do. Its not illegal or inherently "stupid" to change an object you are iterating, its just that you need to know the consequences of your actions.

Sign up to request clarification or add additional context in comments.

Comments

0

"Note that line2 is equivalent to for i in range(len(lst))"

I don't think it is
The for loop in Python iterates over a list using the integrated next function. So at the end you get a stop iteration error, raised by next if the iterable you are iterating over is done. But this error is automatically caught by the for loop.

1 Comment

Could you explain why the equivalence doesn't hold? This is what I gathered from your second paragraph: when i = 2, StopIteration exception is raised because i cannot increase any further (because i cannot go beyond range(len(lst))). StopIteration what causes the program to exit the for loop, and therefore to terminate. Is that right?
0

You should be able to tell if you print the x in the process,

lst = [1, 2, 3, 4, 5]
for x in lst:
    print(x)
    lst.remove(x)

# 1
# 3
# 5

What happens is, you are removing the 1 from the list at first. Because you removed 1, instead of proceeding to 2, you proceed to 3. Then remove the 3 from the list. Now same procedure applies, instead of proceeding to number 4, you are proceeding to number 5 and removing that number from the list. So you have completed your iteration.

By the way, for x in lst is not the same as for x in range(len(lst)), this might be the point where you are confused.

In the first case, python creates an iterable from your list, and calls next method on every iteration, so when you reach at the end of the list, StopIteration error is raised, causing the iteration process to stop. In the second case, you should handle that yourself explicitly. That means, python does not create an iterable from your list, you should keep track of where you are.


I suggest you to read the article to learn the difference between an iterable and an iterator and how they work:

Iterator vs Iterable

2 Comments

"Because you removed one, instead of proceeding to 2, you proceed to 3". I don't see why this happens, if my explanation weren't correct. Could you also explain why for x in lst and for i in range(len(lst)) are not the same?
@jessica Clarified the answer a little bit more.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.