2
\$\begingroup\$

I need a function to iterate through a python iterable in chunks. That is, it takes an iterable and a size of n and yields generators iterating through each chunk of size n. After some experimentation, I wrote this stupid hack because it seems there is no easy way to preemptively check whether an iterable has been exhausted. How can I improve this code?

def iterchunks(it, n):
    def next_n(it_, n_, return_first = None):
        if return_first is not None:
            n_ -= 1
            yield return_first
        for _ in range(n_):
            yield next(it_)

    # check if the iterator is exhausted by advancing the iterator, 
    # if not return the value returned by advancing the iterator along with the boolean result
    def exhausted(it_):
        res = next(it_, None)
        return res is None, res

    while True:
        exhsted, rf = exhausted(it)
        if exhsted:
            return
        else:
            # if the iterator is not exhausted, yield the returned value along with the next chunk
            yield next_n(it, n, rf)
\$\endgroup\$
2
  • \$\begingroup\$ Hmm, not exact solution, but... do you know StopIteration exception? try: while True: yield [next(it) for _ in range(n)] except StopIteration: pass \$\endgroup\$ Commented Apr 26, 2017 at 22:20
  • \$\begingroup\$ @enedil Yes, if I just pack the generators into lists or tuples, there would be no problem since the StopIteration exception would be triggered upon calling any empty generators. That would sacrifice some flexibility and laziness though. \$\endgroup\$ Commented Apr 26, 2017 at 22:39

1 Answer 1

2
\$\begingroup\$

Your code has a significant bug. If I ask it to chunk a list with None at a multiple of n plus 1 spot (c * n + 1), it will not return the rest of the list

xs = list(range(75, 90))
xs[5] = None
print([list(c) for c in iterchunks(iter(xs), 5)])
# Outputs [[75, 76, 77, 78, 79]]
# Expected [[75, 76, 77, 78, 79], [None, 81, 82, 83, 84], [85, 86, 87, 88, 89]]

To resolve this, use the standard practice of trying something, and asking for forgiveness later. I would suggest either an iterable you build up. Still this seems like a case of reinventing the wheel, it is unfortunate python doesn't have it built in to the itertools library. It does define grouper in the docs of itertools, which is kinda what you want, except it pads a fill value to the end of the iterable.

def chunk(it, n):
    try:
        while True:
            xs = []  # The buffer to hold the next n items
            for _ in range(n):
                xs.append(next(it))
            yield xs
    except StopIteration:
        yield xs

This is the code from the itertools docs here, with one amendment, to yield from instead of returning the iterable it creates

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    yield from zip_longest(*[iter(iterable)] * n, fillvalue=fillvalue)
\$\endgroup\$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.