Implement `__iter__()` and `__next__()` in different

How you make iterators and iterables

There are two ways to do this:

  1. Implement __iter__ to return self and nothing else, implement __next__ on the same class. You’ve written an iterator.
  2. Implement __iter__ to return some other object that follows the rules of #1 (a cheap way to do this is to write it as a generator function so you don’t have to hand-implement the other class). Don’t implement __next__. You’ve written an iterable that is not an iterator.

For correctly implemented versions of each protocol, the way you tell them apart is the __iter__ method. If the body is just return self (maybe with a logging statement or something, but no other side-effects), then either it’s an iterator, or it was written incorrectly. If the body is anything else, then either it’s a non-iterator iterable, or it was written incorrectly. Anything else is violating the requirements for the protocols.

In case #2, the other object would be of another class by definition (because you either have an idempotent __iter__ and implement __next__, or you only have __iter__, without __next__, which produces a new iterator).


Why the protocol is designed this way

The reason you need __iter__ even on iterators is to support patterns like:

 iterable = MyIterable(...)
 iterator = iter(iterable)  # Invokes MyIterable.__iter__
 next(iterator, None)  # Throw away first item
 for x in iterator:    # for implicitly calls iterator's __iter__; dies if you don't provide __iter__

The reason you always return a new iterator for iterables, rather than just making them iterators and resetting the state when __iter__ is invoked is to handle the above case (if MyIterable just returned itself and reset iteration, the for loop’s implicit call to __iter__ would reset it again and undo the intended skip of the first element) and to support patterns like this:

 for x in iterable:
     for y in iterable:  # Operating over product of all elements in iterable

If __iter__ reset itself to the beginning and only had a single state, this would:

  1. Get the first item and put it in x
  2. Reset, then iterate through the whole of iterable putting each value in y
  3. Try to continue outer loop, discover it’s already exhausted, never give any other value to x

It’s also needed because Python assumes that iter(x) is x is a safe, side-effect free way to test if an iterable is an iterator. If your __iter__ modifies your own state, it’s not side-effect free. At worst, for iterables, it should waste a little time making an iterator that is immediately thrown away. For iterators, it should be effectively free (since it just returns itself).


To answer your questions directly:

Does this mean you can put __iter__() and __next__() in two different objects?

For iterators, you can’t (it must have both methods, though __iter__ is trivial). For non-iterator iterables, you must (it must only have __iter__, and return some other iterator object). There is no “can”.

Can it be done for objects belonging to different classes?

Yes.

Can it only be done for objects belonging to different classes?

Yes.


Examples

Example of iterable:

class MyRange:
    def __init__(self, start, stop):
         self.start = start
         self.stop = stop

    def __iter__(self):
         return MyRangeIterator(self)  # Returns new iterator, as this is a non-iterator iterable

    # Likely to have other methods (because iterables are often collections of
    # some sort and support many other behaviors)
    # Does *not* have __next__, as this is not an iterator

Example of iterator:

class MyRangeIterator:  # Class is often non-public and or defined inside the iterable as
                        # nested class; it exists solely to store state for iterator
    def __init__(self, rangeobj):  # Constructed from iterable; could pass raw values if you preferred
        self.current = rangeobj.start
        self.stop = rangeobj.stop
    def __iter__(self):
        return self             # Returns self, because this is an iterator
    def __next__(self):         # Has __next__ because this is an iterator
        retval = self.current   # Must cache current because we need to modify it before we return
        if retval >= self.stop:
            raise StopIteration # Indicates iterator exhausted
        self.current += 1       # Ensure state updated for next call
        return retval           # Return cached value

    # Unlikely to have other methods; iterators are generally iterated and that's it

Example of “easy iterable” where you don’t implement your own iterator class, by making __iter__ a generator function:

class MyEasyRange:
    def __init__(self, start, stop): ... # Same as for MyRange

    def __iter__(self):  # Generator function is simpler (and faster)
                         # than writing your own iterator class
         current = self.start  # Can't mutate attributes, because multiple iterators might rely on this one iterable
         while current < self.stop:
             yield current     # Produces value and freezes generator until iteration resumes
             current += 1
         # reaching the end of the function acts as implicit StopIteration for a generator

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top