How you make iterators and iterables
There are two ways to do this:
return selfand nothing else, implement
__next__on the same class. You’ve written an iterator.
__iter__to return some other object that follows the rules of #1 (a cheap way to do this is to write it as a generator function so you don’t have to hand-implement the other class). Don’t implement
__next__. You’ve written an iterable that is not an iterator.
For correctly implemented versions of each protocol, the way you tell them apart is the
__iter__ method. If the body is just
return self (maybe with a logging statement or something, but no other side-effects), then either it’s an iterator, or it was written incorrectly. If the body is anything else, then either it’s a non-iterator iterable, or it was written incorrectly. Anything else is violating the requirements for the protocols.
In case #2, the other object would be of another class by definition (because you either have an idempotent
__iter__ and implement
__next__, or you only have
__next__, which produces a new iterator).
Why the protocol is designed this way
The reason you need
__iter__ even on iterators is to support patterns like:
iterable = MyIterable(...) iterator = iter(iterable) # Invokes MyIterable.__iter__ next(iterator, None) # Throw away first item for x in iterator: # for implicitly calls iterator's __iter__; dies if you don't provide __iter__
The reason you always return a new iterator for iterables, rather than just making them iterators and resetting the state when
__iter__ is invoked is to handle the above case (if
MyIterable just returned itself and reset iteration, the
for loop’s implicit call to
__iter__ would reset it again and undo the intended skip of the first element) and to support patterns like this:
for x in iterable: for y in iterable: # Operating over product of all elements in iterable
__iter__ reset itself to the beginning and only had a single state, this would:
- Get the first item and put it in
- Reset, then iterate through the whole of
iterableputting each value in
- Try to continue outer loop, discover it’s already exhausted, never give any other value to
It’s also needed because Python assumes that
iter(x) is x is a safe, side-effect free way to test if an iterable is an iterator. If your
__iter__ modifies your own state, it’s not side-effect free. At worst, for iterables, it should waste a little time making an iterator that is immediately thrown away. For iterators, it should be effectively free (since it just returns itself).
To answer your questions directly:
Does this mean you can put
__next__()in two different objects?
For iterators, you can’t (it must have both methods, though
__iter__ is trivial). For non-iterator iterables, you must (it must only have
__iter__, and return some other iterator object). There is no “can”.
Can it be done for objects belonging to different classes?
Can it only be done for objects belonging to different classes?
Example of iterable:
class MyRange: def __init__(self, start, stop): self.start = start self.stop = stop def __iter__(self): return MyRangeIterator(self) # Returns new iterator, as this is a non-iterator iterable # Likely to have other methods (because iterables are often collections of # some sort and support many other behaviors) # Does *not* have __next__, as this is not an iterator
Example of iterator:
class MyRangeIterator: # Class is often non-public and or defined inside the iterable as # nested class; it exists solely to store state for iterator def __init__(self, rangeobj): # Constructed from iterable; could pass raw values if you preferred self.current = rangeobj.start self.stop = rangeobj.stop def __iter__(self): return self # Returns self, because this is an iterator def __next__(self): # Has __next__ because this is an iterator retval = self.current # Must cache current because we need to modify it before we return if retval >= self.stop: raise StopIteration # Indicates iterator exhausted self.current += 1 # Ensure state updated for next call return retval # Return cached value # Unlikely to have other methods; iterators are generally iterated and that's it
Example of “easy iterable” where you don’t implement your own iterator class, by making
__iter__ a generator function:
class MyEasyRange: def __init__(self, start, stop): ... # Same as for MyRange def __iter__(self): # Generator function is simpler (and faster) # than writing your own iterator class current = self.start # Can't mutate attributes, because multiple iterators might rely on this one iterable while current < self.stop: yield current # Produces value and freezes generator until iteration resumes current += 1 # reaching the end of the function acts as implicit StopIteration for a generator
CLICK HERE to find out more related problems solutions.