I realised I still don't quite understand how to implement an iterator class. So a class where one can "for-loop" over some of its contents.
I have looked at these answers (I still don't get it):
What exactly are Python's iterator, iterable, and iteration protocols?
Build a Basic Python Iterator
As far as I understand, an iterable is one that implements __iter__ and returns an iterator which is something that has __next__ implemented.
From this I somehow understood that if I want my class to be an iterator, and iterable. I must define __iter__ to return self, and have __next__ defined. Am I wrong so far?
Here is a scaffold for my class:
class Wrapper:
def __init__(self, low, high):
self.store = OrderedDict() # it is imported (ommited)
def __getitem__(self, key):
return self.store[key]
def __iter__(self):
return self
# Option 3
return self.store
def __next__(self):
# Option 1
for key in self.store:
return key
# Option 2
for key in self.store.keys():
return key
Tried the above options, none worked. :(
What happens is I have some py.tests prepared to test the iteration if it works correctly, just a simple for loop, nothing fancy. And the tests just run forever (or longer then my patience < 5 min), but the mock class has 5 items, so it should not take long.
What am I doing wrong?
You've got your concepts mixed up. You say
I realised I still don't quite understand how to implement an iterator class. So a class where one can "for-loop" over some of its contents.
but that's not how things work. If you just want to be able to perform for loops over instances of your class, you almost certainly shouldn't be making your instances iterators directly. You should write an __iter__ method that returns an iterator, either manually or by yielding, and you should not write a __next__:
# Returning an iterator manually:
class Wrapper:
...
def __iter__(self):
return iter(self.store)
# or with yield:
class Wrapper:
...
def __iter__(self):
for thing in self.store:
yield thing
Iterators are technically iterable, but only once. To write an iterator, you would have __iter__ return self and have __next__ return the next item on each call, or raise StopIteration when the items are exhausted (and continue to raise StopIteration on any further calls). I'll write an iterator over the half-open interval [0, 10), since writing one for an OrderedDict would just delegate to the OrderedDict's iterator for everything too directly to be instructive:
class Iterator:
def __init__(self):
self.state = 0
def __iter__(self):
return self
def __next__(self):
if self.state == 10:
raise StopIteration
self.state += 1
return self.state - 1
Writing iterators manually is unusual even when you do want an iterator, though; it's much more common to use yield:
def iterator():
for i in range(10):
yield i
# or managing the index manually, to show what it would be like without range:
def iterator():
i = 0
while i < 10:
yield i
i += 1
You can make the following changes:
class Wrapper:
def __init__(self, low, high):
self.store = OrderedDict()
self.__iter = None # maintain state of self as iterator
def __iter__(self):
if self.__iter is None:
self.__iter = iter(self.store)
return self
def __next__(self):
try:
return next(self.__iter)
except StopIteration: # support repeated iteration
self.__iter = None
raise
__next__ should return the next element of an iterator that is being iterated. And your __iter__ method should return self to properly implement the protocol. This ensures for instance, that a call to next during a loop drives forward the same iterator as the loop:
for x in wrapper:
# do stuff with x
next(wrapper) # skip one
For most purposes, impementing __iter__ should suffice:
class Wrapper:
def __init__(self, low, high):
self.store = OrderedDict()
def __iter__(self):
return iter(self.store)
Related
I understand the difference between iterables and iterators and how to implement both. But what is the rationale behind the existence of iterables...
Someone says the iterator is only capable of keeping its current state and iterables are used to provide fresh iterators. But the fresh iterators can also be done by re-initializing the iterator instance. So what is the point of introducing the extra iterables and iter interface...?
Thanks in advance.
# the correct way
class OneToFourIterator:
def __init__(self):
self.state = 0
def next(self):
self.state += 1
if self.state > 4:
raise StopIteration
return self.state
class OneToFourIterable:
def __init__(self):
pass
def __iter__(self):
return OneToFourIterator()
# below is the crazy way, and assume that in the fake python world,
# anything with a next method can be used in for-in loop
# and what the for-in loop does is to call the __init__ method acquiring
# a fresh iterator. Then call next method in each loop action. In this way,
# each for-in loop can get an independent iterator and each iterator has its own state.
class CrazyIterator:
def __init__(self):
self.state = 0
return self # if we allow __init__ method to return something
def next(self):
self.state += 1
if self.state > 4:
raise StopIteration
return self.state
# so why do we need iterables and __iter__...I am confused. I am sorry this is not a good question.
trying to understand how iterator works. Below is my test, but I get an error in Iter(self) -- TypeError: this constructor takes no arguments. Can anyone help me? Thanks very much.
class TestIter:
def __init__(self, value):
self.value = value
def __iter__(self):
return Iter(self)
class Iter:
def __int__(self, source):
self.source = source
def next(self):
if self.source.value >= 10:
raise StopIteration
else:
self.source.value += 1
return self.source.value
test = TestIter(5)
for i in test:
print(i)
exception TypeError
Raised when an operation or function is applied to
an object of inappropriate type. The associated value is a string
giving details about the type mismatch.
In your case it was __int__ which should have been __init__. And just as a suggestion, instead of using such a complicated way to build a iterator, just simply use one class and call it directly.
Example:
class Count:
"""Iterator that counts upward forever."""
def __init__(self, start):
self.num = start
def __iter__(self):
return self
def __next__(self): // This will go to infinity but you can applyyour own logic to
num = self.num
self.num += 1
return num
Calling can be done by either this:
>>> c = Count()
>>> next(c)
0
Or this:
>>> for n in Count():
... print(n)
...
0
1
2
(this goes on forever)
You have a typo in your code. Check __int__ vs __init__ in your Iter.
Because of the typo, you do not define the __init__ and therefore you use the default, which indeed takes no arguments.
I'm trying to port a custom class from Python 2 to Python 3. I can't find the right syntax to port the iterator for the class. Here is a MVCE of the real class and my attempts to solve this so far:
Working Python 2 code:
class Temp:
def __init__(self):
self.d = dict()
def __iter__(self):
return self.d.iteritems()
temp = Temp()
for thing in temp:
print(thing)
In the above code iteritems() breaks in Python 3. According to this highly voted answer, "dict.items now does the thing dict.iteritems did in python 2". So I tried that next:
class Temp:
def __init__(self):
self.d = dict()
def __iter__(self):
return self.d.items()
The above code yields "TypeError: iter() returned non-iterator of type 'dict_items'"
According to this answer, Python 3 requires iterable objects to provide a next() method in addition to the iter method. Well, a dictionary is also iterable, so in my use case I should be able to just pass dictionary's next and iter methods, right?
class Temp:
def __init__(self):
self.d = dict()
def __iter__(self):
return self.d.__iter__
def next(self):
return self.d.next
This time it's giving me "TypeError: iter() returned non-iterator of type 'method-wrapper'".
What am I missing here?
As the error message suggests, your __iter__ function does not return an iterator, which you can easily fix using the built-in iter function
class Temp:
def __init__(self):
self.d = {}
def __iter__(self):
return iter(self.d.items())
This will make your class iterable.
Alternatively, you may write a generator yourself, like so:
def __iter__(self):
for key,item in self.d.items():
yield key,item
If you want to be able to iterate over keys and items separately, i.e. in the form that the usual python3 dictionary can, you can provide additional functions, for example
class Temp:
def __init__(self, dic):
self.d = dic
def __iter__(self):
return iter(self.d)
def keys(self):
return self.d.keys()
def items(self):
return self.d.items()
def values(self):
return self.d.values()
I'm guessing from the way you phrased it that you don't actually want the next() method to be implemented if not needed. If you would, you would have to somehow turn your whole class into an iterator and somehow keep track of where you are momentarily in this iterator, because dictionaries themselves are not iterators. See also this answer.
I don't know what works in Python 2. But on Python 3 iterators can be most easily created using something called a generator. I am providing the name and the link so that you can research further.
class Temp:
def __init__(self):
self.d = {}
def __iter__(self):
for thing in self.d.items():
yield thing
I have written a custom container object.
According to this page, I need to implement this method on my object:
__iter__(self)
However, upon following up the link to Iterator Types in the Python reference manual, there are no examples given of how to implement your own.
Can someone post a snippet (or link to a resource), that shows how to do this?
The container I am writing, is a map (i.e. stores values by unique keys).
dicts can be iterated like this:
for k, v in mydict.items()
In this case I need to be able to return two elements (a tuple?) in the iterator.
It is still not clear how to implement such an iterator (despite the several answers that have been kindly provided). Could someone please shed some more light on how to implement an iterator for a map-like container object? (i.e. a custom class that acts like a dict)?
I normally would use a generator function. Each time you use a yield statement, it will add an item to the sequence.
The following will create an iterator that yields five, and then every item in some_list.
def __iter__(self):
yield 5
yield from some_list
Pre-3.3, yield from didn't exist, so you would have to do:
def __iter__(self):
yield 5
for x in some_list:
yield x
Another option is to inherit from the appropriate abstract base class from the `collections module as documented here.
In case the container is its own iterator, you can inherit from
collections.Iterator. You only need to implement the next method then.
An example is:
>>> from collections import Iterator
>>> class MyContainer(Iterator):
... def __init__(self, *data):
... self.data = list(data)
... def next(self):
... if not self.data:
... raise StopIteration
... return self.data.pop()
...
...
...
>>> c = MyContainer(1, "two", 3, 4.0)
>>> for i in c:
... print i
...
...
4.0
3
two
1
While you are looking at the collections module, consider inheriting from Sequence, Mapping or another abstract base class if that is more appropriate. Here is an example for a Sequence subclass:
>>> from collections import Sequence
>>> class MyContainer(Sequence):
... def __init__(self, *data):
... self.data = list(data)
... def __getitem__(self, index):
... return self.data[index]
... def __len__(self):
... return len(self.data)
...
...
...
>>> c = MyContainer(1, "two", 3, 4.0)
>>> for i in c:
... print i
...
...
1
two
3
4.0
NB: Thanks to Glenn Maynard for drawing my attention to the need to clarify the difference between iterators on the one hand and containers that are iterables rather than iterators on the other.
usually __iter__() just return self if you have already define the next() method (generator object):
here is a Dummy example of a generator :
class Test(object):
def __init__(self, data):
self.data = data
def next(self):
if not self.data:
raise StopIteration
return self.data.pop()
def __iter__(self):
return self
but __iter__() can also be used like this:
http://mail.python.org/pipermail/tutor/2006-January/044455.html
The "iterable interface" in python consists of two methods __next__() and __iter__(). The __next__ function is the most important, as it defines the iterator behavior - that is, the function determines what value should be returned next. The __iter__() method is used to reset the starting point of the iteration. Often, you will find that __iter__() can just return self when __init__() is used to set the starting point.
See the following code for defining a Class Reverse which implements the "iterable interface" and defines an iterator over any instance from any sequence class. The __next__() method starts at the end of the sequence and returns values in reverse order of the sequence. Note that instances from a class implementing the "sequence interface" must define a __len__() and a __getitem__() method.
class Reverse:
"""Iterator for looping over a sequence backwards."""
def __init__(self, seq):
self.data = seq
self.index = len(seq)
def __iter__(self):
return self
def __next__(self):
if self.index == 0:
raise StopIteration
self.index = self.index - 1
return self.data[self.index]
>>> rev = Reverse('spam')
>>> next(rev) # note no need to call iter()
'm'
>>> nums = Reverse(range(1,10))
>>> next(nums)
9
If your object contains a set of data you want to bind your object's iter to, you can cheat and do this:
>>> class foo:
def __init__(self, *params):
self.data = params
def __iter__(self):
if hasattr(self.data[0], "__iter__"):
return self.data[0].__iter__()
return self.data.__iter__()
>>> d=foo(6,7,3,8, "ads", 6)
>>> for i in d:
print i
6
7
3
8
ads
6
To answer the question about mappings: your provided __iter__ should iterate over the keys of the mapping. The following is a simple example that creates a mapping x -> x * x and works on Python3 extending the ABC mapping.
import collections.abc
class MyMap(collections.abc.Mapping):
def __init__(self, n):
self.n = n
def __getitem__(self, key): # given a key, return it's value
if 0 <= key < self.n:
return key * key
else:
raise KeyError('Invalid key')
def __iter__(self): # iterate over all keys
for x in range(self.n):
yield x
def __len__(self):
return self.n
m = MyMap(5)
for k, v in m.items():
print(k, '->', v)
# 0 -> 0
# 1 -> 1
# 2 -> 4
# 3 -> 9
# 4 -> 16
In case you don't want to inherit from dict as others have suggested, here is direct answer to the question on how to implement __iter__ for a crude example of a custom dict:
class Attribute:
def __init__(self, key, value):
self.key = key
self.value = value
class Node(collections.Mapping):
def __init__(self):
self.type = ""
self.attrs = [] # List of Attributes
def __iter__(self):
for attr in self.attrs:
yield attr.key
That uses a generator, which is well described here.
Since we're inheriting from Mapping, you need to also implement __getitem__ and __len__:
def __getitem__(self, key):
for attr in self.attrs:
if key == attr.key:
return attr.value
raise KeyError
def __len__(self):
return len(self.attrs)
One option that might work for some cases is to make your custom class inherit from dict. This seems like a logical choice if it acts like a dict; maybe it should be a dict. This way, you get dict-like iteration for free.
class MyDict(dict):
def __init__(self, custom_attribute):
self.bar = custom_attribute
mydict = MyDict('Some name')
mydict['a'] = 1
mydict['b'] = 2
print mydict.bar
for k, v in mydict.items():
print k, '=>', v
Output:
Some name
a => 1
b => 2
example for inhert from dict, modify its iter, for example, skip key 2 when in for loop
# method 1
class Dict(dict):
def __iter__(self):
keys = self.keys()
for i in keys:
if i == 2:
continue
yield i
# method 2
class Dict(dict):
def __iter__(self):
for i in super(Dict, self).__iter__():
if i == 2:
continue
yield i
As part of some WSGI middleware I want to write a python class that wraps an iterator to implement a close method on the iterator.
This works fine when I try it with an old-style class, but throws a TypeError when I try it with a new-style class. What do I need to do to get this working with a new-style class?
Example:
class IteratorWrapper1:
def __init__(self, otheriter):
self._iterator = otheriter
self.next = otheriter.next
def __iter__(self):
return self
def close(self):
if getattr(self._iterator, 'close', None) is not None:
self._iterator.close()
# other arbitrary resource cleanup code here
class IteratorWrapper2(object):
def __init__(self, otheriter):
self._iterator = otheriter
self.next = otheriter.next
def __iter__(self):
return self
def close(self):
if getattr(self._iterator, 'close', None) is not None:
self._iterator.close()
# other arbitrary resource cleanup code here
if __name__ == "__main__":
for i in IteratorWrapper1(iter([1, 2, 3])):
print i
for j in IteratorWrapper2(iter([1, 2, 3])):
print j
Gives the following output:
1
2
3
Traceback (most recent call last):
...
TypeError: iter() returned non-iterator of type 'IteratorWrapper2'
What you're trying to do makes sense, but there's something evil going on inside Python here.
class foo(object):
c = 0
def __init__(self):
self.next = self.next2
def __iter__(self):
return self
def next(self):
if self.c == 5: raise StopIteration
self.c += 1
return 1
def next2(self):
if self.c == 5: raise StopIteration
self.c += 1
return 2
it = iter(foo())
# Outputs: <bound method foo.next2 of <__main__.foo object at 0xb7d5030c>>
print it.next
# 2
print it.next()
# 1?!
for x in it:
print x
foo() is an iterator which modifies its next method on the fly--perfectly legal anywhere else in Python. The iterator we create, it, has the method we expect: it.next is next2. When we use the iterator directly, by calling next(), we get 2. Yet, when we use it in a for loop, we get the original next, which we've clearly overwritten.
I'm not familiar with Python internals, but it seems like an object's "next" method is being cached in tp_iternext (http://docs.python.org/c-api/typeobj.html#tp_iternext), and then it's not updated when the class is changed.
This is definitely a Python bug. Maybe this is described in the generator PEPs, but it's not in the core Python documentation, and it's completely inconsistent with normal Python behavior.
You could work around this by keeping the original next function, and wrapping it explicitly:
class IteratorWrapper2(object):
def __init__(self, otheriter):
self.wrapped_iter_next = otheriter.next
def __iter__(self):
return self
def next(self):
return self.wrapped_iter_next()
for j in IteratorWrapper2(iter([1, 2, 3])):
print j
... but that's obviously less efficient, and you should not have to do that.
There are a bunch of places where CPython take surprising shortcuts based on class properties instead of instance properties. This is one of those places.
Here is a simple example that demonstrates the issue:
def DynamicNext(object):
def __init__(self):
self.next = lambda: 42
And here's what happens:
>>> instance = DynamicNext()
>>> next(instance)
…
TypeError: DynamicNext object is not an iterator
>>>
Now, digging into the CPython source code (from 2.7.2), here's the implementation of the next() builtin:
static PyObject *
builtin_next(PyObject *self, PyObject *args)
{
…
if (!PyIter_Check(it)) {
PyErr_Format(PyExc_TypeError,
"%.200s object is not an iterator",
it->ob_type->tp_name);
return NULL;
}
…
}
And here's the implementation of PyIter_Check:
#define PyIter_Check(obj) \
(PyType_HasFeature((obj)->ob_type, Py_TPFLAGS_HAVE_ITER) && \
(obj)->ob_type->tp_iternext != NULL && \
(obj)->ob_type->tp_iternext != &_PyObject_NextNotImplemented)
The first line, PyType_HasFeature(…), is, after expanding all the constants and macros and stuff, equivalent to DynamicNext.__class__.__flags__ & 1L<<17 != 0:
>>> instance.__class__.__flags__ & 1L<<17 != 0
True
So that check obviously isn't failing… Which must mean that the next check — (obj)->ob_type->tp_iternext != NULL — is failing.
In Python, this line is roughly (roughly!) equivalent to hasattr(type(instance), "next"):
>>> type(instance)
__main__.DynamicNext
>>> hasattr(type(instance), "next")
False
Which obviously fails because the DynamicNext type doesn't have a next method — only instances of that type do.
Now, my CPython foo is weak, so I'm going to have to start making some educated guesses here… But I believe they are accurate.
When a CPython type is created (that is, when the interpreter first evaluates the class block and the class' metaclass' __new__ method is called), the values on the type's PyTypeObject struct are initialized… So if, when the DynamicNext type is created, no next method exists, the tp_iternext, field will be set to NULL, causing PyIter_Check to return false.
Now, as the Glenn points out, this is almost certainly a bug in CPython… Especially given that correcting it would only impact performance when either the object being tested isn't iterable or dynamically assigns a next method (very approximately):
#define PyIter_Check(obj) \
(((PyType_HasFeature((obj)->ob_type, Py_TPFLAGS_HAVE_ITER) && \
(obj)->ob_type->tp_iternext != NULL && \
(obj)->ob_type->tp_iternext != &_PyObject_NextNotImplemented)) || \
(PyObject_HasAttrString((obj), "next") && \
PyCallable_Check(PyObject_GetAttrString((obj), "next"))))
Edit: after a little bit of digging, the fix would not be this simple, because at least some portions of the code assume that, if PyIter_Check(it) returns true, then *it->ob_type->tp_iternext will exist… Which isn't necessarily the case (ie, because the next function exists on the instance, not the type).
SO! That's why surprising things happen when you try to iterate over a new-style instance with a dynamically assigned next method.
Looks like built-in iter doesn't check for next callable in an instance but in a class and IteratorWrapper2 doesn't have any next. Below is simpler version of your problem
class IteratorWrapper2(object):
def __init__(self, otheriter):
self.next = otheriter.next
def __iter__(self):
return self
it=iter([1, 2, 3])
myit = IteratorWrapper2(it)
IteratorWrapper2.next # fails that is why iter(myit) fails
iter(myit) # fails
so the solution would be to return otheriter in __iter__
class IteratorWrapper2(object):
def __init__(self, otheriter):
self.otheriter = otheriter
def __iter__(self):
return self.otheriter
or write your own next, wrapping inner iterator
class IteratorWrapper2(object):
def __init__(self, otheriter):
self.otheriter = otheriter
def next(self):
return self.otheriter.next()
def __iter__(self):
return self
Though I do not understand why doesn't iter just use the self.next of instance.
Just return the iterator. That's what __iter__ is for. It makes no sense to try to monkey-patch the object into being in iterator and return it when you already have an iterator.
EDIT: Now with two methods. Once, monkey patching the wrapped iterator, second, kitty-wrapping the iterator.
class IteratorWrapperMonkey(object):
def __init__(self, otheriter):
self.otheriter = otheriter
self.otheriter.close = self.close
def close(self):
print "Closed!"
def __iter__(self):
return self.otheriter
class IteratorWrapperKitten(object):
def __init__(self, otheriter):
self.otheriter = otheriter
def __iter__(self):
return self
def next(self):
return self.otheriter.next()
def close(self):
print "Closed!"
class PatchableIterator(object):
def __init__(self, inp):
self.iter = iter(inp)
def next(self):
return self.iter.next()
def __iter__(self):
return self
if __name__ == "__main__":
monkey = IteratorWrapperMonkey(PatchableIterator([1, 2, 3]))
for i in monkey:
print i
monkey.close()
kitten = IteratorWrapperKitten(iter([1, 2, 3]))
for i in kitten:
print i
kitten.close()
Both methods work both with new and old-style classes.