Why `__iter__` does not work when defined as an instance variable? - python

If I define the __iter__ method as follows, it won't work:
class A:
def __init__(self):
self.__iter__ = lambda: iter('text')
for i in A().__iter__():
print(i)
iter(A())
Result:
t
e
x
t
Traceback (most recent call last):
File "...\mytest.py", line 10, in <module>
iter(A())
TypeError: 'A' object is not iterable
As you can see, calling A().__iter__() works, but A() is not iterable.
However if I define __iter__ for the class, then it will work:
class A:
def __init__(self):
self.__class__.__iter__ = staticmethod(lambda: iter('text'))
# or:
# self.__class__.__iter__ = lambda s: iter('text')
for i in A():
print(i)
iter(A())
# will print:
# t
# e
# x
# t
Does anyone know why python has been designed like this? i.e. why __iter__ as instance variable does not work? Don't you find it unintuitive?

It is done by design. You can find the thorough description here: https://docs.python.org/3/reference/datamodel.html#special-method-lookup
Short answer: the special method must be set on the class object itself in order to be consistently invoked by the interpreter.
Long answer: the idea behind this is to speed up well-known constructions. In your example:
class A:
def __init__(self):
self.__iter__ = lambda: iter('text')
How often are you going to write a code like this in real life? So, what Python does - it skips a dictionary lookup of the instance, i.e. iter(A()) simply does not "see" that self.__iter__, which is actually self.__dict__['__iter__'] in this case.
It also skips all the __getattribute__ instance and metaclass lookup gaining a significant speedup.

Related

`setattr` fails with `AttributeError` in CPython?

For some reason this fails in Python 3.8:
setattr(iter(()), '_hackkk', 'bad idea')
Error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-3-c046f8521130> in <module>
----> 1 setattr(iter(()), '_hackkk', 'bad idea')
AttributeError: 'tuple_iterator' object has no attribute '_hackkk'
How do I attach random data where I shouldn't, i.e., on an iterator or a generator?
You can attach data only to objects, which have __dict__-member. Not all objects have it - for example builtin classes like int, float, list and so on do not. This is an optimization, because otherwise instances of those classes would need too much memory - a dictionary has a quite large memory footprint.
Also for normal classes one could use __slots__, thus removing __dict__-member and prohibiting dynamic addition of attributes to an object of this class. E.g.
class A:
pass
setattr(A(),'b', 2)
works, but
class B:
__slots__ = 'b'
setattr(B(),'c', 2)
doesn't work, as class B has no slot with name c and no __dict__.
Thus, the answer to your question is: for some classes (as the tuple_iterator) you cannot.
If you really need to, you can wrap tuple_iterator in a class with __dict__ and append the new attribute to the wrapper-object:
class IterWrapper:
def __init__(self, it):
self.it=it
def __next__(self):
return next(self.it)
def __iter__(self): # for testing
return self
and now:
iw=IterWrapper(iter((1,2,3)))
setattr(iw, "a", 2)
print(iw.a) # prints 2
print(list(iw)) # prints [1,2,3]
has the desired behavior.

What really makes an object callable in python [duplicate]

I would like to do the following:
class A(object): pass
a = A()
a.__int__ = lambda self: 3
i = int(a)
Unfortunately, this throws:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: int() argument must be a string or a number, not 'A'
This only seems to work if I assign the "special" method to the class A instead of an instance of it. Is there any recourse?
One way I thought of was:
def __int__(self):
# No infinite loop
if type(self).__int__.im_func != self.__int__.im_func:
return self.__int__()
raise NotImplementedError()
But that looks rather ugly.
Thanks.
Python always looks up special methods on the class, not the instance (except in the old, aka "legacy", kind of classes -- they're deprecated and have gone away in Python 3, because of the quirky semantics that mostly comes from looking up special methods on the instance, so you really don't want to use them, believe me!-).
To make a special class whose instances can have special methods independent from each other, you need to give each instance its own class -- then you can assign special methods on the instance's (individual) class without affecting other instances, and live happily ever after. If you want to make it look like you're assigning to an attribute the instance, while actually assigning to an attribute of the individualized per-instance class, you can get that with a special __setattr__ implementation, of course.
Here's the simple case, with explicit "assign to class" syntax:
>>> class Individualist(object):
... def __init__(self):
... self.__class__ = type('GottaBeMe', (self.__class__, object), {})
...
>>> a = Individualist()
>>> b = Individualist()
>>> a.__class__.__int__ = lambda self: 23
>>> b.__class__.__int__ = lambda self: 42
>>> int(a)
23
>>> int(b)
42
>>>
and here's the fancy version, where you "make it look like" you're assigning the special method as an instance attribute (while behind the scene it still goes to the class of course):
>>> class Sophisticated(Individualist):
... def __setattr__(self, n, v):
... if n[:2]=='__' and n[-2:]=='__' and n!='__class__':
... setattr(self.__class__, n, v)
... else:
... object.__setattr__(self, n, v)
...
>>> c = Sophisticated()
>>> d = Sophisticated()
>>> c.__int__ = lambda self: 54
>>> d.__int__ = lambda self: 88
>>> int(c)
54
>>> int(d)
88
The only recourse that works for new-style classes is to have a method on the class that calls the attribute on the instance (if it exists):
class A(object):
def __int__(self):
if '__int__' in self.__dict__:
return self.__int__()
raise ValueError
a = A()
a.__int__ = lambda: 3
int(a)
Note that a.__int__ will not be a method (only functions that are attributes of the class will become methods) so self is not passed implicitly.
I have nothing to add about the specifics of overriding __int__. But I noticed one thing about your sample that bears discussing.
When you manually assign new methods to an object, "self" is not automatically passed in. I've modified your sample code to make my point clearer:
class A(object): pass
a = A()
a.foo = lambda self: 3
a.foo()
If you run this code, it throws an exception because you passed in 0 arguments to "foo" and 1 is required. If you remove the "self" it works fine.
Python only automatically prepends "self" to the arguments if it had to look up the method in the class of the object and the function it found is a "normal" function. (Examples of "abnormal" functions: class methods, callable objects, bound method objects.) If you stick callables in to the object itself they won't automatically get "self".
If you want self there, use a closure.

Best way to implement/call a class that returns an immutable value?

I would something like this in Python:
result = SomeClass(some_argument)
Here is the catch though. I don't want the result to be an instance but an immutable object (int, for example). Basically the hole role of a class is returning a value calculated from the argument. I am using a class and not a function for DRY purposes.
Since the above code won't work because it will always return an instance of SomeClass what would be the best alternative?
My only idea is to have a static method, but I don't like it:
result = SomeClass.static_method(some_argument)
You can override __new__. This is rarely a good idea and/or necessary though ...
>>> class Foo(object):
... def __new__(cls):
... return 1
...
>>> Foo()
1
>>> type(Foo())
<type 'int'>
If you don't return an instance of cls, __init__ will never be called.
Basically class methods are the way to go if you have a factory method.
About the result - it really depends on what kind of immutability you seek, but basically namedtuple does a great job for encapsulating things and is also immutable (like normal tuples):
from collections import namedtuple
class FactoryClass(object):
_result_type = namedtuple('ProductClass', ['prod', 'sum'])
#classmethod
def make_object(cls, arg1, arg2):
return cls._result_type(prod=arg1 * arg2, sum=arg1 + arg2)
>>> FactoryClass.make_object(2,3)
ProductClass(prod=6, sum=5)
>>> x = _
>>> x.prod = 3
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: can't set attribute

Subclassing and overriding a generator function in python

I need to override a method of a parent class, which is a generator, and am wondering the correct way to do this. Is there anything wrong with the following, or a more efficient way?
class A:
def gen(self):
yield 1
yield 2
class B(A):
def gen(self):
yield 3
for n in super().gen():
yield n
For Python 3.3 and up, the best, most general way to do this is:
class A:
def gen(self):
yield 1
yield 2
class B(A):
def gen(self):
yield 3
yield from super().gen()
This uses the new yield from syntax for delegating to a subgenerator. It's better than the other solutions because it's actually handing control to the generator it delegates to; if said generator supports .send and .throw to pass values and exceptions into the generator, then delegation means it actually receives the values; explicitly looping and yielding one by one will receive the values in the gen wrapper, not the generator actually producing the values, and the same problem applies to other solutions like using itertools.chain.
What you have looks fine, but is not the only approach. What's important about a generator function is that it returns an iterable object. Your subclass could thus instead directly create an iterable, for example:
import itertools
class B(A):
def gen(self):
return itertools.chain([3], super().gen())
The better approach is going to depend on exactly what you're doing; the above looks needlessly complex, but I wouldn't want to generalize from such a simple example.
To call a method from a subclass you need the keyword super.
New Source Code:
class B(A):
def gen(self):
yield 3
for n in super().gen():
yield n
This:
b = B()
for i in b.gen():
print(i)
produces the output:
3
1
2
In the first Iteration your generator stops at '3', for the following iterations it just goes on as the superclass normally would.
This Question provides a really good and lengthy explanation of generators, iterators and the yield- keyword:
What does the "yield" keyword do in Python?
Your code is correct.
Or rather, I don't see problem in it and it apparently runs correctly.
The only thing I can think of is the following one.
.
Post-scriptum
For new-style classes, see other answers that use super()
But super() only works for new-style classes
Anyway, this answer could be useful at least, but only, for classic-style classes.
.
When the interpreter arrives on the instruction for n in A.gen(self):, it must find the function A.gen.
The notation A.gen doesn't mean that the object A.gen is INSIDE the object A.
The object A.gen is SOMEWHERE in the memory and the interpreter will know where to find it by obtaining the needed information (an address) from A.__dict__['gen'] , in which A.__dict__ is the namespace of A.
So, finding the function object A.gen in the memory requires a lookup in A.__dict__
But to perform this lookup, the interpreter must first find the object A itself.
So, when it arrives on the instruction for n in A.gen(self): , it first searches if the identifier A is among the local identifiers, that is to say it searches for the string 'A' in the local namespace of the function (of which I don't know the name).
Since it is not, the interpreter goes outside the function and searches for this identifier at the module level, in the global namespace (which is globals() )
At this point, it may be that the global namespace would have hundreds or thousands of attributes names among which to perform the lookup for 'A'.
However, A has very few attributes: its __dict__ 's keys are only '_ module _' , 'gen' and '_ doc _' (to see that, make print A.__dict__ )
So, it would be a pity that the little search for the string 'gen' in A._dict_ should be done after a search among hundreds of items in the dictionary-namespace globals() of the module level.
.
That's why I suggest another way to make the interpreter able to find the function A.gen
class A:
def gen(self):
yield 1
yield 2
class BB(A):
def gen(self):
yield 3
for n in self.__class__.__bases__[0].gen(self):
yield n
bb = BB()
print list(bb.gen()) # prints [3, 1, 2]
self._class_ is the class from which has been instanciated the instance, that is to say it is Bu
self._class_._bases_ is a tuple containing the base classes of Bu
Presently there is only one element in this tuple , so self._class_._bases_[0] is A
__class__ and __bases__ are names of special attributes that aren't listed in _dict_ ;
In fact _class_ , _bases_ and _dict_ are special attributes of similar nature, they are Python-provided attributes, see:
http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html
.
Well, what I mean , in the end, is that there are few elements in self._class_ and in self._class_._bases_ , so it is rational to think that the successive lookups in these objects to finally find the way to access to A.gen will be faster than the lookup to search for 'gen' in the global namespace in case this one contains hundreds of elements.
Maybe that's trying to do too much optimization, maybe not.
This answer is mainly to give information on the underlying implied mechanisms, that I personally find interesting to know.
.
Edit
You can obtain the same as your code with a more concise instruction
class A:
def gen(self):
yield 1
yield 2
class Bu(A):
def gen(self):
yield 3
for n in A.gen(self):
yield n
b = Bu()
print 'list(b.gen()) ==',list(b.gen())
from itertools import chain
w = chain(iter((3,)),xrange(1,3))
print 'list(w) ==',list(w)
produces
list(b.gen()) == [3, 1, 2]
list(w) == [3, 1, 2]
If A.gen() also may contain a return statement, then you also need to make sure your override returns with a value. This is easiest done as follows:
class A:
def gen(self):
yield 1
return 2
class B:
def gen(self):
yield 3
ret = yield from super().gen()
return ret
This gives:
>>> i = A.gen()
>>> next(i)
1
>>> next(i)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration: 2
>>> i = B.gen()
>>> next(i)
3
>>> next(i)
1
>>> next(i)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration: 2
Without an explicit return statement, the last line is StopIteration instaed of StopIteration: 2.

The best way to invoke methods in Python class declarations?

Say I am declaring a class C and a few of the declarations are very similar. I'd like to use a function f to reduce code repetition for these declarations. It's possible to just declare and use f as usual:
>>> class C(object):
... def f(num):
... return '<' + str(num) + '>'
... v = f(9)
... w = f(42)
...
>>> C.v
'<9>'
>>> C.w
'<42>'
>>> C.f(4)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unbound method f() must be called with C instance as first argument (got int instance instead)
Oops! I've inadvertently exposed f to the outside world, but it doesn't take a self argument (and can't for obvious reasons). One possibility would be to del the function after I use it:
>>> class C(object):
... def f(num):
... return '<' + str(num) + '>'
... v = f(9)
... del f
...
>>> C.v
'<9>'
>>> C.f
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'C' has no attribute 'f'
But what if I want to use f again later, after the declaration? It won't do to delete the function. I could make it "private" (i.e., prefix its name with __) and give it the #staticmethod treatment, but invoking staticmethod objects through abnormal channels gets very funky:
>>> class C(object):
... #staticmethod
... def __f(num):
... return '<' + str(num) + '>'
... v = __f.__get__(1)(9) # argument to __get__ is ignored...
...
>>> C.v
'<9>'
I have to use the above craziness because staticmethod objects, which are descriptors, are not themselves callable. I need to recover the function wrapped by the staticmethod object before I can call it.
There has got to be a better way to do this. How can I cleanly declare a function in a class, use it during its declaration, and also use it later from within the class? Should I even be doing this?
Quite simply, the solution is that f does not need to be a member of the class. I am assuming that your thought-process has gone through a Javaish language filter causing the mental block. It goes a little something like this:
def f(n):
return '<' + str(num) + '>'
class C(object):
v = f(9)
w = f(42)
Then when you want to use f again, just use it
>>> f(4)
'<4>'
I think the moral of the tale is "In Python, you don't have to force everything into a class".
Extending Ali A's answer,
if you really want to avoid f in the module namespace (and using a non-exported name like _f, or setting __all__ isn't sufficient), then
you could achieve this by creating the class within a closure.
def create_C():
def f(num):
return '<' + str(num) + '>'
class C(object):
v = f(9)
def method_using_f(self, x): return f(x*2)
return C
C=create_C()
del create_C
This way C has access to f within its definition and methods, but nothing else does (barring fairly involved introspection
of its methods (C.method_using_f.im_func.func_closure))
This is probably overkill for most purposes though - documenting that f is internal by using the "_" prefix nameing convention should
generally be sufficient.
[Edit] One other option is to hold a reference to the pre-wrapped function object in the methods you wish to use it in. For example, by setting it as a default argument:
class C(object):
def f(num):
return '<' + str(num) + '>'
v = f(9)
def method_using_f(self, x, f=f): return f(x*2)
del f
(Though I think the closure approach is probably better)
I believe you are trying to do this:
class C():
... class F():
... def __call__(self,num):
... return "<"+str(num)+">"
... f=F()
... v=f(9)
>>> C.v
'<9>'
>>> C.f(25)
'<25>'
>>>
Maybe there is better or more pythonic solution...
"declare a function in a class, use it during its declaration, and also use it later from within the class"
Sorry. Can't be done.
"Can't be done" doesn't seem to get along with Python
This is one possibility:
class _C:
# Do most of the function definitions in here
#classmethod
def f(cls):
return 'boo'
class C(_C):
# Do the subsequent decoration in here
v = _C.f()
One option: write a better staticmethod:
class staticfunc(object):
def __init__(self, func):
self.func = func
def __call__(self, *args, **kw):
return self.func(*args, **kw)
def __repr__(self):
return 'staticfunc(%r)' % self.func
Let's begin from the beginning.
"declare a function in a class, use it during its declaration, and also use it later from within the class"
Sorry. Can't be done. "In a class" contradicts "used during declaration".
In a class means created as part of the declaration.
Used during declaration means it exists outside the class. Often as a meta class. However, there are other ways.
It's not clear what C.w and C.v are supposed to be. Are they just strings? If so, an external function f is the best solution. The "not clutter the namespace" is a bit specious. After all, you want to use it again.
It's in the same module as C. That's why Python has modules. It binds the function and class together.
import myCmod
myCmod.C.w
myCmod.C.v
myCmod.f(42)
If w and v aren't simple strings, there's a really good solution that gives a lot of flexibility.
Generally, for class-level ("static") variables like this, we can use other classes. It's not possible to completely achieve the desired API, but this is close.
>>> class F(object):
def __init__( self, num ):
self.value= num
self.format= "<%d>" % ( num, )
>>> class C(object):
w= F(42)
v= F(9)
>>> C.w
<__main__.F object at 0x00C58C30>
>>> C.w.format
'<42>'
>>> C.v.format
'<9>'
The advantage of this is that F is a proper, first-class thing that can be extended. Not a "hidden" thing that we're trying to avoid exposing. It's a fact of life, so we might as well follow the Open/Closed principle and make it open to extension.

Categories