Is it possible to modify the behavior of len()? - python

I'm aware of creating a custom __repr__ or __add__ method (and so on), to modify the behavior of operators and functions. Is there a method override for len?
For example:
class Foo:
def __repr__(self):
return "A wild Foo Class in its natural habitat."
foo = Foo()
print(foo) # A wild Foo Class in its natural habitat.
print(repr(foo)) # A wild Foo Class in its natural habitat.
Could this be done for len, with a list? Normally, it would look like this:
foo = []
print(len(foo)) # 0
foo = [1, 2, 3]
print(len(foo)) # 3
What if I want to leave search types out of the count? Like this:
class Bar(list):
pass
foo = [Bar(), 1, '']
print(len(foo)) # 3
count = 0
for item in foo:
if not isinstance(item, Bar):
count += 1
print(count) # 2
Is there a way to do this from within a list subclass?

Yes, implement the __len__ method:
def __len__(self):
return 42
Demo:
>>> class Foo(object):
... def __len__(self):
... return 42
...
>>> len(Foo())
42
From the documentation:
Called to implement the built-in function len(). Should return the length of the object, an integer >= 0. Also, an object that doesn’t define a __bool__() method and whose __len__() method returns zero is considered to be false in a Boolean context.
For your specific case:
>>> class Bar(list):
... def __len__(self):
... return sum(1 for ob in self if not isinstance(ob, Bar))
...
>>> len(Bar([1, 2, 3]))
3
>>> len(Bar([1, 2, 3, Bar()]))
3

Yes, just as you have already discovered that you can override the behaviour of a repr() function call by implementing the __repr__ magic method, you can specify the behaviour from a len() function call by implementing (surprise surprise) then __len__ magic:
>>> class Thing:
... def __len__(self):
... return 123
...
>>> len(Thing())
123
A pedant might mention that you are not modifying the behaviour of len(), you are modifying the behaviour of your class. len just does the same thing it always does, which includes checking for a __len__ attribute on the argument.

Remember: Python is a dynamically and Duck Typed language.
If it acts like something that might have a length;
class MyCollection(object):
def __len__(self):
return 1234
Example:
>>> obj = MyCollection()
>>> len(obj)
1234
if it doesn't act like it has a length; KABOOM!
class Foo(object):
def __repr___(self):
return "<Foo>"
Example:
>>> try:
... obj = Foo()
... len(obj)
... except:
... raise
...
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
TypeError: object of type 'Foo' has no len()
From Typing:
Python uses duck typing and has typed objects but untyped variable
names. Type constraints are not checked at compile time; rather,
operations on an object may fail, signifying that the given object is
not of a suitable type. Despite being dynamically typed, Python is
strongly typed, forbidding operations that are not well-defined (for
example, adding a number to a string) rather than silently attempting
to make sense of them.
Example:
>>> x = 1234
>>> s = "1234"
>>> x + s
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'str'

You can just add a __len__ method to your class.
class Test:
def __len__(self):
return 2
a=Test()
len(a) # --> 2

Related

re-defining an existing function as method of a class

I am learning to use classes.
I want to create a class with several methods and attributes.
In my case, some of the methods might be functions that I have written elsewhere in the code.
Is it a good practice to re-define the functions as methods of the class like in the following example?
def f(x,b):
return b*x
class Class1:
def __init__(self,b):
self.b=8.
def f(self,x):
return f(x,self.b)
instance1=Class1(5)
print instance1.f(7.)
The code returns what it should, but is this the right way to do it or perhaps it is redundant or it might lead to troubles for larger codes?
What is the right way to define methods using function written elsewhere?
Functions...
Consider the following group of functions:
def stack_push(stack, value):
stack.append(value)
def stack_is_empty(stack):
return len(stack) == 0
def stack_pop(stack):
return stack.pop()
Together, they implement a stack in terms of a built-in list. As long as you don't interact with the list directly, the only thing they allow is adding values to one end, removing values from the same end, and testing if the stack is empty:
>>> s = []
>>> stack_push(s, 3)
>>> stack_push(s, 5)
>>> stack_push(s, 10)
>>> if not stack_is_empty(s): stack_pop(s)
10
>>> if not stack_is_empty(s): stack_pop(s)
5
>>> if not stack_is_empty(s): stack_pop(s)
3
>>> if not stack_is_empty(s): stack_pop(s)
>>>
... vs. Methods
Notice that each function takes the same argument: a list being treated as a stack. This is an indication that we can instead write a class the represents a stack, so that we don't need to maintain a list that could potentially be (mis)used outside of these three functions. It also guarantees that we start with an empty list for our new stack.
class Stack:
def __init__(self):
self.data = []
def push(self, value):
self.data.append(value)
def is_empty(self):
return len(self.data) == 0
def pop(self):
return self.data.pop()
Now, we don't work with a list that supports all sorts of non-stack operations like indexing, iteration, and mutation at the beginning or middle of the list: we can only push, pop, and test for emptiness.
>>> s = Stack()
Things like s[3], s.insert(2, 9), etc are not allowed.
(Note that we aren't strictly prevented from using s.data directly, but it's considered bad practice to do so unless the class says it is OK to do so in its documentation. In this case, we do not allow that.)
We use these methods much like we used the stack_* functions.
>>> s.push(3)
>>> s.push(5)
>>> s.push(10)
>>> if not s.is_empty(): s.pop()
10
>>> if not s.is_empty(): s.pop()
5
>>> if not s.is_empty(): s.pop()
3
>>> if not s.is_empty(): s.pop()
>>>
The difference is, we cannot "accidentally" use other list methods, because Stack does not expose them.
>>> s.insert(3, 9)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Stack' object has no attribute 'insert'
Finally, note that we don't write our original stack_* functions and use them in the definition of the Stack class: there is no need; we just define the methods explicitly inside the class statement.
# No.
class Stack:
def push(self, value):
stack_push(self.data, value)
We also don't continue to use the stack_* functions on an instance of Stack.
# No no no!
>>> stack_is_empty(s.data)

Why does this method not return a value?

I have the following code:
class Thing:
def __init__(self):
self.a = 30
self.b = 10
def sumit(self):
return self.a + self.b
giventhing = Thing
print(giventhing.sumit/2)
I get this error:
TypeError: unsupported operand type(s) for /: 'function and 'int'
There are two issues here:
sumit is an instance method, so you need to call it on an instance, not a class (or type).
To execute callables, such as methods, you need to use the propert syntax, which is method(), note the () at the end.
Doing giventhing = Thing won't give you an instance, it will give you a reference to the class/type itself, which is only useful if you want to operate with class members, which is not your use case.
Doing giventhing.sumit / 2, won't divide the result of sumit by 2. In fact, giventhing.sumit will yield a reference to the function itself, not its result. You need to call the function in order to get its return value, i.e. sumit()
Fixed code:
giventhing = Thing() # You need an instance of Thing
print(giventhing.sumit() / 2) # You need to actually call sumit on the instance
sumit is a function: you need to call it with brackets: print(giventhing.sumit()/2)
Functions are type of function you need to first call it
So need:
giventhing = Thing()
print(giventhing.sumit()/2)
So totally need parentheses
Here's an example:
>>> class A:
def __init__(self,a):
self.a=a
def out(self):
return self.a
>>> A
<class '__main__.A'>
>>> a=A(1)
>>> a.out
<bound method A.out of <__main__.A object at 0x0000005F9D177EB8>>
>>> a.out()
1
>>>

Monkey patching operator overloads behaves differently in Python2 vs Python3 [duplicate]

This question already has answers here:
Overriding special methods on an instance
(5 answers)
Closed 6 years ago.
Consider the following code:
class Foo:
def __mul__(self,other):
return other/0
x = Foo()
x.__mul__ = lambda other:other*0.5
print(x.__mul__(5))
print(x*5)
In Python2 (with from future import print), this outputs
2.5
2.5
In Python3, this outputs
2.5
---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
<ipython-input-1-36322c94fe3a> in <module>()
5 x.__mul__ = lambda other:other*0.5
6 print(x.__mul__(5))
----> 7 print(x*5)
<ipython-input-1-36322c94fe3a> in __mul__(self, other)
1 class Foo:
2 def __mul__(self,other):
----> 3 return other/0
4 x = Foo()
5 x.__mul__ = lambda other:other*0.5
ZeroDivisionError: division by zero
I ran into this situation when I was trying to implement a type that supported a subset of algebraic operations. For one instance, I needed to modify the multiplication function for laziness: some computation must be deferred until the instance is multiplied with another variable. The monkey patch worked in Python 2, but I noticed it failed in 3.
Why does this happen?
Is there any way to get more flexible operator overloading in Python3?
That is not a monkeypatch.
This would have been a monkeypatch:
class Foo:
def __mul__(self, other):
return other / 0
Foo.__mul__ = lambda self,other: other * 0.5
x = Foo()
x*9 # prints 4.5
What was done with x.__mul__ = lambda other:other*0.5 was creating a __mul__ attribute on the x instance.
Then, it was expected that x*5 would call x.__mul__(5). And it did, in Python 2.
In Python 3, it called Foo.__mul__(x, 5), so the attribute was not used.
Python 2 would have done the same as Python 3, but it did not because Foo was created as an old-style class.
This code would be equivalent for Python 2 and Python 3:
class Foo(object):
def __mul__(self,other):
return other/0
x = Foo()
x.__mul__ = lambda other:other*0.5
print(x.__mul__(5))
print(x*5)
That will raise an exception. Note the (object).
You can't override the special methods on instance level. Based on python's documentation:
For custom classes, implicit invocations of special methods are only guaranteed to work correctly if defined on an object’s type, not in the object’s instance dictionary.
The rationale behind this behaviour lies with a number of special methods such as __hash__() and __repr__() that are implemented by all objects, including type objects. If the implicit lookup of these methods used the conventional lookup process, they would fail when invoked on the type object itself:
>>> 1 .__hash__() == hash(1)
True
>>> int.__hash__() == hash(int)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: descriptor '__hash__' of 'int' object needs an argument
So one simple way is defining a regular function for your your monkey-patching aims, and assign your new method to it:
In [45]: class Foo:
def __init__(self, arg):
self.arg = arg
def __mul__(self,other):
return other * self.arg
def _mul(self, other):
return other/0
Demo:
In [47]: x = Foo(10)
In [48]: x * 3
Out[48]: 30
In [49]: my_func = lambda x: x * 0.5
In [50]: x._mul = my
my_func mypub/
In [50]: x._mul = my_func
In [51]: x._mul(4)
Out[51]: 2.0

How to implement "__iadd__()" for an immutable type?

I would like to subclass an immutable type or implement one of my own which behaves like an int does as shown in the following console session:
>>> i=42
>>> id(i)
10021708
>>> i.__iadd__(1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute '__iadd__'
>>> i += 1
>>> i
43
>>> id(i)
10021696
Not surprisingly, int objects have no __iadd__() method, yet applying += to one doesn't result in an error, instead it apparently creates a new int and also somehow magically reassigns it to the name given in the augmented assignment statement.
Is it possible to create a user-defined class or subclass of a built-in immutable one that does this, and if so, how?
Simply don't implement __iadd__, but only __add__:
>>> class X(object):
... def __add__(self, o):
... return "added"
>>> x = X()
>>> x += 2
>>> x
'added'
If there's no x.__iadd__, Python simply calculates x += y as x = x + y doc.
The return value of __iadd__() is used. You don't need to return the object that's being added to; you can create a new one and return that instead. In fact, if the object is immutable, you have to.
import os.path
class Path(str):
def __iadd__(self, other):
return Path(os.path.join(str(self), str(other)))
path = Path("C:\\")
path += "windows"
print path
When it sees i += 1, Python will try to call __iadd__. If that fails, it'll try to call __add__.
In both cases, the result of the call will be bound to the name, i.e. it'll attempt i = i.__iadd__(1) and then i = i.__add__(1).
class aug_int:
def __init__(self, value):
self.value = value
def __iadd__(self, other):
self.value += other
return self
>>> i = aug_int(34)
>>> i
<__main__.aug_int instance at 0x02368E68>
>>> i.value
34
>>> i += 55
>>> i
<__main__.aug_int instance at 0x02368E68>
>>> i.value
89
>>>

What makes a user-defined class unhashable?

The docs say that a class is hashable as long as it defines __hash__ method and __eq__ method. However:
class X(list):
# read-only interface of `tuple` and `list` should be the same, so reuse tuple.__hash__
__hash__ = tuple.__hash__
x1 = X()
s = {x1} # TypeError: unhashable type: 'X'
What makes X unhashable?
Note that I must have identical lists (in terms of regular equality) to be hashed to the same value; otherwise, I will violate this requirement on hash functions:
The only required property is that objects which compare equal have
the same hash value
The docs do warn that a hashable object shouldn't be modified during its lifetime, and of course I don't modify instances of X after creation. Of course, the interpreter won't check that anyway.
Simply setting the __hash__ method to that of the tuple class is not enough. You haven't actually told it how to hash any differently. tuples are hashable because they are immutable. If you really wanted to make you specific example work, it might be like this:
class X2(list):
def __hash__(self):
return hash(tuple(self))
In this case you are actually defining how to hash your custom list subclass. You just have to define exactly how it can generate a hash. You can hash on whatever you want, as opposed to using the tuple's hashing method:
def __hash__(self):
return hash("foobar"*len(self))
From the Python3 docs:
If a class does not define an __eq__() method it should not define a
__hash__() operation either; if it defines __eq__() but not __hash__(), its instances will not be usable as items in hashable collections. If a class defines mutable objects and implements an
__eq__() method, it should not implement __hash__(), since the implementation of hashable collections requires that a key’s hash
value is immutable (if the object’s hash value changes, it will be in
the wrong hash bucket).
Ref: object.__hash__(self)
Sample code:
class Hashable:
pass
class Unhashable:
def __eq__(self, other):
return (self == other)
class HashableAgain:
def __eq__(self, other):
return (self == other)
def __hash__(self):
return id(self)
def main():
# OK
print(hash(Hashable()))
# Throws: TypeError("unhashable type: 'X'",)
print(hash(Unhashable()))
# OK
print(hash(HashableAgain()))
What you could and should do, based on your other question, is:
don't subclass anything, just encapsulate a tuple. It's perfectly fine to do so in the init.
class X(object):
def __init__(self, *args):
self.tpl = args
def __hash__(self):
return hash(self.tpl)
def __eq__(self, other):
return self.tpl == other
def __repr__(self):
return repr(self.tpl)
x1 = X()
s = {x1}
which yields:
>>> s
set([()])
>>> x1
()
An addition to the above answers - For the specific case of a dataclass in python3.7+ - to make a dataclass hashable, you can use
#dataclass(frozen=True)
class YourClass:
pass
as the decoration instead of
#dataclass
class YourClass:
pass
If you don't modify instances of X after creation, why aren't you subclassing tuple?
But I'll point out that this actually doesn't throw an error, at least in Python 2.6.
>>> class X(list):
... __hash__ = tuple.__hash__
... __eq__ = tuple.__eq__
...
>>> x = X()
>>> s = set((x,))
>>> s
set([[]])
I hesitate to say "works" because this doesn't do what you think it does.
>>> a = X()
>>> b = X((5,))
>>> hash(a)
4299954584
>>> hash(b)
4299954672
>>> id(a)
4299954584
>>> id(b)
4299954672
It's just using the object id as a hash. When you actually call __hash__ you still get an error; likewise for __eq__.
>>> a.__hash__()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: descriptor '__hash__' for 'tuple' objects doesn't apply to 'X' object
>>> X().__eq__(X())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: descriptor '__eq__' for 'tuple' objects doesn't apply to 'X' object
I gather that the python internals, for some reason, are detecting that X has a __hash__ and an __eq__ method, but aren't calling them.
The moral of all this is: just write a real hash function. Since this is a sequence object, converting it to a tuple and hashing that is the most obvious approach.
def __hash__(self):
return hash(tuple(self))

Categories