Can some operators in Python not be overloaded properly? - python

I am studying Scott Meyers' More Effective C++. Item 7 advises to never overload && and ||, because their short-circuit behavior cannot be replicated when the operators are turned into function calls (or is this no longer the case?).
As operators can also be overloaded in Python, I am curious whether this situation exists there as well. Is there any operator in Python (2.x, 3.x) that, when overridden, cannot be given its original meaning?
Here is an example of 'original meaning'
class MyInt {
public:
MyInt operator+(MyInt &m) {
return MyInt(this.val + m.val);
};
int val;
MyInt(int v) : val(v){}
}

Exactly the same rationale applies to Python. You shouldn't (and can't) overload and and or, because their short-circuiting behavior cannot be expressed in terms of functions. not isn't permitted either - I guess this is because there's no guarantee that it will be invoked at all.
As pointed out in the comments, the proposal to allow the overloading of logical and and or was officially rejected.

The assignment operator can also not be overloaded.
class Thing: ...
thing = Thing()
thing = 'something else'
There is nothing you can override in Thing to change the behavior of the = operator.
(You can overload property assignment though.)

In Python, all object methods that represent operators are treated "equal": their precedences are described in the language model, and there is no conflict with overriding any.
But both C++ "&&" and "||" - in Python "and" and "or" - are not available in Python as object methods to start with - they check for the object truthfulness, though - which is defined by __bool__. If __bool__is not implemented, Python check for a __len__ method, and check if its output is zero, in which case the object's truth value is False. In all other cases its truth value is True. That makes it for any semantic problems that would arise from combining overriding with the short-circuiting behavior.
Note one can override & and | by implementing __and__ and __or__ with no problems.
As for the other operators, although not directly related, one should just take care with __getattribute__ - the method called when retrieving any attribute from an object (we normally don't mention it as an operator) - including calls from within itself. The __getattr__ is also in place, and is just invoked at the end of the attribute search chain, when an attribute is not found.

Related

Efficient arithmetic special methods in Cython

According to the Cython documentation regarding arithmetic special methods (operator overloads), the way they're implemented, I can't rely on self being the object whose special method is being called.
Evidently, this has two consequences:
I can't specify a static type in the method declaration. For example, if I have a class Foo which can only be multiplied by, say, an int, then I can't have def __mul__(self, int op) without seeing TypeErrors (sometimes).
In order to decide what to do, I have to check the types of the operands, presumably using isinstance() to handle subclasses, which seems farcically expensive in an operator.
Is there any good way to handle this while retaining the convenience of operator syntax? My whole reason for switching my classes to Cython extension types is to improve efficiency, but as they rely heavily on the arithmetic methods, based on the above it seems like I'm actually going to make them worse.
If I understand the docs and my test results correctly, you actually can have a fast __mul__(self, int op) on a Foo, but you can only use it as Foo() * 4, not 4 * Foo(). The latter would require an __rmul__, which is not supported, so it always raises TypeError.
The fact that the second argument is typed int means that Cython does the typecheck for you, so you can be sure that the left argument is really self.

Python: Difference between add and __add__

In Python, what is the difference between add and __add__ methods?
A method called add is just that - a method with that name. It has no special meaning whatsoever to the language or the interpreter. The only other thing that could be said about it is that sets have a method with the same name. That's it, nothing special about it.
The method __add__ is called internally by the + operator, so it gets special attention in the language spec and by the interpreter and you override it to define addition for object of a class. You don't call it directly (you can - they're still normal methods, they only get called implicitly in some circumstances and have some extra restrictions - but there's rarely if ever a reason - let alone a good reason). See the docs on "special" methods for details and a complete list of other "special" methods.
If you just went through this doc https://docs.python.org/3/library/operator.html and was curious about the differences between e.g.
operator.add(a, b)
operator.__add__(a, b)
Check the source code https://github.com/python/cpython/blob/3.10/Lib/operator.py :
def add(a, b):
"Same as a + b."
return a + b
...
# All of these "__func__ = func" assignments have to happen after importing
# from _operator to make sure they're set to the right function
...
__add__ = add
So
print(3+3) # call `operator.__add__` which is `operator.add`
import operator
print(operator.add(3, 3)) # call `operator.add` directory
To add to the earlier posts, __*__ are often discouraged as names for identifiers in own-classes unless one is doing some hacking on core-python functionality, like modifying / over-loading standard operators, etc. And also, often such names are linked with magical behavior, so it might be wise to avoid using them in own-namespaces unless the magical nature of a method is implied.
See this post for an elaborate argument

What languages other than Python have an explicit self?

It seems a bit weird that Python requires you to explicitly pass self as the first argument to all class functions. Are there other languages that require something similar?
By explicit, do you mean "explicitly passed as an argument to each class function"?
If so, then Python is the only one I know off-hand.
Most OO languages support this or self in some form, but most of them let you define class functions without always defining self as the first argument.
Depending on your point of view, Lua. To quote the reference: "A call v:name(args) is syntactic sugar for v.name(v,args), except that v is evaluated only once." You can also define methods using either notation. So you could say that Lua has an optional explicit self.
The programming language Oberon 2 has an explicitly named but not explicitly passed 'this' or 'self' argument for member functions of classes (known as type bound procedures in Oberon terminology)
The following example is an Insert method on a type Text, where the identifier 't' is specified to bind to the explicit 'this' or 'self' reference.
PROCEDURE (t: Text) Insert (string: ARRAY OF CHAR; pos: LONGINT);
BEGIN ...
END Insert;
More details on Object Orientation in Oberon are here.
F# (presumably from its OCAML heritage) requires an explicit name for all self-references; though the name is any arbitrary identifier e.g.
override x.BeforeAnalysis() =
base.BeforeAnalysis()
DoWithLock x.AddReference
Here we're defining an overriding member function BeforeAnalysis which calls another member function AddReference. The identifier x here is arbitrary, but is required in both the declaration and any reference to members of the "this"/"self" instance.
Modula-3 does. Which is not too surprising since Python's class mechanism is a mixture of the ones found in Modula-3 and C++.
any Object-Oriented language has a notion of this or self within member functions.
Clojure isn't an OOP language but does use explicit self parameters in some circumstances: most notably when you implement a protocol, and the "self" argument (you can name it anything you like) is the first parameter to a protocol method. This argument is then used for polymorphic dispatch to determine the right function implementation, e.g.:
(defprotocol MyProtocol
(foo [this that]))
(extend-protocol MyProtocol String
(foo [this that]
(str this " and " that)))
(extend-protocol MyProtocol Long
(foo [this that]
(* this that)))
(foo "Cat" "Dog")
=> "Cat and Dog"
(foo 10 20)
=> 200
Also, the first parameter to a function is often used by convention to mean the object that is being acted upon, e.g. the following code to append to a vector:
(conj [1 2 3] 4)
=> [1 2 3 4]
many object oriented languages if not all of them
for example c++ support "this" instead of "self"
but you dont have to pass it, it is passed passively
hope that helps ;)

Why does Python allow comparison of a callable and a number?

I used python to write an assignment last week, here is a code snippet
def departTime():
'''
Calculate the time to depart a packet.
'''
if(random.random < 0.8):
t = random.expovariate(1.0 / 2.5)
else:
t = random.expovariate(1.0 / 10.5)
return t
Can you see the problem? I compare random.random with 0.8, which
should be random.random().
Of course this because of my careless, but I don't get it. In my
opinion, this kind of comparison should invoke a least a warning in
any programming language.
So why does python just ignore it and return False?
This isn't always a mistake
Firstly, just to make things clear, this isn't always a mistake.
In this particular case, it's pretty clear the comparison is an error.
However, because of the dynamic nature of Python, consider the following (perfectly valid, if terrible) code:
import random
random.random = 9 # Very weird but legal assignment.
random.random < 10 # True
random.random > 10 # False
What actually happens when comparing objects?
As for your actual case, comparing a function object to a number, have a look at Python documentation: Python Documentation: Expressions. Check out section 5.9, titled "Comparisons", which states:
The operators <, >, ==, >=, <=, and != compare the values of two objects. The objects need not have the same type. If both are numbers, they are converted to a common type. Otherwise, objects of different types always compare unequal, and are ordered consistently but arbitrarily. You can control comparison behavior of objects of non-built-in types by defining a cmp method or rich comparison methods like gt, described in section Special method names.
(This unusual definition of comparison was used to simplify the definition of operations like sorting and the in and not in operators. In the future, the comparison rules for objects of different types are likely to change.)
That should explain both what happens and the reasoning for it.
BTW, I'm not sure what happens in newer versions of Python.
Edit: If you're wondering, Debilski's answer gives info about Python 3.
This is ‘fixed’ in Python 3 http://docs.python.org/3.1/whatsnew/3.0.html#ordering-comparisons.
Because in Python that is a perfectly valid comparison. Python can't know if you really want to make that comparison or if you've just made a mistake. It's your job to supply Python with the right objects to compare.
Because of the dynamic nature of Python you can compare and sort almost everything with almost everything (this is a feature). You've compared a function to a float in this case.
An example:
list = ["b","a",0,1, random.random, random.random()]
print sorted(list)
This will give the following output:
[0, 0.89329568818188976, 1, <built-in method random of Random object at 0x8c6d66c>, 'a', 'b']
I think python allows this because the random.random object could be overriding the > operator by including a __gt__ method in the object which might be accepting or even expecting a number. So, python thinks you know what you are doing... and does not report it.
If you try check for it, you can see that __gt__ exists for random.random...
>>> random.random.__gt__
<method-wrapper '__gt__' of builtin_function_or_method object at 0xb765c06c>
But, that might not be something you want to do.

Rules of thumb for when to use operator overloading in python

From what I remember from my C++ class, the professor said that operator overloading is cool, but since it takes relatively a lot of thought and code to cover all end-cases (e.g. when overloading + you probably also want to overload ++ and +=, and also make sure to handle end cases like adding an object to itself etc.), you should only consider it in those cases where this feature will have a major impact on your code, like overloading the operators for the matrix class in a math application.
Does the same apply to python? Would you recommend overriding operator behavior in python? And what rules of thumb can you give me?
Operator overloading is mostly useful when you're making a new class that falls into an existing "Abstract Base Class" (ABC) -- indeed, many of the ABCs in standard library module collections rely on the presence of certain special methods (and special methods, one with names starting and ending with double underscores AKA "dunders", are exactly the way you perform operator overloading in Python). This provides good starting guidance.
For example, a Container class must override special method __contains__, i.e., the membership check operator item in container (as in, if item in container: -- don't confuse with the for statement, for item in container:, which relies on __iter__!-).
Similarly, a Hashable must override __hash__, a Sized must override __len__, a Sequence or a Mapping must override __getitem__, and so forth. (Moreover, the ABCs can provide your class with mixin functionality -- e.g., both Sequence and Mapping can provide __contains__ on the basis of your supplied __getitem__ override, and thereby automatically make your class a Container).
Beyond the collections, you'll want to override special methods (i.e. provide for operator overloading) mostly if your new class "is a number". Other special cases exist, but resist the temptation of overloading operators "just for coolness", with no semantic connection to the "normal" meanings, as C++'s streams do for << and >> and Python strings (in Python 2.*, fortunately not in 3.* any more;-) do for % -- when such operators do not any more mean "bit-shifting" or "division remainder", you're just engendering confusion. A language's standard library can get away with it (though it shouldn't;-), but unless your library gets as widespread as the language's standard one, the confusion will hurt!-)
I've written software with significant amounts of overloading, and lately I regret that policy. I would say this:
Only overload operators if it's the natural, expected thing to do and doesn't have any side effects.
So if you make a new RomanNumeral class, it makes sense to overload addition and subtraction etc. But don't overload it unless it's natural: it makes no sense to define addition and subtraction for a Car or a Vehicle object.
Another rule of thumb: don't overload ==. It makes it very hard (though not impossible) to actually test if two objects are the same. I made this mistake and paid for it for a long time.
As for when to overload +=, ++ etc, I'd actually say: only overload additional operators if you have a lot of demand for that functionality. It's easier to have one way to do something than five. Sure, it means sometimes you'll have to write x = x + 1 instead of x += 1, but more code is ok if it's clearer.
In general, like with many 'fancy' features, it's easy to think that you want something when you don't really, implement a bunch of stuff, not notice the side effects, and then figure it out later. Err on the conservative side.
EDIT: I wanted to add an explanatory note about overloading ==, because it seems various commenters misunderstand this, and it's caught me out. Yes, is exists, but it's a different operation. Say I have an object x, which is either from my custom class, or is an integer. I want to see if x is the number 500. But if you set x = 500, then later test x is 500, you will get False, due to the way Python caches numbers. With 50, it would return True. But you can't use is, because you might want x == 500 to return True if x is an instance of your class. Confusing? Definitely. But this is the kind of detail you need to understand to successfully overload operators.
Here is an example that uses the bitwise or operation to simulate a unix pipeline. This is intended as a counter example to most of the rules of thumb.
I just found Lumberjack which uses this syntax in real code
class pipely(object):
def __init__(self, *args, **kw):
self._args = args
self.__dict__.update(kw)
def __ror__(self, other):
return ( self.map(x) for x in other if self.filter(x) )
def map(self, x):
return x
def filter(self, x):
return True
class sieve(pipely):
def filter(self, x):
n = self._args[0]
return x==n or x%n
class strify(pipely):
def map(self, x):
return str(x)
class startswith(pipely):
def filter(self, x):
n=str(self._args[0])
if x.startswith(n):
return x
print"*"*80
for i in xrange(2,100) | sieve(2) | sieve(3) | sieve(5) | sieve(7) | strify() | startswith(5):
print i
print"*"*80
for i in xrange(2,100) | sieve(2) | sieve(3) | sieve(5) | sieve(7) | pipely(map=str) | startswith(5):
print i
print"*"*80
for i in xrange(2,100) | sieve(2) | sieve(3) | sieve(5) | sieve(7) | pipely(map=str) | pipely(filter=lambda x: x.startswith('5')):
print i
Python's overloading is "safer" in general than C++'s -- for example, the assignment operator can't be overloaded, and += has a sensible default implementation.
In some ways, though, overloading in Python is still as "broken" as in C++. Programmers should restrain the desire to "re-use" an operator for unrelated purposes, such as C++ re-using the bitshifts to perform string formatting and parsing. Don't overload an operator with different semantics from your implementation just to get prettier syntax.
Modern Python style strongly discourages "rogue" overloading, but many aspects of the language and standard library retain poorly-named operators for backwards compatibility. For example:
%: modulus and string formatting
+: addition and sequence concatenation
*: multiplication and sequence repetition
So, rule of thumb? If your operator implementation will surprise people, don't do it.

Categories