How to implement #dataclass to define arithmetic operations in Python? - python

I'm learning Python on my own and I found a task that requires using a decorator #dataclass to create a class with basic arithmetic operations.
from dataclasses import dataclass
from numbers import Number
#dataclass
class MyClass:
x: float
y: float
def __add__(self, other):
match other:
case Number():
return MyClass(float(other) + self.x, self.y)
case MyClass(ot_x, ot_y):
return MyClass(self.x + ot_x, self.y + ot_y)
__radd__ = __add__
I have implemented the addition operation. But I also need to do the operations of subtraction __sub__, multiplication __mul__, division __truediv__, negation __neg__, also __mod__ and __pow__. But I couldn't realize these operations. The main thing for me is to use the construction match/case. Maybe there are simpler ways to create it.
I will be glad of your help.

If you're trying to make a complete numeric type, I strongly suggest checking out the implementation of the fractions.Fraction type in the fractions source code. The class was intentionally designed as a model for how you'd overload all the pairs of operators needed to implement a numeric type at the Python layer (it's explicitly pointed out in the numbers module's guide to type implementers).
The critical parts for minimizing boilerplate begin with the definition of the _operator_fallbacks utility function within the class (which is used to take a single implementation of the operation and the paired operator module function representing it, and generate the associated __op__ and __rop__ operators, being type strict for the former and relaxed for the latter, matching the intended behavior of each operator based on whether it's the first chance or last chance to implement the method).
It's far too much code to include here, but to show how you'd implement addition using it, I'll adapt your code to call it (you'd likely use a slightly different implementation of _operator_fallbacks, but the idea is the same):
import operator
# Optional, but if you want to act like built-in numeric types, you
# should be immutable, and using slots (if you can rely on Python 3.10+)
# dramatically reduce per-instance memory overhead
# Pre-3.10, since x and y don't have defaults, you could define __slots__ manually
#dataclass(frozen=True, slots=True)
class MyClass:
x: float
y: float
# _operator_fallbacks defined here
# When it received a non-MyClass, it would construct a MyClass from it, e.g.
# to match your original code it would construct it as MyClass(float(val), 0)
# and then invoke the monomorphic_operator, e.g. the _add passed to it below
# or, if the type did not make sense to convert to MyClass, but it made sense
# to convert the MyClass instance to the other type, it would do so, then use the
# provided fallback operator to perform the computation
# For completely incompatible types, it just returns NotImplemented
def _add(a, b):
"""a + b"""
return MyClass(a.x + b.x, a.y + b.y) # _operator_fallback has already coerced the types appropriately
__add__, __radd__ = _operator_fallbacks(_add, operator.add)
By putting the ugliness of type-checking and coercion in common code found in _operator_fallbacks, and putting only the real work of addition in _add, it avoids a lot of every-operator-overload boilerplate (as you can see here; _operator_fallbacks will be a page of code to make the forward and reverse functions and return them, but each new operator is only a few lines, defining the monomorphic operator and calling _operator_fallbacks to generate the __op__/__rop__ pair.

Related

Can you create an abstract data type that is *not* in a class?

This is just a question that is a curiosity as I was reviewing OOP. Can you have an ADT that is not in a class? So it'd all be separate functions. The language (it shouldn't matter, but in case it does) that I'm thinking in is Python 3.
No. A data type (in Python at least) is by definition a class. In C, you have to simulate object-orientedness by having individual functions, but there still has to be a struct to hold the data. Otherwise, there's no "data type".
Within the Python language, a data type is a class (with certain properties), so the trivial answer is no. In particular, one major characteristic that differentiates class (or data type) functionality from simple function calls, is that the defined data operations work seamlessly on the data type, or with trivial syntax, rather than having to specify every operation and operand in an explicit call.
Consider the statements:
# Fully functional, implicit data type operation
z = x + y
# Explicit data type operation, still within the class
z = x.add(y)
# Function call
z = add(x, y)
In the third instance, you have none of the built-in protections or encapsulations that come with a class. You can have a set of functions that just happen to coordinate to give you the desired results, but this is not an abstract data type.

Is this an example of python function overload?

I know python does not allow us to overload functions. However, does it have inbuilt overloaded methods?
Consider this:
setattr(object_name,'variable', 'value')
setattr(class_name,'method','function')
The first statement dynamically adds variables to objects during run time, but the second one attaches outside functions to classes at run time.
The same function does different things based on its arguments. Is this function overload?
The function setattr(foo, 'bar', baz) is always the same as foo.bar = baz, regardless of the type of foo. There is no overloading here.
In Python 3, limited overloading is possible with functools.singledispatch, but setattr is not implemented with that.
A far more interesting example, in my opinion, is type(). type() does two entirely different things depending on how you call it:
If called with a single argument, it returns the type of that argument.
If called with three arguments (of the correct types), it dynamically creates a new class.
Nevertheless, type() is not overloaded. Why not? Because it is implemented as one function that counts how many arguments it got and then decides what to do. In pure Python, this is done with the variadic *args syntax, but type() is implemented in C, so it looks rather different. It's doing the same thing, though.
Python, in some sense, doesn't need a function overloading capability when other languages do. Consider the following example in C:
int add(int x, int y) {
return x + y;
}
If you wish to extend the notion to include stuff that are not integers you would need to make another function:
float add(float x, float y) {
return x + y;
}
In Python, all you need is:
def add(x, y):
return x + y
It works fine for both, and it isn't considered function overloading. You can also handle different cases of variable types using methods like isinstance. The major issue, as pointed out by this question, is the number of types. But in your case you pass the same number of types, and even so, there are ways around this without function overloading.
overloading methods is tricky in python. However, there could be usage of passing the dict, list or primitive variables.
I have tried something for my use cases, this could help here to understand people to overload the methods.
Let's take the example:
a class overload method with call the methods from different class.
def add_bullet(sprite=None, start=None, headto=None, spead=None, acceleration=None):
pass the arguments from remote class:
add_bullet(sprite = 'test', start=Yes,headto={'lat':10.6666,'long':10.6666},accelaration=10.6}
OR add_bullet(sprite = 'test', start=Yes,headto={'lat':10.6666,'long':10.6666},speed=['10','20,'30']}
So, handling is being achieved for list, Dictionary or primitive variables from method overloading.
try it out for your codes

How do you set a conditional in python based on datatypes?

This question seems mind-boggling simple, yet I can't figure it out. I know you can check datatypes in python, but how can you set a conditional based on the datatype? For instance, if I have to write a code that sorts through a dictionary/list and adds up all the integers, how do I isolate the search to look for only integers?
I guess a quick example would look something like this:
y = []
for x in somelist:
if type(x) == <type 'int'>: ### <--- psuedo-code line
y.append(x)
print sum(int(z) for z in y)
So for line 3, how would I set such a conditional?
How about,
if isinstance(x, int):
but a cleaner way would simply be
sum(z for z in y if isinstance(z, int))
TLDR:
Use if isinstance(x, int): unless you have a reason not to.
Use if type(x) is int: if you need exact type equality and nothing else.
Use try: ix = int(x) if you are fine with converting to the target type.
There is a really big "it depends" to type-checking in Python. There are many ways to deal with types, and all have their pros and cons. With Python3, several more have emerged.
Explicit type equality
Types are first-class objects, and you can treat them like any other value.
So if you want the type of something to be equal to int, just test for it:
if type(x) is int:
This is the most restrictive type of testing: it requires exact type equality. Often, this is not what you want:
It rules out substitute types: a float would not be valid, even though it behaves like an int for many purposes.
It rules out subclasses and abstract types: a pretty-printing int subclass or enum would be rejected, even though they are logically Integers.
This severely limits portability: Python2 Strings can be either str or unicode, and Integers can be either int or long.
Note that explicit type equality has its uses for low-level operations:
Some types cannot be subclassed, such as slice. An explicit check is, well, more explicit here.
Some low-level operations, such as serialisation or C-APIs, require specific types.
Variants
A comparison can also be performed against the __class__ attribute:
if x.__class__ is int:
Note if a class defines a __class__ property, this is not the same as type(x).
When there are several classes to check for, using a dict to dispatch actions is more extensible and can be faster (≥5-10 types) than explicit checks.
This is especially useful for conversions and serialisation:
dispatch_dict = {float: round, str: int, int: lambda x: x}
def convert(x):
converter = self.dispatch_dict[type(x)] # lookup callable based on type
return converter(x)
Instance check on explicit types
The idiomatic type test uses the isinstance builtin:
if isinstance(x, int):
This check is both exact and performant. This is most often what people want for checking types:
It handles subtypes properly. A pretty-printing int subclass would still pass this test.
It allows checking multiple types at once. In Python2, doing isinstance(x, (int, long)) gets you all builtin integers.
Most importantly, the downsides are negligible most of the time:
It still accepts funky subclasses that behave in weird ways. Since anything can be made to behave in weird ways, this is futile to guard against.
It can easily be too restrictive: many people check for isinstance(x, list) when any sequence (e.g. tuple) or even iterable (e.g. a generator) would do as well. This is more of a concern for general purpose libraries than scripts or applications.
Variant
If you already have a type, issubclass behaves the same:
if issubclass(x_type, int):
Instance check on abstract type
Python has a concept of abstract base classes. Loosely speaking, these express the meaning of types, not their hierarchy:
if isinstance(x, numbers.Real): # accept anything you can sum up like a number
In other words, type(x) does not necessarily inherit from numbers.Real but must behave like it.
Still, this is a very complex and difficult concept:
It is often overkill if you are looking for basic types. An Integer is simply an int most of the time.
People coming from other languages often confuse its concepts.
Distinguishing it from e.g. C++, the emphasis is abstract base class as opposed to abstract base class.
ABCs can be used like Java interfaces, but may still have concrete functionality.
However, it is incredibly useful for generic libraries and abstractions.
Many functions/algorithms do not need explicit types, just their behaviour.
If you just need to look up things by key, dict restricts you to a specific in-memory type. By contrast, collections.abc.Mapping also includes database wrappers, large disk-backed dictionaries, lazy containers, ... - and dict.
It allows expressing partial type constraints.
There is no strict base type implementing iteration. But if you check objects against collections.abc.Iterable, they all work in a for loop.
It allows creating separate, optimised implementations that appear as the same abstract type.
While it is usually not needed for throwaway scripts, I would highly recommend using this for anything that lives beyond a few python releases.
Tentative conversion
The idiomatic way of handling types is not to test them, but to assume they are compatible. If you already expect some wrong types in your input, simply skip everything that is not compatible:
try:
ix = int(x)
except (ValueError, TypeError):
continue # not compatible with int, try the next one
else:
a.append(ix)
This is not actually a type check, but usually serves the same intention.
It guarantees you have the expected type in your output.
It has some limited leeway in converting wrong types, e.g. specialising float to int.
It works without you knowing which types conform to int.
The major downside is that it is an explicit transformation.
You can silently accept "wrong" values, e.g. converting a str containing a literal.
It needlessly converts even types that would be good enough, e.g. float to int when you just need numbers.
Conversion is an effective tool for some specific use cases. It works best if you know roughly what your input is, and must make guarantees about your output.
Function dispatch
Sometimes the goal of type checking is just to select an appropriate function. In this case, function dispatch such as functools.singledispatch allows specialising function implementations for specific types:
#singledispatch
def append_int(value, sequence):
return
#append_int.register
def _(value: int, sequence):
sequence.append(value)
This is a combination of isinstance and dict dispatch. It is most useful for larger applications:
It keeps the site of usage small, regardless of the number of dispatched types.
It allows registering specialisations for additional types later, even in other modules.
Still, it doesn't come without its downsides:
Originating in functional and strongly typed languages, many Python programmers are not familiar with single- or even multiple-dispatch.
Dispatches require separate functions, and are therefore not suitable to be defined at the site of usage.
Creating the functions and "warming up" the dispatch cache takes notable runtime overhead. Dispatch functions should be defined once and re-used often.
Even a warmed up dispatch table is slower than a hand-written if/else or dict lookup.
Controlling the input
The best course of action is to ensure you never have to check for type in the first place. This is a bit of a meta-topic, as it depends strongly on the use case.
Here, the source of somelist should never have put non-numbers into it.
You can simply use type and equal operator like this
if (type(x) == int):
let me declare variable x of type int
x = 2
if type(x) == type(1) or isinstance(x, int):
# do something
Both works fine.
Easy - use types.
import types
k = 5
if(type(k)==types.IntType):
print "int"
Here's a quick dir(types):
['BooleanType', 'BufferType', 'BuiltinFunctionType', 'BuiltinMethodType', 'ClassType', 'CodeType', 'ComplexType', 'DictProxyType', 'DictType', 'DictionaryType', 'EllipsisType', 'FileType', 'FloatType', 'FrameType', 'FunctionType', 'GeneratorType', 'GetSetDescriptorType', 'InstanceType', 'IntType', 'LambdaType', 'ListType', 'LongType', 'MemberDescriptorType', 'MethodType', 'ModuleType', 'NoneType', 'NotImplementedType', 'ObjectType', 'SliceType', 'StringType', 'StringTypes', 'TracebackType', 'TupleType', 'TypeType', 'UnboundMethodType', 'UnicodeType', 'XRangeType', '__builtins__', '__doc__', '__file__', '__name__', '__package__']
You can use the type function on both sides of the operator. Like this:
if type(x) == type(1):

Efficient arithmetic special methods in Cython

According to the Cython documentation regarding arithmetic special methods (operator overloads), the way they're implemented, I can't rely on self being the object whose special method is being called.
Evidently, this has two consequences:
I can't specify a static type in the method declaration. For example, if I have a class Foo which can only be multiplied by, say, an int, then I can't have def __mul__(self, int op) without seeing TypeErrors (sometimes).
In order to decide what to do, I have to check the types of the operands, presumably using isinstance() to handle subclasses, which seems farcically expensive in an operator.
Is there any good way to handle this while retaining the convenience of operator syntax? My whole reason for switching my classes to Cython extension types is to improve efficiency, but as they rely heavily on the arithmetic methods, based on the above it seems like I'm actually going to make them worse.
If I understand the docs and my test results correctly, you actually can have a fast __mul__(self, int op) on a Foo, but you can only use it as Foo() * 4, not 4 * Foo(). The latter would require an __rmul__, which is not supported, so it always raises TypeError.
The fact that the second argument is typed int means that Cython does the typecheck for you, so you can be sure that the left argument is really self.

Rules of thumb for when to use operator overloading in python

From what I remember from my C++ class, the professor said that operator overloading is cool, but since it takes relatively a lot of thought and code to cover all end-cases (e.g. when overloading + you probably also want to overload ++ and +=, and also make sure to handle end cases like adding an object to itself etc.), you should only consider it in those cases where this feature will have a major impact on your code, like overloading the operators for the matrix class in a math application.
Does the same apply to python? Would you recommend overriding operator behavior in python? And what rules of thumb can you give me?
Operator overloading is mostly useful when you're making a new class that falls into an existing "Abstract Base Class" (ABC) -- indeed, many of the ABCs in standard library module collections rely on the presence of certain special methods (and special methods, one with names starting and ending with double underscores AKA "dunders", are exactly the way you perform operator overloading in Python). This provides good starting guidance.
For example, a Container class must override special method __contains__, i.e., the membership check operator item in container (as in, if item in container: -- don't confuse with the for statement, for item in container:, which relies on __iter__!-).
Similarly, a Hashable must override __hash__, a Sized must override __len__, a Sequence or a Mapping must override __getitem__, and so forth. (Moreover, the ABCs can provide your class with mixin functionality -- e.g., both Sequence and Mapping can provide __contains__ on the basis of your supplied __getitem__ override, and thereby automatically make your class a Container).
Beyond the collections, you'll want to override special methods (i.e. provide for operator overloading) mostly if your new class "is a number". Other special cases exist, but resist the temptation of overloading operators "just for coolness", with no semantic connection to the "normal" meanings, as C++'s streams do for << and >> and Python strings (in Python 2.*, fortunately not in 3.* any more;-) do for % -- when such operators do not any more mean "bit-shifting" or "division remainder", you're just engendering confusion. A language's standard library can get away with it (though it shouldn't;-), but unless your library gets as widespread as the language's standard one, the confusion will hurt!-)
I've written software with significant amounts of overloading, and lately I regret that policy. I would say this:
Only overload operators if it's the natural, expected thing to do and doesn't have any side effects.
So if you make a new RomanNumeral class, it makes sense to overload addition and subtraction etc. But don't overload it unless it's natural: it makes no sense to define addition and subtraction for a Car or a Vehicle object.
Another rule of thumb: don't overload ==. It makes it very hard (though not impossible) to actually test if two objects are the same. I made this mistake and paid for it for a long time.
As for when to overload +=, ++ etc, I'd actually say: only overload additional operators if you have a lot of demand for that functionality. It's easier to have one way to do something than five. Sure, it means sometimes you'll have to write x = x + 1 instead of x += 1, but more code is ok if it's clearer.
In general, like with many 'fancy' features, it's easy to think that you want something when you don't really, implement a bunch of stuff, not notice the side effects, and then figure it out later. Err on the conservative side.
EDIT: I wanted to add an explanatory note about overloading ==, because it seems various commenters misunderstand this, and it's caught me out. Yes, is exists, but it's a different operation. Say I have an object x, which is either from my custom class, or is an integer. I want to see if x is the number 500. But if you set x = 500, then later test x is 500, you will get False, due to the way Python caches numbers. With 50, it would return True. But you can't use is, because you might want x == 500 to return True if x is an instance of your class. Confusing? Definitely. But this is the kind of detail you need to understand to successfully overload operators.
Here is an example that uses the bitwise or operation to simulate a unix pipeline. This is intended as a counter example to most of the rules of thumb.
I just found Lumberjack which uses this syntax in real code
class pipely(object):
def __init__(self, *args, **kw):
self._args = args
self.__dict__.update(kw)
def __ror__(self, other):
return ( self.map(x) for x in other if self.filter(x) )
def map(self, x):
return x
def filter(self, x):
return True
class sieve(pipely):
def filter(self, x):
n = self._args[0]
return x==n or x%n
class strify(pipely):
def map(self, x):
return str(x)
class startswith(pipely):
def filter(self, x):
n=str(self._args[0])
if x.startswith(n):
return x
print"*"*80
for i in xrange(2,100) | sieve(2) | sieve(3) | sieve(5) | sieve(7) | strify() | startswith(5):
print i
print"*"*80
for i in xrange(2,100) | sieve(2) | sieve(3) | sieve(5) | sieve(7) | pipely(map=str) | startswith(5):
print i
print"*"*80
for i in xrange(2,100) | sieve(2) | sieve(3) | sieve(5) | sieve(7) | pipely(map=str) | pipely(filter=lambda x: x.startswith('5')):
print i
Python's overloading is "safer" in general than C++'s -- for example, the assignment operator can't be overloaded, and += has a sensible default implementation.
In some ways, though, overloading in Python is still as "broken" as in C++. Programmers should restrain the desire to "re-use" an operator for unrelated purposes, such as C++ re-using the bitshifts to perform string formatting and parsing. Don't overload an operator with different semantics from your implementation just to get prettier syntax.
Modern Python style strongly discourages "rogue" overloading, but many aspects of the language and standard library retain poorly-named operators for backwards compatibility. For example:
%: modulus and string formatting
+: addition and sequence concatenation
*: multiplication and sequence repetition
So, rule of thumb? If your operator implementation will surprise people, don't do it.

Categories