Forcing immutability on an object - python

Before I start, I'm already aware that object immutability in Python is often a bad idea, however, I believe that in my case it would be appropriate.
Let's say I'm working with a coordinate system in my code, such that each coordinate uses a struct of X, Y, Z. I've already overloaded subtraction, addition, etc. methods to do what I want. My current problem is the assignment operator, which I've read cannot be overloaded. Problem is when I have the following, I do not want A to point to the same point as B, I want the two to be independent, in case I need to overwrite a coordinate of one but not the other later:
B = Point(1,2,3)
A = B
I'm aware that I can use deepcopy, but that seems like a hack, especially since I could have a list of points that I might need to take a slice of (in which case it would again have a slice of point references, not points). I've also considered using tuples, but my points have member methods I need, and a very large portion of my code already uses the structs.
My idea was to modify Point to be immutable, since it's really only 3 floats of data, and from doing some research _new _() seems like the right function to overwrite for this. I'm not sure how to achieve this though, would it be something like this or am I way off?
def __new__(self):
return Point(self.x, self.y, self.z)
EDIT:
My bad, I realized after reading katrielalex's post that I can't modify a parameter of immutable object once it has been defined, in which case it's not a problem that both A and B point to the same data since a reassignment would require creation of a new point. I'd say that katrielalex's and vonPetrushev's posts achieve what I want, I think I'll go with vonPetrushev's solution since I don't need to rewrite all my current code to use tuples (the extra set of parentheses and not being able to reference coordinates as point.x)

In conjunction with katrielalex's suggestion, making the Point a named tuple would be good as well. Here I've just replaced the tuple parent with namedtuple('Point', 'x y z') - and that's enough for it to work.
>>> from collections import namedtuple
>>> class Point(namedtuple('Point', 'x y z')):
... def __add__(self, other):
... return Point((i + j for i, j in zip(self, other)))
...
... def __mul__(self, other):
... return sum(i * j for i, j in zip(self, other))
...
... def __sub__(self, other):
... return Point((i - j for i, j in zip(self, other)))
...
... #property
... def mod(self):
... from math import sqrt
... return sqrt(sum(i*i for i in self))
...
Then you can have:
>>> Point(1, 2, 3)
Point(x=1, y=2, z=3)
>>> Point(x=1, y=2, z=3).mod
3.7416573867739413
>>> Point(x=1, y=2, z=3) * Point(0, 0, 1)
3
>>> Point._make((1, 2, 3))
Point(x=1, y=2, z=3)
(Thanks to katrielalex for suggesting to extend the namedtuple rather than copying the code produced.)

You can make Point a subclass of tuple -- remember, the built-in types (at least in recent Pythons) are just more classes. This will give you the desired immutability.
However, I'm slightly confused about your suggested use case:
in case I need to overwrite a coordinate of one but not the other later:
That doesn't make sense if Points are immutable...
>>> class Point(tuple):
... def __add__(self, other):
... return Point((i + j for i, j in zip(self, other)))
...
... def __mul__(self, other):
... return sum(i * j for i, j in zip(self, other))
...
... def __sub__(self, other):
... return Point((i - j for i, j in zip(self, other)))
...
... #property
... def mod(self):
... from math import sqrt
... return sqrt(sum(i*i for i in self))
...
>>> a = Point((1,2,3))
>>> b = Point((4,5,6))
>>> a + b
(5, 7, 9)
>>> b - a
(3, 3, 3)
>>> a * b
32
>>> a.mod
3.7416573867739413
>>> a[0] = 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Point' object does not support item assignment

try this:
class Point(object):
def __init__(self, x, y, z):
self._x=x
self._y=y
self._z=z
def __getattr__(self, key):
try:
key={'x':'_x','y':'_y','z':'_z'}[key]
except KeyError:
raise AttributeError
else:
return self.__dict__[key]
def __setattr__(self, key, value):
if key in ['_x','_y','_z']:
object.__setattr__(self, key, value)
else:
raise TypeError("'Point' object does not support item assignment")
So, you can construct a Point object, but not change its attributes.

Related

Python 2.7 - clean syntax for lvalue modification

It is very common to have struct-like types that are not expected to be modified by distant copyholders.
A string is a basic example, but that's an easy case because it's excusably immutable -- Python is unusual in even allowing things like method calls on literal strings.
The problem is that (in most languages) we frequently have things like, say an (x,y) Point class. We occasionally want to change x and y independently. I.e., from a usage perspective, a Point LVALUE should be mutable (even though copies will not see the mutation).
But Python 2.7 doesn't seem to provide any options to enable automatic copy-on-assignment. So that means we actually MUST make our Point class IMMUTABLE because inadvertent references are going to get created all over the place (typically because somebody forgot to clone the object before passing it to somebody else).
And no, I'm not interested in the countless hacks that allow the object to be mutated only "while it's being created", as that is a weak concept that does not scale.
The logical conclusion of these circumstances is that we need our mutation methods to actually modify the LVALUE. For example %= supports that. The problem is that it would be much better to have a more reasonable syntax, like using __setattr__ and/or defining set_x and set_y methods, as shown below.
class Point(object):
# Python doesn't have copy-on-assignment, so we must use an immutable
# object to avoid unintended changes by distant copyholders.
def __init__(self, x, y, others=None):
object.__setattr__(self, 'x', x)
object.__setattr__(self, 'y', y)
def __setattr__(self, name, value):
self %= (name, value)
return self # SHOULD modify lvalue (didn't work)
def __repr__(self):
return "(%d %d)" % (self.x, self.y)
def copy(self, x=None, y=None):
if x == None: x = self.x
if y == None: y = self.y
return Point(x, y)
def __eq__ (a,b): return a.x == b.x and a.y == b.y
def __ne__ (a,b): return a.x != b.x or a.y != b.y
def __add__(a,b): return Point(a.x+b.x, a.y+b.y)
def __sub__(a,b): return Point(a.x-b.x, a.y-b.y)
def set_x(a,b): return a.copy(x=b) # SHOULD modify lvalue (didn't work)
def set_y(a,b): return a.copy(y=b) # SHOULD modify lvalue (didn't work)
# This works in Python 2.7. But the syntax is awful.
def __imod__(a,b):
if b[0] == 'x': return a.copy(x=b[1])
elif b[0] == 'y': return a.copy(y=b[1])
else: raise AttributeError, \
"Point has no member '%s'" % b[0]
my_very_long_and_complicated_lvalue_expression = [Point(10,10)] * 4
# modify element 0 via "+=" -- OK
my_very_long_and_complicated_lvalue_expression[0] += Point(1,-1)
# modify element 1 via normal "__set_attr__" -- NOT OK
my_very_long_and_complicated_lvalue_expression[1].x = 9999
# modify element 2 via normal "set_x" -- NOT OK
my_very_long_and_complicated_lvalue_expression[2].set_x(99)
# modify element 3 via goofy "set_x" -- OK
my_very_long_and_complicated_lvalue_expression[3] %='x', 999
print my_very_long_and_complicated_lvalue_expression
The result is:
[(11 9), (10 10), (10 10), (999 10)]
As you can see, += and %= work just fine, but just about anything else doesn't seem to work. Surely the language inventors have created a basic syntax for LVALUE modification that is not limited to goofy-looking operators. I just can't seem to find it. Please help.
In Python, the typical pattern is to copy before modification rather than copying on assignment. You could implement some kind of data store with the semantics you want, but it seems like a lot of work.
I feel like we've given the search for pre-existing solutions its due diligence. Given that "<=" is assignment in some languages (e.g., Verilog) we can quite intuitively introduce:
value_struct_instance<<='field', value
as the Pythonic form of
value_struct_instance.field = value
Here is an updated example for instructive purposes:
# Python doesn't support copy-on-assignment, so we must use an
# immutable object to avoid unintended changes by distant copyholders.
# As a consequence, the lvalue must be changed on a field update.
#
# Currently the only known syntax for updating a field on such an
# object is:
#
# value_struct_instance<<='field', value
#
# https://stackoverflow.com/questions/45788271/
class Point(object):
def __init__(self, x, y, others=None):
object.__setattr__(self, 'x', x)
object.__setattr__(self, 'y', y)
def __setattr__(self, name, value):
raise AttributeError, \
"Use \"point<<='%s', ...\" instead of \"point.%s = ...\"" \
% (name, name)
def __repr__(self):
return "(%d %d)" % (self.x, self.y)
def copy(self, x=None, y=None):
if x == None: x = self.x
if y == None: y = self.y
return Point(x, y)
def __ilshift__(a,b):
if b[0] == 'x': return a.copy(x=b[1])
elif b[0] == 'y': return a.copy(y=b[1])
else: raise AttributeError, \
"Point has no member '%s'" % b[0]
def __eq__ (a,b): return a.x == b.x and a.y == b.y
def __ne__ (a,b): return a.x != b.x or a.y != b.y
def __add__(a,b): return Point(a.x+b.x, a.y+b.y)
def __sub__(a,b): return Point(a.x-b.x, a.y-b.y)
my_very_long_and_complicated_lvalue_expression = [Point(10,10)] * 3
# modify element 0 via "+="
my_very_long_and_complicated_lvalue_expression[0] += Point(1,-1)
# modify element 1 via "<<='field'," (NEW IDIOM)
my_very_long_and_complicated_lvalue_expression[1]<<='x', 15
print my_very_long_and_complicated_lvalue_expression
# result:
# [(11 9), (15 10), (10 10)]
my_very_long_and_complicated_lvalue_expression[1]<<='y', 25
print my_very_long_and_complicated_lvalue_expression
# result:
# [(11 9), (15 25), (10 10)]
# Attempt to modify element 2 via ".field="
my_very_long_and_complicated_lvalue_expression[2].y = 25
# result:
# AttributeError: Use "point<<='y', ..." instead of "point.y = ..."

Existence of mutable named tuple in Python?

Can anyone amend namedtuple or provide an alternative class so that it works for mutable objects?
Primarily for readability, I would like something similar to namedtuple that does this:
from Camelot import namedgroup
Point = namedgroup('Point', ['x', 'y'])
p = Point(0, 0)
p.x = 10
>>> p
Point(x=10, y=0)
>>> p.x *= 10
Point(x=100, y=0)
It must be possible to pickle the resulting object. And per the characteristics of named tuple, the ordering of the output when represented must match the order of the parameter list when constructing the object.
There is a mutable alternative to collections.namedtuple – recordclass.
It can be installed from PyPI:
pip3 install recordclass
It has the same API and memory footprint as namedtuple and it supports assignments (It should be faster as well). For example:
from recordclass import recordclass
Point = recordclass('Point', 'x y')
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)
recordclass (since 0.5) support typehints:
from recordclass import recordclass, RecordClass
class Point(RecordClass):
x: int
y: int
>>> Point.__annotations__
{'x':int, 'y':int}
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)
There is a more complete example (it also includes performance comparisons).
Recordclass library now provides another variant -- recordclass.make_dataclass factory function. It support dataclasses-like API (there are module level functions update, make, replace instead of self._update, self._replace, self._asdict, cls._make methods).
from recordclass import dataobject, make_dataclass
Point = make_dataclass('Point', [('x', int), ('y',int)])
Point = make_dataclass('Point', {'x':int, 'y':int})
class Point(dataobject):
x: int
y: int
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> p.x = 10; p.y += 3; print(p)
Point(x=10, y=5)
recordclass and make_dataclass can produce classes, whose instances occupy less memory than __slots__-based instances. This can be important for the instances with attribute values, which has not intended to have reference cycles. It may help reduce memory usage if you need to create millions of instances. Here is an illustrative example.
types.SimpleNamespace was introduced in Python 3.3 and supports the requested requirements.
from types import SimpleNamespace
t = SimpleNamespace(foo='bar')
t.ham = 'spam'
print(t)
namespace(foo='bar', ham='spam')
print(t.foo)
'bar'
import pickle
with open('/tmp/pickle', 'wb') as f:
pickle.dump(t, f)
As a Pythonic alternative for this task, since Python-3.7, you can use
dataclasses module that not only behaves like a mutable NamedTuple, because they use normal class definitions, they also support other class features.
From PEP-0557:
Although they use a very different mechanism, Data Classes can be thought of as "mutable namedtuples with defaults". Because Data Classes use normal class definition syntax, you are free to use inheritance, metaclasses, docstrings, user-defined methods, class factories, and other Python class features.
A class decorator is provided which inspects a class definition for variables with type annotations as defined in PEP 526, "Syntax for Variable Annotations". In this document, such variables are called fields. Using these fields, the decorator adds generated method definitions to the class to support instance initialization, a repr, comparison methods, and optionally other methods as described in the Specification section. Such a class is called a Data Class, but there's really nothing special about the class: the decorator adds generated methods to the class and returns the same class it was given.
This feature is introduced in PEP-0557 that you can read about it in more details on provided documentation link.
Example:
In [20]: from dataclasses import dataclass
In [21]: #dataclass
...: class InventoryItem:
...: '''Class for keeping track of an item in inventory.'''
...: name: str
...: unit_price: float
...: quantity_on_hand: int = 0
...:
...: def total_cost(self) -> float:
...: return self.unit_price * self.quantity_on_hand
...:
Demo:
In [23]: II = InventoryItem('bisc', 2000)
In [24]: II
Out[24]: InventoryItem(name='bisc', unit_price=2000, quantity_on_hand=0)
In [25]: II.name = 'choco'
In [26]: II.name
Out[26]: 'choco'
In [27]:
In [27]: II.unit_price *= 3
In [28]: II.unit_price
Out[28]: 6000
In [29]: II
Out[29]: InventoryItem(name='choco', unit_price=6000, quantity_on_hand=0)
The latest namedlist 1.7 passes all of your tests with both Python 2.7 and Python 3.5 as of Jan 11, 2016. It is a pure python implementation whereas the recordclass is a C extension. Of course, it depends on your requirements whether a C extension is preferred or not.
Your tests (but also see the note below):
from __future__ import print_function
import pickle
import sys
from namedlist import namedlist
Point = namedlist('Point', 'x y')
p = Point(x=1, y=2)
print('1. Mutation of field values')
p.x *= 10
p.y += 10
print('p: {}, {}\n'.format(p.x, p.y))
print('2. String')
print('p: {}\n'.format(p))
print('3. Representation')
print(repr(p), '\n')
print('4. Sizeof')
print('size of p:', sys.getsizeof(p), '\n')
print('5. Access by name of field')
print('p: {}, {}\n'.format(p.x, p.y))
print('6. Access by index')
print('p: {}, {}\n'.format(p[0], p[1]))
print('7. Iterative unpacking')
x, y = p
print('p: {}, {}\n'.format(x, y))
print('8. Iteration')
print('p: {}\n'.format([v for v in p]))
print('9. Ordered Dict')
print('p: {}\n'.format(p._asdict()))
print('10. Inplace replacement (update?)')
p._update(x=100, y=200)
print('p: {}\n'.format(p))
print('11. Pickle and Unpickle')
pickled = pickle.dumps(p)
unpickled = pickle.loads(pickled)
assert p == unpickled
print('Pickled successfully\n')
print('12. Fields\n')
print('p: {}\n'.format(p._fields))
print('13. Slots')
print('p: {}\n'.format(p.__slots__))
Output on Python 2.7
1. Mutation of field values
p: 10, 12
2. String
p: Point(x=10, y=12)
3. Representation
Point(x=10, y=12)
4. Sizeof
size of p: 64
5. Access by name of field
p: 10, 12
6. Access by index
p: 10, 12
7. Iterative unpacking
p: 10, 12
8. Iteration
p: [10, 12]
9. Ordered Dict
p: OrderedDict([('x', 10), ('y', 12)])
10. Inplace replacement (update?)
p: Point(x=100, y=200)
11. Pickle and Unpickle
Pickled successfully
12. Fields
p: ('x', 'y')
13. Slots
p: ('x', 'y')
The only difference with Python 3.5 is that the namedlist has become smaller, the size is 56 (Python 2.7 reports 64).
Note that I have changed your test 10 for in-place replacement. The namedlist has a _replace() method which does a shallow copy, and that makes perfect sense to me because the namedtuple in the standard library behaves the same way. Changing the semantics of the _replace() method would be confusing. In my opinion the _update() method should be used for in-place updates. Or maybe I failed to understand the intent of your test 10?
It seems like the answer to this question is no.
Below is pretty close, but it's not technically mutable. This is creating a new namedtuple() instance with an updated x value:
Point = namedtuple('Point', ['x', 'y'])
p = Point(0, 0)
p = p._replace(x=10)
On the other hand, you can create a simple class using __slots__ that should work well for frequently updating class instance attributes:
class Point:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y
To add to this answer, I think __slots__ is good use here because it's memory efficient when you create lots of class instances. The only downside is that you can't create new class attributes.
Here's one relevant thread that illustrates the memory efficiency - Dictionary vs Object - which is more efficient and why?
The quoted content in the answer of this thread is a very succinct explanation why __slots__ is more memory efficient - Python slots
The following is a good solution for Python 3: A minimal class using __slots__ and Sequence abstract base class; does not do fancy error detection or such, but it works, and behaves mostly like a mutable tuple (except for typecheck).
from collections import Sequence
class NamedMutableSequence(Sequence):
__slots__ = ()
def __init__(self, *a, **kw):
slots = self.__slots__
for k in slots:
setattr(self, k, kw.get(k))
if a:
for k, v in zip(slots, a):
setattr(self, k, v)
def __str__(self):
clsname = self.__class__.__name__
values = ', '.join('%s=%r' % (k, getattr(self, k))
for k in self.__slots__)
return '%s(%s)' % (clsname, values)
__repr__ = __str__
def __getitem__(self, item):
return getattr(self, self.__slots__[item])
def __setitem__(self, item, value):
return setattr(self, self.__slots__[item], value)
def __len__(self):
return len(self.__slots__)
class Point(NamedMutableSequence):
__slots__ = ('x', 'y')
Example:
>>> p = Point(0, 0)
>>> p.x = 10
>>> p
Point(x=10, y=0)
>>> p.x *= 10
>>> p
Point(x=100, y=0)
If you want, you can have a method to create the class too (though using an explicit class is more transparent):
def namedgroup(name, members):
if isinstance(members, str):
members = members.split()
members = tuple(members)
return type(name, (NamedMutableSequence,), {'__slots__': members})
Example:
>>> Point = namedgroup('Point', ['x', 'y'])
>>> Point(6, 42)
Point(x=6, y=42)
In Python 2 you need to adjust it slightly - if you inherit from Sequence, the class will have a __dict__ and the __slots__ will stop from working.
The solution in Python 2 is to not inherit from Sequence, but object. If isinstance(Point, Sequence) == True is desired, you need to register the NamedMutableSequence as a base class to Sequence:
Sequence.register(NamedMutableSequence)
Tuples are by definition immutable.
You can however make a dictionary subclass where you can access the attributes with dot-notation;
In [1]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:class AttrDict(dict):
:
: def __getattr__(self, name):
: return self[name]
:
: def __setattr__(self, name, value):
: self[name] = value
:--
In [2]: test = AttrDict()
In [3]: test.a = 1
In [4]: test.b = True
In [5]: test
Out[5]: {'a': 1, 'b': True}
If you want similar behavior as namedtuples but mutable try namedlist
Note that in order to be mutable it cannot be a tuple.
Let's implement this with dynamic type creation:
import copy
def namedgroup(typename, fieldnames):
def init(self, **kwargs):
attrs = {k: None for k in self._attrs_}
for k in kwargs:
if k in self._attrs_:
attrs[k] = kwargs[k]
else:
raise AttributeError('Invalid Field')
self.__dict__.update(attrs)
def getattribute(self, attr):
if attr.startswith("_") or attr in self._attrs_:
return object.__getattribute__(self, attr)
else:
raise AttributeError('Invalid Field')
def setattr(self, attr, value):
if attr in self._attrs_:
object.__setattr__(self, attr, value)
else:
raise AttributeError('Invalid Field')
def rep(self):
d = ["{}={}".format(v,self.__dict__[v]) for v in self._attrs_]
return self._typename_ + '(' + ', '.join(d) + ')'
def iterate(self):
for x in self._attrs_:
yield self.__dict__[x]
raise StopIteration()
def setitem(self, *args, **kwargs):
return self.__dict__.__setitem__(*args, **kwargs)
def getitem(self, *args, **kwargs):
return self.__dict__.__getitem__(*args, **kwargs)
attrs = {"__init__": init,
"__setattr__": setattr,
"__getattribute__": getattribute,
"_attrs_": copy.deepcopy(fieldnames),
"_typename_": str(typename),
"__str__": rep,
"__repr__": rep,
"__len__": lambda self: len(fieldnames),
"__iter__": iterate,
"__setitem__": setitem,
"__getitem__": getitem,
}
return type(typename, (object,), attrs)
This checks the attributes to see if they are valid before allowing the operation to continue.
So is this pickleable? Yes if (and only if) you do the following:
>>> import pickle
>>> Point = namedgroup("Point", ["x", "y"])
>>> p = Point(x=100, y=200)
>>> p2 = pickle.loads(pickle.dumps(p))
>>> p2.x
100
>>> p2.y
200
>>> id(p) != id(p2)
True
The definition has to be in your namespace, and must exist long enough for pickle to find it. So if you define this to be in your package, it should work.
Point = namedgroup("Point", ["x", "y"])
Pickle will fail if you do the following, or make the definition temporary (goes out of scope when the function ends, say):
some_point = namedgroup("Point", ["x", "y"])
And yes, it does preserve the order of the fields listed in the type creation.
I can't believe nobody's said this before, but it seems to me Python just wants you to write your own simple, mutable class instead of using a namedtuple whenever you need the "namedtuple" to be mutable.
Quick summary
Just jump straight down to Approach 5 below. It's short and to-the-point, and by far the best of these options.
Various, detailed approaches:
Approach 1 (good): simple, callable class with __call__()
Here is an example of a simple Point object for (x, y) points:
class Point():
def __init__(self, x, y):
self.x = x
self.y = y
def __call__(self):
"""
Make `Point` objects callable. Print their contents when they
are called.
"""
print("Point(x={}, y={})".format(self.x, self.y))
Now use it:
p1 = Point(1,2)
p1()
p1.x = 7
p1()
p1.y = 8
p1()
Here is the output:
Point(x=1, y=2)
Point(x=7, y=2)
Point(x=7, y=8)
This is pretty similar to a namedtuple, except it is fully mutable, unlike a namedtuple. Also, a namedtuple isn't callable, so to see its contents, just type the object instance name withOUT parenthesis after it (as p2 in the example below, instead of as p2()). See this example and output here:
>>> from collections import namedtuple
>>> Point2 = namedtuple("Point2", ["x", "y"])
>>> p2 = Point2(1, 2)
>>> p2
Point2(x=1, y=2)
>>> p2()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Point2' object is not callable
>>> p2.x = 7
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: can't set attribute
Approach 2 (better): use __repr__() in place of __call__()
I just learned you can use __repr__() in place of __call__(), to get more namedtuple-like behavior. Defining the __repr__() method allows you to define "the 'official' string representation of an object" (see the official documentation here). Now, just calling p1 is the equivalent of calling the __repr__() method, and you get identical behavior to the namedtuple. Here is the new class:
class Point():
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return "Point(x={}, y={})".format(self.x, self.y)
Now use it:
p1 = Point(1,2)
p1
p1.x = 7
p1
p1.y = 8
p1
Here is the output:
Point(x=1, y=2)
Point(x=7, y=2)
Point(x=7, y=8)
Approach 3 (better still, but a little awkward to use): make it a callable which returns an (x, y) tuple
The original poster (OP) would also like something like this to work (see his comment below my answer):
x, y = Point(x=1, y=2)
Well, for simplicity, let's just make this work instead:
x, y = Point(x=1, y=2)()
# OR
p1 = Point(x=1, y=2)
x, y = p1()
While we are at it, let's also condense this:
self.x = x
self.y = y
...into this (source where I first saw this):
self.x, self.y = x, y
Here is the class definition for all of the above:
class Point():
def __init__(self, x, y):
self.x, self.y = x, y
def __repr__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return "Point(x={}, y={})".format(self.x, self.y)
def __call__(self):
"""
Make the object callable. Return a tuple of the x and y components
of the Point.
"""
return self.x, self.y
Here are some test calls:
p1 = Point(1,2)
p1
p1.x = 7
x, y = p1()
x2, y2 = Point(10, 12)()
x
y
x2
y2
I won't show pasting the class definition into the interpreter this time, but here are those calls with their output:
>>> p1 = Point(1,2)
>>> p1
Point(x=1, y=2)
>>> p1.x = 7
>>> x, y = p1()
>>> x2, y2 = Point(10, 12)()
>>> x
7
>>> y
2
>>> x2
10
>>> y2
12
Approach 4 (best so far, but a lot more code to write): make the class also an iterator
By making this into an iterator class, we can get this behavior:
x, y = Point(x=1, y=2)
# OR
x, y = Point(1, 2)
# OR
p1 = Point(1, 2)
x, y = p1
Let's get rid of the __call__() method, but to make this class an iterator we will add the __iter__() and __next__() methods. Read more about these things here:
https://treyhunner.com/2018/06/how-to-make-an-iterator-in-python/
How to build a basic iterator?
https://docs.python.org/3/library/exceptions.html#StopIteration
Here is the solution:
class Point():
def __init__(self, x, y):
self.x, self.y = x, y
self._iterator_index = 0
self._num_items = 2 # counting self.x and self.y
def __repr__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return "Point(x={}, y={})".format(self.x, self.y)
def __iter__(self):
return self
def __next__(self):
self._iterator_index += 1
if self._iterator_index == 1:
return self.x
elif self._iterator_index == 2:
return self.y
else:
raise StopIteration
And here are some test calls and their output:
>>> x, y = Point(x=1, y=2)
>>> x
1
>>> y
2
>>> x, y = Point(3, 4)
>>> x
3
>>> y
4
>>> p1 = Point(5, 6)
>>> x, y = p1
>>> x
5
>>> y
6
>>> p1
Point(x=5, y=6)
Approach 5 (USE THIS ONE) (Perfect!--best and cleanest/shortest approach): make the class an iterable, with the yield generator keyword
Study these references:
https://treyhunner.com/2018/06/how-to-make-an-iterator-in-python/
What does the "yield" keyword do?
Here is the solution. It relies on a fancy "iterable-generator" (AKA: just "generator") keyword/Python mechanism, called yield.
Basically, the first time an iterable calls for the next item, it calls the __iter__() method, and stops and returns the contents of the first yield call (self.x in the code below). The next time an iterable calls for the next item, it picks up where it last left off (just after the first yield in this case), and looks for the next yield, stopping and returning the contents of that yield call (self.y in the code below). Each "return" from a yield actually returns a "generator" object, which is an iterable itself, so you can iterate on it. Each new iterable call for the next item continues this process, starting up where it last left off, just after the most-recently-called yield, until no more yield calls exist, at which point the iterations are ended and the iterable has been fully iterated. Therefore, once this iterable has called for two objects, both yield calls have been used up, so the iterator ends. The end result is that calls like this work perfectly, just as they did in Approach 4, but with far less code to write!:
x, y = Point(x=1, y=2)
# OR
x, y = Point(1, 2)
# OR
p1 = Point(1, 2)
x, y = p1
Here is the solution (a part of this solution can also be found in the treyhunner.com reference just above). Notice how short and clean this solution is!
Just the class definition code; no docstrings, so you can truly see how short and simple this is:
class Point():
def __init__(self, x, y):
self.x, self.y = x, y
def __repr__(self):
return "Point(x={}, y={})".format(self.x, self.y)
def __iter__(self):
yield self.x
yield self.y
With descriptive docstrings:
class Point():
def __init__(self, x, y):
self.x, self.y = x, y
def __repr__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return "Point(x={}, y={})".format(self.x, self.y)
def __iter__(self):
"""
Make this `Point` class an iterable. When used as an iterable, it will
now return `self.x` and `self.y` as the two elements of a list-like,
iterable object, "generated" by the usages of the `yield` "generator"
keyword.
"""
yield self.x
yield self.y
Copy and paste the exact same test code as used in the previous approach (Approach 4) just above, and you will get the exact same output as above as well!
References:
https://docs.python.org/3/library/collections.html#collections.namedtuple
Approach 1:
What is the difference between __init__ and __call__?
Approach 2:
https://www.tutorialspoint.com/What-does-the-repr-function-do-in-Python-Object-Oriented-Programming
Purpose of __repr__ method?
https://docs.python.org/3/reference/datamodel.html#object.__repr__
Approach 4:
*****[EXCELLENT!] https://treyhunner.com/2018/06/how-to-make-an-iterator-in-python/
How to build a basic iterator?
https://docs.python.org/3/library/exceptions.html#StopIteration
Approach 5:
See links from Approach 4, plus:
*****[EXCELLENT!] What does the "yield" keyword do?
What is the meaning of single and double underscore before an object name?
Provided performance is of little importance, one could use a silly hack like:
from collection import namedtuple
Point = namedtuple('Point', 'x y z')
mutable_z = Point(1,2,[3])
If you want to be able to create classes "on-site", I find the following very convenient:
class Struct:
def __init__(self, **kw):
self.__dict__.update(**kw)
That allows me to write:
p = Struct(x=0, y=0)
P.x = 10
stats = Struct(count=0, total=0.0)
stats.count += 1
The most elegant way I can think of doesn't require a 3rd party library and lets you create a quick mock class constructor with default member variables without dataclasses cumbersome type specification. So it's better for roughing out some code:
# copy-paste 3 lines:
from inspect import getargvalues, stack
from types import SimpleNamespace
def DefaultableNS(): return SimpleNamespace(**getargvalues(stack()[1].frame)[3])
# then you can make classes with default fields on the fly in one line, eg:
def Node(value,left=None,right=None): return DefaultableNS()
node=Node(123)
print(node)
#[stdout] namespace(value=123, left=None, right=None)
print(node.value,node.left,node.right) # all fields exist
A plain SimpleNamespace is clumsier, it breaks DRY:
def Node(value,left=None,right=None):
return SimpleNamespace(value=value,left=left,right=right)
# breaks DRY as you need to repeat the argument names twice
I will share my solution to this question. I needed a way to save attributes in the case that my program crashed or was stopped for some reason so that it would know where where in a list of inputs to resume from. Based on #GabrielStaples's answer:
import pickle, json
class ScanSession:
def __init__(self, input_file: str = None, output_file: str = None,
total_viable_wallets: int = 0, total: float = 0,
report_dict: dict = {}, wallet_addresses: list = [],
token_map: list = [], token_map_file: str = 'data/token.maps.json',
current_batch: int = 0):
self.initialized = time.time()
self.input_file = input_file
self.output_file = output_file
self.total_viable_wallets = total_viable_wallets
self.total = total
self.report_dict = report_dict
self.wallet_addresses = wallet_addresses
self.token_map = token_map
self.token_map_file = token_map_file
self.current_batch = current_batch
#property
def __dict__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return {'initialized': self.initialized, 'input_file': self.input_file,
'output_file': self.output_file, 'total_viable_wallets': self.total_viable_wallets,
'total': self.total, 'report_dict': self.report_dict,
'wallet_addresses': self.wallet_addresses, 'token_map': self.token_map,
'token_map_file':self.token_map_file, 'current_batch': self.current_batch
}
def load_session(self, session_file):
with open(session_file, 'r') as f:
_session = json.loads(json.dumps(f.read()))
_session = dict(_session)
for key, value in _session.items():
setattr(self, key, value)
def dump_session(self, session_file):
with open(session_file, 'w') as f:
json.dump(self.__dict__, fp=f)
Using it:
session = ScanSession()
session.total += 1
session.__dict__
{'initialized': 1670801774.8050613, 'input_file': None, 'output_file': None, 'total_viable_wallets': 0, 'total': 10, 'report_dict': {}, 'wallet_addresses': [], 'token_map': [], 'token_map_file': 'data/token.maps.json', 'current_batch': 0}
pickle.dumps(session)
b'\x80\x04\x95\xe8\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x0bScanSession\x94\x93\x94)\x81\x94}\x94(\x8c\x0binitialized\x94GA\xd8\xe5\x9a[\xb3\x86 \x8c\ninput_file\x94N\x8c\x0boutput_file\x94N\x8c\x14total_viable_wallets\x94K\x00\x8c\x05total\x94K\n\x8c\x0breport_dict\x94}\x94\x8c\x10wallet_addresses\x94]\x94\x8c\ttoken_map\x94]\x94\x8c\x0etoken_map_file\x94\x8c\x14data/token.maps.json\x94\x8c\rcurrent_batch\x94K\x00ub.'

Using Python tuples as vectors

I need to represent immutable vectors in Python ("vectors" as in linear algebra, not as in programming). The tuple seems like an obvious choice.
The trouble is when I need to implement things like addition and scalar multiplication. If a and b are vectors, and c is a number, the best I can think of is this:
tuple(map(lambda x,y: x + y, a, b)) # add vectors 'a' and 'b'
tuple(map(lambda x: x * c, a)) # multiply vector 'a' by scalar 'c'
which seems inelegant; there should be a clearer, simpler way to get this done -- not to mention avoiding the call to tuple, since map returns a list.
Is there a better option?
NumPy supports various algebraic operations with its arrays.
Immutable types are pretty rare in Python and third-party extensions thereof; the OP rightly claims "there are enough uses for linear algebra that it doesn't seem likely I have to roll my own" -- but all the existing types I know that do linear algebra are mutable! So, as the OP is adamant on immutability, there is nothing for it but the roll-your-own route.
Not that there's all that much rolling involved, e.g. if you specifically need 2-d vectors:
import math
class ImmutableVector(object):
__slots__ = ('_d',)
def __init__(self, x, y):
object.__setattr__(self, _d, (x, y))
def __setattr__(self, n, v):
raise ValueError("Can't alter instance of %s" % type(self))
#property
def x(self):
return self._d[0]
#property
def y(self):
return self._d[1]
def __eq__(self, other):
return self._d == other._d
def __ne__(self, other):
return self._d != other._d
def __hash__(self):
return hash(self._d)
def __add__(self, other):
return type(self)(self.x+other.x, self.y+other.y)
def __mul__(self, scalar):
return type(self)(self.x*scalar, self.y*scalar)
def __repr__(self):
return '%s(%s, %s)' % (type(self).__name__, self.x, self.y)
def __abs__(self):
return math.hypot(self.x, self.y)
I "threw in for free" a few extras such as .x and .y R/O properties, nice string representation, usability in sets or as keys in dicts (why else would one want immutability?-), low memory footprint, abs(v) to give v's vector-length -- I'm sure you can think of other "wouldn't-it-be-cool-if" methods and operators, depending on your application field, and they'll be just as easy. If you need other dimensionalities it won't be much harder, though a tad less readable since the .x, .y notation doesn't apply any more;-) (but I'd use genexps, not map).
By inheriting from tuple, you can make a nice Vector class pretty easily. Here's enough code to provide addition of vectors, and multiplication of a vector by a scalar. It gives you arbitrary length vectors, and can work with complex numbers, ints, or floats.
class Vector(tuple):
def __add__(self, a):
# TODO: check lengths are compatable.
return Vector(x + y for x, y in zip(self, a))
def __mul__(self, c):
return Vector(x * c for x in self)
def __rmul__(self, c):
return Vector(c * x for x in self)
a = Vector((1, 2, 3))
b = Vector((2, 3, 4))
print a + b
print 3 * a
print a * 3
Although using a library like NumPy seems to be the resolution for the OP, I think there is still some value in a simple solution which does not require additional libraries and which you can stay immutable, with iterables.
Using the itertools and operators modules:
imap(add, a, b) # returns iterable to sum of a and b vectors
This implementation is simple. It does not use lambda neither any list-tuple conversion as it is iterator based.
from itertools import imap
from operator import add
vec1 = (1, 2, 3)
vec2 = (10, 20, 30)
result = imap(add, vec1, vec2)
print(tuple(result))
Yields:
(11, 22, 33)
Why not create your own class, making use of 2 Cartesian point member variables? (sorry if the syntax is a little off, my python is rusty)
class point:
def __init__(self,x,y):
self.x=x
self.y=y
#etc
def add(self,p):
return point(self.x + p.x, self.y + p.y)
class vector:
def __init__(self,a,b):
self.pointA=a
self.pointB=b
#etc
def add(self,v):
return vector(self.pointA + v.pointA, self.pointB + v.pointB)
For occasional use, a Python 3 solution without repeating lambdas is possible via using the standard operator package:
from operator import add, mul
a = (1, 2, 3)
b = (4, 5, 6)
print(tuple(map(add, a , b)))
print(tuple(map(mul, a , b)))
which prints:
(5, 7, 9)
(4, 10, 18)
For serious linear algebra computations using numpy vectors is the canonical solution:
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(a+b)
print(a*b)
which prints:
[5 7 9]
[ 4 10 18]
Since pretty much all of the sequence manipulation functions return lists, that's pretty much what you're going to have to do.

Possible to use more than one argument on __getitem__?

I am trying to use
__getitem__(self, x, y):
on my Matrix class, but it seems to me it doesn't work (I still don't know very well to use python).
I'm calling it like this:
print matrix[0,0]
Is it possible at all to use more than one argument? Thanks. Maybe I can use only one argument but pass it as a tuple?
__getitem__ only accepts one argument (other than self), so you get passed a tuple.
You can do this:
class matrix:
def __getitem__(self, pos):
x,y = pos
return "fetching %s, %s" % (x, y)
m = matrix()
print m[1,2]
outputs
fetching 1, 2
See the documentation for object.__getitem__ for more information.
Indeed, when you execute bla[x,y], you're calling type(bla).__getitem__(bla, (x, y)) -- Python automatically forms the tuple for you and passes it on to __getitem__ as the second argument (the first one being its self). There's no good way[1] to express that __getitem__ wants more arguments, but also no need to.
[1] In Python 2.* you can actually give __getitem__ an auto-unpacking signature which will raise ValueError or TypeError when you're indexing with too many or too few indices...:
>>> class X(object):
... def __getitem__(self, (x, y)): return x, y
...
>>> x = X()
>>> x[23, 45]
(23, 45)
Whether that's "a good way" is moot... it's been deprecated in Python 3 so you can infer that Guido didn't consider it good upon long reflection;-). Doing your own unpacking (of a single argument in the signature) is no big deal and lets you provide clearer errors (and uniform ones, rather than ones of different types for the very similar error of indexing such an instance with 1 vs, say, 3 indices;-).
No, __getitem__ just takes one argument (in addition to self). In the case of matrix[0, 0], the argument is the tuple (0, 0).
You can directly call __getitem__ instead of using brackets.
Example:
class Foo():
def __init__(self):
self.a = [5, 7, 9]
def __getitem__(self, i, plus_one=False):
if plus_one:
i += 1
return self.a[I]
foo = Foo()
foo[0] # 5
foo.__getitem__(0) # 5
foo.__getitem__(0, True) # 7
I learned today that you can pass double index to your object that implements getitem, as the following snippet illustrates:
class MyClass:
def __init__(self):
self.data = [[1]]
def __getitem__(self, index):
return self.data[index]
c = MyClass()
print(c[0][0])

Overriding "+=" in Python? (__iadd__() method)

Is it possible to override += in Python?
Yes, override the __iadd__ method. Example:
def __iadd__(self, other):
self.number += other.number
return self
In addition to what's correctly given in answers above, it is worth explicitly clarifying that when __iadd__ is overriden, the x += y operation does NOT end with the end of __iadd__ method.
Instead, it ends with x = x.__iadd__(y). In other words, Python assigns the return value of your __iadd__ implementation to the object you're "adding to", AFTER the implementation completes.
This means it is possible to mutate the left side of the x += y operation so that the final implicit step fails. Consider what can happen when you are adding to something that's within a list:
>>> x[1] += y # x has two items
Now, if your __iadd__ implementation (a method of an object at x[1]) erroneously or on purpose removes the first item (x[0]) from the beginning of the list, Python will then run your __iadd__ method) & try to assign its return value to x[1]. Which will no longer exist (it will be at x[0]), resulting in an ÌndexError.
Or, if your __iadd__ inserts something to beginning of x of the above example, your object will be at x[2], not x[1], and whatever was earlier at x[0] will now be at x[1]and be assigned the return value of the __iadd__ invocation.
Unless one understands what's happening, resulting bugs can be a nightmare to fix.
In addition to overloading __iadd__ (remember to return self!), you can also fallback on __add__, as x += y will work like x = x + y. (This is one of the pitfalls of the += operator.)
>>> class A(object):
... def __init__(self, x):
... self.x = x
... def __add__(self, other):
... return A(self.x + other.x)
>>> a = A(42)
>>> b = A(3)
>>> print a.x, b.x
42 3
>>> old_id = id(a)
>>> a += b
>>> print a.x
45
>>> print old_id == id(a)
False
It even trips up experts:
class Resource(object):
class_counter = 0
def __init__(self):
self.id = self.class_counter
self.class_counter += 1
x = Resource()
y = Resource()
What values do you expect x.id, y.id, and Resource.class_counter to have?
http://docs.python.org/reference/datamodel.html#emulating-numeric-types
For instance, to execute the statement
x += y, where x is an instance of a
class that has an __iadd__() method,
x.__iadd__(y) is called.

Categories