Easiest way to copy all fields from one dataclass instance to another? - python

Let's assume you have defined a Python dataclass:
#dataclass
class Marker:
a: float
b: float = 1.0
What's the easiest way to copy the values from an instance marker_a to another instance marker_b?
Here's an example of what I try to achieve:
marker_a = Marker(1.0, 2.0)
marker_b = Marker(11.0, 12.0)
# now some magic happens which you hopefully can fill in
print(marker_b)
# result: Marker(a=1.0, b=2.0)
As a boundary condition, I do not want to create and assign a new instance to marker_b.
OK, I could loop through all defined fields and copy the values one by one, but there has to be a simpler way, I guess.

The dataclasses.replace function returns a new copy of the object.
Without passing in any changes, it will return a copy with no modification:
>>> import dataclasses
>>> #dataclasses.dataclass
... class Dummy:
... foo: int
... bar: int
...
>>> dummy = Dummy(1, 2)
>>> dummy_copy = dataclasses.replace(dummy)
>>> dummy_copy.foo = 5
>>> dummy
Dummy(foo=1, bar=2)
>>> dummy_copy
Dummy(foo=5, bar=2)
Note that this is a shallow copy.
Edit to address comments:
If a copy is undesirable, I would probably go with the following:
for key, value in dataclasses.asdict(dummy).items():
setattr(some_obj, key, value)

I think that looping over the fields probably is the easiest way. All the other options I can think of involve creating a new object.
from dataclasses import fields
marker_a = Marker(5)
marker_b = Marker(0, 99)
for field in fields(Marker):
setattr(marker_b, field.name, getattr(marker_a, field.name))
print(marker_b) # Marker(a=5, b=1.0)

#dataclass
class Marker:
a: float
b: float = 1.0
marker_a = Marker(0.5)
marker_b = Marker(**marker_a.__dict__)
marker_b
# Marker(a=0.5, b=1.0)
If you didn't want to create a new instance, try this:
marker_a = Marker(1.0, 2.0)
marker_b = Marker(11.0, 12.0)
marker_b.__dict__ = marker_a.__dict__.copy()
# result: Marker(a=1.0, b=2.0)
Not sure whether that's considered a bad hack though...

Another option which may be more elegant:
import dataclasses
marker_a = Marker(1.0, 2.0)
marker_b = Marker(**dataclasses.asdict(marker_a))

Here's a version that also lets you choose the result dataclass type and override attributes:
dataclassWith(Y(x=2, z=5), y=3) # > Y(x=3, y=3, z=5)
dataclassWith(Y(x=2, z=5), X, x=99) # > X(z=5, x=99) # There is no z
MISSING = object()
def dataclassWith(other, clz=None, **kw):
if clz is None: clz = other.__class__
k = other.__dict__.copy()
k.update(kw)
return clz(**{k:v for k,v in k.items()
if getattr(clz, k, MISSING) is not MISSING})
class TestDataclassUtil(unittest.TestCase):
def test_dataclassWith(self):
#dataclasses.dataclass
class X():
x:int = 1
z:int = 99
#dataclasses.dataclass
class Y(X):
y:int = 2
r = dataclassWith(Y(x=2), y=3)
self.assertTrue(isinstance(r, Y))
self.assertTrue(r.x==2)
self.assertTrue(r.y==3)
self.assertTrue(r.z==99)
r = dataclassWith(Y(x=2), X, z=100)
self.assertTrue(isinstance(r, X))
self.assertTrue(r.x==2)
self.assertTrue(r.z==100)

Related

Python dynamic enum

I would like to create dynamic enums in python loaded from a SQL Table. The output of SQL will be a list of tuplets, which with I want to fill the attributes of the enum.
Lets say I receive this list:
lst = [('PROCESS_0', 0, "value", 123, False), ('PROCESS_1',1,"anothervalue", 456, True)]
I now want to fill the values in the enum below:
class Jobs(IntEnum):
def __new__(cls, value: int, label: str, heartbeat: int = 60, heartbeat_required: bool = False):
obj = int.__new__(cls, value)
obj._value_ = value
obj.label = label
obj.heartbeat = heartbeat
obj.heartbeat_required = heartbeat_required
return obj
The first variable in the tuple should be the variable name of the enum, I have solved this with:
locals()['Test'] = (0, '', 789, False)
But this only works for single values, it seems that I can not run a for loop within enum. When using a for loop like this:
for i in lst:
locals()[i[0]] = (i[1], i[2], i[3])
Python sends this error TypeError: Attempted to reuse key: 'i' which propably comes from enums only having constants.
Is there any (possibly elegant) solution for this?
Many thanks in advance!
You need to use _ignore_ = "i". Something like:
class Jobs(IntEnum):
_ignore_ = "i"
def __new__(cls, value, label, heartbeat=60, heartbeat_required=False):
obj = int.__new__(cls, value)
obj._value_ = value
obj.label = label
obj.heartbeat = heartbeat
obj.heartbeat_required = heartbeat_required
return obj
for i in lst:
locals()[i[0]] = i[1:]
Check the example at https://docs.python.org/3/howto/enum.html#timeperiod
Note that the _ignore_ can be avoided in favor of dict comprehension
from datetime import timedelta
class Period(timedelta, Enum):
"different lengths of time"
vars().update({ f"day_{i}": i for i in range(367) })
Then you can access all possible enum values via Period.__members__

How to convert an object back into the code used to create it?

For example if I have a custom Python object like this;
#!/usr/bin/env python3
import os
base_dir = os.path.abspath(".")
class MyFile(dict):
def __init__(self, name, size = None, dir = base_dir):
self.name = name
self.path = os.path.join(dir, name)
self.bytes = size
and somewhere in my program, I initialize my object class;
a = MyFile(name = "foo", size = 10)
I want to be able to return the code used to create the object in the first place. For example;
print(a)
# <__main__.MyFile object at 0x102b84470>
# should instead print:
# MyFile(name = "foo", size = 10)
But since my object has some default attribute values, I only want those to show up in the output if they were explicitly included when the object was initialized;
b = MyFile(name = "bar", dir = "/home")
print(b)
# <__main__.MyFile object at 0x102b845c0>
# should instead print:
# MyFile(name = "bar", dir = "/home")
And to be clear, I am not trying to pull this from the source code, because a lot of my objects will be created dynamically, and I want to be able to return the same thing for them as well;
l = [ ("baz", 4), ("buzz", 12) ]
f = [ MyFile(name = n, size = s) for n, s in l ]
print(f)
# [<__main__.MyFile object at 0x1023844a8>, <__main__.MyFile object at 0x102384828>]
# should instead print:
# [ MyFile(name = "baz", size = 4), MyFile(name = "buzz", size = 12) ]
I saw the inspect library (https://docs.python.org/3/library/inspect.html) but it does not seem to have anything that does this. What am I missing? This functionality would be pretty analogous to R's dput function.
At a very basic level you can do this:
class MyClass:
def __init__(self, a, b):
self.a = a
self.b = b
def __repr__(self):
return f'{self.__class__.__name__}({self.a}, {self.b})'
class MyOtherClass(MyClass):
def method(self):
pass
c = MyClass(1, 2)
oc = MyOtherClass(3, 4)
print(c, oc)
Result:
MyClass(1, 2) MyOtherClass(3, 4)
This does what you ask, as well as taking subclassing into account to provide the correct class name. But of course things can get complicated for several reasons:
class MyClass:
def __init__(self, a, b):
self.a = a + 1
self.b = b if b < 10 else a
self.c = 0
def inc_c(self):
self.c += 1
def __repr__(self):
return f'{self.__class__.__name__}({self.a - 1}, {self.b})'
The value of c isn't covered by the constructor, so the proposed call would set it to 0. And Although you could compensate for the + 1 for a, the value of b will be more complicated - even more so if you realise someone could have changed the value later.
And then you need to consider that subclasses can override behaviour, etc. So, doing something like this only makes sense in very limited use cases.
As simple as replacing your code snippet with the following:
import os
base_dir = os.path.abspath(".")
class MyFile(object):
def __init__(self, name, size = None, dir = base_dir):
self.name = name
self.path = os.path.join(dir, name)
self.bytes = size
self.remember(name,size, dir)
def remember(self, name,size, dir):
self.s= '{}(name = \'{}\'{}{})'.format(self.__class__.__name__,name, ", size="+str(size) if size!=None else "", ', dir="'+dir+'"' if dir!=base_dir else "")
def __repr__(self):
return self.s
a) for a it returns:
MyFile(name = 'foo', size=10)
b) for b it returns:
MyFile(name = 'bar', dir="/home")
c) for f it returns:
[MyFile(name = 'baz', size=4), MyFile(name = 'buzz', size=12)]
Thanks to everyone who commented and answered. Ultimately, I incorporated their ideas and feedback into the following method, which allowed me to preserve the object's native __repr__ while still getting the behaviors I wanted.
#!/usr/bin/env python3
import os
base_dir = os.path.abspath(".")
class MyFile(dict):
"""
A custom dict class that auto-populates some keys based on simple input args
compatible with unittest.TestCase.assertDictEqual
"""
def __init__(self, name, size = None, dir = base_dir):
"""
standard init methods
"""
self.name = name
self.path = os.path.join(dir, name)
self.bytes = size
# auto-populate this key
self['somekey'] = self.path + ' ' + str(self.bytes)
# more logic for more complex keys goes here...
# use these later with `init` and `repr`
self.args = None
self.kwargs = None
#classmethod
def init(cls, *args, **kwargs):
"""
alternative method to initialize the object while retaining the args passed
"""
obj = cls(*args, **kwargs)
obj.args = args
obj.kwargs = kwargs
return(obj)
def repr(self):
"""
returns a text representation of the object that can be used to
create a new copy of an identical object, displaying only the
args that were originally used to create the current object instance
(do not show args that were not passed e.g. default value args)
"""
n = 'MyFile('
if self.args:
for i, arg in enumerate(self.args):
n += arg.__repr__()
if i < len(self.args) - 1 or self.kwargs:
n += ', '
if self.kwargs:
for i, (k, v) in enumerate(self.kwargs.items()):
n += str(k) + '=' + v.__repr__()
if i < len(self.kwargs.items()) - 1:
n += ', '
n += ')'
return(n)
Usage:
# normal object initialization
obj1 = MyFile('foo', size=10)
print(obj1) # {'somekey': '/Users/me/test/foo 10'}
# initialize with classmethod instead to preserve args
obj2 = MyFile.init("foo", size = 10)
print(obj2) # {'somekey': '/Users/me/test/foo 10'}
# view the text representation
repr = obj2.repr()
print(repr) # MyFile('foo', size=10)
# re-load a copy of the object from the text representation
obj3 = eval(repr)
print(obj3) # {'somekey': '/Users/me/test/foo 10'}
The use case for this being where I need to represent large simple data structures (dicts) in my Python code (integration tests), where the data values are dynamically generated from a smaller set of variables. But when I have many hundreds of such data structures that I need to include in the test case, it becomes infeasible to write the code for e.g. MyFile(...) out hundreds of times. This method allows me to use a script to ingest the data, and then print out compact Python code needed to recreate the data using my custom object class. Which I can then just copy/paste into my test cases.

A python function that return a list of function with a for loop

I am trying to implement a function (make_q) that returns a list of functions(Q) that are generated using the argument that make_q gets (P). Q is a variable dependent to n(=len(P)) and making the Q functions are similar, so it can be done in a for loop but here is the catch if I name the function in the loop, they will all have the same address so I only get the last Q, Is there to bypass this?
Here is my code,
def make_q(self):
Temp_P=[p for p in self.P]
Q=()
for i in range(self.n-1):
p=min(Temp_P)
q=max(Temp_P)
index_p=Temp_P.index(p)
index_q=Temp_P.index(q)
def tempQ():
condition=random.random()
if condition<=(p*self.n):
return index_p
else:
return index_q
Temp_Q=list(Q)
Temp_Q.append(tempQ)
Q=tuple(Temp_Q)
q-=(1-p*self.n)/self.n
Temp_P[index_q]=q
Temp_P.pop(index_p)
return Q
test.Q
(<function __main__.Test.make_q.<locals>.tempQ()>,
<function __main__.Test.make_q.<locals>.tempQ()>,
<function __main__.Test.make_q.<locals>.tempQ()>,
<function __main__.Test.make_q.<locals>.tempQ()>,
<function __main__.Test.make_q.<locals>.tempQ()>)
I also tried to make them a tuple so they have different addresses but it didn't work.
Is there a way to name functions(tempQ) dynamic like tempQi
jasonharper's observation and solution in comments is correct(and should be the accepted answer). But since you asked about metaclasses, I am posting this anyway.
In python, each class is a type , with "name", "bases" (base classes) and "attrs"(all members of a class). Essentially, a metaclass defines a behaviour of a class, you can read more about it at https://www.python-course.eu/python3_metaclasses.php and various other online tutorials.
The __new__ method runs when a class is set up. Note the usage of attrs where your class member self.n is accessed by attrs['n'] (as attrs is a dict of all class members). I am defining functions tempQ_0, tempQ_1... dynamically. As you can see, we can also add docstrings to this dynamically defined class members.
import random
class MyMetaClass(type):
def __new__(cls, name, bases, attrs):
Temp_P = [p for p in attrs['P']]
for i in range(attrs['n'] - 1):
p = min(Temp_P)
q = max(Temp_P)
index_p = Temp_P.index(p)
index_q = Temp_P.index(q)
def fget(self, index_p=index_p, index_q=index_q): # this is an unbound method
condition = random.random()
return index_p if condition <= (p * self.n) else index_q
attrs['tempQ_{}'.format(i)] = property(fget, doc="""
This function returns {} or {} randomly""".format(index_p, index_q))
q -= (1 - p * attrs['n']) / attrs['n']
Temp_P[index_q] = q
Temp_P.pop(index_p)
return super(MyMetaClass, cls).__new__(cls, name, bases, attrs)
# PY2
# class MyClass(object):
# __metaclass__ = MyMetaClass
# n = 3
# P = [3, 6, 8]
# PY3
class MyClass(metaclass=MyMetaClass):
n = 3
P = [3, 6, 8]
# or use with_metaclass from future.utils for both Py2 and Py3
# print(dir(MyClass))
print(MyClass.tempQ_0, MyClass.tempQ_1)
output
<property object at 0x10e5fbd18> <property object at 0x10eaad0e8>
So your list of functions is [MyClass.tempQ_0, MyClass.tempQ_1]
Please try via formatted strings, for eg: "function_{}.format(name)" also, how do you want your output to look like?

JSON serialize a class and change property casing with Python

I'd like to create a JSON representation of a class and change the property names automatically from snake_case to lowerCamelCase, as I'd like to comply with PEP8 in Python and also the JavaScript naming conventions (and maybe even more importantly, the backend I'm communicating to uses lowerCamelCase).
I prefer to use the standard json module, but I have nothing against using another, open source library (e.g. jsonpickle might solve my issue?).
>>> class HardwareProfile:
... def __init__(self, vm_size):
... self.vm_size = vm_size
>>> hp = HardwareProfile('Large')
>>> hp.vm_size
'Large'
### ### What I want ### ###
>>> magicjson.dumps(hp)
'{"vmSize": "Large"}'
### ### What I have so far... ### ###
>>> json.dumps(hp, default=lambda o: o.__dict__)
'{"vm_size": "Large"}'
You just need to create a function to transform the snake_case keys to camelCase. You can easily do that using .split, .lower, and .title.
import json
class HardwareProfile:
def __init__(self, vm_size):
self.vm_size = vm_size
self.some_other_thing = 42
self.a = 'a'
def snake_to_camel(s):
a = s.split('_')
a[0] = a[0].lower()
if len(a) > 1:
a[1:] = [u.title() for u in a[1:]]
return ''.join(a)
def serialise(obj):
return {snake_to_camel(k): v for k, v in obj.__dict__.items()}
hp = HardwareProfile('Large')
print(json.dumps(serialise(hp), indent=4, default=serialise))
output
{
"vmSize": "Large",
"someOtherThing": 42,
"a": "a"
}
You could put serialise in a lambda, but I think it's more readable to write it as a proper def function.

Printing an object python class

I wrote the following program:
def split_and_add(invoer):
rij = invoer.split('=')
rows = []
for line in rij:
rows.append(process_row(line))
return rows
def process_row(line):
temp_coordinate_row = CoordinatRow()
rij = line.split()
for coordinate in rij:
coor = process_coordinate(coordinate)
temp_coordinate_row.add_coordinaterow(coor)
return temp_coordinate_row
def process_coordinate(coordinate):
cords = coordinate.split(',')
return Coordinate(int(cords[0]),int(cords[1]))
bestand = file_input()
rows = split_and_add(bestand)
for row in range(0,len(rows)-1):
rij = rows[row].weave(rows[row+1])
print rij
With this class:
class CoordinatRow(object):
def __init__(self):
self.coordinaterow = []
def add_coordinaterow(self, coordinate):
self.coordinaterow.append(coordinate)
def weave(self,other):
lijst = []
for i in range(len(self.coordinaterow)):
lijst.append(self.coordinaterow[i])
try:
lijst.append(other.coordinaterow[i])
except IndexError:
pass
self.coordinaterow = lijst
return self.coordinaterow
However there is an error in
for row in range(0,len(rows)-1):
rij = rows[row].weave(rows[row+1])
print rij
The outcome of the print statement is as follows:
[<Coordinates.Coordinate object at 0x021F5630>, <Coordinates.Coordinate object at 0x021F56D0>]
It seems as if the program doesn't acces the actual object and printing it. What am i doing wrong here ?
This isn't an error. This is exactly what it means for Python to "access the actual object and print it". This is what the default string representation for a class looks like.
If you want to customize the string representation of your class, you do that by defining a __repr__ method. The typical way to do it is to write a method that returns something that looks like a constructor call for your class.
Since you haven't shown us the definition of Coordinate, I'll make some assumptions here:
class Coordinate(object):
def __init__(self, x, y):
self.x, self.y = x, y
# your other existing methods
def __repr__(self):
return '{}({}, {})'.format(type(self).__name__, self.x, self.y)
If you don't define this yourself, you end up inheriting __repr__ from object, which looks something like:
return '<{} object at {:#010x}>'.format(type(self).__qualname__, id(self))
Sometimes you also want a more human-readable version of your objects. In that case, you also want to define a __str__ method:
def __str__(self):
return '<{}, {}>'.format(self.x, self.y)
Now:
>>> c = Coordinate(1, 2)
>>> c
Coordinate(1, 2)
>>> print(c)
<1, 2>
But notice that the __str__ of a list calls __repr__ on all of its members:
>>> cs = [c]
>>> print(cs)
[Coordinate(1, 2)]

Categories