I have a base class A with some heavy attributes (actually large numpy arrays) that are derived from data given to A's __init__() method.
First, I would like to subclass A into a new class B to perform modifications on these attributes with some B's specific methods. As these attributes are quite intensive to obtain, I don't want to instantiate B the same way as A but better use an A instance to initialize a B object. This is a type casting between A and B and I think I should use the __new__() method to return a B object.
Second, before every computations on B's attributes, I must be sure that the initial state of B has been restored to the current state of the instance of A that has been used for B instantiation, without creating a B object every time, a kind of dynamic linkage...
Here is an example code I wrote:
from copy import deepcopy
import numpy as np
class A(object):
def __init__(self, data):
self.data=data
def generate_derived_attributes(self):
print "generating derived attributes..."
self.derived_attributes = data.copy()
class B(A):
def __new__(cls, obj_a):
assert isinstance(obj_a, A)
cls = deepcopy(obj_a)
cls.__class__ = B
cls._super_cache = obj_a # This is not a copy... no additional memory required
return cls
def compute(self):
# First reset the state (may use a decorator ?)
self.reset()
print "Doing some computations..."
def reset(self):
print "\nResetting object to its initial state"
_super_cache = self._super_cache # For not being destroyed...
self.__dict__ = deepcopy(self._super_cache.__dict__)
self._super_cache = _super_cache
if __name__ == '__main__':
a = A(np.zeros(100000000, dtype=np.float))
a.generate_derived_attributes()
print a
b = B(a)
print b
b.compute()
b.compute()
Is this implementation a kind way to reach my objective with python or is there more Pythonic ways... ? Could I be more generic ? (I know that using __dict__ will not be a good choice in every cases, especially while using __slots__()...). Do you think that using a decorator around B.compute() would give me more flexibility for using this along with other classes ?
Related
I have a class that does some complex calculation and generates some result MyClass.myresults.
MyClass.myresults is actually a class itself with different attributes (e.g. MyClass.myresults.mydf1, MyClass.myresults.mydf2.
Now, I need to run MyClass iteratively following a list of scenarios(scenarios=[1,2,[2,4], 5].
This happens with a simple loop:
for iter in scenarios:
iter = [iter] if isinstance(iter, int) else iter
myclass = MyClass() #Initialize MyClass
myclass.DoStuff(someInput) #Do stuff and get results
results.StoreScenario(myclass.myresults, iter)
and at the end of each iteration store MyClass.myresults.
I would like to create a separate class (Results) that at each iteration creates a subclass scenario_1, scenario_2, scenario_2_4 and stores within it MyClass.myresults.
class Results:
# no initialization, is an empty container to which I would like to add attributes iteratively
class StoreScenario:
def __init__(self, myresults, iter):
self.'scenario_'.join(str(iter)) = myresults #just a guess, I am assuming this is wrong
Suggestions on different approaches are more than welcome, I am quite new to classes and I am not sure if this is an acceptable approach or if I am doing something awful (clunky, memory inefficient, or else).
There's two problems of using this approach, The first one is, Result class (separate class) only stores modified values of your class MyClass, I mean, they should be the same class.
The second problem is memory efficiency, you create the same object twice for storing actual values and modified values at each iteration.
The suggested approach is using a hashmap or a dictionary in python. Using dictionary you are able to store copies of modified object very efficient and there's no need to create another class.
class MyClass:
def __init__(self):
# some attributes ...
self.scenarios_result = {}
superObject = MyClass()
for iter in scenarios:
iter = [iter] if isinstance(iter, int) else iter
myclass = MyClass() #Initialize MyClass
myclass.DoStuff(someInput) #Do stuff and get results
# results.StoreScenario(myclass.myresults, iter)
superObject.scenarios_result[iter] = myclass
So I solved it using setattr:
class Results:
def __init__(self):
self.scenario_results= type('ScenarioResults', (), {}) # create an empty object
def store_scenario(self, data, scenarios):
scenario_key = 'scenario_' + '_'.join(str(x) for x in scenarios)
setattr(self.simulation_results, scenario_key,
subclass_store_scenario(data))
class subclass_store_scenario:
def __init__(self, data):
self.some_stuff = data.result1.__dict__
self.other_stuff = data.result2.__dict__
This allows me to call things like:
results.scenario_results.scenario_1.some_stuff.something
results.scenario_results.scenario_1.some_stuff.something_else
This is necessary for me as I need to compute other measures, summary or scenario-specific, which I can then iteratively assign using again setattr:
def construct_measures(self, some_data, configuration):
for scenario in self.scenario_results:
#scenario is a reference to the self.scenario_results class.
#we can simply add attributes to it
setattr(scenario , 'some_measure',
self.computeSomething(
some_data.input1, some_data.input2))
I have 2 classes that are not linked by inheritance. Instance of Class A needs a method from instance of Class B to update itself. And the method of Class B needs the latest state of instance A in order to know what to compute.
In the minimum code below:
Instance of A has a variable x.
In order to update x, A's instance asks B's instance to update variable x
B's instance gets the latest copy of A's instance and processes the variable x
The problem is I am literally having to pass the complete instance of A via b.processX(self) method call.
Is inheritance the only way to achieve this? Is there no other way? Is there a more pythonic way than passing a full instance of A into B via (self)?
class A:
def __init__(self,value):
self.x = value
def updateX(self,b):
self.x = b.processX(self) # <-- IS THERE A BETTER WAY?
return self.x
class B:
def __init__(self):
pass
def processX(self,a):
a.x += 5
return a.x
if __name__ == '__main__':
a = A(10)
b = B()
ret = a.updateX(b)
print(ret)
EDIT: the reason I am asking this is in my actual code, A's instance can become very big with thousands of datapoints and I am concerned about performance if every call to B results in massive data structures being passed from one instance to the other
In my code I have a class, where one method is responsible for filtering some data. To allow customization for descendants I would like to define filtering function as a class attribute as per below:
def my_filter_func(x):
return x % 2 == 0
class FilterClass(object):
filter_func = my_filter_func
def filter_data(self, data):
return filter(self.filter_func, data)
class FilterClassDescendant(FilterClass):
filter_func = my_filter_func2
However, such code leads to TypeError, as filter_func receives "self" as first argument.
What is a pythonic way to handle such use cases? Perhaps, I should define my "filter_func" as a regular class method?
You could just add it as a plain old attribute?
def my_filter_func(x):
return x % 2 == 0
class FilterClass(object):
def __init__(self):
self.filter_func = my_filter_func
def filter_data(self, data):
return filter(self.filter_func, data)
Alternatively, force it to be a staticmethod:
def my_filter_func(x):
return x % 2 == 0
class FilterClass(object):
filter_func = staticmethod(my_filter_func)
def filter_data(self, data):
return filter(self.filter_func, data)
Python has a lot of magic within. One of those magics has something to do with transforming functions into UnboundMethod objects (when assigned to the class, and not to an class' instance).
When you assign a function (And I'm not sure whether it applies to any callable or just functions), Python converts it to an UnboundMethod object (i.e. an object which can be called using an instance or not).
Under normal conditions, you can call your UnboundMethod as normal:
def myfunction(a, b):
return a + b
class A(object):
a = myfunction
A.a(1, 2)
#prints 3
This will not fail. However, there's a distinct case when you try to call it from an instance:
A().a(1, 2)
This will fail since when an instance gets (say, internal getattr) an attribute which is an UnboundMethod, it returns a copy of such method with the im_self member populated (im_self and im_func are members of UnboundMethod). The function you intended to call, is in the im_func member. When you call this method, you're actually calling im_func with, additionally, the value in im_self. So, the function needs an additional parameter (the first one, which will stand for self).
To avoid this magic, Python has two possible decorators:
If you want to pass the function as-is, you must use #staticmethod. In this case, you will have the function not converted to UnboundMethod. However, you will not be able to access the calling class, except as a global reference.
If you want to have the same, but be able to access the current class (disregarding whether the function it is called from an instance or from a class), then your function should have another first argument (INSTEAD of self: cls) which is a reference to the class, and the decorator to use is #classmethod.
Examples:
class A(object):
a = staticmethod(lambda a, b: a + b)
A.a(1, 2)
A().a(1, 2)
Both will work.
Another example:
def add_print(cls, a, b):
print cls.__name__
return a + b
class A(object):
ap = classmethod(add_print)
class B(A):
pass
A.ap(1, 2)
B.ap(1, 2)
A().ap(1, 2)
B().ap(1, 2)
Check this by yourseld and enjoy the magic.
Edit: There was some confusion, but I want to ask a general question about object oriented design in Python.
Consider a class that lets you map data values to counts or frequencies:
class DataMap(dict):
pass
Now consider a subclass that allows you to construct a histogram from a list of data:
class Histogram(DataMap):
def __init__(self, list_of_values):
# 1. Put appropriate super(...) call here if necessary
# 2. Build the map of values to counts in self
pass
Now consider a class that lets you make a smoothed probability mass table rather than a Histogram.
class ProbabilityMass(DataMap):
pass
What is the best way to allow a ProbabilityMass to be constructed from either a Histogram or a list of values?
I "grew up" programming in C++, and in this case I would use an overloaded constructor. In Python I've thought of doing this with:
The constructor takes multiple arguments (all but one of these should == None)
I define from_Histogram and from_list methods
In the second case (which I believe is better), what is the best way to allow the from_list method to use the shared code from the Histogram constructor? A ProbabilityMass table is nearly identical to a Histogram table, but it is scaled so that the sum of all value is 1.0.
If you have come across a similar problem, please share your expertise!
To start with, if you think you want #staticmethod, you almost always don't. Either the function is not part of the class, in which case it should just be a free function, or it is part of the class, but not tied to an instance, and it should be a #classmethod. Your named constructor is a good candidate for a #classmethod.
Also note that you should invoke A.__init__ from B via super(), otherwise multiple inheritance can bite you bad.
class A:
def __init__(self, data):
self.values_to_counts = {}
for val in data:
if val in self.values_to_counts:
self.values_to_counts[val] += 1
else:
self.values_to_counts[val] = 1
#classmethod
def from_values_to_counts(cls, values_to_counts):
self = cls([])
self.values_to_counts = values_to_counts
return self
class B(A):
def __init__(self, data, parameter):
super(B, self).__init__(data)
self.parameter = parameter
def print_parameter(self):
print self.parameter
In this case, you don't need a B.from_values_to_counts, it inherits from A, and it will return an instance of B, since that's how it was called.
If you need to do more complex initialization in B, you can, using super(), which looks very similar to the way it would when you use it with instances. after all, a classmethod really isn't anything more complex than an instancemethod where the im_self attribute is assigned to the class itself.
class A:
def __init__(self, data):
self.values_to_counts = {}
for val in data:
if val in self.values_to_counts:
self.values_to_counts[val] += 1
else:
self.values_to_counts[val] = 1
#classmethod
def from_values_to_counts(cls, values_to_counts):
self = cls([])
self.values_to_counts = values_to_counts
return self
class B(A):
def __init__(self, data, parameter):
super(B, self).__init__(data)
self.parameter = parameter
def print_parameter(self):
print self.parameter
#classmethod
def from_values_to_counts(cls, values_to_counts):
self = super(B, cls).from_values_to_counts(values_to_counts)
do_more_initialization(self)
return self
I need to extend the Networkx python package and add a few methods to the Graph class for my particular need
The way I thought about doing this is simplying deriving a new class say NewGraph, and adding the required methods.
However there are several other functions in networkx which create and return Graph objects (e.g. generate a random graph). I now need to turn these Graph objects into NewGraph objects so that I can use my new methods.
What is the best way of doing this? Or should I be tackling the problem in a completely different manner?
If you are just adding behavior, and not depending on additional instance values, you can assign to the object's __class__:
from math import pi
class Circle(object):
def __init__(self, radius):
self.radius = radius
def area(self):
return pi * self.radius**2
class CirclePlus(Circle):
def diameter(self):
return self.radius*2
def circumference(self):
return self.radius*2*pi
c = Circle(10)
print c.radius
print c.area()
print repr(c)
c.__class__ = CirclePlus
print c.diameter()
print c.circumference()
print repr(c)
Prints:
10
314.159265359
<__main__.Circle object at 0x00A0E270>
20
62.8318530718
<__main__.CirclePlus object at 0x00A0E270>
This is as close to a "cast" as you can get in Python, and like casting in C, it is not to be done without giving the matter some thought. I've posted a fairly limited example, but if you can stay within the constraints (just add behavior, no new instance vars), then this might help address your problem.
Here's how to "magically" replace a class in a module with a custom-made subclass without touching the module. It's only a few extra lines from a normal subclassing procedure, and therefore gives you (almost) all the power and flexibility of subclassing as a bonus. For instance this allows you to add new attributes, if you wish.
import networkx as nx
class NewGraph(nx.Graph):
def __getattribute__(self, attr):
"This is just to show off, not needed"
print "getattribute %s" % (attr,)
return nx.Graph.__getattribute__(self, attr)
def __setattr__(self, attr, value):
"More showing off."
print " setattr %s = %r" % (attr, value)
return nx.Graph.__setattr__(self, attr, value)
def plot(self):
"A convenience method"
import matplotlib.pyplot as plt
nx.draw(self)
plt.show()
So far this is exactly like normal subclassing. Now we need to hook this subclass into the networkx module so that all instantiation of nx.Graph results in a NewGraph object instead. Here's what normally happens when you instantiate an nx.Graph object with nx.Graph()
1. nx.Graph.__new__(nx.Graph) is called
2. If the returned object is a subclass of nx.Graph,
__init__ is called on the object
3. The object is returned as the instance
We will replace nx.Graph.__new__ and make it return NewGraph instead. In it, we call the __new__ method of object instead of the __new__ method of NewGraph, because the latter is just another way of calling the method we're replacing, and would therefore result in endless recursion.
def __new__(cls):
if cls == nx.Graph:
return object.__new__(NewGraph)
return object.__new__(cls)
# We substitute the __new__ method of the nx.Graph class
# with our own.
nx.Graph.__new__ = staticmethod(__new__)
# Test if it works
graph = nx.generators.random_graphs.fast_gnp_random_graph(7, 0.6)
graph.plot()
In most cases this is all you need to know, but there is one gotcha. Our overriding of the __new__ method only affects nx.Graph, not its subclasses. For example, if you call nx.gn_graph, which returns an instance of nx.DiGraph, it will have none of our fancy extensions. You need to subclass each of the subclasses of nx.Graph that you wish to work with and add your required methods and attributes. Using mix-ins may make it easier to consistently extend the subclasses while obeying the DRY principle.
Though this example may seem straightforward enough, this method of hooking into a module is hard to generalize in a way that covers all the little problems that may crop up. I believe it's easier to just tailor it to the problem at hand. For instance, if the class you're hooking into defines its own custom __new__ method, you need to store it before replacing it, and call this method instead of object.__new__.
I expanded what PaulMcG did and made it a factory pattern.
class A:
def __init__(self, variable):
self.a = 10
self.a_variable = variable
def do_something(self):
print("do something A")
class B(A):
def __init__(self, variable=None):
super().__init__(variable)
self.b = 15
#classmethod
def from_A(cls, a: A):
# Create new b_obj
b_obj = cls()
# Copy all values of A to B
# It does not have any problem since they have common template
for key, value in a.__dict__.items():
b_obj.__dict__[key] = value
return b_obj
if __name__ == "__main__":
a = A(variable="something")
b = B.from_A(a=a)
print(a.__dict__)
print(b.__dict__)
b.do_something()
print(type(b))
Result:
{'a': 10, 'a_variable': 'something'}
{'a': 10, 'a_variable': 'something', 'b': 15}
do something A
<class '__main__.B'>
You could simply create a new NewGraph derived from Graph object and have the __init__ function include something like self.__dict__.update(vars(incoming_graph)) as the first line, before you define your own properties. In this way you basically copy all the properties from the Graph you have onto a new object, derived from Graph, but with your special sauce.
class NewGraph(Graph):
def __init__(self, incoming_graph):
self.__dict__.update(vars(incoming_graph))
# rest of my __init__ code, including properties and such
Usage:
graph = function_that_returns_graph()
new_graph = NewGraph(graph)
cool_result = function_that_takes_new_graph(new_graph)
I encountered the same question when contributing to networkx, because I need many new methods for Graph. The answer by #Aric is the simplest solution, but inheritance is not used. Here a native networkx feature is utilise, and it should be more efficient.
There is a section in networkx tutorial, using the graph constructors, showing how to init Graph object from existing objects for a graph, especially, another graph object. This is the example shown there, you can init a new DiGraph object, H, out of an existing Graph object, G:
>>> G = Graph()
>>> G.add_edge(1, 2)
>>> H = nx.DiGraph(G) # create a DiGraph using the connections from G
>>> list(H.edges())
[(1, 2), (2, 1)]
Note the mathematical meaning when converting an existing graph to a directed graph. You can probably realise this feature via some function or constructor, but I see it as an important feature in networkx. Haven't checked their implementation, but I guess it's more efficient.
To preserve this feature in NewGraph class, you should make it able to take an existing object as argument in __init__, for example:
from typing import Optional
import networkx as nx
class NewGraph(nx.Graph):
def __init__(self, g: Optional[nx.Graph] = None):
"""Init an empty directed graph or from an existing graph.
Args:
g: an existing graph.
"""
if not g:
super().__init__()
else:
super().__init__(g)
Then whenever you have a Graph object, you can init (NOT turn it directly to) a NewGraph object by:
>>> G = nx.some_function()
...
>>> NG = NewGraph(G)
or you can init an empty NewGraph object:
>>> NG_2 = NewGraph()
For the same reason, you can init another Graph object out of NG:
>>> G_2 = nx.Graph(NG)
Most likely, there are many operations after super().__init__() when initiating a NewGraph object, so the answer by #PaulMcG, as he/she mentioned, is not a good idea in such circumstance.
If a function is creating Graph objects, you can't turn them into NewGraph objects.
Another option is for NewGraph is to have a Graph rather than being a Graph. You delegate the Graph methods to the Graph object you have, and you can wrap any Graph object into a new NewGraph object:
class NewGraph:
def __init__(self, graph):
self.graph = graph
def some_graph_method(self, *args, **kwargs):
return self.graph.some_graph_method(*args, **kwargs)
#.. do this for the other Graph methods you need
def my_newgraph_method(self):
....
For your simple case you could also write your subclass __init__ like this and assign the pointers from the Graph data structures to your subclass data.
from networkx import Graph
class MyGraph(Graph):
def __init__(self, graph=None, **attr):
if graph is not None:
self.graph = graph.graph # graph attributes
self.node = graph.node # node attributes
self.adj = graph.adj # adjacency dict
else:
self.graph = {} # empty graph attr dict
self.node = {} # empty node attr dict
self.adj = {} # empty adjacency dict
self.edge = self.adj # alias
self.graph.update(attr) # update any command line attributes
if __name__=='__main__':
import networkx as nx
R=nx.gnp_random_graph(10,0.4)
G=MyGraph(R)
You could also use copy() or deepcopy() in the assignments but if you are doing that you might as well use
G=MyGraph()
G.add_nodes_from(R)
G.add_edges_from(R.edges())
to load your graph data.
The __class__ assignment approach actually alters the variable. If you only want to call a function form the super class you can use super. For example:
class A:
def __init__(self):
pass
def f(self):
print("A")
class B(A):
def __init__(self):
super().__init__()
def f(self):
print("B")
b = B()
b.f()
super(type(b), b).f()
is returning
B
A
Have you guys tried
[Python] cast base class to derived class
I have tested it, and seems it works. Also I think this method is bit better than below one since below one does not execute init function of derived function.
c.__class__ = CirclePlus