I have a class that does some complex calculation and generates some result MyClass.myresults.
MyClass.myresults is actually a class itself with different attributes (e.g. MyClass.myresults.mydf1, MyClass.myresults.mydf2.
Now, I need to run MyClass iteratively following a list of scenarios(scenarios=[1,2,[2,4], 5].
This happens with a simple loop:
for iter in scenarios:
iter = [iter] if isinstance(iter, int) else iter
myclass = MyClass() #Initialize MyClass
myclass.DoStuff(someInput) #Do stuff and get results
results.StoreScenario(myclass.myresults, iter)
and at the end of each iteration store MyClass.myresults.
I would like to create a separate class (Results) that at each iteration creates a subclass scenario_1, scenario_2, scenario_2_4 and stores within it MyClass.myresults.
class Results:
# no initialization, is an empty container to which I would like to add attributes iteratively
class StoreScenario:
def __init__(self, myresults, iter):
self.'scenario_'.join(str(iter)) = myresults #just a guess, I am assuming this is wrong
Suggestions on different approaches are more than welcome, I am quite new to classes and I am not sure if this is an acceptable approach or if I am doing something awful (clunky, memory inefficient, or else).
There's two problems of using this approach, The first one is, Result class (separate class) only stores modified values of your class MyClass, I mean, they should be the same class.
The second problem is memory efficiency, you create the same object twice for storing actual values and modified values at each iteration.
The suggested approach is using a hashmap or a dictionary in python. Using dictionary you are able to store copies of modified object very efficient and there's no need to create another class.
class MyClass:
def __init__(self):
# some attributes ...
self.scenarios_result = {}
superObject = MyClass()
for iter in scenarios:
iter = [iter] if isinstance(iter, int) else iter
myclass = MyClass() #Initialize MyClass
myclass.DoStuff(someInput) #Do stuff and get results
# results.StoreScenario(myclass.myresults, iter)
superObject.scenarios_result[iter] = myclass
So I solved it using setattr:
class Results:
def __init__(self):
self.scenario_results= type('ScenarioResults', (), {}) # create an empty object
def store_scenario(self, data, scenarios):
scenario_key = 'scenario_' + '_'.join(str(x) for x in scenarios)
setattr(self.simulation_results, scenario_key,
subclass_store_scenario(data))
class subclass_store_scenario:
def __init__(self, data):
self.some_stuff = data.result1.__dict__
self.other_stuff = data.result2.__dict__
This allows me to call things like:
results.scenario_results.scenario_1.some_stuff.something
results.scenario_results.scenario_1.some_stuff.something_else
This is necessary for me as I need to compute other measures, summary or scenario-specific, which I can then iteratively assign using again setattr:
def construct_measures(self, some_data, configuration):
for scenario in self.scenario_results:
#scenario is a reference to the self.scenario_results class.
#we can simply add attributes to it
setattr(scenario , 'some_measure',
self.computeSomething(
some_data.input1, some_data.input2))
Related
I came across a method in Python that returns a class, but can be destructured as if it's a tuple.
How can you define a result of a function to be both an instance of a class AND use destructure assignment as if it's a tuple?
An example where you see this behavior:
import scipy.stats as stats
res = stats.ttest_ind(data1, data2)
print(type(res)) # <class 'scipy.stats.stats.Ttest_indResult'>
# One way to assign values is by directly accessing the instance's properties.
p = res.pvalue
t = res.statistic
# A second way is to treat the result as a tuple, and assign to variables directly. But how is this working?
# We saw above that the type of the result is NOT a tuple but a class. How would Python know the order of the properties here? (It's not like we're destructuring based on named properties)
t, p = stats.ttest_ind(data1, data2)
It's a named tuple, which is basically an extension to tuple type in python.
To unpack a data type with a, b = some_object, the object on the right side needs to be iterable. A list or tuple works, obviously, but you can make your own class iterable by implementing an __iter__ method.
For example, the following class would behave consistently with the interface you've shown the Ttest_indResult class to have (though it's probably implemented very differently):
class MyClass:
def __init__(self, statistic, pvalue):
self.statistic = statistic # these attributes are accessible by name
self.pvalue = pvalue
def __iter__(self): # but you can also iterate to get the same values
yield self.statistic
yield self.pvalue
I have a question on the usage of the setattr method in python.
I have a python class with around 20 attributes, which can be initialized in the below manner:
class SomeClass():
def __init__(self, pd_df_row): # pd_df_row is one row from a dataframe
#initialize some attributes (attribute_A to attribute_Z) in a similar manner
if 'column_A' in pd_df_row.columns:
self.attribute_A = pd_df_row['column_A']
else:
self.attribute_A = np.nan
....
if 'column_Z' in pd_df_row.columns:
self.attribute_Z = pd_df_row['column_Z']
else:
self.attribute_Z = np.nan
# initialize some other attributes based on some other columns in pd_df_row
self.other_attribute = pre_process(pd_df_row['column_123'])
# some other methods
def compute_something(self):
return self.attribute_A + self.attribute_B
Is it advisable to write the class in the below way instead, making use of the setattr method and for loop in python:
class SomeClass():
# create a static list to store the mapping between attribute names and column names that can be initialized using a similar logic.
# However, the mapping would not cover all columns in the input pd_df_row or cover all attributes of the class, because not all columns are read and stored in the same way
# (this mapping will be hardcoded. Its initialization cannot be further simplified using a loop, because the attribute name and the corresponding column name do not actually follow any particular patterns)
ATTR_LIST = [('attribute_A', 'column_A'), ('attribute_B', 'column_B'), ...,('attribute_Z', 'column_Z')]
def __init__(self, pd_df_row): #where pd_df_row is a dataframe
#initialize some attributes (attribute_A to attribute_Z) in a loop
for attr_name, col_name in SomeClass.ATTR_LIST:
if col_name in pd_df_row.columns:
setattr(self, attr_name, pd_df_row[col_name])
else:
setattr(self, attr_name, np.nan)
# initialize some other attributes based on some other columns in pd_df_row
self.other_attribute = pre_process(pd_df_row['column_123'])
# some other methods
def compute_something(self):
return self.attribute_A + self.attribute_B
the second way of writing this class seem to be able to shorten the code. However, it also seem to make the structure of the class a bit confusing, by creating the static list of attribute and column name mapping (which will be used to initiate only some but not all of the attributes). Also, I noticed that code auto-completion will not work for the second piece of code as the code editor wont be able to know what attribute is created until run time. Therefore my question is, is it advisable to use setattr() in this way? In what cases should I write my code in this way and in what cases I should avoid doing so?
In addition, does creating the static mapping in the class violate object oriented programming principles? should I create and store this mapping in some other place instead?
Thank you.
You could, but I would consider having a dict of attributes rather than separate similarly named attributes.
class SomeClass():
def __init__(self, pd_df_row): # pd_df_row is one row from a dataframe
self.attributes = {}
for x in ['A', ..., 'Z']:
column = f'column_{x}'
if column in pd_df_row:
self.attributes[x] = pd_df_row[column]
else:
self.attributes[x] = np.nan
# initialize some other attributes
self.other_attribute = some_other_values
# some other methods
def compute_something(self):
return self.attribute['A'] + self.attribute['B']
How to return a list of objects and not list here.
I want to return a list of test objects and not a list of str..
class test:
val = ""
def __init__(self,v):
self.val = v
def tolower(self,k):
k = k.val.lower()
return k
def test_run():
tests_lst = []
tests_lst.append(test("TEST-0"))
tests_lst.append(test("TEST-1"))
tests_lst.append(test("TEST-2"))
i_want_object_of_test = map(lambda x:x.val.lower(),tests_lst)
if __name__ == '__main__':
test_run()
OUTPUT:
['test-0', 'test-1', 'test-2']
i want a list of test objects where each object's val has changed to lower case.
The question is unclear. I'll answer by what I understand.
What I understand is that you are trying to create a new list of test objects, with the values as lower case.
You can do this either by changing the state of each of the objects in a for loop (changing state is usually not recommended):
for test_obj in test_lst:
test_obj.val = test_obj.val.lower()
A way to do it through a list comprehension is to create new test instances:
i_want_object_of_test = [test(test_obj.val.lower()) for test_obj in test_lst]
Besides, there are a few problems with your test class:
It is an old style class, you should always inherit from object in your classes: class test(object):
You define a class variable by putting val = ""' in your class defenition, you then override it in each instance.
Your tolower method gets another test instance (k) and returns its value as lower case. I assume you want to either return a new test object or change the current one in place. Either way the method should only use self.
I have a base class A with some heavy attributes (actually large numpy arrays) that are derived from data given to A's __init__() method.
First, I would like to subclass A into a new class B to perform modifications on these attributes with some B's specific methods. As these attributes are quite intensive to obtain, I don't want to instantiate B the same way as A but better use an A instance to initialize a B object. This is a type casting between A and B and I think I should use the __new__() method to return a B object.
Second, before every computations on B's attributes, I must be sure that the initial state of B has been restored to the current state of the instance of A that has been used for B instantiation, without creating a B object every time, a kind of dynamic linkage...
Here is an example code I wrote:
from copy import deepcopy
import numpy as np
class A(object):
def __init__(self, data):
self.data=data
def generate_derived_attributes(self):
print "generating derived attributes..."
self.derived_attributes = data.copy()
class B(A):
def __new__(cls, obj_a):
assert isinstance(obj_a, A)
cls = deepcopy(obj_a)
cls.__class__ = B
cls._super_cache = obj_a # This is not a copy... no additional memory required
return cls
def compute(self):
# First reset the state (may use a decorator ?)
self.reset()
print "Doing some computations..."
def reset(self):
print "\nResetting object to its initial state"
_super_cache = self._super_cache # For not being destroyed...
self.__dict__ = deepcopy(self._super_cache.__dict__)
self._super_cache = _super_cache
if __name__ == '__main__':
a = A(np.zeros(100000000, dtype=np.float))
a.generate_derived_attributes()
print a
b = B(a)
print b
b.compute()
b.compute()
Is this implementation a kind way to reach my objective with python or is there more Pythonic ways... ? Could I be more generic ? (I know that using __dict__ will not be a good choice in every cases, especially while using __slots__()...). Do you think that using a decorator around B.compute() would give me more flexibility for using this along with other classes ?
I am trying to write a code whereby I can set a variable, say n, to create n numbers of instances for that particular class. The instances have to be named 'Node_1', 'Node_2'...'Node_n'. I've tried to do this in several ways using the for loop, however I always get the error: 'Can't assign to operator.'
My latest effort is as follows:
class C():
pass
for count in range(1,3):
"node"+str(count)=locals()["C"]()
print(node)
I understand that the "node" + str(count) is not possible, but I don't see how I can solve this issue.
Any help on the matter will be greatly appreciated.
You could do what you're trying to do, but it's a really bad idea. You should either use a list or a dict; since you seem to want the names to be nodeX, and starting from 1, you should use a dict.
nodes = {'node{}'.format(x): C() for x in range(1, 3)}
Depending on what you're doing, you could also use a defaultdict.
from collections import defaultdict
nodes = defaultdict(C)
print(nodes['node1'])
nodes['node2'].method()
print(nodes['anything-can-go-here'])
Once you're doing that though, there's no need for the 'node' prefix.
The best pattern for creating several similar objects is a list comprehension:
class C():
pass
nodes = [C() for i in range(3)]
This leaves you with three objects of class C, stored in a list called nodes. Access each object in the normal way, with indexing (e.g. nodes[0]).
You're trying to assign a value to a string. You can write Node_1 = C(), but "Node_1" = C() is meaningless, as "Node_1" is a string literal, not an identifier.
It's a little sketchy, but you can use the locals() dictionary to access the identifiers by name:
for count in range(1, 3):
locals()["node" + str(count)] = C()
...and, having done that, you can then use node1 and node2 as if they were defined explicitly in your code.
Typically, however, it's preferable to not access your locals this way, rather you should probably be using a separate dictionary of your own creation that stands on its own and contains the values there:
nodes = {}
for count in range(1, 3):
nodes[count] = C()
... and the values can then be accessed like so: nodes[1], nodes[2], etc.
What I like to do, to keep a registry of all the instances of a class:
class C(object):
instances = {}
def __new__(cls, *args, **kwargs):
instance = super(C, cls).__new__(cls, *args, **kwargs)
instance_name = 'node_{}'.format(len(cls.instances))
cls.instances[instance_name] = instance
return instance
if __name__ == '__main__':
for _ in range(3):
C()
print C.instances
OrderedDict([('node_0', <main.C object at 0x10c3fe8d0>), ('node_1', <main.C object at 0x10c4cb610>), ('node_2', <main.C object at 0x10c4e04d0>)])