I have a python object, with many attributes and functions (dummy example bellow):
class molecule:
def __init__(self, atoms, coords):
self.atoms=np.copy(atoms)
self.coords=np.copy(coords)
def shift(self,r):
self.coords=self.coords+r
I would like to generate preferably a numpy array (or a list) of these objects and to obtain its properties without always looping over the array. At the moment I create a list of molecule objects (mols) by a loop and check its attributes by loops eg:
atomList=[mol.atoms for mol in mols]
but I would prefer to obtain it as:
atomList=mols.atoms
Is there an automatic way to obtain such an array/list class without manually defining the molList class and manually add its attributes, functions etc?
You can use a class_variable. The difference between class variables and instance variables can be found here:
https://medium.com/python-features/class-vs-instance-variables-8d452e9abcbd#:~:text=Class%20variables%20are%20shared%20across,surprising%20behaviour%20in%20our%20code.
For your example, something like this ought to work:
class molecule:
atomList = [] # class variable
def __init__(self, atoms, coords):
self.atoms=np.copy(atoms) # instance variable
self.coords=np.copy(coords)
molecule.atomList.append(atoms) # update the class variable with each new instance of the class
def shift(self,r):
self.coords=self.coords+r
Then in your code, you can just do atomlist = molecule.atomList
Related
I have a question on the usage of the setattr method in python.
I have a python class with around 20 attributes, which can be initialized in the below manner:
class SomeClass():
def __init__(self, pd_df_row): # pd_df_row is one row from a dataframe
#initialize some attributes (attribute_A to attribute_Z) in a similar manner
if 'column_A' in pd_df_row.columns:
self.attribute_A = pd_df_row['column_A']
else:
self.attribute_A = np.nan
....
if 'column_Z' in pd_df_row.columns:
self.attribute_Z = pd_df_row['column_Z']
else:
self.attribute_Z = np.nan
# initialize some other attributes based on some other columns in pd_df_row
self.other_attribute = pre_process(pd_df_row['column_123'])
# some other methods
def compute_something(self):
return self.attribute_A + self.attribute_B
Is it advisable to write the class in the below way instead, making use of the setattr method and for loop in python:
class SomeClass():
# create a static list to store the mapping between attribute names and column names that can be initialized using a similar logic.
# However, the mapping would not cover all columns in the input pd_df_row or cover all attributes of the class, because not all columns are read and stored in the same way
# (this mapping will be hardcoded. Its initialization cannot be further simplified using a loop, because the attribute name and the corresponding column name do not actually follow any particular patterns)
ATTR_LIST = [('attribute_A', 'column_A'), ('attribute_B', 'column_B'), ...,('attribute_Z', 'column_Z')]
def __init__(self, pd_df_row): #where pd_df_row is a dataframe
#initialize some attributes (attribute_A to attribute_Z) in a loop
for attr_name, col_name in SomeClass.ATTR_LIST:
if col_name in pd_df_row.columns:
setattr(self, attr_name, pd_df_row[col_name])
else:
setattr(self, attr_name, np.nan)
# initialize some other attributes based on some other columns in pd_df_row
self.other_attribute = pre_process(pd_df_row['column_123'])
# some other methods
def compute_something(self):
return self.attribute_A + self.attribute_B
the second way of writing this class seem to be able to shorten the code. However, it also seem to make the structure of the class a bit confusing, by creating the static list of attribute and column name mapping (which will be used to initiate only some but not all of the attributes). Also, I noticed that code auto-completion will not work for the second piece of code as the code editor wont be able to know what attribute is created until run time. Therefore my question is, is it advisable to use setattr() in this way? In what cases should I write my code in this way and in what cases I should avoid doing so?
In addition, does creating the static mapping in the class violate object oriented programming principles? should I create and store this mapping in some other place instead?
Thank you.
You could, but I would consider having a dict of attributes rather than separate similarly named attributes.
class SomeClass():
def __init__(self, pd_df_row): # pd_df_row is one row from a dataframe
self.attributes = {}
for x in ['A', ..., 'Z']:
column = f'column_{x}'
if column in pd_df_row:
self.attributes[x] = pd_df_row[column]
else:
self.attributes[x] = np.nan
# initialize some other attributes
self.other_attribute = some_other_values
# some other methods
def compute_something(self):
return self.attribute['A'] + self.attribute['B']
I have a class which has some class variables, methods, etc. Let's call it Cell.
class Cell:
def __init__(self):
self.status = 0
...
I have a list of different instances of this class.
grid = [Cell.Cell() for i in range(x_size*y_size)]
Is it possible to get the upper shown status variable of each of the instances stored in grid in a vectorized manner without looping through the elements of the list?
Not in vanilla Python.
statuses = [x.status for x in grid]
If you are looking for something that abstracts away the explicit iteration, or even just the for keyword, perhaps you'd prefer
from operator import attrgetter
statuses = list(map(attrgetter('status'), grid))
?
I have a class that does some complex calculation and generates some result MyClass.myresults.
MyClass.myresults is actually a class itself with different attributes (e.g. MyClass.myresults.mydf1, MyClass.myresults.mydf2.
Now, I need to run MyClass iteratively following a list of scenarios(scenarios=[1,2,[2,4], 5].
This happens with a simple loop:
for iter in scenarios:
iter = [iter] if isinstance(iter, int) else iter
myclass = MyClass() #Initialize MyClass
myclass.DoStuff(someInput) #Do stuff and get results
results.StoreScenario(myclass.myresults, iter)
and at the end of each iteration store MyClass.myresults.
I would like to create a separate class (Results) that at each iteration creates a subclass scenario_1, scenario_2, scenario_2_4 and stores within it MyClass.myresults.
class Results:
# no initialization, is an empty container to which I would like to add attributes iteratively
class StoreScenario:
def __init__(self, myresults, iter):
self.'scenario_'.join(str(iter)) = myresults #just a guess, I am assuming this is wrong
Suggestions on different approaches are more than welcome, I am quite new to classes and I am not sure if this is an acceptable approach or if I am doing something awful (clunky, memory inefficient, or else).
There's two problems of using this approach, The first one is, Result class (separate class) only stores modified values of your class MyClass, I mean, they should be the same class.
The second problem is memory efficiency, you create the same object twice for storing actual values and modified values at each iteration.
The suggested approach is using a hashmap or a dictionary in python. Using dictionary you are able to store copies of modified object very efficient and there's no need to create another class.
class MyClass:
def __init__(self):
# some attributes ...
self.scenarios_result = {}
superObject = MyClass()
for iter in scenarios:
iter = [iter] if isinstance(iter, int) else iter
myclass = MyClass() #Initialize MyClass
myclass.DoStuff(someInput) #Do stuff and get results
# results.StoreScenario(myclass.myresults, iter)
superObject.scenarios_result[iter] = myclass
So I solved it using setattr:
class Results:
def __init__(self):
self.scenario_results= type('ScenarioResults', (), {}) # create an empty object
def store_scenario(self, data, scenarios):
scenario_key = 'scenario_' + '_'.join(str(x) for x in scenarios)
setattr(self.simulation_results, scenario_key,
subclass_store_scenario(data))
class subclass_store_scenario:
def __init__(self, data):
self.some_stuff = data.result1.__dict__
self.other_stuff = data.result2.__dict__
This allows me to call things like:
results.scenario_results.scenario_1.some_stuff.something
results.scenario_results.scenario_1.some_stuff.something_else
This is necessary for me as I need to compute other measures, summary or scenario-specific, which I can then iteratively assign using again setattr:
def construct_measures(self, some_data, configuration):
for scenario in self.scenario_results:
#scenario is a reference to the self.scenario_results class.
#we can simply add attributes to it
setattr(scenario , 'some_measure',
self.computeSomething(
some_data.input1, some_data.input2))
I have created a class distance_neighbor in which one of the attributes is a list of objects of class Crime. That is the value for all attributes I get from database query result.
At first, I have set data_Crime list as the value for attribute **Crime on class distance_neighbor, and I used del to clear data_Crime list after used, so that the data_Crime list can used in the next loop.
This is my code:
conn = psycopg2.connect("dbname='Chicago_crime' user='postgres' host='localhost' password='1234'")
cur= conn.cursor()
minDistance=float(input("Nilai minimum distance : "))
cur.execute("""SELECT id_objek1, objek1, id_objek2, objek2, distance from tb_distance1 where distance<'%f'""" %(minDistance))
class Crime:
def __init__(self, id_jenis, jenis):
self.id_jenis=id_jenis
self.jenis=jenis
class distance_neighbor (Crime):
def __init__(self, distance, **Crime):
self.distance = distance
self.Crime = Crime
data_Crime =[]
data_distance = []
for id_objek1, objek1, id_objek2, objek2, distance in cur.fetchall():
data_Crime.append(Crime(id_objek1,objek1))
data_Crime.append(Crime(id_objek2,objek2))
data_distance.append(distance_neighbor(distance, data_Crime))
del data_Crime[:]
error Message:
data_distance.append(distance_neighbor(distance, data_Crime))
TypeError: __init__() takes exactly 2 arguments (3 given)
I have fixed my code using below answers guys, Thank you
This should be closer to what you want:
class Crime(object):
def __init__(self, id_jenis, jenis):
self.id_jenis=id_jenis
self.jenis=jenis
class DistanceNeighbor(object):
def __init__(self, distance, crimes):
self.distance = distance
self.crimes = crimes
data_distance = []
for id_objek1, objek1, id_objek2, objek2, distance in cur.fetchall():
crimes = [Crime(id_objek1,objek1), Crime(id_objek2,objek2)]
data_distance.append(DistanceNeighbor(distance, crimes))
Classes in Python 2 should always inherit from object. By convention, class names are in CamelCase.
The inheritance of DistanceNeighbor from Crime seems unnecessary. I changed this.
Attributes to instance should be lower case, therefore I used crimes instead of the very confusing reuse of the class name Crime.
This line:
def __init__(self, distance, **Crime):
takes your list of Crime instance apart as separate arguments.
In your case it means the __init__ receives:
distance, data_Crime[0], data_Crime[0]
this causes this error message:
TypeError: init() takes exactly 2 arguments (3 given)
The instantiation of Crime is pretty short. So, instead of the two appends you can create the list of the two Crime instances in one line:
crimes = [Crime(id_objek1,objek1), Crime(id_objek2,objek2)]
Since this creates a new list in each loop, there is no need to delete the list content in each loop, as you did with del data_Crime[:].
You've defined your __init__ method in distance_neighbor as taking arguments (self, distance, **Crime). The ** before Crime tells Python to pack up any keyword arguments you're passed into a dictionary named Crime. That's not what you're doing though. Your call is distance_neighbor(distance, data_Crime) where data_Crime is a list. You should just accept that as a normal argument in the __init__ method:
class distance_neighbor (Crime):
def __init__(self, distance, crime):
self.distance = distance
self.crime = crime
This will mostly work, but you'll still have an issue. The problem is that the loop that's creating the distance_neighbor objects is reusing the same list for all of them (and using del data_Crime[:] to clear the values in between). If you are keeping a reference to the same list in the objects, they'll all end up with references to that same list (which will be empty) at the end of the loop.
Instead, you should create a new list for each iteration of your loop:
for id_objek1, objek1, id_objek2, objek2, distance in cur.fetchall():
data_Crime = [Crime(id_objek1,objek1), Crime(id_objek2,objek2)]
data_distance.append(distance_neighbor(distance, data_Crime))
This will work, but there are still more things that you probably want to improve in your code. To start with, distance_neighbor is defined as inheriting from Crime, but that doesn't seem appropiate since it contains instance of Crime, rather than being one itself. It should probably inherit from object (or nothing if you're in Python 3 where object is the default base). You may also want to change your class and variable names to match Python convention: CamelCase for class names and lower_case_with_underscores for functions, variables and attributes.
def __init__(self, distance, **Crime):
**Crime is a keyword argument, and expects named arguments. You don't need that, remove the asterisks.
Also, rename the argument, it's very confusing that it has the same name as the class:
class distance_neighbor(Crime):
def __init__(self, distance, c):
self.distance = distance
self.Crime = c
I have 2 classes, a parent and a class which inherits from it. In a list I have an arbitrary number of objects of the parent class, however. I need to convert them all to the child class.
A really simplified version of the code would look like this:
class parent(object):
def __init__():
self.a = 1
class child(parent):
def __init__():
self.b = 2
list_of_objects = []
for x in range(0, 10)
a = parent()
list_of_objects.append(a)
I'm pretty sure I could convert the objets 1 by 1 in a loop using the following line.
a.__dict__ = b.__dict__
But is there a way to convert the whole list at once?
You shouldn't use a.__dict__ = b.__dict__, unless the instance attributes are added dynamically - __dict__ is only used for dynamically added objects. If you're sure the classes are pure python and the internal object properties are named similarly, you could a.__class__=b.__class__.
If you're able to create instances of child, a somewhat cleaner way may be to define a function that creates a child instance from a parent instance. You can avoid the loop by using map or list comprehensions:
def parent_to_child(parent):
newchild= child()
newchild.property= parent.property
#...
list_of_children= map(parent_to_child, list_of_parents)