I have a question on the usage of the setattr method in python.
I have a python class with around 20 attributes, which can be initialized in the below manner:
class SomeClass():
def __init__(self, pd_df_row): # pd_df_row is one row from a dataframe
#initialize some attributes (attribute_A to attribute_Z) in a similar manner
if 'column_A' in pd_df_row.columns:
self.attribute_A = pd_df_row['column_A']
else:
self.attribute_A = np.nan
....
if 'column_Z' in pd_df_row.columns:
self.attribute_Z = pd_df_row['column_Z']
else:
self.attribute_Z = np.nan
# initialize some other attributes based on some other columns in pd_df_row
self.other_attribute = pre_process(pd_df_row['column_123'])
# some other methods
def compute_something(self):
return self.attribute_A + self.attribute_B
Is it advisable to write the class in the below way instead, making use of the setattr method and for loop in python:
class SomeClass():
# create a static list to store the mapping between attribute names and column names that can be initialized using a similar logic.
# However, the mapping would not cover all columns in the input pd_df_row or cover all attributes of the class, because not all columns are read and stored in the same way
# (this mapping will be hardcoded. Its initialization cannot be further simplified using a loop, because the attribute name and the corresponding column name do not actually follow any particular patterns)
ATTR_LIST = [('attribute_A', 'column_A'), ('attribute_B', 'column_B'), ...,('attribute_Z', 'column_Z')]
def __init__(self, pd_df_row): #where pd_df_row is a dataframe
#initialize some attributes (attribute_A to attribute_Z) in a loop
for attr_name, col_name in SomeClass.ATTR_LIST:
if col_name in pd_df_row.columns:
setattr(self, attr_name, pd_df_row[col_name])
else:
setattr(self, attr_name, np.nan)
# initialize some other attributes based on some other columns in pd_df_row
self.other_attribute = pre_process(pd_df_row['column_123'])
# some other methods
def compute_something(self):
return self.attribute_A + self.attribute_B
the second way of writing this class seem to be able to shorten the code. However, it also seem to make the structure of the class a bit confusing, by creating the static list of attribute and column name mapping (which will be used to initiate only some but not all of the attributes). Also, I noticed that code auto-completion will not work for the second piece of code as the code editor wont be able to know what attribute is created until run time. Therefore my question is, is it advisable to use setattr() in this way? In what cases should I write my code in this way and in what cases I should avoid doing so?
In addition, does creating the static mapping in the class violate object oriented programming principles? should I create and store this mapping in some other place instead?
Thank you.
You could, but I would consider having a dict of attributes rather than separate similarly named attributes.
class SomeClass():
def __init__(self, pd_df_row): # pd_df_row is one row from a dataframe
self.attributes = {}
for x in ['A', ..., 'Z']:
column = f'column_{x}'
if column in pd_df_row:
self.attributes[x] = pd_df_row[column]
else:
self.attributes[x] = np.nan
# initialize some other attributes
self.other_attribute = some_other_values
# some other methods
def compute_something(self):
return self.attribute['A'] + self.attribute['B']
Related
I have a problem with naming two similar methods. One is a static method and another one is a method that is the same method but works on the instance. Is it a proper way to do it or should I use only a static method.
Class GameBoard()
def __init__(self, blank_board):
self.board = blank_board
#staticmethod
def get_empty_cells(board):
"""Returns a list of empty cells coordinates (x,y)"""
empty = []
for row_no, row in enumerate(board):
for cell_no, cell in enumerate(row):
if cell == ' ':
empty.append((row_no, cell_no))
return empty
def board_empty_cells(self):
return self.get_empty_cells(self.board)
board1 = GameBoard(blank_board)
board2 = [.....]
empty_board1 = board1.board_empty_cells()
empty_board2 = GameBoard.get_empty_cells(board2)
The reason of that is that I would like to be able to evaluate other boards with the static method, but also if I want to get the instance's empty cells I would like to call board_empty_cells().
Is that a clean code approach or should I get empty cells like:
board1 = GameBoard(blank_board)
empty_board1 = board1.get_empty_cells(board1.board)
What would be proper names for those two functions to be descriptive and unambiguous that one of them is a static method and another operates on instance. Is there any convention to follow to distinguish static methods from methods?
#staticmethod
def get_empty_cells(board):
pass
def board_empty_cells(self):
pass
A lot of times I run into a 'problem' with proper naming for methods and functions.
Is there any guide/convention how to properly name methods (like get_board, is_finished etc.)? I don't mean PEP 8 which I'm familiar with. I mean something that would help me choose proper names that actually would make my code more readable.
I have a class that does some complex calculation and generates some result MyClass.myresults.
MyClass.myresults is actually a class itself with different attributes (e.g. MyClass.myresults.mydf1, MyClass.myresults.mydf2.
Now, I need to run MyClass iteratively following a list of scenarios(scenarios=[1,2,[2,4], 5].
This happens with a simple loop:
for iter in scenarios:
iter = [iter] if isinstance(iter, int) else iter
myclass = MyClass() #Initialize MyClass
myclass.DoStuff(someInput) #Do stuff and get results
results.StoreScenario(myclass.myresults, iter)
and at the end of each iteration store MyClass.myresults.
I would like to create a separate class (Results) that at each iteration creates a subclass scenario_1, scenario_2, scenario_2_4 and stores within it MyClass.myresults.
class Results:
# no initialization, is an empty container to which I would like to add attributes iteratively
class StoreScenario:
def __init__(self, myresults, iter):
self.'scenario_'.join(str(iter)) = myresults #just a guess, I am assuming this is wrong
Suggestions on different approaches are more than welcome, I am quite new to classes and I am not sure if this is an acceptable approach or if I am doing something awful (clunky, memory inefficient, or else).
There's two problems of using this approach, The first one is, Result class (separate class) only stores modified values of your class MyClass, I mean, they should be the same class.
The second problem is memory efficiency, you create the same object twice for storing actual values and modified values at each iteration.
The suggested approach is using a hashmap or a dictionary in python. Using dictionary you are able to store copies of modified object very efficient and there's no need to create another class.
class MyClass:
def __init__(self):
# some attributes ...
self.scenarios_result = {}
superObject = MyClass()
for iter in scenarios:
iter = [iter] if isinstance(iter, int) else iter
myclass = MyClass() #Initialize MyClass
myclass.DoStuff(someInput) #Do stuff and get results
# results.StoreScenario(myclass.myresults, iter)
superObject.scenarios_result[iter] = myclass
So I solved it using setattr:
class Results:
def __init__(self):
self.scenario_results= type('ScenarioResults', (), {}) # create an empty object
def store_scenario(self, data, scenarios):
scenario_key = 'scenario_' + '_'.join(str(x) for x in scenarios)
setattr(self.simulation_results, scenario_key,
subclass_store_scenario(data))
class subclass_store_scenario:
def __init__(self, data):
self.some_stuff = data.result1.__dict__
self.other_stuff = data.result2.__dict__
This allows me to call things like:
results.scenario_results.scenario_1.some_stuff.something
results.scenario_results.scenario_1.some_stuff.something_else
This is necessary for me as I need to compute other measures, summary or scenario-specific, which I can then iteratively assign using again setattr:
def construct_measures(self, some_data, configuration):
for scenario in self.scenario_results:
#scenario is a reference to the self.scenario_results class.
#we can simply add attributes to it
setattr(scenario , 'some_measure',
self.computeSomething(
some_data.input1, some_data.input2))
I do understand how setattr() works in python, but my question is when i try to dynamically set an attribute and give it an unbound function as a value, so the attribute is a callable, the attribute ends up taking the name of the unbound function when i call attr.__name__ instead of the name of the attribute.
Here's an example:
I have a Filter class:
class Filter:
def __init__(self, column=['poi_id', 'tp.event'], access=['con', 'don']):
self.column = column
self.access = access
self.accessor_column = dict(zip(self.access, self.column))
self.set_conditions()
def condition(self, name):
# i want to be able to get the name of the dynamically set
# function and check `self.accessor_column` for a value, but when
# i do `setattr(self, 'accessor', self.condition)`, the function
# name is always set to `condition` rather than `accessor`
return name
def set_conditions(self):
mapping = list(zip(self.column, self.access))
for i in mapping:
poi_column = i[0]
accessor = i[1]
setattr(self, accessor, self.condition)
In the class above, the set_conditions function dynamically set attributes (con and don) of the Filter class and assigns them a callable, but they retain the initial name of the function.
When i run this:
>>> f = Filter()
>>> print(f.con('linux'))
>>> print(f.con.__name__)
Expected:
linux
con (which should be the name of the dynamically set attribute)
I get:
linux
condition (name of the value (unbound self.condition) of the attribute)
But i expect f.con.__name__ to return the name of the attribute (con) and not the name of the unbound function (condition) assigned to it.
Can someone please explain to me why this behaviour is such and how can i go around it?
Thanks.
function.__name__ is the name under which the function has been initially defined, it has nothing to do with the name under which it is accessed. Actually, the whole point of function.__name__ is to correctly identify the function whatever name is used to access it. You definitly want to read this for more on what Python's "names" are.
One of the possible solutions here is replace the static definition of condition with a closure:
class Filter(object):
def __init__(self, column=['poi_id', 'tp.event'], access=['con', 'don']):
self.column = column
self.access = access
self.accessor_column = dict(zip(self.access, self.column))
self.set_conditions()
def set_conditions(self):
mapping = list(zip(self.column, self.access))
for column_name, accessor_name in mapping:
def accessor(name):
print("in {}.accessor '{}' for column '{}'".format(self, accessor_name, column_name))
return name
# this is now technically useless but helps with inspection
accessor.__name__ = accessor_name
setattr(self, accessor_name, accessor)
As a side note (totally unrelated but I thought you may want to know this), using mutable objects as function arguments defaults is one of the most infamous Python gotchas and may yield totally unexpected results, ie:
>>> f1 = Filter()
>>> f2 = Filter()
>>> f1.column
['poi_id', 'tp.event']
>>> f2.column
['poi_id', 'tp.event']
>>> f2.column.append("WTF")
>>> f1.column
['poi_id', 'tp.event', 'WTF']
EDIT:
thank you for your answer, but it doesn't touch my issue here. My problem is not how functions are named or defined, my problem it that when i use setattr() and i set an attribute and i give it a function as it's value, i can access the value and perform what the value does, but since it's a function, why doesn't it return it's name as the function name
Because as I already explained above, the function's __name__ attribute and the name of the Filter instance attribute(s) refering to this function are totally unrelated, and the function knows absolutely nothing about the names of variables or attributes that reference it, as explained in the reference article I linked to.
Actually the fact that the object you're passing to setattr is a function is totally irrelevant, from the object's POV it's just a name and an object, period. And actually the fact you're binding this object (function or just whatever object) to an instance attribute (whether directly or using setattr(), it works just the same) instead of a plain variable is also totally irrelevant - none of those operation will have any impact on the object that is bound (except for increasing it's ref counter but that's a CPython implementation detail - other implementations may implement garbage collection diffently).
May I suggest you this :
from types import SimpleNamespace
class Filter:
def __init__(self, column=['poi_id', 'tp.event'], access=['con', 'don']):
self.column = column
self.access = access
self.accessor_column = dict(zip(self.access, self.column))
self.set_conditions()
def set_conditions(self):
for i in self.access:
setattr(self, i, SimpleNamespace(name=i, func=lambda name: name))
f = Filter()
print(f.con.func('linux'))
>>> linux
print(f.con.name)
>>> con
[edited after bruno desthuilliers's comment.]
I have multiple classes with methods like the one below:
#property
def max_ill(self):
maxval = max(self.illarr)
maxpts = [idx for idx,val in enumerate(self.illarr) if val==maxval]
maxpts = [ self.roomgrid.ptsdict[pt].ptid for pt in maxpts ]
return {'data':maxval,"points":maxpts}
What I'd like to do is to split this property into two such that I can access the max_ill['data'] and max_ill['points'] as individual properties like .max_ill_data and .max_ill_points. This will aid in auto-code-compeletion and also free me from having to remember what each property returns. However, as you can see above, calling each property individually will result in some of the calculations being repeated.
So, is there an elegant (non-hacky) way that I can run the calculation just once and assign both values? I know that I could call a function within the def __init__ constructor function and set these values. But I don't foresee myself needing these values everytime I instantiate a class (hence the use of #property).
Is this a place where setter might be useful ?
One way of doing it is something known as lazy-property but that also has it's drawbacks in case your self.illarr could change over time. In short it would be something like this:
def max_ill(self):
# Helper function to create the needed values. Not to be used directly
maxval = max(self.illarr)
maxpts = [idx for idx,val in enumerate(self.illarr) if val==maxval]
maxpts = [ self.roomgrid.ptsdict[pt].ptid for pt in maxpts ]
# Save the calculated values as attributes
self._max_ill_data = maxval
self._max_ill_points = maxpts
#property
def max_ill_data(self):
try:
# Get the saved value (raises an AttributeError when not existing)
return self._max_ill_data
except AttributeError:
# We found None, so call the method that creates these and return it afterwards
self.max_ill()
return self._max_ill_data
#property
def max_ill_points(self):
try:
return self._max_ill_points
except AttributeError:
self.max_ill()
return self._max_ill_points
So the max_ill is responsible for calculating the values and the properties only return the saved value or if there is no such attribute call the function that creates these.
There are also some libraries that implement lazyproperties even with tied parameters so this is just to illustrate how they (could) simplified work or if you don't want to add dependencies.
I'm trying to create a way to apply a prefix to an item which would modify the item's existing stats. For example in the code below I am trying to apply the 'huge' prefix to the 'jar' item. I'd like to make the code reusable so that I could have different prefixes ('fast', 'healthy') that would modify different item stats.
Is it possible to hold the name of a class member in a variable?
If so, is there any reason I shouldn't?
If not, what alternatives are there?
class Prefix(object):
def __init__(self, word, stat, valu):
self.word = word
self.stat = stat
self.valu = valu
class Item(object):
def __init__(self, name, size):
self.name = name
self.size = size
def apply_prefix(self, prefix):
self.prefix.stat += prefix.valu # <-- Here is my issue
self.name = prefix.word + ' ' + self.name
# My hope is to make the code reusable for any stat
def print_stats(self):
print self.name, self.size
def main():
jar = Item('jar', 10)
huge_prefix = Prefix('huge', 'size', 5)
jar.apply_prefix(huge_prefix)
jar.print_stats()
You're trying to dynamically refer to some attribute. You do that by using getattr. And if you want to set the attribute, well... that's setattr :)
def apply_prefix(self, prefix):
target_attr = getattr(self,prefix.stat) #dynamically gets attr
setattr(self,prefix.stat,target_attr+prefix.valu)
As to whether this is the best coding style: it depends. There are some instances that code is made more clear by use of getattr. Since right now you only have two stats, it seems excessive to need this kind of dynamic attribute referencing, since I could easily do:
bogus_prefix = Prefix('huge','bogus',3)
Which is a valid Prefix, but throws an AttributeError when I try to apply it. That's not the most straightforward thing to debug.
However, there are bonuses to the getattr approach: if you add more stats, you don't have to change a bit (haha) of code in Prefix.
Other alternatives? There are always options in Python. :-)
The way I'd do it is to make Prefix just a dict of word:value pairs. Then apply_prefix would loop over the word keys, updating as many values as I wanted in one shot. It's a similarly dynamic approach, but a bit more scalable.