Should I extract values from Python dictionaries into object attributes?

Should I extract values from Python dictionaries into object attributes? - python

I have a Python class that is initialized with a dictionary of settings, like this:
def __init__(self, settings):
self._settings = settings
Settings dictionary contains 50-100 different parameters that are used quite a lot in other methods:
def MakeTea(self):
tea = Tea()
if self._settings['use_sugar']:
tea.sugar_spoons = self._settings['spoons_of_sugar']
return tea
What I want to know is whether it makes sense to preload all the params into instance attributes like this:
def __init__(self, settings):
self._use_sugar = settings['use_sugar']
self._spoons_of_sugar = settings['spoons_of_sugar']
and use these attributes instead of looking up dictionary values every time I need them:
def MakeTea(self):
tea = Tea()
if self._use_sugar:
tea.sugar_spoons = _self._spoons_of_sugar
return tea
Now, I am fairly new to Python and I worked mostly with compiled languages where it really is a no-brainer: access to instance fields will be much faster than looking up values from any kind of hashtable-based structure. However, with Python being interpreted and all, I'm not sure that I'll have any significant performance gain because at the moment I have almost no knowledge of how Python interpreter works. For all I know, using attribute name in code may involve using some internal dictionaries of identifiers in interpreted environment, so I gain nothing.
So, the question: are there any significant performance benefits in extracting values from dictionary and putting them in instance attributes? Are there any other benefits or downsides of doing it? What's the good practice?
I strongly believe that this is an engineering decision rather than premature optimization. Also, I'm just curious and trying to write decent Python code, so the question seems valid to me whether I actually need those milliseconds or not.

You're comparing attribute access (self.setting) with attribute access (self.settings) plus a dictionary lookup (settings['setting']). Classes are actually implemented as dictionaries, so the problem reduces to two dictionary lookups vs. one. One lookup will be faster.
A simpler and faster way to copy an initialization dict than the one in the other answer is:
class Foobar(object):
def __init__(self, init_dict):
self.__dict__.update(init_dict)
However, I wouldn't do this for optimization purposes. It's both premature optimization (you don't know that you have a speed problem, or what your bottleneck is) and a micro-optimization (making an O(n2) algorithm O(n) will make more of a difference than removing an O(1) dictionary lookup from the original algorithm).
If somewhere, you're accessing one of these settings many, many times, just create a local reference to it, rather than polluting the namespace of Foobar instances with tons of settings.
These are two reasonable designs to consider, but you shouldn't choose one or the other for performance reasons. Instead of either one, I would probably create another object:
class Settings(object):
def __init__(self, init_dict):
self.__dict__.update(init_dict)
class Foobar(object):
def __init__(self, init_dict):
self.settings = Settings(init_dict)
just because I think self.settings.setting is nicer than self.settings['setting'] and it still keeps things organized.
This is a good use for a collections.namedtuple, if you know in advance what all the setting names are.

If you put them into the instance attributes then you'll be looking up your instance dictionary... so in the end you're just gonna be doing the same thing. So no real performance gain or loss.
Example:
>>> class Foobar(object):
def __init__(self, init_dict):
for arg in init_dict:
self.__setattr__(arg, init_dict[arg])
>>> foo = Foobar({'foobar': 'barfoo', 'shroobniz': 'foo'})
>>> print(foo.__dict__)
{'foobar': 'barfoo', 'shroobniz': 'foo'}
So if python looks up foo.__dict__ or foo._settings doesn't really make a difference.

Related

Organizing methods in python classes with hierarchical names

Python officially recognizes namespaces as a "honking great idea" that we should "do more of". One nice thing about namespaces is their hierarchical presentation that organizes code into related parts. Is there an elegant way to organize python class methods into related parts, much as hierarchical namespaces are organized — especially for the purposes of tab-completion?
Some of my python classes cannot be split up into smaller classes, but have many methods attached to them (easily over a hundred). I also find (and my code's users tell me) that the easiest way to find useful methods is to use tab-completion. But with so many methods, this becomes unwieldy, as an enormous list of options is presented — and usually organized alphabetically, which means that closely related methods may be located in completely different parts of this massive list.
Typically, there are very distinct groups of closely related methods. For example, I have one class in which almost all of the methods fall into one of four groups:
io
statistics
transformations
symmetries
And the io group might have read and write subgroups, where there are different options for the file type to read or write, and then some additional methods involved in looking at the metadata for example. To a small extent, I can address this problem using underscores in my method names. For example, I might have methods like
myobject.io_read_from_csv
myobject.io_write_to_csv
This helps with the classification, but is ugly and still leads to unwieldy tab-completion lists. I would prefer it if the first tab-completion list just had the four options listed above, then when one of those options is selected, additional options would be presented with the next tab.
For a slightly more concrete example, here's a partial list of the hierarchy that I have in mind for my class:
myobject.io
myobject.io.read
myobject.io.read.csv
myobject.io.read.h5
myobject.io.read.npy
myobject.io.write
myobject.io.write.csv
myobject.io.write.h5
myobject.io.write.npy
myobject.io.parameters
myobject.io.parameters.from_csv_header
myobject.io.parameters.from_h5_attributes
...
...
myobject.statistics
myobject.statistics.max
myobject.statistics.max_time
myobject.statistics.norm
...
myobject.transformations
myobject.transformations.rotation
myobject.transformations.boost
myobject.transformations.spatial_translation
myobject.transformations.time_translation
myobject.transformations.supertranslation
...
myobject.symmetries
myobject.symmetries.parity
myobject.symmetries.parity.conjugate
myobject.symmetries.parity.symmetric_part
myobject.symmetries.parity.antisymmetric_part
myobject.symmetries.parity.violation
myobject.symmetries.parity.violation_normalized
myobject.symmetries.xreflection
myobject.symmetries.xreflection.conjugate
myobject.symmetries.xreflection.symmetric_part
...
...
...
One way I can imagine solving this problem is to create classes like IO, Statistics, etc., within my main MyClass class whose sole purpose is to store a reference to myobject and provide the methods that it needs. The main class would then have #property methods that just return the instances of those lower-lever classes, for which tab-completion should then work. Does this make sense? Would it work at all to provide tab-completion in ipython, for example? Would this lead to circular-reference problems? Is there a better way?

It looks like my naive suggestion of defining classes within the class does indeed work with ipython's tab-completion and without any circularity problems.
Here's the proof-of-concept code:
class A(object):
class _B(object):
def __init__(self, a):
self._owner = a
def calculate(self, y):
return y * self._owner.x
def __init__(self, x):
self.x = x
self._b = _B(self)
#property
def b(self):
return self._b
(In fact, it would be even simpler if I used self.b = _B(self), and I could skip the property, but I like this because it impedes overwriting b from outside the class. Plus this shows that this more complicated case still works.)
So if I create a = A(1.2), for example, I can hit a.<TAB> and get b as the completion, then a.b.<TAB> suggests calculate as the completion. I haven't run into any problems with this structure in my brief tests so far, and the changes to my code aren't very big — just adding ._owner into a lot of the methods code.

Class with too many parameters: better design strategy?

I am working with models of neurons. One class I am designing is a cell class which is a topological description of a neuron (several compartments connected together). It has many parameters but they are all relevant, for example:
number of axon segments, apical bifibrications, somatic length, somatic diameter, apical length, branching randomness, branching length and so on and so on... there are about 15 parameters in total!
I can set all these to some default value but my class looks crazy with several lines for parameters. This kind of thing must happen occasionally to other people too, is there some obvious better way to design this or am I doing the right thing?
UPDATE:
As some of you have asked I have attached my code for the class, as you can see this class has a huge number of parameters (>15) but they are all used and are necessary to define the topology of a cell. The problem essentially is that the physical object they create is very complex. I have attached an image representation of objects produced by this class. How would experienced programmers do this differently to avoid so many parameters in the definition?
class LayerV(__Cell):
def __init__(self,somatic_dendrites=10,oblique_dendrites=10,
somatic_bifibs=3,apical_bifibs=10,oblique_bifibs=3,
L_sigma=0.0,apical_branch_prob=1.0,
somatic_branch_prob=1.0,oblique_branch_prob=1.0,
soma_L=30,soma_d=25,axon_segs=5,myelin_L=100,
apical_sec1_L=200,oblique_sec1_L=40,somadend_sec1_L=60,
ldecf=0.98):
import random
import math
#make main the regions:
axon=Axon(n_axon_seg=axon_segs)
soma=Soma(diam=soma_d,length=soma_L)
main_apical_dendrite=DendriticTree(bifibs=
apical_bifibs,first_sec_L=apical_sec1_L,
L_sigma=L_sigma,L_decrease_factor=ldecf,
first_sec_d=9,branch_prob=apical_branch_prob)
#make the somatic denrites
somatic_dends=self.dendrite_list(num_dends=somatic_dendrites,
bifibs=somatic_bifibs,first_sec_L=somadend_sec1_L,
first_sec_d=1.5,L_sigma=L_sigma,
branch_prob=somatic_branch_prob,L_decrease_factor=ldecf)
#make oblique dendrites:
oblique_dends=self.dendrite_list(num_dends=oblique_dendrites,
bifibs=oblique_bifibs,first_sec_L=oblique_sec1_L,
first_sec_d=1.5,L_sigma=L_sigma,
branch_prob=oblique_branch_prob,L_decrease_factor=ldecf)
#connect axon to soma:
axon_section=axon.get_connecting_section()
self.soma_body=soma.body
soma.connect(axon_section,region_end=1)
#connect apical dendrite to soma:
apical_dendrite_firstsec=main_apical_dendrite.get_connecting_section()
soma.connect(apical_dendrite_firstsec,region_end=0)
#connect oblique dendrites to apical first section:
for dendrite in oblique_dends:
apical_location=math.exp(-5*random.random()) #for now connecting randomly but need to do this on some linspace
apsec=dendrite.get_connecting_section()
apsec.connect(apical_dendrite_firstsec,apical_location,0)
#connect dendrites to soma:
for dend in somatic_dends:
dendsec=dend.get_connecting_section()
soma.connect(dendsec,region_end=random.random()) #for now connecting randomly but need to do this on some linspace
#assign public sections
self.axon_iseg=axon.iseg
self.axon_hill=axon.hill
self.axon_nodes=axon.nodes
self.axon_myelin=axon.myelin
self.axon_sections=[axon.hill]+[axon.iseg]+axon.nodes+axon.myelin
self.soma_sections=[soma.body]
self.apical_dendrites=main_apical_dendrite.all_sections+self.seclist(oblique_dends)
self.somatic_dendrites=self.seclist(somatic_dends)
self.dendrites=self.apical_dendrites+self.somatic_dendrites
self.all_sections=self.axon_sections+[self.soma_sections]+self.dendrites

UPDATE: This approach may be suited in your specific case, but it definitely has its downsides, see is kwargs an antipattern?
Try this approach:
class Neuron(object):
def __init__(self, **kwargs):
prop_defaults = {
"num_axon_segments": 0,
"apical_bifibrications": "fancy default",
...
}
for (prop, default) in prop_defaults.iteritems():
setattr(self, prop, kwargs.get(prop, default))
You can then create a Neuron like this:
n = Neuron(apical_bifibrications="special value")

I'd say there is nothing wrong with this approach - if you need 15 parameters to model something, you need 15 parameters. And if there's no suitable default value, you have to pass in all 15 parameters when creating an object. Otherwise, you could just set the default and change it later via a setter or directly.
Another approach is to create subclasses for certain common kinds of neurons (in your example) and provide good defaults for certain values, or derive the values from other parameters.
Or you could encapsulate parts of the neuron in separate classes and reuse these parts for the actual neurons you model. I.e., you could write separate classes for modeling a synapse, an axon, the soma, etc.

You could perhaps use a Python"dict" object ?
http://docs.python.org/tutorial/datastructures.html#dictionaries

Having so many parameters suggests that the class is probably doing too many things.
I suggest that you want to divide your class into several classes, each of which take some of your parameters. That way each class is simpler and won't take so many parameters.
Without knowing more about your code, I can't say exactly how you should split it up.

Looks like you could cut down the number of arguments by constructing objects such as Axon, Soma and DendriticTree outside of the LayerV constructor, and passing those objects instead.
Some of the parameters are only used in constructing e.g. DendriticTree, others are used in other places as well, so the problem it's not as clear cut, but I would definitely try that approach.

could you supply some example code of what you are working on? It would help to get an idea of what you are doing and get help to you sooner.
If it's just the arguments you are passing to the class that make it long, you don't have to put it all in __init__. You can set the parameters after you create the class, or pass a dictionary/class full of the parameters as an argument.
class MyClass(object):
def __init__(self, **kwargs):
arg1 = None
arg2 = None
arg3 = None
for (key, value) in kwargs.iteritems():
if hasattr(self, key):
setattr(self, key, value)
if __name__ == "__main__":
a_class = MyClass()
a_class.arg1 = "A string"
a_class.arg2 = 105
a_class.arg3 = ["List", 100, 50.4]
b_class = MyClass(arg1 = "Astring", arg2 = 105, arg3 = ["List", 100, 50.4])

After looking over your code and realizing I have no idea how any of those parameters relate to each other (soley because of my lack of knowledge on the subject of neuroscience) I would point you to a very good book on object oriented design. Building Skills in Object Oriented Design by Steven F. Lott is an excellent read and I think would help you, and anyone else in laying out object oriented programs.
It is released under the Creative Commons License, so is free for you to use, here is a link of it in PDF format http://homepage.mac.com/s_lott/books/oodesign/build-python/latex/BuildingSkillsinOODesign.pdf
I think your problem boils down to the overall design of your classes. Sometimes, though very rarely, you need a whole lot of arguments to initialize, and most of the responses here have detailed other ways of initialization, but in a lot of cases you can break the class up into more easier to handle and less cumbersome classes.

This is similar to the other solutions that iterate through a default dictionary, but it uses a more compact notation:
class MyClass(object):
def __init__(self, **kwargs):
self.__dict__.update(dict(
arg1=123,
arg2=345,
arg3=678,
), **kwargs)

Can you give a more detailed use case ? Maybe a prototype pattern will work:
If there are some similarities in groups of objects, a prototype pattern might help.
Do you have a lot of cases where one population of neurons is just like another except different in some way ? ( i.e. rather than having a small number of discrete classes,
you have a large number of classes that slightly differ from each other. )
Python is a classed based language, but just as you can simulate class based
programming in a prototype based language like Javascript, you can simulate
prototypes by giving your class a CLONE method, that creates a new object and
populates its ivars from the parent. Write the clone method so that keyword parameters
passed to it override the "inherited" parameters, so you can call it with something
like:
new_neuron = old_neuron.clone( branching_length=n1, branching_randomness=r2 )

I have never had to deal with this situation, or this topic. Your description implies to me that you may find, as you develop the design, that there are a number of additional classes that will become relevant - compartment is the most obvious. If these do emerge as classes in their own right, it is probable that some of your parameters become parameters of these additional classes.

You could create a class for your parameters.
Instead passing a bunch of parameters, you pass one class.

In my opinion, in your case the easy solution is to pass higher order objects as parameter.
For example, in your __init__ you have a DendriticTree that uses several arguments from your main class LayerV:
main_apical_dendrite = DendriticTree(
bifibs=apical_bifibs,
first_sec_L=apical_sec1_L,
L_sigma=L_sigma,
L_decrease_factor=ldecf,
first_sec_d=9,
branch_prob=apical_branch_prob
)
Instead of passing these 6 arguments to your LayerV you would pass the DendriticTree object directly (thus saving 5 arguments).
You probably want to have this values accessible everywhere, therefore you will have to save this DendriticTree:
class LayerV(__Cell):
def __init__(self, main_apical_dendrite, ...):
self.main_apical_dendrite = main_apical_dendrite
If you want to have a default value too, you can have:
class LayerV(__Cell):
def __init__(self, main_apical_dendrite=None, ...):
self.main_apical_dendrite = main_apical_dendrite or DendriticTree()
This way you delegate what the default DendriticTree should be to the class dedicated to that matter instead of having this logic in the higher order class that LayerV.
Finally, when you need to access the apical_bifibs you used to pass to LayerV you just access it via self.main_apical_dendrite.bifibs.
In general, even if the class you are creating is not a clear composition of several classes, your goal is to find a logical way to split your parameters. Not only to make your code cleaner, but mostly to help people understand what these parameter will be used for. In the extreme cases where you can't split them, I think it's totally ok to have a class with that many parameters. If there is no clear way to split arguments, then you'll probably end up with something even less clear than a list of 15 arguments.
If you feel like creating a class to group parameters together is overkill, then you can simply use collections.namedtuple which can have default values as shown here.

Want to reiterate what a number of people have said. Theres nothing wrong with that amount of parameters. Especially when it comes to scientific computing/programming
Take for example, sklearn's KMeans++ clustering implementation which has 11 parameters you can init with. Like that, there are numerous examples and nothing wrong with them

I would say there is nothing wrong if make sure you need those params. If you really wanna make it more readable I would recommend following style.
I wouldn't say that a best practice or what, it just make others easily know what is necessary for this Object and what is option.
class LayerV(__Cell):
# author: {name, url} who made this info
def __init__(self, no_default_params, some_necessary_params):
self.necessary_param = some_necessary_params
self.no_default_param = no_default_params
self.something_else = "default"
self.some_option = "default"
def b_option(self, value):
self.some_option = value
return self
def b_else(self, value):
self.something_else = value
return self
I think the benefit for this style is:
You can easily know the params which is necessary in __init__ method
Unlike setter, you don't need two lines to construct the object if you need set an option value.
The disadvantage is, you created more methods in your class than before.
sample:
la = LayerV("no_default", "necessary").b_else("sample_else")
After all, if you have a lot of "necessary" and "no_default" params, always think about is this class(method) do too many things.
If your answer is not, just go ahead.

Have well-defined, narrowly-focused classes ... now how do I get anything done in my program?

I'm coding a poker hand evaluator as my first programming project. I've made it through three classes, each of which accomplishes its narrowly-defined task very well:
HandRange = a string-like object (e.g. "AA"). getHands() returns a list of tuples for each specific hand within the string:
[(Ad,Ac),(Ad,Ah),(Ad,As),(Ac,Ah),(Ac,As),(Ah,As)]
Translation = a dictionary that maps the return list from getHands to values that are useful for a given evaluator (yes, this can probably be refactored into another class).
{'As':52, 'Ad':51, ...}
Evaluator = takes a list from HandRange (as translated by Translator), enumerates all possible hand matchups and provides win % for each.
My question: what should my "domain" class for using all these classes look like, given that I may want to connect to it via either a shell UI or a GUI? Right now, it looks like an assembly line process:
user_input = HandRange()
x = Translation.translateList(user_input)
y = Evaluator.getEquities(x)
This smells funny in that it feels like it's procedural when I ought to be using OO.
In a more general way: if I've spent so much time ensuring that my classes are well defined, narrowly focused, orthogonal, whatever ... how do I actually manage work flow in my program when I need to use all of them in a row?
Thanks,
Mike

Don't make a fetish of object orientation -- Python supports multiple paradigms, after all! Think of your user-defined types, AKA classes, as building blocks that gradually give you a "language" that's closer to your domain rather than to general purpose language / library primitives.
At some point you'll want to code "verbs" (actions) that use your building blocks to perform something (under command from whatever interface you'll supply -- command line, RPC, web, GUI, ...) -- and those may be module-level functions as well as methods within some encompassing class. You'll surely want a class if you need multiple instances, and most likely also if the actions involve updating "state" (instance variables of a class being much nicer than globals) or if inheritance and/or polomorphism come into play; but, there is no a priori reason to prefer classes to functions otherwise.
If you find yourself writing static methods, yearning for a singleton (or Borg) design pattern, writing a class with no state (just methods) -- these are all "code smells" that should prompt you to check whether you really need a class for that subset of your code, or rather whether you may be overcomplicating things and should use a module with functions for that part of your code. (Sometimes after due consideration you'll unearth some different reason for preferring a class, and that's allright too, but the point is, don't just pick a class over a module w/functions "by reflex", without critically thinking about it!).

You could create a Poker class that ties these all together and intialize all of that stuff in the __init__() method:
class Poker(object):
def __init__(self, user_input=HandRange()):
self.user_input = user_input
self.translation = Translation.translateList(user_input)
self.evaluator = Evaluator.getEquities(x)
# and so on...
p = Poker()
# etc, etc...

Should I use a class in this: Reading a XML file using lxml

This question is in continuation to my previous question, in which I asked about passing around an ElementTree.
I need to read the XML files only and to solve this, I decided to create a global ElementTree and then parse it wherever required.
My question is:
Is this an acceptable practice? I heard global variables are bad. If I don't make it global, I was suggested to make a class. But do I really need to create a class? What benefits would I have from that approach. Note that I would be handling only one ElementTree instance per run, the operations are read-only. If I don't use a class, how and where do I declare that ElementTree so that it available globally? (Note that I would be importing this module)
Please answer this question in the respect that I am a beginner to development, and at this stage I can't figure out whether to use a class or just go with the functional style programming approach.

There are a few reasons that global variables are bad. First, it gets you in the habit of declaring global variables which is not good practice, though in some cases globals make sense -- PI, for instance. Globals also create problems when you on purpose or accidentally re-use the name locally. Or worse, when you think you're using the name locally but in reality you're assigning a new value to the global variable. This particular problem is language dependent, and python handles it differently in different cases.
class A:
def __init__(self):
self.name = 'hi'
x = 3
a = A()
def foo():
a.name = 'Bedevere'
x = 9
foo()
print x, a.name #outputs 3 Bedevere
The benefit of creating a class and passing your class around is you will get a defined, constant behavior, especially since you should be calling class methods, which operate on the class itself.
class Knights:
def __init__(self, name='Bedevere'):
self.name = name
def knight(self):
self.name = 'Sir ' + self.name
def speak(self):
print self.name + ":", "Run away!"
class FerociousRabbit:
def __init__(self):
self.death = "awaits you with sharp pointy teeth!"
def speak(self):
print "Squeeeeeeee!"
def cave(thing):
thing.speak()
if isinstance(thing, Knights):
thing.knight()
def scene():
k = Knights()
k2 = Knights('Launcelot')
b = FerociousRabbit()
for i in (b, k, k2):
cave(i)
This example illustrates a few good principles. First, the strength of python when calling functions - FerociousRabbit and Knights are two different classes but they have the same function speak(). In other languages, in order to do something like this, they would at least have to have the same base class. The reason you would want to do this is it allows you to write a function (cave) that can operate on any class that has a 'speak()' method. You could create any other method and pass it to the cave function:
class Tim:
def speak(self):
print "Death awaits you with sharp pointy teeth!"
So in your case, when dealing with an elementTree, say sometime down the road you need to also start parsing an apache log. Well if you're doing purely functional program you're basically hosed. You can modify and extend your current program, but if you wrote your functions well, you could just add a new class to the mix and (technically) everything will be peachy keen.

Pragmatically, is your code expected to grow? Even though people herald OOP as the right way, I found that sometimes it's better to weigh cost:benefit(s) whenever you refactor a piece of code. If you are looking to grow this, then OOP is a better option in that you can extend and customise any future use case, while saving yourself from unnecessary time wasted in code maintenance. Otherwise, if it ain't broken, don't fix it, IMHO.

I generally find myself regretting it when I give in to the temptation to give a module, for example, a load_file() method that sets a global that the module's other functions can then use to find the file they're supposed to be talking about. It makes testing far more difficult, for example, and as soon as I need two XML files there is a problem. Plus, every single function needs to check whether the file's there and give an error if it's not.
If I want to be functional, I simply therefore have every function take the XML file as an argument.
If I want to be object oriented, I'll have a MyXMLFile class whose methods can just look at self.xmlfile or whatever.
The two approaches are more or less equivalent when there's just one single thing, like a file, to be passed around; but when the number of things in the "state" becomes larger than a few, then I find classes simpler because I can stick all of those things in the class.
(Am I answering your question? I'm still a big vague on what kind of answer you want.)

When and how to use the builtin function property() in python

It appears to me that except for a little syntactic sugar, property() does nothing good.
Sure, it's nice to be able to write a.b=2 instead of a.setB(2), but hiding the fact that a.b=2 isn't a simple assignment looks like a recipe for trouble, either because some unexpected result can happen, such as a.b=2 actually causes a.b to be 1. Or an exception is raised. Or a performance problem. Or just being confusing.
Can you give me a concrete example for a good usage of it? (using it to patch problematic code doesn't count ;-)

In languages that rely on getters and setters, like Java, they're not supposed nor expected to do anything but what they say -- it would be astonishing if x.getB() did anything but return the current value of logical attribute b, or if x.setB(2) did anything but whatever small amount of internal work is needed to make x.getB() return 2.
However, there are no language-imposed guarantees about this expected behavior, i.e., compiler-enforced constraints on the body of methods whose names start with get or set: rather, it's left up to common sense, social convention, "style guides", and testing.
The behavior of x.b accesses, and assignments such as x.b = 2, in languages which do have properties (a set of languages which includes but is not limited to Python) is exactly the same as for getter and setter methods in, e.g., Java: the same expectations, the same lack of language-enforced guarantees.
The first win for properties is syntax and readability. Having to write, e.g.,
x.setB(x.getB() + 1)
instead of the obvious
x.b += 1
cries out for vengeance to the gods. In languages which support properties, there is absolutely no good reason to force users of the class to go through the gyrations of such Byzantine boilerplate, impacting their code's readability with no upside whatsoever.
In Python specifically, there's one more great upside to using properties (or other descriptors) in lieu of getters and setters: if and when you reorganize your class so that the underlying setter and getter are not needed anymore, you can (without breaking the class's published API) simply eliminate those methods and the property that relies on them, making b a normal "stored" attribute of x's class rather than a "logical" one obtained and set computationally.
In Python, doing things directly (when feasible) instead of via methods is an important optimization, and systematically using properties enables you to perform this optimization whenever feasible (always exposing "normal stored attributes" directly, and only ones which do need computation upon access and/or setting via methods and properties).
So, if you use getters and setters instead of properties, beyond impacting the readability of your users' code, you are also gratuitously wasting machine cycles (and the energy that goes to their computer during those cycles;-), again for no good reason whatsoever.
Your only argument against properties is e.g. that "an outside user wouldn't expect any side effects as a result of an assignment, usually"; but you miss the fact that the same user (in a language such as Java where getters and setters are pervasive) wouldn't expect (observable) "side effects" as a result of calling a setter, either (and even less for a getter;-). They're reasonable expectations and it's up to you, as the class author, to try and accommodate them -- whether your setter and getter are used directly or through a property, makes no difference. If you have methods with important observable side effects, do not name them getThis, setThat, and do not use them via properties.
The complaint that properties "hide the implementation" is wholly unjustified: most all of OOP is about implementing information hiding -- making a class responsible for presenting a logical interface to the outside world and implementing it internally as best it can. Getters and setters, exactly like properties, are tools towards this goal. Properties just do a better job at it (in languages that support them;-).

The idea is to allow you to avoid having to write getters and setters until you actually need them.
So, to start off you write:
class MyClass(object):
def __init__(self):
self.myval = 4
Obviously you can now write myobj.myval = 5.
But later on, you decide that you do need a setter, as you want to do something clever at the same time. But you don't want to have to change all the code that uses your class - so you wrap the setter in the #property decorator, and it all just works.

but hiding the fact that a.b=2 isn't a
simple assignment looks like a recipe
for trouble
You're not hiding that fact though; that fact was never there to begin with. This is python -- a high-level language; not assembly. Few of the "simple" statements in it boil down to single CPU instructions. To read simplicity into an assignment is to read things that aren't there.
When you say x.b = c, probably all you should think is that "whatever just happened, x.b should now be c".

A basic reason is really simply that it looks better. It is more pythonic. Especially for libraries. something.getValue() looks less nice than something.value
In plone (a pretty big CMS), you used to have document.setTitle() which does a lot of things like storing the value, indexing it again and so. Just doing document.title = 'something' is nicer. You know that a lot is happening behind the scenes anyway.

You are correct, it is just syntactic sugar. It may be that there are no good uses of it depending on your definition of problematic code.
Consider that you have a class Foo that is widely used in your application. Now this application has got quite large and further lets say it's a webapp that has become very popular.
You identify that Foo is causing a bottleneck. Perhaps it is possible to add some caching to Foo to speed it up. Using properties will let you do that without changing any code or tests outside of Foo.
Yes of course this is problematic code, but you just saved a lot of $$ fixing it quickly.
What if Foo is in a library that you have hundreds or thousands of users for? Well you saved yourself having to tell them to do an expensive refactor when they upgrade to the newest version of Foo.
The release notes have a lineitem about Foo instead of a paragraph porting guide.
Experienced Python programmers don't expect much from a.b=2 other than a.b==2, but they know even that may not be true. What happens inside the class is it's own business.

Here's an old example of mine. I wrapped a C library which had functions like "void dt_setcharge(int atom_handle, int new_charge)" and "int dt_getcharge(int atom_handle)". I wanted at the Python level to do "atom.charge = atom.charge + 1".
The "property" decorator makes that easy. Something like:
class Atom(object):
def __init__(self, handle):
self.handle = handle
def _get_charge(self):
return dt_getcharge(self.handle)
def _set_charge(self, charge):
dt_setcharge(self.handle, charge)
charge = property(_get_charge, _set_charge)
10 years ago, when I wrote this package, I had to use __getattr__ and __setattr__ which made it possible, but the implementation was a lot more error prone.
class Atom:
def __init__(self, handle):
self.handle = handle
def __getattr__(self, name):
if name == "charge":
return dt_getcharge(self.handle)
raise AttributeError(name)
def __setattr__(self, name, value):
if name == "charge":
dt_setcharge(self.handle, value)
else:
self.__dict__[name] = value

getters and setters are needed for many purposes, and are very useful because they are transparent to the code. Having object Something the property height, you assign a value as Something.height = 10, but if height has a getter and setter then at the time you do assign that value you can do many things in the procedures, like validating a min or max value, like triggering an event because the height changed, automatically setting other values in function of the new height value, all that may occur at the moment Something.height value was assigned. Remember, you don't need to call them in your code, they are auto executed at the moment you read or write the property value. In some way they are like event procedures, when the property X changes value and when the property X value is read.

It is useful when you try to replace inheritance with delegation in refactoring. The following is a toy example. Stack was a subclass in Vector.
class Vector:
def __init__(self, data):
self.data = data
#staticmethod
def get_model_with_dict():
return Vector([0, 1])
class Stack:
def __init__(self):
self.model = Vector.get_model_with_dict()
self.data = self.model.data
class NewStack:
def __init__(self):
self.model = Vector.get_model_with_dict()
#property
def data(self):
return self.model.data
#data.setter
def data(self, value):
self.model.data = value
if __name__ == '__main__':
c = Stack()
print(f'init: {c.data}') #init: [0, 1]
c.data = [0, 1, 2, 3]
print(f'data in model: {c.model.data} vs data in controller: {c.data}')
#data in model: [0, 1] vs data in controller: [0, 1, 2, 3]
c_n = NewStack()
c_n.data = [0, 1, 2, 3]
print(f'data in model: {c_n.model.data} vs data in controller: {c_n.data}')
#data in model: [0, 1, 2, 3] vs data in controller: [0, 1, 2, 3]
Note if you do use directly access instead of property, the self.model.data does not equal self.data, which is out of our expectation.
You can take codes before __name__=='__main__' as a library.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Should I extract values from Python dictionaries into object attributes? - python

Related

Organizing methods in python classes with hierarchical names

Class with too many parameters: better design strategy?

Have well-defined, narrowly-focused classes ... now how do I get anything done in my program?

Should I use a class in this: Reading a XML file using lxml

When and how to use the builtin function property() in python

Categories

Resources