is there any way to prevent side effects in python? - python

Is there any way to prevent side effects in python? For example, the following function has a side effect, is there any keyword or any other way to have the python complain about it?
def func_with_side_affect(a):
a.append('foo')

Python is really not set up to enforce prevention of side-effects. As some others have mentioned, you can try to deepcopy the data or use immutable types, but these still have corner cases that are tricky to catch, and it's just a ton more effort than it's worth.
Using a functional style in Python normally involves the programmer simply designing their functions to be functional. In other words, whenever you write a function, you write it in such a way that it doesn't mutate the arguments.
If you're calling someone else's function, then you have to make sure the data you are passing in either cannot be mutated, or you have to keep around a safe, untouched copy of the data yourself, that you keep away from that untrusted function.

No, but with you example, you could use immutable types, and pass tuple as an a argument. Side effects can not affect immutable types, for example you can not append to tuple, you could only create other tuple by extending given.
UPD: But still, your function could change objects which is referenced by your immutable object (as it was pointed out in comments), write to files and do some other IO.

Sorry really late to the party. You can use effect library to isolate side-effects in your python code. As others have said in Python you have to explicitly write functional style code but this library really encourages towards it.

About the only way to enforce that would be to overwrite the function specification to deepcopy any arguments before they are passed to the original function. You could to that with a function decorator.
That way, the function has no way to actually change the originally passed arguments. This however has the "sideeffect" of a considerable slowdown as the deepcopy operation is rather costly in terms of memory (and garbage-collection) usage as well as CPU consumption.
I'd rather recommend you properly test your code to ensure that no accidental changes happen or use a language that uses full copy-by-value semantics (or has only immutable variables).
As another workaround, you could make your passed objects basically immutable by adding this to your classes:
"""An immutable class with a single attribute 'value'."""
def __setattr__(self, *args):
raise TypeError("can't modify immutable instance")
__delattr__ = __setattr__
def __init__(self, value):
# we can no longer use self.value = value to store the instance data
# so we must explicitly call the superclass
super(Immutable, self).__setattr__('value', value)
(Code copied from the Wikipedia article about Immutable object)

Since any Python code can do IO, any Python code could launch intercontinental ballistic missiles (and I'd consider launching ICBMs to be a fairly catastrophic side effect for most purposes).
The only way to avoid side effects is to not use Python code in the first place but rather data - i.e. you end up creating a domain specific language which disallows side effects, and a Python interpreter which executes programs of that language.

You'll have to make a copy of the list first. Something like this:
def func_without_side_affect(a):
b = a[:]
b.append('foo')
return b
This shorter version might work for you too:
def func_without_side_affect(a):
return a[:] + ['foo']
If you have nested lists or other things like that, you'll probably want to look at copy.deepcopy to make the copy instead of the [:] slice operator.

It would be very difficult to do for the general case, but for some practical cases you could do something like this:
def call_function_checking_for_modification(f, *args, **kwargs):
myargs = [deepcopy(x) for x in args]
mykwargs = dict((x, deepcopy(kwargs[x])) for x in kwargs)
retval = f(*args, **kwargs)
for arg, myarg in izip(args, myargs):
if arg != myarg:
raise ValueError, 'Argument was modified during function call!'
for kwkey in kwargs:
if kwargs[kwkey] != mykwargs[kwkey]:
raise ValueError, 'Argument was modified during function call!'
return retval
But, obviously, there are a few issues with this. For trivial things (i.e. all the inputs are simple types), then this isn't very useful anyways - those will likely be immutable, and in any case they are easy (well, relatively) to detect than complex types.
For complex types though, the deepcopy will be expensive, and there's no guarantee that the == operator will actually work correctly. (and simple copy isn't good enough... imagine a list, where one element changes value... a simple copy will just store a reference, and so the original value with change too).
In general, though, this is not that useful, since if you are already worried about side effects with calling this functions, you can just guard against them more intelligently (by storing your own copy if needed, auditing the destination function, etc), and if it's your function you are worried about causing side effects, you will have audited it to make sure.
Something like the above could be wrapped in a decorator though; with the expensive parts gated by a global variable (if _debug == True:, something like that), it could maybe be useful in projects where lots of people are editing the same code, though, i guess...
Edit: This only works for environments where a more 'strict' form of 'side effects' is expected. In many programming languages, you can make the available of side effects much more explicit - in C++ for instance, everything is by value unless explicitly a pointer or reference, and even then you can declare incoming references as const so that it can't be modified. There, 'side effects' can throw errors at compile time. (of course there are way to get some anyways).
The above enforces that any modified values are in the return value/tuple. If you are in python 3 (i'm not yet) I think you could specify decoration in the function declaration itself to specify attributes of function arguments, including whether they would be allowed to be modified, and include that in the above function to allow some arguments explicitly to be mutable.
Note that I think you could probably also do something like this:
class ImmutableObject(object):
def __init__(self, inobj):
self._inited = False
self._inobj = inobj
self._inited = True
def __repr__(self):
return self._inobj.__repr__()
def __str__(self):
return self._inobj.__str__()
def __getitem__(self, key):
return ImmutableObject(self._inobj.__getitem__(key))
def __iter__(self):
return self.__iter__()
def __setitem__(self, key, value):
raise AttributeError, 'Object is read-only'
def __getattr__(self, key):
x = getattr(self._inobj, key)
if callable(x):
return x
else:
return ImmutableObject(x)
def __setattr__(self, attr, value):
if attr not in ['_inobj', '_inited'] and self._inited == True:
raise AttributeError, 'Object is read-only'
object.__setattr__(self, attr, value)
(Probably not a complete implementation, haven't tested much, but a start). Works like this:
a = [1,2,3]
b = [a,3,4,5]
print c
[[1, 2, 3], 3, 4, 5]
c[0][1:] = [7,8]
AttributeError: Object is read-only
It would let you protect a specific object from modification if you didn't trust the downstream function, while still being relatively lightweight. Still requires explicit wrapping of the object though. You could probably build a decorator to do this semi-automatically though for all arguments. Make sure to skip the ones that are callable.

Related

Python - Bad practice to store instance vars in local vars to avoid "self"?

I've been mostly programming in Java and I find Pythons explicit self referencing to class members to be ugly. I really don't like how all the "self."s clutter down my methods, so I find myself wanting to store instance variables in local variables just to get rid of it. For example, I would replace this:
def insert(self, data, priority):
self.list.append(self.Node(data, priority))
index = len(self)-1
while self.list[index].priority < self.list[int(index/2)].priority:
self.list[index], self.list[int(index/2)] = self.list[int(index/2)], self.list[index]
index = int(index/2)
with this:
def insert(self, data, priority):
l = self.list
l.append(self.Node(data, priority))
index = len(self)-1
while l[index].priority < l[int(index/2)].priority:
l[index], l[int(index/2)] = l[int(index/2)], l[index]
index = int(index/2)
Normally I would name the local variable the same as the instance variable, but "list" is reserved so I went with "l". My question is: is this considered bad practice in the Python community?
Easier answer first. In Python, underscore is used to avoid clashes with keywords and builtins:
list_ = self.list
This will be understood by Python programmers as the right way.
As for making local variables for properties, it depends. Grepping codebase of Plone (and even standard library) shows, that x = self.x is used, especially,
context = self.context
As pointed out in comments, it's potentially error-prone, because binding another value to local variable will not affect the property.
On the other hand, if some attribute is read-only in the method, it makes code much more readable. So, it's ok if variable use is local enough, say, like let-clauses in functional programming languages.
Sometimes properties are actually functions, so self.property will be calculated each time. (It's another question how "pythonic" is doing extensive calculations for property getters) (thanks Python #property versus getters and setters for a ready example):
class MyClass(object):
...
#property
def my_attr(self):
...
#my_attr.setter
def my_attr(self, value):
...
In summary, use sparingly, with care, do not make it a rule.
I agree that explicitly adding "self" (or "this" for other languages) isn't very appealing for the eye. But as people said, python follows the philosophy "explicit is better than implicit". Therefore it really wants you to express the scope of the variable you want to access.
Java won't let you use variables you didn't declare, so there are no chances for confusion. But in python if the "self" was optional, for the assignment a = 5 it would not be clear whether to create a member or local variable. So the explicit self is required at some places. Accessing would work the same though. Note that also Java requires an explicit this for name clashes.
I just counted the selfs in some spaghetti code of mine. For 1000 lines of code there's more than 500 appearances of self. Now the code indeed isn't that readable, but the problem isn't the repeated use of self. For your code example above: the 2nd version has a shorter line length, which makes it easier and/or faster to comprehend. I would say your example is an acceptable case.

How to check if a function is pure in Python?

A pure function is a function similar to a Mathematical function, where there is no interaction with the "Real world" nor side-effects. From a more practical point of view, it means that a pure function can not:
Print or otherwise show a message
Be random
Depend on system time
Change global variables
And others
All this limitations make it easier to reason about pure functions than non-pure ones. The majority of the functions should then be pure so that the program can have less bugs.
In languages with a huge type-system like Haskell the reader can know right from the start if a function is or is not pure, making the successive reading easier.
In Python this information may be emulated by a #pure decorator put on top of the function. I would also like that decorator to actually do some validation work. My problem lies in the implementation of such a decorator.
Right now I simply look the source code of the function for buzzwords such as global or random or print and complains if it finds one of them.
import inspect
def pure(function):
source = inspect.getsource(function)
for non_pure_indicator in ('random', 'time', 'input', 'print', 'global'):
if non_pure_indicator in source:
raise ValueError("The function {} is not pure as it uses `{}`".format(
function.__name__, non_pure_indicator))
return function
However it feels like a weird hack, that may or may not work depending on your luck, could you please help me in writing a better decorator?
I kind of see where you are coming from but I don't think this can work. Let's take a simple example:
def add(a,b):
return a + b
So this probably looks "pure" to you. But in Python the + here is an arbitrary function which can do anything, just depending on the bindings in force when it is called. So that a + b can have arbitrary side effects.
But it's even worse than that. Even if this is just doing standard integer + then there's more 'impure' stuff going on.
The + is creating a new object. Now if you are sure that only the caller has a reference to that new object then there is a sense in which you can think of this as a pure function. But you can't be sure that, during the creation process of that object, no reference to it leaked out.
For example:
class RegisteredNumber(int):
numbers = []
def __new__(cls,*args,**kwargs):
self = int.__new__(cls,*args,**kwargs)
self.numbers.append(self)
return self
def __add__(self,other):
return RegisteredNumber(super().__add__(other))
c = RegisteredNumber(1) + 2
print(RegisteredNumber.numbers)
This will show that the supposedly pure add function has actually changed the state of the RegisteredNumber class. This is not a stupidly contrived example: in my production code base we have classes which track each created instance, for example, to allow access via key.
The notion of purity just doesn't make much sense in Python.
(not an answer, but too long for a comment)
So if a function can return different values for the same set of arguments, it is not pure?
Remember that functions in Python are objects, so you want to check the purity of an object...
Take this example:
def foo(x):
ret, foo.x = x*x+foo.x, foo.x+1
return ret
foo.x=0
calling foo(3) repeatedly gives:
>>> foo(3)
9
>>> foo(3)
10
>>> foo(3)
11
...
Moreover, reading globals does not require to use the global statement, or the global() builtin inside your function. Global variables might change somewhere else, affecting the purity of your function.
All the above situation might be difficult to detect at runtime.

How to write __getitem__ cleanly?

In Python, when implementing a sequence type, I often (relatively speaking) find myself writing code like this:
class FooSequence(collections.abc.Sequence):
# Snip other methods
def __getitem__(self, key):
if isinstance(key, int):
# Get a single item
elif isinstance(key, slice):
# Get a whole slice
else:
raise TypeError('Index must be int, not {}'.format(type(key).__name__))
The code checks the type of its argument explicitly with isinstance(). This is regarded as an antipattern within the Python community. How do I avoid it?
I cannot use functools.singledispatch, because that's quite deliberately incompatible with methods (it will attempt to dispatch on self, which is entirely useless since we're already dispatching on self via OOP polymorphism). It works with #staticmethod, but what if I need to get stuff out of self?
Casting to int() and then catching the TypeError, checking for a slice, and possibly re-raising is still ugly, though perhaps slightly less so.
It might be cleaner to convert integers into one-element slices and handle both situations with the same code, but that has its own problems (return 0 or [0]?).
As much as it seems odd, I suspect that the way you have it is the best way to go about things. Patterns generally exist to encompass common use cases, but that doesn't mean that they should be taken as gospel when following them makes life more difficult. The main reason that PEP 443 gives for balking at explicit typechecking is that it is "brittle and closed to extension". However, that mainly applies to custom functions that take a number of different types at any time. From the Python docs on __getitem__:
For sequence types, the accepted keys should be integers and slice objects. Note that the special interpretation of negative indexes (if the class wishes to emulate a sequence type) is up to the __getitem__() method. If key is of an inappropriate type, TypeError may be raised; if of a value outside the set of indexes for the sequence (after any special interpretation of negative values), IndexError should be raised. For mapping types, if key is missing (not in the container), KeyError should be raised.
The Python documentation explicitly states the two types that should be accepted, and what to do if an item that is not of those two types is provided. Given that the types are provided by the documentation itself, it's unlikely to change (doing so would break far more implementations than just yours), so it's likely not worth the trouble to go out of your way to code against Python itself potentially changing.
If you're set on avoiding explicit typechecking, I would point you toward this SO answer. It contains a concise implementation of a #methdispatch decorator (not my name, but i'll roll with it) that lets #singledispatch work with methods by forcing it to check args[1] (arg) rather than args[0] (self). Using that should allow you to use custom single dispatch with your __getitem__ method.
Whether or not you consider either of these "pythonic" is up to you, but remember that while The Zen of Python notes that "Special cases aren't special enough to break the rules", it then immediately notes that "practicality beats purity". In this case, just checking for the two types that the documentation explicitly states are the only things __getitem__ should support seems like the practical way to me.
The antipattern is for code to do explicit type checking, which means using the type() function. Why? Because then a subclass of the target type will no longer work. For instance, __getitem__ can use an int, but using type() to check for it means an int-subclass, which would work, will fail only because type() does not return int.
When a type-check is necessary, isinstance is the appropriate way to do it as it does not exclude subclasses.
When writing __dunder__ methods, type checking is necessary and expected -- using isinstance().
In other words, your code is perfectly Pythonic, and its only problem is the error message (it doesn't mention slices).
I'm not aware of a way to avoid doing it once. That's just the tradeoff of using a dynamically-typed language in this way. However, that doesn't mean you have to do it over and over again. I would solve it once by creating an abstract class with split out method names, then inherit from that class instead of directly from Sequence, like:
class UnannoyingSequence(collections.abc.Sequence):
def __getitem__(self, key):
if isinstance(key, int):
return self.getitem(key)
elif isinstance(key, slice):
return self.getslice(key)
else:
raise TypeError('Index must be int, not {}'.format(type(key).__name__))
# default implementation in terms of getitem
def getslice(self, key):
# Get a whole slice
class FooSequence(UnannoyingSequence):
def getitem(self, key):
# Get a single item
# optional efficient, type-specific implementation not in terms of getitem
def getslice(self, key):
# Get a whole slice
This cleans up FooSequence enough that I might even do it this way if I only had the one derived class. I'm sort of surprised the standard library doesn't already work that way.
To stay pythonic, you have work with the semantics rather than the type of the objects. So if you have some parameter as accessor to a sequence, just use it like that. Use the abstraction for a parameter as long as possible. If you expect a set of user identifiers, do not expect a set, but rather some data structure with a method add. If you expect some text, do not expect a unicode object, but rather some container for characters featuring encode and decode methods.
I assume in general you want to do something like "Use the behavior of the base implementation unless some special value is provided. If you want to implement __getitem__, you can use a case distinction where something different happens if one special value is provided. I'd use the following pattern:
class FooSequence(collections.abc.Sequence):
# Snip other methods
def __getitem__(self, key):
try:
if key == SPECIAL_VALUE:
return SOMETHING_SPECIAL
else:
return self.our_baseclass_instance[key]
except AttributeError:
raise TypeError('Wrong type: {}'.format(type(key).__name__))
If you want to distinguish between a single value (in perl terminology "scalar") and a sequence (in Java terminology "collection"), then it is pythonically fine to determine whether an iterator is implemented. You can either use a try-catch pattern or hasattr as I do now:
>>> a = 42
>>> b = [1, 3, 5, 7]
>>> c = slice(1, 42)
>>> hasattr(a, "__iter__")
False
>>> hasattr(b, "__iter__")
True
>>> hasattr(c, "__iter__")
False
>>>
Applied to our example:
class FooSequence(collections.abc.Sequence):
# Snip other methods
def __getitem__(self, key):
try:
if hasattr(key, "__iter__"):
return map(lambda x: WHATEVER(x), key)
else:
return self.our_baseclass_instance[key]
except AttributeError:
raise TypeError('Wrong type: {}'.format(type(key).__name__))
Dynamic programming languages like python and ruby use duck typing. And a duck is an animal, that walks like a duck, swims like a duck and quacks like a duck. Not because somebody calls it a "duck".

Python: emulate C-style pass-by-reference for variables

I have a framework with some C-like language. Now I'm re-writing that framework and the language is being replaced with Python.
I need to find appropriate Python replacement for the following code construction:
SomeFunction(&arg1)
What this does is a C-style pass-by-reference so the variable can be changed inside the function call.
My ideas:
just return the value like v = SomeFunction(arg1)
is not so good, because my generic function can have a lot of arguments like SomeFunction(1,2,'qqq','vvv',.... and many more)
and I want to give the user ability to get the value she wants.
Return the collection of all the arguments no matter have they changed or not, like: resulting_list = SomeFunction(1,2,'qqq','vvv',.... and many more) interesting_value = resulting_list[3]
this can be improved by giving names to the values and returning dictionary interesting_value = resulting_list['magic_value1']
It's not good because we have constructions like
DoALotOfStaff( [SomeFunction1(1,2,3,&arg1,'qq',val2),
SomeFunction2(1,&arg2,v1),
AnotherFunction(),
...
], flags1, my_var,... )
And I wouldn't like to load the user with list of list of variables, with names or indexes she(the user) should know. The kind-of-references would be very useful here ...
Final Response
I compiled all the answers with my own ideas and was able to produce the solution. It works.
Usage
SomeFunction(1,12, get.interesting_value)
AnotherFunction(1, get.the_val, 'qq')
Explanation
Anything prepended by get. is kind-of reference, and its value will be filled by the function. There is no need in previous defining of the value.
Limitation - currently I support only numbers and strings, but these are sufficient form my use-case.
Implementation
wrote a Getter class which overrides getattribute and produces any variable on demand
all newly created variables has pointer to their container Getter and support method set(self,value)
when set() is called it checks if the value is int or string and creates object inheriting from int or str accordingly but with addition of the same set() method. With this new object we replace our instance in the Getter container
Thank you everybody. I will mark as "answer" the response which led me on my way, but all of you helped me somehow.
I would say that your best, cleanest, bet would be to construct an object containing the values to be passed and/or modified - this single object can be passed, (and will automatically be passed by reference), in as a single parameter and the members can be modified to return the new values.
This will simplify the code enormously and you can cope with optional parameters, defaults, etc., cleanly.
>>> class C:
... def __init__(self):
... self.a = 1
... self.b = 2
...
>>> c=C
>>> def f(o):
... o.a = 23
...
>>> f(c)
>>> c
<class __main__.C at 0x7f6952c013f8>
>>> c.a
23
>>>
Note
I am sure that you could extend this idea to have a class of parameter that carried immutable and mutable data into your function with fixed member names plus storing the names of the parameters actually passed then on return map the mutable values back into the caller parameter name. This technique could then be wrapped into a decorator.
I have to say that it sounds like a lot of work compared to re-factoring your existing code to a more object oriented design.
This is how Python works already:
def func(arg):
arg += ['bar']
arg = ['foo']
func(arg)
print arg
Here, the change to arg automatically propagates back to the caller.
For this to work, you have to be careful to modify the arguments in place instead of re-binding them to new objects. Consider the following:
def func(arg):
arg = arg + ['bar']
arg = ['foo']
func(arg)
print arg
Here, func rebinds arg to refer to a brand new list and the caller's arg remains unchanged.
Python doesn't come with this sort of thing built in. You could make your own class which provides this behavior, but it will only support a slightly more awkward syntax where the caller would construct an instance of that class (equivalent to a pointer in C) before calling your functions. It's probably not worth it. I'd return a "named tuple" (look it up) instead--I'm not sure any of the other ways are really better, and some of them are more complex.
There is a major inconsistency here. The drawbacks you're describing against the proposed solutions are related to such subtle rules of good design, that your question becomes invalid. The whole problem lies in the fact that your function violates the Single Responsibility Principle and other guidelines related to it (function shouldn't have more than 2-3 arguments, etc.). There is really no smart compromise here:
either you accept one of the proposed solutions (i.e. Steve Barnes's answer concerning your own wrappers or John Zwinck's answer concerning usage of named tuples) and refrain from focusing on good design subtleties (as your whole design is bad anyway at the moment)
or you fix the design. Then your current problem will disappear as you won't have the God Objects/Functions (the name of the function in your example - DoALotOfStuff really speaks for itself) to deal with anymore.

When and how to use the builtin function property() in python

It appears to me that except for a little syntactic sugar, property() does nothing good.
Sure, it's nice to be able to write a.b=2 instead of a.setB(2), but hiding the fact that a.b=2 isn't a simple assignment looks like a recipe for trouble, either because some unexpected result can happen, such as a.b=2 actually causes a.b to be 1. Or an exception is raised. Or a performance problem. Or just being confusing.
Can you give me a concrete example for a good usage of it? (using it to patch problematic code doesn't count ;-)
In languages that rely on getters and setters, like Java, they're not supposed nor expected to do anything but what they say -- it would be astonishing if x.getB() did anything but return the current value of logical attribute b, or if x.setB(2) did anything but whatever small amount of internal work is needed to make x.getB() return 2.
However, there are no language-imposed guarantees about this expected behavior, i.e., compiler-enforced constraints on the body of methods whose names start with get or set: rather, it's left up to common sense, social convention, "style guides", and testing.
The behavior of x.b accesses, and assignments such as x.b = 2, in languages which do have properties (a set of languages which includes but is not limited to Python) is exactly the same as for getter and setter methods in, e.g., Java: the same expectations, the same lack of language-enforced guarantees.
The first win for properties is syntax and readability. Having to write, e.g.,
x.setB(x.getB() + 1)
instead of the obvious
x.b += 1
cries out for vengeance to the gods. In languages which support properties, there is absolutely no good reason to force users of the class to go through the gyrations of such Byzantine boilerplate, impacting their code's readability with no upside whatsoever.
In Python specifically, there's one more great upside to using properties (or other descriptors) in lieu of getters and setters: if and when you reorganize your class so that the underlying setter and getter are not needed anymore, you can (without breaking the class's published API) simply eliminate those methods and the property that relies on them, making b a normal "stored" attribute of x's class rather than a "logical" one obtained and set computationally.
In Python, doing things directly (when feasible) instead of via methods is an important optimization, and systematically using properties enables you to perform this optimization whenever feasible (always exposing "normal stored attributes" directly, and only ones which do need computation upon access and/or setting via methods and properties).
So, if you use getters and setters instead of properties, beyond impacting the readability of your users' code, you are also gratuitously wasting machine cycles (and the energy that goes to their computer during those cycles;-), again for no good reason whatsoever.
Your only argument against properties is e.g. that "an outside user wouldn't expect any side effects as a result of an assignment, usually"; but you miss the fact that the same user (in a language such as Java where getters and setters are pervasive) wouldn't expect (observable) "side effects" as a result of calling a setter, either (and even less for a getter;-). They're reasonable expectations and it's up to you, as the class author, to try and accommodate them -- whether your setter and getter are used directly or through a property, makes no difference. If you have methods with important observable side effects, do not name them getThis, setThat, and do not use them via properties.
The complaint that properties "hide the implementation" is wholly unjustified: most all of OOP is about implementing information hiding -- making a class responsible for presenting a logical interface to the outside world and implementing it internally as best it can. Getters and setters, exactly like properties, are tools towards this goal. Properties just do a better job at it (in languages that support them;-).
The idea is to allow you to avoid having to write getters and setters until you actually need them.
So, to start off you write:
class MyClass(object):
def __init__(self):
self.myval = 4
Obviously you can now write myobj.myval = 5.
But later on, you decide that you do need a setter, as you want to do something clever at the same time. But you don't want to have to change all the code that uses your class - so you wrap the setter in the #property decorator, and it all just works.
but hiding the fact that a.b=2 isn't a
simple assignment looks like a recipe
for trouble
You're not hiding that fact though; that fact was never there to begin with. This is python -- a high-level language; not assembly. Few of the "simple" statements in it boil down to single CPU instructions. To read simplicity into an assignment is to read things that aren't there.
When you say x.b = c, probably all you should think is that "whatever just happened, x.b should now be c".
A basic reason is really simply that it looks better. It is more pythonic. Especially for libraries. something.getValue() looks less nice than something.value
In plone (a pretty big CMS), you used to have document.setTitle() which does a lot of things like storing the value, indexing it again and so. Just doing document.title = 'something' is nicer. You know that a lot is happening behind the scenes anyway.
You are correct, it is just syntactic sugar. It may be that there are no good uses of it depending on your definition of problematic code.
Consider that you have a class Foo that is widely used in your application. Now this application has got quite large and further lets say it's a webapp that has become very popular.
You identify that Foo is causing a bottleneck. Perhaps it is possible to add some caching to Foo to speed it up. Using properties will let you do that without changing any code or tests outside of Foo.
Yes of course this is problematic code, but you just saved a lot of $$ fixing it quickly.
What if Foo is in a library that you have hundreds or thousands of users for? Well you saved yourself having to tell them to do an expensive refactor when they upgrade to the newest version of Foo.
The release notes have a lineitem about Foo instead of a paragraph porting guide.
Experienced Python programmers don't expect much from a.b=2 other than a.b==2, but they know even that may not be true. What happens inside the class is it's own business.
Here's an old example of mine. I wrapped a C library which had functions like "void dt_setcharge(int atom_handle, int new_charge)" and "int dt_getcharge(int atom_handle)". I wanted at the Python level to do "atom.charge = atom.charge + 1".
The "property" decorator makes that easy. Something like:
class Atom(object):
def __init__(self, handle):
self.handle = handle
def _get_charge(self):
return dt_getcharge(self.handle)
def _set_charge(self, charge):
dt_setcharge(self.handle, charge)
charge = property(_get_charge, _set_charge)
10 years ago, when I wrote this package, I had to use __getattr__ and __setattr__ which made it possible, but the implementation was a lot more error prone.
class Atom:
def __init__(self, handle):
self.handle = handle
def __getattr__(self, name):
if name == "charge":
return dt_getcharge(self.handle)
raise AttributeError(name)
def __setattr__(self, name, value):
if name == "charge":
dt_setcharge(self.handle, value)
else:
self.__dict__[name] = value
getters and setters are needed for many purposes, and are very useful because they are transparent to the code. Having object Something the property height, you assign a value as Something.height = 10, but if height has a getter and setter then at the time you do assign that value you can do many things in the procedures, like validating a min or max value, like triggering an event because the height changed, automatically setting other values in function of the new height value, all that may occur at the moment Something.height value was assigned. Remember, you don't need to call them in your code, they are auto executed at the moment you read or write the property value. In some way they are like event procedures, when the property X changes value and when the property X value is read.
It is useful when you try to replace inheritance with delegation in refactoring. The following is a toy example. Stack was a subclass in Vector.
class Vector:
def __init__(self, data):
self.data = data
#staticmethod
def get_model_with_dict():
return Vector([0, 1])
class Stack:
def __init__(self):
self.model = Vector.get_model_with_dict()
self.data = self.model.data
class NewStack:
def __init__(self):
self.model = Vector.get_model_with_dict()
#property
def data(self):
return self.model.data
#data.setter
def data(self, value):
self.model.data = value
if __name__ == '__main__':
c = Stack()
print(f'init: {c.data}') #init: [0, 1]
c.data = [0, 1, 2, 3]
print(f'data in model: {c.model.data} vs data in controller: {c.data}')
#data in model: [0, 1] vs data in controller: [0, 1, 2, 3]
c_n = NewStack()
c_n.data = [0, 1, 2, 3]
print(f'data in model: {c_n.model.data} vs data in controller: {c_n.data}')
#data in model: [0, 1, 2, 3] vs data in controller: [0, 1, 2, 3]
Note if you do use directly access instead of property, the self.model.data does not equal self.data, which is out of our expectation.
You can take codes before __name__=='__main__' as a library.

Categories