I have many different small classes which have a few fields each, e.g. this:
class Article:
def __init__(self, name, available):
self.name = name
self.available = available
What's the easiest and/or most idiomatic way to make the name field read only, so that
a = Article("Pineapple", True)
a.name = "Banana" # <-- should not be possible
is not possible anymore?
Here's what I considered so far:
Use a getter (ugh!).
class Article:
def __init__(self, name, available):
self._name = name
self.available = available
def name(self):
return self._name
Ugly, non-pythonic - and a lot of boilerplate code to write (especially if I have multiple fields to make read-only). However, it does the job and it's easy to see why that is.
Use __setattr__:
class Article:
def __init__(self, name, available):
self.name = name
self.available = available
def __setattr__(self, name, value):
if name == "name":
raise Exception("%s property is read-only" % name)
self.__dict__[name] = value
Looks pretty on the caller side, seems to be the idiomatic way to do the job - but unfortunately I have many classes with only a few fields to make read only each. So I'd need to add a __setattr__ implementation to all of them. Or use some sort of mixin maybe? In any case, I'd need to make up my mind how to behave in case a client attempts to assign a value to a read-only field. Yield some exception, I guess - but which?
Use a utility function to define properties (and optionally getters) automatically. This is basically the same idea as (1) except that I don't write the getters explicitely but rather do something like
class Article:
def __init__(self, name, available):
# This function would somehow give a '_name' field to self
# and a 'name()' getter to the 'Article' class object (if
# necessary); the getter simply returns self._name
defineField(self, "name")
self.available = available
The downside of this is that I don't even know if this is possible (or how to implement it) since I'm not familiar with runtime code generation in Python. :-)
So far, (2) appears to be most promising to me except for the fact that I'll need __setattr__ definitions to all my classes. I wish there was a way to 'annotate' fields so that this happens automatically. Does anybody have a better idea?
For what it's worth, I'mu sing Python 2.6.
UPDATE:
Thanks for all the interesting responses! By now, I have this:
def ro_property(o, name, value):
setattr(o.__class__, name, property(lambda o: o.__dict__["_" + name]))
setattr(o, "_" + name, value)
class Article(object):
def __init__(self, name, available):
ro_property(self, "name", name)
self.available = available
This seems to work quite nicely. The only changes needed to the original class are
I need to inherit object (which is not such a stupid thing anyway, I guess)
I need to change self._name = name to ro_property(self, "name", name).
This looks quite neat to me - can anybody see a downside with it?
I would use property as a decorator to manage your getter for name (see the example for the class Parrot in the documentation). Use, for example, something like:
class Article(object):
def __init__(self, name, available):
self._name = name
self.available = available
#property
def name(self):
return self._name
If you do not define the setter for the name property (using the decorator x.setter around a function) this throws an AttributeError when you try and reset name.
Note: You have to use Python's new-style classes (i.e. in Python 2.6 you have to inherit from object) for properties to work correctly. This is not the case according to #SvenMarnach.
As pointed out in other answers, using a property is the way to go for read-only attributes. The solution in Chris' answer is the cleanest one: It uses the property() built-in in a straight-forward, simple way. Everyone familiar with Python will recognize this pattern, and there's no domain-specific voodoo happening.
If you don't like that every property needs three lines to define, here's another straight-forward way:
from operator import attrgetter
class Article(object):
def __init__(self, name, available):
self._name = name
self.available = available
name = property(attrgetter("_name"))
Generally, I don't like defining domain-specific functions to do something that can be done easily enough with standard tools. Reading code is so much easier if you don't have to get used to all the project-specific stuff first.
Based in the Chris answer, but arguably more pythonic:
def ro_property(field):
return property(lambda self : self.__dict__[field])
class Article(object):
name = ro_property('_name')
def __init__(self):
self._name = "banana"
If trying to modify the property it will raise an AttributeError.
a = Article()
print a.name # -> 'banana'
a.name = 'apple' # -> AttributeError: can't set attribute
UPDATE: About your updated answer, the (little) problem I see is that you are modifying the definition of the property in the class every time you create an instance. And I don't think that is such a good idea. That's why I put the ro_property call outside of the __init__ function
What about?:
def ro_property(name):
def ro_property_decorator(c):
setattr(c, name, property(lambda o: o.__dict__["_" + name]))
return c
return ro_property_decorator
#ro_property('name')
#ro_property('other')
class Article(object):
def __init__(self, name):
self._name = name
self._other = "foo"
a = Article("banana")
print a.name # -> 'banana'
a.name = 'apple' # -> AttributeError: can't set attribute
Class decorators are fancy!
It should be noted that it's always possible to modify attributes of an object in Python - there are no truly private variables in Python. It's just that some approaches make it a bit harder. But a determined coder can always lookup and modify the value of an attribute. For example, I can always modify your __setattr__ if I want to...
For more information, see Section 9.6 of The Python Tutorial. Python uses name mangling when attributes are prefixed with __ so the actual name at runtime is different but you could still derive what that name at runtime is (and thus modify the attribute).
I would stick with your option 1 but refined it to use Python property:
class Article
def get_name(self):
return self.__name
name = property(get_name)
Related
I like to create helper classes that can be used by other classes and where all methods are static (staticmethod). I need to wrap each method with a decorator #staticmethod, but this solution seems to me not very aesthetic. I decided to create a metaclass for such classes - tools, here is an abstract implementation example:
import types
import math
class StaticClass(type):
def __new__(mcs, name, bases, attr):
for name, value in attr.items():
if type(value) is types.MethodType:
attr[name] = staticmethod(value)
return super().__new__(mcs, name, bases, attr)
class Tool(metaclass=StaticClass):
def get_radius_from_area(area):
return math.sqrt(area/Tool.get_pi())
def get_pi():
return math.pi
class CalcRadius:
def __init__(self, area):
self.__area = area
def __call__(self):
return Tool.get_radius_from_area(self.__area)
if __name__ == '__main__':
get_it = CalcRadius(100)
print(get_it()) # 5.641895835477563
Everything works and gives the correct result, but there are understandable and predictable problems with code inspection in the IDE (I use Pycharm 2019.2).
for def get_radius_from_area(area):
Usually first parameter of a method in named 'self'.
'area' highlighted in yellow in the return of the method.
for get_pi():
Method must have a first parameter, usually called 'self'
Void in brackets with out arguments is underlined in red.
If I add the line "# noinspection PyMethodParameters" above the class this partially solves the problem, but it looks even worse than dozens of #staticmethods.
I understand why this is happening and why the developers from JetBrains specially adapt parts of the code in their IDE for Django.
But can I somehow beautifully create a purely static class, in which all methods are static?
Maybe metaclasses are not the best option and is there some kind of alternative solution?
Your metaclass isn't actually doing anything, because types.MethodType matches only bound method objects. You aren't getting any of those when you browse through the class namespace in __new__, so you never wrap anything in staticmethod.
You can fix it by changing the check to types.FunctionType (this will probably satisfy automated tools who will correctly see the method types):
class StaticClass(type):
def __new__(mcs, name, bases, attr):
for name, value in attr.items():
if type(value) is types.FunctionType:
attr[name] = staticmethod(value)
return super().__new__(mcs, name, bases, attr)
But I'd suggest just doing away with the classes and using functions directly. Functions are first class objects in Python, you can pass them around between objects as much as you want. If you want a handy grouping of them, you can put them in lists or dictionaries, or use modules to collect their code in various groupings (and use packages to group modules). I'd also advise you to avoid using leading double-underscore __names to try to get privacy for your attributes by invoking name mangling. It doesn't actually protect your data from anything (outside code can still get at it), and it makes it a whole lot harder to debug. It's included in Python to help you avoid accidental name collisions, not to protect member variables as a matter of course.
Why not encapsulate your code at the module level, instead of class, and offer functions to your users?
Your code could be as simple as that:
import math
def calc_radius(area):
return math.sqrt(area/math.pi)
print(calc_radius(100))
You can also create a class decorator, which I personally prefer over a metaclass in aesthetics:
import types
def staticclass(cls):
for name, value in vars(cls).items():
if isinstance(value, types.FunctionType):
setattr(cls, name, staticmethod(value))
return cls
so that:
#staticclass
class Tool:
def get_radius_from_area(area):
return math.sqrt(area/Tool.get_pi())
def get_pi():
return math.pi
class CalcRadius:
def __init__(self, area):
self.__area = area
def __call__(self):
return Tool.get_radius_from_area(self.__area)
if __name__ == '__main__':
get_it = CalcRadius(100)
print(get_it())
outputs: 5.641895835477563
EDIT: #Blckknght correctly points out that a function is not a bound method until it is actually bound to an instance, which the class object is not. Switching to isinstance(value, types.FunctionType) would allow proper wrapping.
I have a class Step, which I want to derive by many sub-classes. I want every class deriving from Step to be "registered" by a name I choose for it (not the class's name), so I can later call Step.getStepTypeByName().
Something like this, only working :):
class Step(object):
_STEPS_BY_NAME = {}
#staticmethod
def REGISTER(cls, name):
_STEPS_BY_NAME[name] = cls
class Derive1(Step):
REGISTER(Derive1, "CustomDerive1Name")
...
class Derive2(Step):
REGISTER(Derive2, "CustomDerive2Name")
...
Your solution do not work for three reasons.
The first one is that _STEPS_BY_NAME only exists as an attribute of the Step class, so Step.REGISTER cannot access _STEPS_BY_NAME without a reference to the Step class. IOW you have to make it a classmethod (cf below)
The second one is that you need to explicitely use Step.REGISTER(cls) - the name REGISTER does not exist outside the Step class.
The third reason is that within a class statement's body, the class object has not yet been created not bound to it's name, so you cannot not reference the class itself at this point.
IOW, you'd want this instead:
class Step(object):
_STEPS_BY_NAME = {}
# NB : by convention, "ALL_UPPER" names denote pseudo-constants
#classmethod
def register(cls, name):
# here `cls` is the current class
cls._STEPS_BY_NAME[name] = stepclass
class Derive1(Step):
...
Step.register(Derive1, "CustomDerive1Name")
class Derive2(Step):
...
Step.register(Derive2, "CustomDerive2Name")
Now with a minor modification to Step.register you could use it as a class decorator, making things much clearer:
class Step(object):
_STEPS_BY_NAME = {}
#classmethod
def register(cls, name):
def _register(stepclass):
cls._STEPS_BY_NAME[name] = stepclass
return stepclass
return _register
#Step.register("CustomDerive1Name")
class Derive1(Step):
...
#Step.register("CustomDerive2Name")
class Derive2(Step):
...
As a last note: unless you have a compelling reason to register your subclasses in the base class itself, it might be better to use module-level variables and functions (a Python module is actually a kind of singleton):
# steps.py
class Step(object):
#....
_STEPS_BY_NAME = {}
def register(name):
def _register(cls):
_STEPS_BY_NAME[name] = cls
return cls
return _register
def get_step_class(name):
return _STEPS_BY_NAME[name]
And in your other modules
import steps
#steps.register("CustomDerive1Name")
class Derive1(steps.Step):
# ...
The point here is to avoid giving too many responsabilies to your Step class. I don't know your concrete use case so I can't tell which design best fits your need, but I've been using this last one on quite a few projects and it always worked fine so far.
You are close. Use this
class Step(object):
pass
class Derive1(Step):
pass
class Derive2(Step):
pass
_STEPS_BY_NAME = {
'foo': Step,
'bar': Derive1,
'bar': Derive2
}
def get_step_by_name(name):
return _STEPS_BY_NAME[name]
Warning: there might be better approaches depending on what you are trying to achieve. Such a mapping from strings to methods is a maintenance nightmare. If you want to change the name of a method, you would have to remember to change it in multiple place. You won't get any autocomplete help from your IDE either.
I am reading up on how we ensure data encapsulation in python.One of the blog says
"Data Encapsulation means, that we should only be able to access private attributes via getters and setters"
Consider the following snippets from the blog:
class Robot:
def __init__(self, name=None, build_year=None):
self.name = name
self.build_year = build_year
Now, if i create the object of the class as below:
obj1=Robot()
obj1.name('Robo1")
obj1.build_year("1978")
Currently, i can access the attributes directly as i have defined them public(without the __notation)
Now to ensure data encapsulation, i need to define the attributes as privates
using the __ notation and access private attributes via getters and setters.
So the new class definition is as follows:
class Robot:
def __init__(self, name=None, build_year=2000):
self.__name = name
self.__build_year = build_year
def set_name(self, name):
self.__name = name
def get_name(self):
return self.__name
def set_build_year(self, by):
self.__build_year = by
def get_build_year(self):
return self.__build_year
Now i instantiate the class as below:
x = Robot("Marvin", 1979)
x.set_build_year(1993)
This way, i achive data encapsulation as private data members are no longer accessed directly and they can only be accessed via the class methods.
Q1:Why are we doing this? Who are we protecting the code from? Who is outside world?Anyone who has the source code can tweak it as per their requirement, so why at all do we add extra methods(get/set) to modify/tweak the attributes?
Q2:Is the above example considered data encapsulation?
Data encapsulation is slightly more general than access protection. name and build_year are encapsulated by the class Robot regardless of how you define the attributes. Python takes the position that getters and setters that do nothing more than access or assign to the underlying attribute are unnecessary.
Even using the double-underscore prefix is just advisory, and is more concerned with preventing name collisions in subclasses. If you really wanted to get to the __build_year attribute directly, you still could with
# Prefix attribute name with _Robot
x._Robot__build_year = 1993
A better design in Python is to use a property, which causes Python to invoke a defined getter and/or setter whenever an attribute is defined directly. For example:
class Robot(object):
def __init__(self, name, by):
self.name = name
self.build_year = by
#property
def name(self):
return self._name
#name.setter
def name(self, newname):
self._name = newname
#property
def build_year(self):
return self._build_year
#build_year.setter
def build_year(self, newby):
self._build_year = newby
You wouldn't actually define these property functions so simply, but a big benefit is that you can start by allowing direct access to a name attribute, and if you decide later that there should be more logic involved in getting/setting the value and you want to switch to properties, you can do so without affecting existing code. Code like
x = Robot("bob", 1993)
x.build_year = 1993
will work the same whether or not x.build_year = 1993 assigns to build_year directly or if it really triggers a call to the property setter.
About source code: sometimes you supply others with compiled python files that does not present the source, and you don't want people to get in mess with direct attribute assignments.
Now, consider data encapsulation as safe guards, last point before assigning or supplying values:
You may want to validate or process assignments using the sets, to make sure the assignment is valid for your needs or enters to the variable in the right format, (e.g. you want to check that attribute __build_year is higher than 1800, or that the name is a string). Very important in dynamic languages like python where a variable is not declared with a specific type.
Same goes for gets. You might want to return the year as a decimal, but use it as an integer in the class.
Yes, your example is a basic data encapsulation.
Say I have a class that looks like
class MeasurementList:
def __init__(self, measurement_list):
self.__measurements = measurement_list
#property
def measurements(self):
return self.__measurements
what is the most pythonic way to retrieve the value of self.measurements from inside the class; directly accessing the variable or going via the property (external accessor)? I.e.,
def do_something(self)
# do something
return self.measurements
or
def do_something(self)
# do something
return self.__measurements
Does any of the alternatives have any speed advantages, or easier refactoring, or other factors?
The point of properties is to add additional functionality to the process of getting/setting a field, while keeping the interface of a field.
That means you start out with a simple field, and access it as a field:
class MeasurementList:
def __init__(self, measurement_list):
self.measurements = measurement_list
def foo(self):
print("there are %d measurements" % len(self.measurements))
Then if you want/have to add additional logic to the setter/getter you convert it into a property, without having changed the interface. Thus no need to refactor accessing code.
class MeasurementList:
def __init__(self, measurement_list):
self._count = 0
self.measurements = measurement_list
#property
def measurements(self):
return self._measurements
#measurements.setter
def measurements(self value):
self._measurements = value
self._count = len(value)
def foo(self):
print("there are %d measurements" % (self._count))
def bar(self):
print(self.measurements)
Alternative reasons for using properties are readonly properties or properties that return computed (not directly stored in fields) values. In the case of read only properties you would access the backing field directly to write (from inside the class).
class MeasurementList:
def __init__(self, measurement_list):
self._measurements = measurement_list
# readonly
#property
def measurements(self):
return self._measurements
# computed property
#property
def count(self):
return len(self.measurements)
def foo(self):
print("there are %d measurements" % (self.count))
def bar(self):
print(self.measurements)
Keeping all that in mind you should not forget that there is no such thing as 'private' in python. If anyone really wants to access a private anything he can do so. It is just convention that anything starting with an underscore should be considered private and not be accessed by the caller. That is also the reason why one underscore is enough. Two underscores initiate some name mangling that is primarily used to avoid name conflicts, not prohibit access.
When you use properties in Python, you should almost always avoid accessing attribute under the property, unless it's necessary. Why?
Properties in Python are used to create getter, setter and deleter, but you probably know it.
They are usually used when you process the property data during those operation. I don't really have a good example for it right now, but consider following:
class User:
# _password stores hash object from user's password.
#property
def password(self):
return self._password.hexdigest() # Returns hash as string
#password.setter
def password(self, val):
self._password = hash(val) # Creates hash object
Here using _password and password results in quite different output. In most cases, you need simply password, both inside and outside class definition, unless you want to interact directly with object wrapped by it.
If you have the same object returned in getter and attribute, then you should follow the same practice, as you may wish sameday to add some checks or mechanics to it, and it will save you from refactoring every use of _attribute, and following that convention will also save you from errors when creating more complex descriptors.
Also, from your code, note that using __measurements (leading double underscore) results in name mangling of attribute name, so if you ever inherit from MeasurementList, you will be almost unable to access this attribute.
I presume you have seen code like this in Java. It is, however, deeply unpythonic to use methods where attribute access serves the purpose perfectly well. Your existing code would be much more simply written as
class MeasurementList:
def __init__(self, measurement_list):
self.measurements = measurement_list
Then no property is required.
The point is, presumably, to avoid allowing code external to the class to alter the value of the __measurements attribute. How necessary is this in practice?
Use setters and getters both inside and outside your class. It would make your code easier to maintain once you add some additional data processing into setters and getters:
class C(object):
_p = 1
#property
def p(self):
print 'getter'
return self._p
#p.setter
def p(self, val):
print 'setter'
self._p = val
def any_method(self):
self.p = 5
print '----'
a = self.p
myObject = C()
myObject.any_method()
From the output, you see that setter and getter are invoked:
setter
----
getter
Suppose I have a class NamedObject which has an attribute name. Now if I had to use a setter, I would first have to define a getter (I guess?) like so:
class NamedObject:
def __init__(self, name):
self.name = name
#property
def name(self):
return self._name
Now I was wondering, inside the setter, should I use self._name or self.name, the getter or the actual attribute? When setting the name, I ofc. need to use _name, but what about when I'm getting INSIDE the setter? For example:
#name.setter
def name(self, value):
if self._name != str(value): # Or should I do 'if self.name != value' ?
self.doStuff(self._name) # Or doStuff(self.name) ?
self.doMoreStuff()
self._name = str(value)
Does it actually matter which one to use, and why use one over the other?
There's no normal reason to use the external interface when your setter is part of the internal interface. I suppose you might be able to construct a scenario where you might want to, but by default, just use the internal variable.
If your getter has significant logic (like lazy initialization), then you should access through the getter all the time.
class Something(object):
UNINITIALIZED = object()
LAZY_ATTRS = ('x','y','z')
def __init__(self):
for attr in self.LAZY_ATTRS:
setattr(self, '_'+attr, self.UNINITIALIZED)
def _get_x(self):
if self._x is self.UNINITIALIZED:
self._x = self.doExpensiveInitStuff('x')
return self._x
But if all your getter does is return self._x, just access the internal variable directly.
Using the getter instead of just accessing the internal variable adds another function call to your setting logic, and in Python, function calls are expensive. If you are writing this:
def _get_x(self):
return self._x
def _set_x(self, value):
self._x = value
x = property(fget=_get_x, fset=_set_x)
then you are suffering from "Too Much Java" syndrome. Java developers have to write this kind of stuff, because if it later becomes necessary to add behavior to the setting or getting of x, all the accesses to x outside of the class have to be recompiled. But in Python, you are far better off keeping things simple, and just defining x as an instance variable, and converting to a property only when the need arises to add some kind of setting or getting behavior. See YAGNI and YAGNI.
Paul already answered well.
For the sake of completeness I'd like to add that using getters/setters consistently makes it easier to override a class. There are several implications here.
If you envision that a particular class is very likely to be overriden/extended by yourself or others, then using getters/setters early on might be beneficial in terms of less time spent later for refactoring. Still, I agree to the keep it simple viewpoint: Use the below only sparingly, because of the runtime cost and reading/coding effort.
If validation is done in the getter, too, then either use the instance attribute directly in the setter, or provide two different getters name() and _name() (or name_already_checked()) so that both can be overridden and use the simple getter without validation inside the setter. This is to allow extension of both the fast, no-validation type of getter as well as the usual, provided for customers, getter.
This does violate the YAGNI principle that Paul pointed to. However, if you do release code for a wider audience "overengineering" is often advisable. Libraries benefit from added flexibility and foresight.