I would like to set up a class hierarchy in Python 3.2 with 'protected' access: Members of the base class would be in scope only for derived classes, but not 'public'.
A double underscore makes a member 'private', a single underscore indicates a warning but the member remains 'public'. What (if any...) is the correct syntax for designating a 'protected' member.
Member access allowance in Python works by "negotiation" and "treaties", not by force.
In other words, the user of your class is supposed to leave their hands off things which are not their business, but you cannot enforce that other than my using _xxx identifiers making absolutely clear that their access is (normally) not suitable.
Double underscores don't make a member 'private' in the C++ or Java sense - Python quite explicitly eschews that kind of language-enforced access rules. A single underscore, by convention, marks an attribute or a method as an "implementation detail" - that is, things outside can still get to it, but this isn't a supported part of the class' interface and, therefore, the guarantees that the class might make about invariants or back/forwards compatibility no longer apply. This solves the same conceptual problem as 'private' (separation of interface and implementation) in a different way.
Double underscores invoke name mangling which still isn't 'private' - it is just a slightly stronger formulation of the above, whereby:
- This function is an implementation detail of this class, but
- Subclasses might reasonably expect to have a method of the same name that isn't meant as an overridden version of the original
This takes a little bit of language support, whereby the __name is mangled to include the name of the class - so that subclass versions of it get different names instead of overriding. It is still quite possible for a subclass or outside code to call that method if it really wants to - and the goal of name mangling is explicitly not to prevent that.
But because of all this, 'protected' turns out not to make much sense in Python - if you really have a method that could break invariants unless called by a subclass (and, realistically, you probably don't even if you think you do), the Python Way is just to document that. Put a note in your docstring to the effect of "This is assumed to only be called by subclasses", and run with the assumption that clients will do the right thing - because if they don't, it becomes their own problem.
Related
Background:
The answers to this post helped me understand the design pattern and why it's useful. However, I can't find anything on docs.python.org on the term "mixin" or how the notion of 'non-instantiated multiple-inheritance' is formalized.
Questions:
Where does the term "mixin" come from and how does python know a class is a mixin if the "*Mixin" pattern is not reserved?
Is there any motivation (besides convention) to append "Mixin" to classes that I want to use for shared methods that don't get inherited from a parent class?
Where does the term "mixin" come from and how does python know a
class is a mixin if the "*Mixin" pattern is not reserved?
The term is used in OOP, but I believe it just surfaced naturally, as it the counterpart of what a "real world" "thing to be mixed in the final result" would do; it means literally something that is mixed-in your final result, and changes its behavior.
As for Python recognizing it: it does not. Python only knows about classes, and you can name your composible superclasses as you want. There is no need to add a "Mixin" suffix to their name - it is not even a strong convention at all: if it does not make sense in your project so that it is easier to understand, feel free to name your classes as you want.
A much stronger convention the language does not care about either (except in some few "soft" places) is that "_" should be use to denote private methods or attributes in Python code. Besides that, the only names that Python care about are special names that are prefixed and post-fixed with two underlines - "__"
Is there any motivation (besides convention) to append "Mixin" to classes that I want to use for shared methods that don't get inherited from a parent class?
As stated above, no. Just name them as you want.
I was doing some research about the use of encapsulation in object oriented programming using Python and I have stumbled with this topic that has mixed opinions about how encapsulated attributes work and about the usage of them.
I have programmed this piece of code that only made matters more confuse to me:
class Dog:
def __init__(self,weight):
self.weight = weight
__color =''
def set_color(self,color):
self.__color = color
def get_color(self):
print(self.__color)
rex = Dog(59)
rex.set_color('Black')
rex.get_color()
rex.color = 'White'
rex.__color = rex.color
print(rex.__color)
rex.get_color()
The result is:
>Black
>White
>Black
I understand that the reason behind this is because when we do the assignment rex.__color = rex.color, a new attribute is created that does not point to the real __color of the instanced Dog.
My questions here are:
Is this a common scenario to occur?
Are private attributes a thing used really often?
In a language that does not have properties (eg. java) this is so common that it has become a standard, and all frameworks assume that getters/setters already exist.
However, in python you can have properties, which are essentially getters/setters that can be added later without altering the code that uses the variables. So, no reason to do it in python. Use the fields as public, and add properties if something changes later.
Note: use single instead of double underscore in your "private" variables. Not only it's the common convention, but also, double underscore is handled differently by the interpreter.
Encapsulation is not about data hidding but about keeping state and behaviour together. Data hidding is meant as a way to enforce encapsulation by preventing direct access to internal state, so the client code must use (public) methods instead. The main points here are 1/ to allow the object to maintain a coherent state (check the values, eventually update some other part of the state accordingly etc) and 2/ to allow implementation changes (internal state / private methods) without breaking the client code.
Languages like Java have no support for computed attributes, so the only way to maintain encapsulation in such languages is to make all attributes protected or private and to eventally provide accessors. Alas, some people never got the "eventually" part right and insist on providing read/write accessors for all attributes, which is a complete nonsense.
Python has a strong support for computed attributes thru the descriptor protocol (mostly known via the generic property type but you can of course write your own desciptors instead), so there's no need for explicit getters/setters - if your design dictates that some class should provide a publicly accessible attribute as part of it's API, you can always start with just a public attribute and if at some point you need to change implementation you can just replace it with a computed attribute.
This doesn't mean you should make all your attributes public !!! - most of the time, you will have "implementation attributes" (attributes that support the internal state but are in no way part of the class API), and you definitly want to keep those protected (by prefixing them with a single leading underscore).
Note that Python doesn't try to technically enforce privacy, it's only a naming convention and you can't prevent client code to access internal state. Nothing to be worried about here, very few peoples stupid enough to bypass the official API without good reasons, and then they know their code might break something and assume all consequences.
My question is quite general, but for clarity I'd like to give an example that is as concrete as possible: I was lately writing a class, which was derived from a matplotlib artist. A minimal working example would be the following:
from matplotlib import text
class TextChild(text.Text):
def __init__(self):
self._rotation = self.get_rotation()
The idea behind using an underscore self._rotation was to show the potential user not to access that attribute directly (i.e. to label it private). This turned out to be a bad idea, because text.Text also has an attribute called _rotation and I got very surprising results.
There are, of course, ways to deal with this.
One is to use a different attribute name, say, self._rotation2, but
the base class may be subject to change in the future, possibly
introducing new attributes and with a bit of bad luck names might
again match, which would break the derived class.
Another solution would be to use name mangling, i.e.
self.__rotation (the solution I chose). From what I understood,
however, name mangling should be used as sparsely as possible and if
I have many private attributes there will be a lot of double
underscores in the code.
So here is the question: Is there a preferred way of naming private class attributes when deriving from a class out of my own control that may change in the future?
Is there a preferred way of naming private class attributes when deriving from a class out of my own control that may change in the future?
It's really difficult to tell how you should choose the name of identifiers in your code, this is open to you. Generally speaking, it's your job as a programmer to avoid name collisions, some advanced IDEs can aide in this process.
For you question I believe using name mangling will definitely avoid name collisions somehow, this won't litter your code with underscores as you might think given that you use this feature wisely. If you're using a lot of redundant names, it's better to choose unique names instead. It's generally acceptable to use __name for attributes that you would like to ensure that they belong to their classes and please remember private in Python isn't really private, it's really pseudo-private. You'll still be able to access those attributes.
Here's one trick that you can use to avoid name collisions:
>>> "name" in dir(Foo)
True
So if name is already there in the namespace of class Foo, you would know from this single line and to get a list of all the attributes of class Foo just call dir with Foo as its argument: dir(Foo).
Mainly this is a design issue, but if I were in your position I'd opt to check with dir to ensure the uniqueness of my names to avoid overriding other names unintentionally. For example, if you read the codes of Python standard library, in many places the use of _name naming convention to denote this name should not be directly accessed from outside the class is pretty obvious.
I recently posted a question on stackoverflow and I got a resolution.
Some one suggested to me about the coding style and I haven't received further input. I have the following question with reference to the prior query.
How can we declare private variables inside a class in python? I thought that by using a double underscore (__) the variable is treated as private. Please correct me.
As per the suggestion received before, we don't have to use a getter or setter method. Shouldn't we use a getter or setter or both? Please let me know your suggestion on this one.
Everything is public in Python, the __ is a suggestion by convention that you shouldn't use that function as it is an implementation detail.
This is not enforced by the language or runtime in any way, these names are decorated in a semi-obfuscated way, but they are still public and still visible to all code that tries to use them.
Idiomatic Python doesn't use get/set accessors, it is duplication of effort since there is no private scope.
You only use accessors when you want indirect access to a member variable to have code around it, and then you mark the member variable with __ as the start of its name and provide a function with the actual name.
You could go to great lengths with writing reams of code to try and protect the user from themselves using Descriptors and meta programming, but in the end you will end up with more code that is more to test and more to maintain, and still no guarantee that bad things won't happen. Don't worry about it - Python has survived 20 years this way so far, so it can't be that big of a deal.
PEP 8 (http://www.python.org/dev/peps/pep-0008/) has a section "Designing for inheritance" that should address most of these concerns.
To quote:
"We don't use the term "private" here, since no attribute is really
private in Python (without a generally unnecessary amount of work)."
Also:
"If your class is intended to be subclassed, and you have attributes
that you do not want subclasses to use, consider naming them with
double leading underscores and no trailing underscores."
If you've not read the entire section, I would encourage you to do so.
Update:
To answer the question (now that the title has changed). The pythonic way to use private variables, is to not use private variables. Trying to hide something in python is seldom seen as pythonic.
You can use Python properties instead of getters and setters. Just use an instance attribute and when you need something more complex, make this attribute a property without changing too much code.
http://adam.gomaa.us/blog/2008/aug/11/the-python-property-builtin/
Private variables:
If you use the double underscore at the beginning of your class members they are considered to be private, though not REALLY enforced by python. They simply get some naming tacked on to the front to prevent them from being easily accessed. Single underscore could be treated as "protected".
Getter/Setter:
You can use these if you want to do more to wrap the process and 'protect' your 'private' attributes. But its, again, not required. You could also use Properties, which has getter/setter features.
1) http://docs.python.org/tutorial/classes.html#private-variables
“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice.
(continue reading for more details about class-private variables and name mangling)
2) http://docs.python.org/library/functions.html#property
This question already has answers here:
What is the meaning of single and double underscore before an object name?
(18 answers)
Closed 8 years ago.
Somebody was nice enough to explain to me that __method() mangles but instead of bothering him further since there are a lot of other people who need help I was wondering if somebody could elaborate the differences further.
For example I don't need mangling but does _ stay private so somebody couldn't do instance._method()? Or does it just keep it from overwriting another variable by making it unique? I don't need my internal methods "hidden" but since they are specific to use I don't want them being used outside of the class.
From PEP 8:
_single_leading_underscore: weak "internal use" indicator. E.g.
from M import *
does not import objects whose name starts with an underscore.
single_trailing_underscore_: used by convention to avoid conflicts with Python keyword, e.g.
Tkinter.Toplevel(master, class_='ClassName')
__double_leading_underscore: when naming a class attribute, invokes name
mangling (inside class FooBar, __boo becomes _FooBar__boo; see below).
__double_leading_and_trailing_underscore__: "magic" objects or
attributes that live in user-controlled namespaces. E.g. __init__,
__import__ or __file__. Never invent such names; only use them
as documented.
Also, from David Goodger's Code Like a Pythonista:
Attributes: interface, _internal, __private
But try to avoid the __private form. I never use it. Trust me. If you
use it, you WILL regret it later.
Explanation:
People coming from a C++/Java background are especially prone to
overusing/misusing this "feature". But __private names don't work the
same way as in Java or C++. They just trigger a name mangling whose
purpose is to prevent accidental namespace collisions in subclasses:
MyClass.__private just becomes MyClass._MyClass__private. (Note that
even this breaks down for subclasses with the same name as the
superclass, e.g. subclasses in different modules.) It is possible to
access __private names from outside their class, just inconvenient and
fragile (it adds a dependency on the exact name of the superclass).
The problem is that the author of a class may legitimately think "this
attribute/method name should be private, only accessible from within
this class definition" and use the __private convention. But later on,
a user of that class may make a subclass that legitimately needs
access to that name. So either the superclass has to be modified
(which may be difficult or impossible), or the subclass code has to
use manually mangled names (which is ugly and fragile at best).
There's a concept in Python: "we're all consenting adults here". If
you use the __private form, who are you protecting the attribute from?
It's the responsibility of subclasses to use attributes from
superclasses properly, and it's the responsibility of superclasses to
document their attributes properly.
It's better to use the single-leading-underscore convention,
_internal. "This isn't name mangled at all; it just indicates to
others to "be careful with this, it's an internal implementation
detail; don't touch it if you don't fully understand it". It's only a
convention though.
A single leading underscore is simply a convention that means, "You probably shouldn't use this." It doesn't do anything to stop someone from using the attribute.
A double leading underscore actually changes the name of the attribute so that two classes in an inheritance hierarchy can use the same attribute name, and they will not collide.
There is no access control in Python. You can access all attributes of a class, and that includes mangled names (as _class__variable). Concentrate on your code and API instead of trying to protect developers from themselves.