How to arrange/access private state variables from a Python class? - python

For the sake of the example, let's say I have a basic LED class that can blink different patterns on an LED. The LED would have a few attributes describing it's state at any given point in time such as blink pattern, color, current state. Let's also say there are few other attributes that are used by the class for things like timers, pattern counters/iterators, etc.
I want the consumer of the LED class to be able to ask it for it's state info which would include the first list of attributes but not the second.
My question is could someone explain the best practices for storing and returning this kind of state information?
Some thoughts:
I would think we want all these attributes to be private since we want others to interact with the LED class through a known API
I would want to be able to return all the common state variables in a single "get_state()" call rather than having to access each individually
Is there any benefit to having a logical grouping to these attributes in the class or do I just declare each individual one in the constructor?
I'm coming from a C background and trying to stop thinking in structs and do things the Python way.

You can make the attribute you want private by prefixing it with two underscores. Then, you can provide access to that attribute using the property decorator (basically a getter):
class LEDStuff:
def __init__(self, led_stat):
self.__led_stat = led_stat # behind the scene, name mangling happens
#property
def led_stat():
return self.__led_stat
Note that in python prefixing an attribute with two underscores does not make it private in the sense that an attribute is made private in java. It merely protects the attribute from accidental overriding.
In the python world, it's more common to prefix the attributes we want to protect from accidental overriding with one single leading underscore.

Related

How to use implementation inheritance?

How to use implementation inheritance in Python, that is to say public attributes x and protected attributes _x of the implementation inherited base classes becoming private attributes __x of the derived class?
In other words, in the derived class:
accessing the public attribute x or protected attribute _x should look up x or _x respectively like usual, except it should skip the implementation inherited base classes;
accessing the private attribute __x should look up __x like usual, except it should look up x and _x instead of __x for the implementation inherited base classes.
In C++, implementation inheritance is achieved by using the private access specifier in the base class declarations of a derived class, while the more common interface inheritance is achieved by using the public access specifier:
class A: public B, private C, private D, public E { /* class body */ };
For instance, implementation inheritance is needed to implement the class Adapter design pattern which relies on class inheritance (not to be confused with the object Adapter design pattern which relies on object composition) and consists in converting the interface of an Adaptee class into the interface of a Target abstract class by using an Adapter class that inherits both the interface of the Target abstract class and the implementation of the Adaptee class (cf. the Design Patterns book by Erich Gamma et al.):
Here is a Python program specifying what is intended, based on the above class diagram:
import abc
class Target(abc.ABC):
#abc.abstractmethod
def request(self):
raise NotImplementedError
class Adaptee:
def __init__(self):
self.state = "foo"
def specific_request(self):
return "bar"
class Adapter(Target, private(Adaptee)):
def request(self):
# Should access self.__state and Adaptee.specific_request(self)
return self.__state + self.__specific_request()
a = Adapter()
# Test 1: the implementation of Adaptee should be inherited
try:
assert a.request() == "foobar"
except AttributeError:
assert False
# Test 2: the interface of Adaptee should NOT be inherited
try:
a.specific_request()
except AttributeError:
pass
else:
assert False
You don't want to do this. Python is not C++, nor is C++ Python. How classes are implemented is completely different and so will lead to different design patterns. You do not need to use the class adapter pattern in Python, nor do you want to.
The only practical way to implement the adapter pattern in Python is either by using composition, or by subclassing the Adaptee without hiding that you did so.
I say practical here because there are ways to sort of make it work, but this path would take a lot of work to implement and is likely to introduce hard to track down bugs, and would make debugging and code maintenance much, much harder. Forget about 'is it possible', you need to worry about 'why would anyone ever want to do this'.
I'll try to explain why.
I'll also tell you how the impractical approaches might work. I'm not actually going to implement these, because that's way too much work for no gain, and I simply don't want to spend any time on that.
But first we have to clear several misconceptions here. There are some very fundamental gaps in your understanding of Python and how it's model differs from the C++ model: how privacy is handled, and compilation and execution philosophies, so lets start with those:
Privacy models
First of all, you can't apply C++'s privacy model to Python, because Python has no encapsulation privacy. At all. You need to let go of this idea, entirely.
Names starting with a single underscore are not actually private, not in the way C++ privacy works. Nor are they 'protected'. Using an underscore is just a convention, Python does not enforce access control. Any code can access any attribute on instances or classes, whatever naming convention was used. Instead, when you see a name that start with an underscore you can assume that the name is not part of the conventions of a public interface, that is, that these names can be changed without notice or consideration for backwards compatibility.
Quoting from the Python tutorial section on the subject:
“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice.
It's a good convention, but not even something you can rely on, consistently. E.g. the collections.namedtuple() class generator generates a class with 5 different methods and attributes that all start with an underscore but are all meant to be public, because the alternative would be to place arbitrary restrictions on what attribute names you can give the contained elements, and making it incredibly hard to add additional methods in future Python versions without breaking a lot of code.
Names starting with two underscores (and none at the end), are not private either, not in a class encapsulation sense such as the C++ model. They are class-private names, these names are re-written at compile time to produce a per-class namespace, to avoid collisions.
In other words, they are used to avoid a problem very similar to the namedtuple issue described above: to remove limits on what names a subclass can use. If you ever need to design base classes for use in a framework, where subclasses should have the freedom to name methods and attributes without limit, that's where you use __name class-private names. The Python compiler will rewrite __attribute_name to _ClassName__attribute_name when used inside a class statement as well as in any functions that are being defined inside a class statement.
Note that C++ doesn't use names to indicate privacy. Instead, privacy is a property of each identifier, within a given namespace, as processed by the compiler. The compiler enforces access control; private names are not accessible and will lead to compilation errors.
Without a privacy model, your requirement where "public attributes x and protected attributes _x of the implementation inherited base classes becoming private attributes __x of the derived class" are not attainable.
Compilation and execution models
C++
C++ compilation produces binary machine code aimed at execution directly by your CPU. If you want to extend a class from another project, you can only do so if you have access to additional information, in the form of header files, to describe what API is available. The compiler combines information in the header files with tables stored with the machine code and your source code to build more machine code; e.g. inheritance across library boundaries is handled through virtualisation tables.
Effectively, there is very little left of the objects used to construct the program with. You generally don't create references to class or method or function objects, the compiler has taken those abstract ideas as inputs but the output produced is machine code that doesn't need most of those concepts to exist any more. Variables (state, local variables in methods, etc.) are stored either on the heap or on the stack, and the machine code accesses these locations directly.
Privacy is used to direct compiler optimisations, because the compiler can, at all times, know exactly what code can change what state. Privacy also makes virtualisation tables and inheritance from 3rd-party libraries practical, as only the public interface needs to be exposed. Privacy is an efficiency measure, primarily.
Python
Python, on the other hand, runs Python code using a dedicated interpreter runtime, itself a piece of machine code compiled from C code, which has a central evaluation loop that takes Python-specific op-codes to execute your code. Python source code is compiled into bytecode roughly at the module and function levels, stored as a nested tree of objects.
These objects are fully introspectable, using a common model of attributes, sequences and mappings. You can subclass classes without having to have access to additional header files.
In this model, a class is an object with references to base classes, as well as a mapping of attributes (which includes any functions which become bound methods through access on instances). Any code to be executed when a method is called on an instance is encapsulated in code objects attached to function objects stored in the class attribute mapping. The code objects are already compiled to bytecode, and interaction with other objects in the Python object model is through runtime lookups of references, with the attribute names used for those lookups stored as constants within the compiled bytecode if the source code used fixed names.
From the point of view of executing Python code, variables (state and local variables) live in dictionaries (the Python kind, ignoring the internal implementation as hash maps) or, for local variables in functions, in an array attached to the stack frame object. The Python interpreter translates access to these to access to values stored on the heap.
This makes Python slow, but also much more flexible when executing. You can not only introspect the object tree, most of the tree is writeable letting you replace objects at will and so change how the program behaves in nearly limitless ways. And again, there are no privacy controls enforced.
Why use class adapters in C++, and not in Python
My understanding is that experienced C++ coders will use a class adapter (using subclassing) over an object adapter (using composition), because they need to pass compiler-enforced type checks (they need to pass the instances to something that requires the Target class or a subclass thereof), and they need to have fine control over object lifetimes and memory footprints. So, rather than have to worry about the lifetime or memory footprint of an encapsulated instance when using composition, subclassing gives you more complete control over the instance lifetime of your adapter.
This is especially helpful when it might not be practical or even possible to alter the implementation of how the adaptee class would control instance lifetime. At the same time, you wouldn't want to deprive the compiler from optimisation opportunities offered by private and protected attribute access. A class that exposes both the Target and Adaptee interfaces offers fewer options for optimisation.
In Python you almost never have to deal with such issues. Python's object lifetime handling is straightforward, predictable and works the same for every object anyway. If lifetime management or memory footprints were to become an issue you'd probably already be moving the implementation to an extension language like C++ or C.
Next, most Python APIs do not require a specific class or subclass. They only care about the right protocols, that is, if the right methods and attributes are implemented. As long as your Adapter has the right methods and attributes, it'll do fine. See Duck Typing; if your adapter walks like a duck, and talks like a duck, it surely must be a duck. It doesn't matter if that same duck can also bark like a dog.
The practical reasons why you don't do this in Python
Let's move to practicalities. We'll need to update your example Adaptee class to make it a bit more realistic:
class Adaptee:
def __init__(self, arg_foo=42):
self.state = "foo"
self._bar = arg_foo % 17 + 2 * arg_foo
def _ham_spam(self):
if self._bar % 2 == 0:
return f"ham: {self._bar:06d}"
return f"spam: {self._bar:06d}"
def specific_request(self):
return self._ham_spam()
This object not only has a state attribute, it also has a _bar attribute and a private method _ham_spam.
Now, from here on out I'm going to ignore the fact that your basic premise is flawed because there is no privacy model in Python, and instead re-interpret your question as a request to rename the attributes.
For the above example that would become:
state -> __state
_bar -> __bar
_ham_spam -> __ham_spam
specific_request -> __specific_request
You now have a problem, because the code in _ham_spam and specific_request has already been compiled. The implementation for these methods expects to find _bar and _ham_spam attributes on the self object passed in when called. Those names are constants in their compiled bytecode:
>>> import dis
>>> dis.dis(Adaptee._ham_spam)
8 0 LOAD_FAST 0 (self)
2 LOAD_ATTR 0 (_bar)
4 LOAD_CONST 1 (2)
6 BINARY_MODULO
# .. etc. remainder elided ..
The LOAD_ATTR opcode in the above Python bytecode disassembly excerpt will only work correctly if the local variable self has an attribute named _bar.
Note that self can be bound to an instance of Adaptee as well as of Adapter, something you'd have to take into account if you wanted to change how this code operates.
So, it is not enough to simply rename method and attribute names.
Overcoming this problem would require one of two approaches:
intercept all attribute access on both the class and instance levels to translate between the two models.
rewriting the implementations of all methods
Neither of these is a good idea. Certainly neither of them are going to be more efficient or practical, compared to creating a composition adapter.
Impractical approach #1: rewrite all attribute access
Python is dynamic, and you could intercept all attribute access on both the class and the instance levels. You need both, because you have a mix of class attributes (_ham_spam and specific_request), and instance attributes (state and _bar).
You can intercept instance-level attribute access by implementing all methods in the Customizing attribute access section (you don't need __getattr__ for this case). You'll have to be very careful, because you'll need access to various attributes of your instances while controlling access to those very attributes. You'll need to handle setting and deleting as well as getting. This lets you control most attribute access on instances of Adapter().
You would do the same at the class level by creating a metaclass for whatever class your private() adapter would return, and implementing the exact same hook methods for attribute access there. You'll have to take into account that your class can have multiple base classes, so you'd need to handle these as layered namespaces, using their MRO ordering. Attribute interactions with the Adapter class (such as Adapter._special_request to introspect the inherited method from Adaptee) will be handled at this level.
Sounds easy enough, right? Except than the Python interpreter has many optimisations to ensure it isn't completely too slow for practical work. If you start intercepting every attribute access on instances, you will kill a lot of these optimisations (such as the method call optimisations introduced in Python 3.7). Worse, Python ignores the attribute access hooks for special method lookups.
And you have now injected a translation layer, implemented in Python, invoked multiple times for every interaction with the object. This will be a performance bottleneck.
Last but not least, to do this in a generic way, where you can expect private(Adaptee) to work in most circumstances is hard. Adaptee could have other reasons to implement the same hooks. Adapter or a sibling class in the hierarchy could also be implementing the same hooks, and implement them in a way that means the private(...) version is simply bypassed.
Invasive all-out attribute interception is fragile and hard to get right.
Impractical approach #2: rewriting the bytecode
This goes down the rabbit hole quite a bit further. If attribute rewriting isn't practical, how about rewriting the code of Adaptee?
Yes, you could, in principle, do this. There are tools available to directly rewrite bytecode, such as codetransformer. Or you could use the inspect.getsource() function to read the on-disk Python source code for a given function, then use the ast module to rewrite all attribute and method access, then compile the resulting updated AST to bytecode. You'd have to do so for all methods in the Adaptee MRO, and produce a replacement class dynamically that'll achieve what you want.
This, again, is not easy. The pytest project does something like this, they rewrite test assertions to provide much more detailed failure information than otherwise possible. This simple feature requires a 1000+ line module to achieve, paired with a 1600-line test suite to ensure that it does this correctly.
And what you've then achieved is bytecode that doesn't match the original source code, so anyone having to debug this code will have to deal with the fact that the source code the debugger sees doesn't match up with what Python is executing.
You'll also lose the dynamic connection with the original base class. Direct inheritance without code rewriting lets you dynamically update the Adaptee class, rewriting the code forces a disconnect.
Other reason these approaches can't work
I've ignored a further issue that neither of the above approaches can solve. Because Python doesn't have a privacy model, there are plenty of projects out there where code interacts with class state directly.
E.g., what if your Adaptee() implementation relies on a utility function that will try to access state or _bar directly? It's part of the same library, the author of that library would be well within their rights to assume that accessing Adaptee()._bar is safe and normal. Neither attribute intercepting nor code rewriting will fix this issue.
I also ignored the fact that isinstance(a, Adaptee) will still return True, but if you have hidden it's public API by renaming, you have broken that contract. For better or worse, Adapter is a subclass of Adaptee.
TLDR
So, in summary:
Python has no privacy model. There is no point in trying to enforce one here.
The practical reasons that necessitate the class adapter pattern in C++, don't exist in Python
Neither dynamic attribute proxying nor code tranformation is going to be practical in this case and introduce more problems than are being solved here.
You should instead use composition, or just accept that your adapter is both a Target and an Adaptee and so use subclassing to implement the methods required by the new interface without hiding the adaptee interface:
class CompositionAdapter(Target):
def __init__(self, adaptee):
self._adaptee = adaptee
def request(self):
return self._adaptee.state + self._adaptee.specific_request()
class SubclassingAdapter(Target, Adaptee):
def request(self):
return self.state + self.specific_request()
Python doesn't have a way of defining private members like you've described (docs).
You could use encapsulation instead of inheritance and call the method directly, as you noted in your comment. This would be my preferred approach, and it feels the most "pythonic".
class Adapter(Target):
def request(self):
return Adaptee.specific_request(self)
In general, Python's approach to classes is much more relaxed than what is found in C++. Python supports duck-typing, so there is no requirement to subclass Adaptee, as long as the interface of Target is satisfied.
If you really want to use inheritance, you could override interfaces you don't want exposed to raise an AttributeError, and use the underscore convention to denote private members.
class Adaptee:
def specific_request(self):
return "foobar"
# make "private" copy
_specific_request = specific_request
class Adapter(Target, Adaptee):
def request(self):
# call "private" implementation
return self._specific_request()
def specific_request(self):
raise AttributeError()
This question has more suggestions if you want alternatives for faking private methods.
If you really wanted true private methods, you could probably implement a metaclass that overrides object.__getattribute__. But I wouldn't recommend it.

Python encapsulated attributes on real world

I was doing some research about the use of encapsulation in object oriented programming using Python and I have stumbled with this topic that has mixed opinions about how encapsulated attributes work and about the usage of them.
I have programmed this piece of code that only made matters more confuse to me:
class Dog:
def __init__(self,weight):
self.weight = weight
__color =''
def set_color(self,color):
self.__color = color
def get_color(self):
print(self.__color)
rex = Dog(59)
rex.set_color('Black')
rex.get_color()
rex.color = 'White'
rex.__color = rex.color
print(rex.__color)
rex.get_color()
The result is:
>Black
>White
>Black
I understand that the reason behind this is because when we do the assignment rex.__color = rex.color, a new attribute is created that does not point to the real __color of the instanced Dog.
My questions here are:
Is this a common scenario to occur?
Are private attributes a thing used really often?
In a language that does not have properties (eg. java) this is so common that it has become a standard, and all frameworks assume that getters/setters already exist.
However, in python you can have properties, which are essentially getters/setters that can be added later without altering the code that uses the variables. So, no reason to do it in python. Use the fields as public, and add properties if something changes later.
Note: use single instead of double underscore in your "private" variables. Not only it's the common convention, but also, double underscore is handled differently by the interpreter.
Encapsulation is not about data hidding but about keeping state and behaviour together. Data hidding is meant as a way to enforce encapsulation by preventing direct access to internal state, so the client code must use (public) methods instead. The main points here are 1/ to allow the object to maintain a coherent state (check the values, eventually update some other part of the state accordingly etc) and 2/ to allow implementation changes (internal state / private methods) without breaking the client code.
Languages like Java have no support for computed attributes, so the only way to maintain encapsulation in such languages is to make all attributes protected or private and to eventally provide accessors. Alas, some people never got the "eventually" part right and insist on providing read/write accessors for all attributes, which is a complete nonsense.
Python has a strong support for computed attributes thru the descriptor protocol (mostly known via the generic property type but you can of course write your own desciptors instead), so there's no need for explicit getters/setters - if your design dictates that some class should provide a publicly accessible attribute as part of it's API, you can always start with just a public attribute and if at some point you need to change implementation you can just replace it with a computed attribute.
This doesn't mean you should make all your attributes public !!! - most of the time, you will have "implementation attributes" (attributes that support the internal state but are in no way part of the class API), and you definitly want to keep those protected (by prefixing them with a single leading underscore).
Note that Python doesn't try to technically enforce privacy, it's only a naming convention and you can't prevent client code to access internal state. Nothing to be worried about here, very few peoples stupid enough to bypass the official API without good reasons, and then they know their code might break something and assume all consequences.

Using getProperty() in a class instead of self.property?

I don't know the proper terminology for this so couldn't find anything online about this.
Take this example code:
def Fruit(object):
def __init__(self, color):
self._color = color
def color(self):
return self._color
Now, say I want to check to see whether a fruit is red:
def isRed(self):
if self._color == "red":
return True
return False
Would work perfectly fine. However, so does
def isRed(self):
if self.color() == "red":
return True
return False
Is there a reason why it is good practice to have a getProperty function? (I'm assuming it is, since an MIT professor, whose course I'm taking, does this with his classes and expects students to do the same on their homework.)
Are either of these two examples different, and why is it against convention to simply refer to the property by self.property?
Edit: Added underscore to make self._color for convention.
TL;DR: Not all general programming best practices aren't Python best practices. Getter and setter methods are a general (OOP) best practice, but not a Python best practice. Instead, use plain Python attributes when you can and switch to Python #propertys as-needed.
It many object-oriented programming languages (e.g. Java and C++), it is regarded as good practice to:
make data members (a.k.a. "attributes") private
provide getter and / or setter methods to access them
Why?
Enable change through "encapsulation", by keep interface stable while keeping implementation flexible (decoupling)
Allow for more granular access levels
Let's look at these in detail:
"encapsulation" in object orientation
One of the core ideas of object orientation is that bundling the definition of small chunks of data together with functionality related to that data makes imperative/"structured"/procedural programs more manageable and evolvable.
These bundles are called "objects". Each "class" is a template of a group objects with the same data structure (though potentially different data) and the same related functionality.
The data definition are the (non-static) data members of a class (the "attributes" of the objects). The related functionality is encoded in function members ("methods").
This can also be seen as a way to build new user-defined types. (Each class is a type, each object is kinda like a value.)
Often, the methods need more guarantees about the attribute values to work properly than the types of the data members already provide. Let's say you have
class Color() {
float red;
float green;
float blue;
float hue() {
return // ... some formula
}
float brightness {
return // ... some formula
}
}
If red, green and blue are in the range [0, 1], the implementation of these methods would probably depend on that fact. Similarly, if they were in the range [0, 256). And whatever the class-internal convention is, it is the task of the methods of that class to uphold it and only assign values to the data members that are acceptable.
Though, usually, objects of different classes have to interact for a meaningful object-oriented program. But you don't want to think about another class' internal conventions, just because you're accessing it, as that would require a lot of lookups to find out what those conventions are. So you shouldn't assign to the data members of objects of a class from code outside that class.
To avoid this happening by mistake or negligence, the widely accepted best practice in these languages is to declare all data members private. But that means that they cannot be read from outside, either! If the value is of interest to the outside, we can work around this by providing a non-private getter method that does nothing but provide the value of the attribute.
Enabling change while limiting ripple effects
Say the outside (e.g. another class) must be able to set the value of some attribute of your class. And say there aren't any restrictions necessary beyond what that attribute's type already imposes. Should you make that attribute public? (Still assuming this isn't in Python!) No! Instead, provide a setter method that does nothing but taking a value as argument and assigning it to the attribute!
Seems kinda dull, so why do that? So that we can change our mind later!
New side effect
Say you want to log to the console/terminal (std-out) each time the red-component of your color object changes. (For whatever reason.)
In a (setter) method, you add one line of code and it does that, without requiring any change in the callers.
But if you need to first switch from assigning to a public attribute to calling a setter method, all the code pieces doing assignments to these attributes (which might be many by that time) have to be changed, too! (Don't forget to make the attribute private, so that none will be forgotten.)
So it's better to have only private attributes from the beginning, and add setter method when code outside the class has to be able to set the value.
Change of internal representation
Say you just noticed that for your application, colors should really be represented internally as hue, value and saturation rather than red, green and blue components.
If you have setter and getter methods, the ones for red, green and blue will become more complicated due to the neccesary conversion calculations. (But the brightness and hue method will become much simpler.) Still, changing them can be much less work than having to change all the code outside the class that uses the class. As the interface stays the same, callers won't have to be changed at all and won't notice a difference.
But if you need to first switch from assigning to a public attribute to calling a setter method ... well, we've been there, haven't we?
decoupling
So accessor methods methods (that what we call getters and setters) help you decouple the public interface of a class from its internal implementation, and thereby the objects from their users. This allows you to change the internal implementation without breaking the public interface, so that code using your class doesn't have to be changed when you do that.
granular access levels
Need an attribute that can only be read from the outside, but not written from the outside? Easy: Provide only a getter method, but no setter method (and have the attribute itself be private).
Less common, but more common than you might think:
Need an attribute that can only be written from the outside, but not read from the outside? Easy: Provide only a setter method, but no getter method (and have the attribute itself be private).
Not sure if your attribute should be accessed (and accessible) from outside your class? Make it private and don't provide any getter and setter for now. You can always add them later. (And then think about what visibility level they should have.)
As you see, there's no reason to ever have a non-private attribute in a mutable object. (Assuming that the runtime overhead doesn't matter for your application (it probably doesn't, indeed) or is optimized away by the compiler (it probably is, at least partially).)
Not a security feature!
Note that "visibility" levels of attributes and methods are not meant for providing application security or privacy (they don't). They're tools to help programmers from making mistakes (by avoiding them to access stuff they shouldn't by accident), but they won't keep adversarial programmers from accessing that stuff anyway. Or, for that matter, honest programmers who think they know what they're doing (whether they do know or not) and willing to take the risk.
Python is different
In Python, everything is public
While Python is also imperative, "structured", procedural and very object-oriented, it takes a much more laid back approach to visibility. There is no real "private" visibility level in Python, nor "protected" or "package" (default in Java) levels.
Essentially, everything in a Python class is public.
This makes sense when Python is used as scripting language for quick-and-dirty ad-hoc solutions that you'll probably code once and then throw away (or keep like that without further development).
If you make more involved applications in Python (and that's certainly possible with Python and also done a lot) you'll probably want to distinguish between a class' public interface and its internal implementation details. Python provides two levels of "hiding" internal members (both, functions and data attributes):
by convention: _ prefix
by name mangling: __ prefix
"hiding" by convention
Beginning a name with _ signals to everyone outside a namespace (whether a class or a module or a package):
You shouldn't access this, unless you know what you're doing. And I (the implementor of stuff in that namespace) may change that at will, so you probably don't know what you will be doing by accessing it. Stuff may break. And if it does, it'll be your (the one accessing it) fault, not mine (the one implementing it). This member isn't a part of this namespace's public interface.
Yes, you can access it. That doesn't mean that you should. We're all adults here. Be responsible.
And you should adhere to that, even if you'd happen to not be an adult, yet.
hiding by name mangling
Beginning a name with __ signals to everyone outside a namespace (whether a class or a module or a package):
The same as with _ applies, only, you know, even stronger!
Additionally, and only if the namespace is a class (and the attribute name ends in no more than one underscore):
To make sure you don't access these things from outside by accident, Python "mangles" the names of these attributes for access from outside the class. The resulting name is perfectly predictable (it's _ + (simple) class name + original attribute name), so you can still access these things, but you most certainly won't simply by mistake.
Also, this can help avoid name collisions between members of base classes and members of their subclasses. (Though, it won't work as intended if the classes share the same class name, as the "simple class name" is used, not including modules and packages.)
In either case, you may have good reasons to access these values anyway (e.g. for debugging) and Python doesn't want to stand in your way when you do (or with name mangling, at most only slightly so.)
Python has method-based "properties" that can be accessed just like data attributes
So, as there is no real private in Python, we can't apply the pattern/style from Java and C++. But we might still need stable interfaces to do serious programming.
Good thing that in Python you can replace a data attribute with methods, without having to change its users. Pils19's answer provides an example:
class Fruit(object):
def __init__(self, color):
self._color = color
#property
def color(self):
return self._color
(Documentation of this decorator here.)
If we also provide a property-setter-method and a property-deleter-method ...
class Fruit(object):
def __init__(self, color):
self._color = color
#property
def color(self):
return self._color
#color.setter
def color(self, c):
self._color = c
#color.deleter
def color(self):
del self._color
Then this will act equivalent to a simple data attribute:
class Fruit(object):
def __init__(self, c):
self.color = c
But now we have all the freedom of methods. We can leave out any of them (most usual is to only have the getter, so you have a read-only attribute), we can give them additional or different behavior, etc.
This is the recommended approach in Python:
use (public) data members if in doubt
prefix with _ for implementation details
if/when you need additional/different behavior or to disable reading, writing or deleting, use properties or replace public data members with properties
Your professor
I'm assuming [that there is a good practice to define non-property getters and setters in Python], since an MIT professor, whose course I'm taking, does this with his classes and expects students to do the same on their homework.
Are you sure this is what your professor did, or did he use Python's properties mechanism?
If he did, is this a class about Python or does it just so happen that Python is used for the examples (and that your professor also used it to demonstrate something actually only applicable to other languages)?
And let's not forget: Even MIT professors might be forced to teach classes where they aren't experts on every aspect of the subject.
Normally it's a good practice to you the #Property decorator. And have the internal properties with an single leading underscore. For you example it would look like:
class Fruit(object):
def __init__(self, color):
self._color = color
#property
def color(self):
return self._color

Attribute naming convention when deriving from class that may change in the future

My question is quite general, but for clarity I'd like to give an example that is as concrete as possible: I was lately writing a class, which was derived from a matplotlib artist. A minimal working example would be the following:
from matplotlib import text
class TextChild(text.Text):
def __init__(self):
self._rotation = self.get_rotation()
The idea behind using an underscore self._rotation was to show the potential user not to access that attribute directly (i.e. to label it private). This turned out to be a bad idea, because text.Text also has an attribute called _rotation and I got very surprising results.
There are, of course, ways to deal with this.
One is to use a different attribute name, say, self._rotation2, but
the base class may be subject to change in the future, possibly
introducing new attributes and with a bit of bad luck names might
again match, which would break the derived class.
Another solution would be to use name mangling, i.e.
self.__rotation (the solution I chose). From what I understood,
however, name mangling should be used as sparsely as possible and if
I have many private attributes there will be a lot of double
underscores in the code.
So here is the question: Is there a preferred way of naming private class attributes when deriving from a class out of my own control that may change in the future?
Is there a preferred way of naming private class attributes when deriving from a class out of my own control that may change in the future?
It's really difficult to tell how you should choose the name of identifiers in your code, this is open to you. Generally speaking, it's your job as a programmer to avoid name collisions, some advanced IDEs can aide in this process.
For you question I believe using name mangling will definitely avoid name collisions somehow, this won't litter your code with underscores as you might think given that you use this feature wisely. If you're using a lot of redundant names, it's better to choose unique names instead. It's generally acceptable to use __name for attributes that you would like to ensure that they belong to their classes and please remember private in Python isn't really private, it's really pseudo-private. You'll still be able to access those attributes.
Here's one trick that you can use to avoid name collisions:
>>> "name" in dir(Foo)
True
So if name is already there in the namespace of class Foo, you would know from this single line and to get a list of all the attributes of class Foo just call dir with Foo as its argument: dir(Foo).
Mainly this is a design issue, but if I were in your position I'd opt to check with dir to ensure the uniqueness of my names to avoid overriding other names unintentionally. For example, if you read the codes of Python standard library, in many places the use of _name naming convention to denote this name should not be directly accessed from outside the class is pretty obvious.

Python classes, how to use them style-wise, and the Single Responsibility Principle [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I've been programming in Python for some time and have covered some knowledge in Python style but still have a problem on how to use classes properly.
When reading object oriented lecture I often find rules like Single Responsibility Principle that state
"The Single Responsibility Principle says that a class should have
one, and only one, reason to change"
Reading this, I might think of breaking one class into two, like:
class ComplicatedOperations(object):
def __init__(self, item):
pass
def do(self):
...
## lots of other functions
class CreateOption(object):
def __init__(self, simple_list):
self.simple_list = simple_list
def to_options(self):
operated_data = self.transform_data(self.simple_list)
return self.default_option() + operated_data
def default_option(self):
return [('', '')]
def transform_data(self, simple_list):
return [self.make_complicated_operations_that_requires_losts_of_manipulation(item)
for item in simple_list]
def make_complicated_operations_that_requires_losts_of_manipulation(self, item):
return ComplicatedOperations(item).do()
This, for me, raises lots of different questions; like:
When should I use class variables or pass arguments in class functions?
Should the ComplicatedOperations class be a class or just a bunch of functions?
Should the __init__ method be used to calculate the final result. Does that makes that class hard to test.
What are the rules for the pythonists?
Edited after answers:
So, reading Augusto theory, I would end up with something like this:
class ComplicatedOperations(object):
def __init__(self):
pass
def do(self, item):
...
## lots of other functions
def default_option():
return [('', '')]
def complicate_data(item):
return ComplicatedOperations().do(item)
def transform_data_to_options(simple_list):
return default_option() + [self.complicate_data(item)
for item in simple_list]
(Also corrected a small bug with default_option.)
When should I use class variables or pass arguments in class functions
In your example I would pass item into the do method. Also, this is related to programming in any language, give a class only the information it needs (Least Authority), and pass everything that is not internal to you algorithm via parameters (Depedency Injection), so, if the ComplicatedOperations does not need item to initialize itself, do not give it as a init parameter, and if it needs item to do it's job, give it as a parameter.
Should the ComplicatedOperations class be a class or just a bunch of functions
I'd say, depends. If you're using various kinds of operations, and they share some sort of interface or contract, absolutely. If the operation reflects some concept and all the methods are related to the class, sure. But if they are loose and unrelated, you might just use functions or think again about the Single Responsability and split the methods up into other classes
Should the init method be used to calculate the final result. Does that makes that class hard to test.
No, the init method is for initialization, you should do its work on a separated method.
As a side note, because of the lack of context, I did not understand what is CreateOption's role. If it is only used as show above, you might as well just remove it ...
I personally think of classes as of concepts. I'd define a Operation class which behaves like an operation, so contains a do() method, and every other method/property that may make it unique.
As mgilson correctly says, if you cannot define and isolate any concept, maybe a simple functional approach would be better.
To answer your questions:
you should use class attributes when a certain property is shared among the instances (in Python class attributes are initialized at compile time, so different object will see the same value. Usually class attributes should be constants). Use instance attributes to have object-specific properties to use in its methods without passing them. This doesn't mean you should put everything in self, but just what you consider characterising for your object. Use passed variables to have values that do not regard your object and may depend from the state of external objects (or on the execution of the program).
As said above, I'd keep one single class Operation and use a list of Operation objects to do your computations.
the init method would just instantiate the object and make all the processing needed for the proper behaviour of the object (in other words make it read to use).
Just think about the ideas you're trying to model.
A class generally represents a type of object. Class instances are specific objects of that type. A classic example is an Animal class. a cat would be an instance of Animal. class variables (I assume you mean those that belong to the instance rather than the class object itself), should be used for attributes of the instance. In this case, for example, colour could be a class attribute, which would be set as cat.colour = "white" or bear.colour = "brown". Arguments should be used where the value could come from some source outside the class. If the Animal class has a sleep method, it might need to know the duration of the sleep and posture that the animal sleeps in. duration would be an argument of the method, since it has no relation on the animal, but posture would be a class variable since it is determined by the animal.
In python, a class is typically used to group together a set of functions and variables which share a state. Continuing with the above example, a specific animal has a state which is shared across its methods and is defined by its attributes. If your class is just a group of functions which don't in any way depend on the state of the class, then they could just as easily be separate functions.
If __init__ is used to calculate the final result (which would have to be stored in an attribute of the class since __init__ cannot return a result), then you might as well use a function. A common pattern, however, is to do a lot of processing in __init__ via several other, sometimes private, methods of the class. The reason for this is that large complicated functions are often easier to test if they are broken down into smaller, distinct tasks, each of which can then be tested individually. However, this is usually only done when a class is needed anyway.
One approach to the whole business is to start out by deciding what functionality you need. When you have a group of functions or variables which all act on or apply to the same object, then it is time to move them into a class. Remember that Object Oriented Programming (OOP) is a design method suited to some tasks, but is not inherently superiour to functional programming (in fact, some programmers would argue the opposite!), so there's no need to use classes unless there is actually a need.
Classes are an organizational structure. So, if you are not using them to organize, you are doing it wrong. :)
There are several different things you can use them for organizing:
Bundle data with methods that use said data, defines one spot that the code will interact with this data
Bundle like functions together, provides understandable api since 'everyone knows' that all math functions are in the math object
Provide defined communications between methods, sets up a 'conveyor belt' of operations with a defined interface. Each operation is a black box, and can change arbitrarily, so long as it keeps to the standard
Abstract a concept. This can include sub classes, data, methods, so on and so forth all around some central idea like database access. This class then becomes a component you can use in other projects with a minimal amount of retooling
If you don't need to do some organizational thing like the above, then you should go for simplicity and program in a procedural/functional style. Python is about having a toolbox, not a hammer.

Categories