Related
How to use implementation inheritance in Python, that is to say public attributes x and protected attributes _x of the implementation inherited base classes becoming private attributes __x of the derived class?
In other words, in the derived class:
accessing the public attribute x or protected attribute _x should look up x or _x respectively like usual, except it should skip the implementation inherited base classes;
accessing the private attribute __x should look up __x like usual, except it should look up x and _x instead of __x for the implementation inherited base classes.
In C++, implementation inheritance is achieved by using the private access specifier in the base class declarations of a derived class, while the more common interface inheritance is achieved by using the public access specifier:
class A: public B, private C, private D, public E { /* class body */ };
For instance, implementation inheritance is needed to implement the class Adapter design pattern which relies on class inheritance (not to be confused with the object Adapter design pattern which relies on object composition) and consists in converting the interface of an Adaptee class into the interface of a Target abstract class by using an Adapter class that inherits both the interface of the Target abstract class and the implementation of the Adaptee class (cf. the Design Patterns book by Erich Gamma et al.):
Here is a Python program specifying what is intended, based on the above class diagram:
import abc
class Target(abc.ABC):
#abc.abstractmethod
def request(self):
raise NotImplementedError
class Adaptee:
def __init__(self):
self.state = "foo"
def specific_request(self):
return "bar"
class Adapter(Target, private(Adaptee)):
def request(self):
# Should access self.__state and Adaptee.specific_request(self)
return self.__state + self.__specific_request()
a = Adapter()
# Test 1: the implementation of Adaptee should be inherited
try:
assert a.request() == "foobar"
except AttributeError:
assert False
# Test 2: the interface of Adaptee should NOT be inherited
try:
a.specific_request()
except AttributeError:
pass
else:
assert False
You don't want to do this. Python is not C++, nor is C++ Python. How classes are implemented is completely different and so will lead to different design patterns. You do not need to use the class adapter pattern in Python, nor do you want to.
The only practical way to implement the adapter pattern in Python is either by using composition, or by subclassing the Adaptee without hiding that you did so.
I say practical here because there are ways to sort of make it work, but this path would take a lot of work to implement and is likely to introduce hard to track down bugs, and would make debugging and code maintenance much, much harder. Forget about 'is it possible', you need to worry about 'why would anyone ever want to do this'.
I'll try to explain why.
I'll also tell you how the impractical approaches might work. I'm not actually going to implement these, because that's way too much work for no gain, and I simply don't want to spend any time on that.
But first we have to clear several misconceptions here. There are some very fundamental gaps in your understanding of Python and how it's model differs from the C++ model: how privacy is handled, and compilation and execution philosophies, so lets start with those:
Privacy models
First of all, you can't apply C++'s privacy model to Python, because Python has no encapsulation privacy. At all. You need to let go of this idea, entirely.
Names starting with a single underscore are not actually private, not in the way C++ privacy works. Nor are they 'protected'. Using an underscore is just a convention, Python does not enforce access control. Any code can access any attribute on instances or classes, whatever naming convention was used. Instead, when you see a name that start with an underscore you can assume that the name is not part of the conventions of a public interface, that is, that these names can be changed without notice or consideration for backwards compatibility.
Quoting from the Python tutorial section on the subject:
“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice.
It's a good convention, but not even something you can rely on, consistently. E.g. the collections.namedtuple() class generator generates a class with 5 different methods and attributes that all start with an underscore but are all meant to be public, because the alternative would be to place arbitrary restrictions on what attribute names you can give the contained elements, and making it incredibly hard to add additional methods in future Python versions without breaking a lot of code.
Names starting with two underscores (and none at the end), are not private either, not in a class encapsulation sense such as the C++ model. They are class-private names, these names are re-written at compile time to produce a per-class namespace, to avoid collisions.
In other words, they are used to avoid a problem very similar to the namedtuple issue described above: to remove limits on what names a subclass can use. If you ever need to design base classes for use in a framework, where subclasses should have the freedom to name methods and attributes without limit, that's where you use __name class-private names. The Python compiler will rewrite __attribute_name to _ClassName__attribute_name when used inside a class statement as well as in any functions that are being defined inside a class statement.
Note that C++ doesn't use names to indicate privacy. Instead, privacy is a property of each identifier, within a given namespace, as processed by the compiler. The compiler enforces access control; private names are not accessible and will lead to compilation errors.
Without a privacy model, your requirement where "public attributes x and protected attributes _x of the implementation inherited base classes becoming private attributes __x of the derived class" are not attainable.
Compilation and execution models
C++
C++ compilation produces binary machine code aimed at execution directly by your CPU. If you want to extend a class from another project, you can only do so if you have access to additional information, in the form of header files, to describe what API is available. The compiler combines information in the header files with tables stored with the machine code and your source code to build more machine code; e.g. inheritance across library boundaries is handled through virtualisation tables.
Effectively, there is very little left of the objects used to construct the program with. You generally don't create references to class or method or function objects, the compiler has taken those abstract ideas as inputs but the output produced is machine code that doesn't need most of those concepts to exist any more. Variables (state, local variables in methods, etc.) are stored either on the heap or on the stack, and the machine code accesses these locations directly.
Privacy is used to direct compiler optimisations, because the compiler can, at all times, know exactly what code can change what state. Privacy also makes virtualisation tables and inheritance from 3rd-party libraries practical, as only the public interface needs to be exposed. Privacy is an efficiency measure, primarily.
Python
Python, on the other hand, runs Python code using a dedicated interpreter runtime, itself a piece of machine code compiled from C code, which has a central evaluation loop that takes Python-specific op-codes to execute your code. Python source code is compiled into bytecode roughly at the module and function levels, stored as a nested tree of objects.
These objects are fully introspectable, using a common model of attributes, sequences and mappings. You can subclass classes without having to have access to additional header files.
In this model, a class is an object with references to base classes, as well as a mapping of attributes (which includes any functions which become bound methods through access on instances). Any code to be executed when a method is called on an instance is encapsulated in code objects attached to function objects stored in the class attribute mapping. The code objects are already compiled to bytecode, and interaction with other objects in the Python object model is through runtime lookups of references, with the attribute names used for those lookups stored as constants within the compiled bytecode if the source code used fixed names.
From the point of view of executing Python code, variables (state and local variables) live in dictionaries (the Python kind, ignoring the internal implementation as hash maps) or, for local variables in functions, in an array attached to the stack frame object. The Python interpreter translates access to these to access to values stored on the heap.
This makes Python slow, but also much more flexible when executing. You can not only introspect the object tree, most of the tree is writeable letting you replace objects at will and so change how the program behaves in nearly limitless ways. And again, there are no privacy controls enforced.
Why use class adapters in C++, and not in Python
My understanding is that experienced C++ coders will use a class adapter (using subclassing) over an object adapter (using composition), because they need to pass compiler-enforced type checks (they need to pass the instances to something that requires the Target class or a subclass thereof), and they need to have fine control over object lifetimes and memory footprints. So, rather than have to worry about the lifetime or memory footprint of an encapsulated instance when using composition, subclassing gives you more complete control over the instance lifetime of your adapter.
This is especially helpful when it might not be practical or even possible to alter the implementation of how the adaptee class would control instance lifetime. At the same time, you wouldn't want to deprive the compiler from optimisation opportunities offered by private and protected attribute access. A class that exposes both the Target and Adaptee interfaces offers fewer options for optimisation.
In Python you almost never have to deal with such issues. Python's object lifetime handling is straightforward, predictable and works the same for every object anyway. If lifetime management or memory footprints were to become an issue you'd probably already be moving the implementation to an extension language like C++ or C.
Next, most Python APIs do not require a specific class or subclass. They only care about the right protocols, that is, if the right methods and attributes are implemented. As long as your Adapter has the right methods and attributes, it'll do fine. See Duck Typing; if your adapter walks like a duck, and talks like a duck, it surely must be a duck. It doesn't matter if that same duck can also bark like a dog.
The practical reasons why you don't do this in Python
Let's move to practicalities. We'll need to update your example Adaptee class to make it a bit more realistic:
class Adaptee:
def __init__(self, arg_foo=42):
self.state = "foo"
self._bar = arg_foo % 17 + 2 * arg_foo
def _ham_spam(self):
if self._bar % 2 == 0:
return f"ham: {self._bar:06d}"
return f"spam: {self._bar:06d}"
def specific_request(self):
return self._ham_spam()
This object not only has a state attribute, it also has a _bar attribute and a private method _ham_spam.
Now, from here on out I'm going to ignore the fact that your basic premise is flawed because there is no privacy model in Python, and instead re-interpret your question as a request to rename the attributes.
For the above example that would become:
state -> __state
_bar -> __bar
_ham_spam -> __ham_spam
specific_request -> __specific_request
You now have a problem, because the code in _ham_spam and specific_request has already been compiled. The implementation for these methods expects to find _bar and _ham_spam attributes on the self object passed in when called. Those names are constants in their compiled bytecode:
>>> import dis
>>> dis.dis(Adaptee._ham_spam)
8 0 LOAD_FAST 0 (self)
2 LOAD_ATTR 0 (_bar)
4 LOAD_CONST 1 (2)
6 BINARY_MODULO
# .. etc. remainder elided ..
The LOAD_ATTR opcode in the above Python bytecode disassembly excerpt will only work correctly if the local variable self has an attribute named _bar.
Note that self can be bound to an instance of Adaptee as well as of Adapter, something you'd have to take into account if you wanted to change how this code operates.
So, it is not enough to simply rename method and attribute names.
Overcoming this problem would require one of two approaches:
intercept all attribute access on both the class and instance levels to translate between the two models.
rewriting the implementations of all methods
Neither of these is a good idea. Certainly neither of them are going to be more efficient or practical, compared to creating a composition adapter.
Impractical approach #1: rewrite all attribute access
Python is dynamic, and you could intercept all attribute access on both the class and the instance levels. You need both, because you have a mix of class attributes (_ham_spam and specific_request), and instance attributes (state and _bar).
You can intercept instance-level attribute access by implementing all methods in the Customizing attribute access section (you don't need __getattr__ for this case). You'll have to be very careful, because you'll need access to various attributes of your instances while controlling access to those very attributes. You'll need to handle setting and deleting as well as getting. This lets you control most attribute access on instances of Adapter().
You would do the same at the class level by creating a metaclass for whatever class your private() adapter would return, and implementing the exact same hook methods for attribute access there. You'll have to take into account that your class can have multiple base classes, so you'd need to handle these as layered namespaces, using their MRO ordering. Attribute interactions with the Adapter class (such as Adapter._special_request to introspect the inherited method from Adaptee) will be handled at this level.
Sounds easy enough, right? Except than the Python interpreter has many optimisations to ensure it isn't completely too slow for practical work. If you start intercepting every attribute access on instances, you will kill a lot of these optimisations (such as the method call optimisations introduced in Python 3.7). Worse, Python ignores the attribute access hooks for special method lookups.
And you have now injected a translation layer, implemented in Python, invoked multiple times for every interaction with the object. This will be a performance bottleneck.
Last but not least, to do this in a generic way, where you can expect private(Adaptee) to work in most circumstances is hard. Adaptee could have other reasons to implement the same hooks. Adapter or a sibling class in the hierarchy could also be implementing the same hooks, and implement them in a way that means the private(...) version is simply bypassed.
Invasive all-out attribute interception is fragile and hard to get right.
Impractical approach #2: rewriting the bytecode
This goes down the rabbit hole quite a bit further. If attribute rewriting isn't practical, how about rewriting the code of Adaptee?
Yes, you could, in principle, do this. There are tools available to directly rewrite bytecode, such as codetransformer. Or you could use the inspect.getsource() function to read the on-disk Python source code for a given function, then use the ast module to rewrite all attribute and method access, then compile the resulting updated AST to bytecode. You'd have to do so for all methods in the Adaptee MRO, and produce a replacement class dynamically that'll achieve what you want.
This, again, is not easy. The pytest project does something like this, they rewrite test assertions to provide much more detailed failure information than otherwise possible. This simple feature requires a 1000+ line module to achieve, paired with a 1600-line test suite to ensure that it does this correctly.
And what you've then achieved is bytecode that doesn't match the original source code, so anyone having to debug this code will have to deal with the fact that the source code the debugger sees doesn't match up with what Python is executing.
You'll also lose the dynamic connection with the original base class. Direct inheritance without code rewriting lets you dynamically update the Adaptee class, rewriting the code forces a disconnect.
Other reason these approaches can't work
I've ignored a further issue that neither of the above approaches can solve. Because Python doesn't have a privacy model, there are plenty of projects out there where code interacts with class state directly.
E.g., what if your Adaptee() implementation relies on a utility function that will try to access state or _bar directly? It's part of the same library, the author of that library would be well within their rights to assume that accessing Adaptee()._bar is safe and normal. Neither attribute intercepting nor code rewriting will fix this issue.
I also ignored the fact that isinstance(a, Adaptee) will still return True, but if you have hidden it's public API by renaming, you have broken that contract. For better or worse, Adapter is a subclass of Adaptee.
TLDR
So, in summary:
Python has no privacy model. There is no point in trying to enforce one here.
The practical reasons that necessitate the class adapter pattern in C++, don't exist in Python
Neither dynamic attribute proxying nor code tranformation is going to be practical in this case and introduce more problems than are being solved here.
You should instead use composition, or just accept that your adapter is both a Target and an Adaptee and so use subclassing to implement the methods required by the new interface without hiding the adaptee interface:
class CompositionAdapter(Target):
def __init__(self, adaptee):
self._adaptee = adaptee
def request(self):
return self._adaptee.state + self._adaptee.specific_request()
class SubclassingAdapter(Target, Adaptee):
def request(self):
return self.state + self.specific_request()
Python doesn't have a way of defining private members like you've described (docs).
You could use encapsulation instead of inheritance and call the method directly, as you noted in your comment. This would be my preferred approach, and it feels the most "pythonic".
class Adapter(Target):
def request(self):
return Adaptee.specific_request(self)
In general, Python's approach to classes is much more relaxed than what is found in C++. Python supports duck-typing, so there is no requirement to subclass Adaptee, as long as the interface of Target is satisfied.
If you really want to use inheritance, you could override interfaces you don't want exposed to raise an AttributeError, and use the underscore convention to denote private members.
class Adaptee:
def specific_request(self):
return "foobar"
# make "private" copy
_specific_request = specific_request
class Adapter(Target, Adaptee):
def request(self):
# call "private" implementation
return self._specific_request()
def specific_request(self):
raise AttributeError()
This question has more suggestions if you want alternatives for faking private methods.
If you really wanted true private methods, you could probably implement a metaclass that overrides object.__getattribute__. But I wouldn't recommend it.
I don't know the proper terminology for this so couldn't find anything online about this.
Take this example code:
def Fruit(object):
def __init__(self, color):
self._color = color
def color(self):
return self._color
Now, say I want to check to see whether a fruit is red:
def isRed(self):
if self._color == "red":
return True
return False
Would work perfectly fine. However, so does
def isRed(self):
if self.color() == "red":
return True
return False
Is there a reason why it is good practice to have a getProperty function? (I'm assuming it is, since an MIT professor, whose course I'm taking, does this with his classes and expects students to do the same on their homework.)
Are either of these two examples different, and why is it against convention to simply refer to the property by self.property?
Edit: Added underscore to make self._color for convention.
TL;DR: Not all general programming best practices aren't Python best practices. Getter and setter methods are a general (OOP) best practice, but not a Python best practice. Instead, use plain Python attributes when you can and switch to Python #propertys as-needed.
It many object-oriented programming languages (e.g. Java and C++), it is regarded as good practice to:
make data members (a.k.a. "attributes") private
provide getter and / or setter methods to access them
Why?
Enable change through "encapsulation", by keep interface stable while keeping implementation flexible (decoupling)
Allow for more granular access levels
Let's look at these in detail:
"encapsulation" in object orientation
One of the core ideas of object orientation is that bundling the definition of small chunks of data together with functionality related to that data makes imperative/"structured"/procedural programs more manageable and evolvable.
These bundles are called "objects". Each "class" is a template of a group objects with the same data structure (though potentially different data) and the same related functionality.
The data definition are the (non-static) data members of a class (the "attributes" of the objects). The related functionality is encoded in function members ("methods").
This can also be seen as a way to build new user-defined types. (Each class is a type, each object is kinda like a value.)
Often, the methods need more guarantees about the attribute values to work properly than the types of the data members already provide. Let's say you have
class Color() {
float red;
float green;
float blue;
float hue() {
return // ... some formula
}
float brightness {
return // ... some formula
}
}
If red, green and blue are in the range [0, 1], the implementation of these methods would probably depend on that fact. Similarly, if they were in the range [0, 256). And whatever the class-internal convention is, it is the task of the methods of that class to uphold it and only assign values to the data members that are acceptable.
Though, usually, objects of different classes have to interact for a meaningful object-oriented program. But you don't want to think about another class' internal conventions, just because you're accessing it, as that would require a lot of lookups to find out what those conventions are. So you shouldn't assign to the data members of objects of a class from code outside that class.
To avoid this happening by mistake or negligence, the widely accepted best practice in these languages is to declare all data members private. But that means that they cannot be read from outside, either! If the value is of interest to the outside, we can work around this by providing a non-private getter method that does nothing but provide the value of the attribute.
Enabling change while limiting ripple effects
Say the outside (e.g. another class) must be able to set the value of some attribute of your class. And say there aren't any restrictions necessary beyond what that attribute's type already imposes. Should you make that attribute public? (Still assuming this isn't in Python!) No! Instead, provide a setter method that does nothing but taking a value as argument and assigning it to the attribute!
Seems kinda dull, so why do that? So that we can change our mind later!
New side effect
Say you want to log to the console/terminal (std-out) each time the red-component of your color object changes. (For whatever reason.)
In a (setter) method, you add one line of code and it does that, without requiring any change in the callers.
But if you need to first switch from assigning to a public attribute to calling a setter method, all the code pieces doing assignments to these attributes (which might be many by that time) have to be changed, too! (Don't forget to make the attribute private, so that none will be forgotten.)
So it's better to have only private attributes from the beginning, and add setter method when code outside the class has to be able to set the value.
Change of internal representation
Say you just noticed that for your application, colors should really be represented internally as hue, value and saturation rather than red, green and blue components.
If you have setter and getter methods, the ones for red, green and blue will become more complicated due to the neccesary conversion calculations. (But the brightness and hue method will become much simpler.) Still, changing them can be much less work than having to change all the code outside the class that uses the class. As the interface stays the same, callers won't have to be changed at all and won't notice a difference.
But if you need to first switch from assigning to a public attribute to calling a setter method ... well, we've been there, haven't we?
decoupling
So accessor methods methods (that what we call getters and setters) help you decouple the public interface of a class from its internal implementation, and thereby the objects from their users. This allows you to change the internal implementation without breaking the public interface, so that code using your class doesn't have to be changed when you do that.
granular access levels
Need an attribute that can only be read from the outside, but not written from the outside? Easy: Provide only a getter method, but no setter method (and have the attribute itself be private).
Less common, but more common than you might think:
Need an attribute that can only be written from the outside, but not read from the outside? Easy: Provide only a setter method, but no getter method (and have the attribute itself be private).
Not sure if your attribute should be accessed (and accessible) from outside your class? Make it private and don't provide any getter and setter for now. You can always add them later. (And then think about what visibility level they should have.)
As you see, there's no reason to ever have a non-private attribute in a mutable object. (Assuming that the runtime overhead doesn't matter for your application (it probably doesn't, indeed) or is optimized away by the compiler (it probably is, at least partially).)
Not a security feature!
Note that "visibility" levels of attributes and methods are not meant for providing application security or privacy (they don't). They're tools to help programmers from making mistakes (by avoiding them to access stuff they shouldn't by accident), but they won't keep adversarial programmers from accessing that stuff anyway. Or, for that matter, honest programmers who think they know what they're doing (whether they do know or not) and willing to take the risk.
Python is different
In Python, everything is public
While Python is also imperative, "structured", procedural and very object-oriented, it takes a much more laid back approach to visibility. There is no real "private" visibility level in Python, nor "protected" or "package" (default in Java) levels.
Essentially, everything in a Python class is public.
This makes sense when Python is used as scripting language for quick-and-dirty ad-hoc solutions that you'll probably code once and then throw away (or keep like that without further development).
If you make more involved applications in Python (and that's certainly possible with Python and also done a lot) you'll probably want to distinguish between a class' public interface and its internal implementation details. Python provides two levels of "hiding" internal members (both, functions and data attributes):
by convention: _ prefix
by name mangling: __ prefix
"hiding" by convention
Beginning a name with _ signals to everyone outside a namespace (whether a class or a module or a package):
You shouldn't access this, unless you know what you're doing. And I (the implementor of stuff in that namespace) may change that at will, so you probably don't know what you will be doing by accessing it. Stuff may break. And if it does, it'll be your (the one accessing it) fault, not mine (the one implementing it). This member isn't a part of this namespace's public interface.
Yes, you can access it. That doesn't mean that you should. We're all adults here. Be responsible.
And you should adhere to that, even if you'd happen to not be an adult, yet.
hiding by name mangling
Beginning a name with __ signals to everyone outside a namespace (whether a class or a module or a package):
The same as with _ applies, only, you know, even stronger!
Additionally, and only if the namespace is a class (and the attribute name ends in no more than one underscore):
To make sure you don't access these things from outside by accident, Python "mangles" the names of these attributes for access from outside the class. The resulting name is perfectly predictable (it's _ + (simple) class name + original attribute name), so you can still access these things, but you most certainly won't simply by mistake.
Also, this can help avoid name collisions between members of base classes and members of their subclasses. (Though, it won't work as intended if the classes share the same class name, as the "simple class name" is used, not including modules and packages.)
In either case, you may have good reasons to access these values anyway (e.g. for debugging) and Python doesn't want to stand in your way when you do (or with name mangling, at most only slightly so.)
Python has method-based "properties" that can be accessed just like data attributes
So, as there is no real private in Python, we can't apply the pattern/style from Java and C++. But we might still need stable interfaces to do serious programming.
Good thing that in Python you can replace a data attribute with methods, without having to change its users. Pils19's answer provides an example:
class Fruit(object):
def __init__(self, color):
self._color = color
#property
def color(self):
return self._color
(Documentation of this decorator here.)
If we also provide a property-setter-method and a property-deleter-method ...
class Fruit(object):
def __init__(self, color):
self._color = color
#property
def color(self):
return self._color
#color.setter
def color(self, c):
self._color = c
#color.deleter
def color(self):
del self._color
Then this will act equivalent to a simple data attribute:
class Fruit(object):
def __init__(self, c):
self.color = c
But now we have all the freedom of methods. We can leave out any of them (most usual is to only have the getter, so you have a read-only attribute), we can give them additional or different behavior, etc.
This is the recommended approach in Python:
use (public) data members if in doubt
prefix with _ for implementation details
if/when you need additional/different behavior or to disable reading, writing or deleting, use properties or replace public data members with properties
Your professor
I'm assuming [that there is a good practice to define non-property getters and setters in Python], since an MIT professor, whose course I'm taking, does this with his classes and expects students to do the same on their homework.
Are you sure this is what your professor did, or did he use Python's properties mechanism?
If he did, is this a class about Python or does it just so happen that Python is used for the examples (and that your professor also used it to demonstrate something actually only applicable to other languages)?
And let's not forget: Even MIT professors might be forced to teach classes where they aren't experts on every aspect of the subject.
Normally it's a good practice to you the #Property decorator. And have the internal properties with an single leading underscore. For you example it would look like:
class Fruit(object):
def __init__(self, color):
self._color = color
#property
def color(self):
return self._color
I had an interview today. I had a question from OOP, about the difference between Encapsulation & Abstraction?
I replied to my knowledge that Encapsulation is basically binding data members & member functions into a single unit called Class. Whereas Abstraction is basically to hide implementation complexity & provide ease of access to the users. I thought she would be okay with my answer. But she queried if the purpose of both is to hide information then what the actual difference between these two is? I could not give any answer to her.
Before asking this question, I read other threads on StackOverFlow about the difference between these two OOPs concepts. But I am not finding myself in a position to convince the interviewer.
Can anyone please justify it with the simplest example?
Encapsulation hides variables or some implementation that may be changed so often in a class to prevent outsiders access it directly. They must access it via getter and setter methods.
Abstraction is used to hide something too, but in a higher degree (class, interface). Clients who use an abstract class (or interface) do not care about what it was, they just need to know what it can do.
This image sums pretty well the difference between both:
Source here
Encapsulation: Wrapping code and data together into a single unit. Class is an example of encapsulation, because it wraps the method and property.
Abstraction: Hiding internal details and showing functionality only. Abstraction focus on what the object does instead of how it does. It provides generalized view of classes.
int number = 5;
string aStringNumber = number.ToString();
Here, ToString() is abstraction. And how this mechanism number variable converted to string and initialize into aStringNumber is encapsulation.
Let us take a real world example of calculator. Encapsulation is the internal circuits, battery, etc., that combine to make it a calculator. Abstraction is the different buttons like on-off, clear and other buttons provided to operate it.
Abstraction - is the process (and result of this process) of identifying the common essential characteristics for a set of objects.
One might say that Abstraction is the process of generalization: all objects under consideration are included in a superset of objects, all of which possess given properties (but are different in other respects).
Encapsulation - is the process of enclosing data and functions manipulating this data into a single unit, so that to hide the internal implementation from the outside world.
This is a general answer not related to a specific programming language (as was the question). So the answer is: abstraction and encapsulation have nothing in common. But their implementations might relate to each other (say, in Java: Encapsulation - details are hidden in a class, Abstraction - details are not present at all in a class or interface).
Yes !!!!
If I say Encapsulation is a kind of an advanced specific scope abstraction,
How many of you read/upvote my answer. Let's dig into why I am saying this.
I need to clear two things before my claim.
One is data hiding and, another one is the abstraction
Data hiding
Most of the time, we will not give direct access to our internal data. Our internal data should not go out directly that is an outside person can't access our internal data directly. It's all about security since we need to protect the internal states of a particular object.
Abstraction
For simplicity, hide the internal implementations is called abstraction. In abstraction, we only focus on the necessary things. Basically, We talk about "What to do" and not "How to do" in abstraction.
Security also can be achieved by abstraction since we are not going to highlight "how we are implementing". Maintainability will be increased since we can alter the implementation but it will not affect our end user.
I said, "Encapsulation is a kind of an advanced specific scope abstraction". Why? because we can see encapsulation as data hiding + abstraction
encapsulation = data hiding + abstraction
In encapsulation, we need to hide the data so the outside person can not see the data and we need to provide methods that can be used to access the data. These methods may have validations or other features inside those things also hidden to an outside person. So here, we are hiding the implementation of access methods and it is called abstraction.
This is why I said like above encapsulation is a kind of abstraction.
So Where is the difference?
The difference is the abstraction is a general one if we are hiding something from the user for simplicity, maintainability and security and,
encapsulation is a specific one for which is related to internal states security where we are hiding the internal state (data hiding) and we are providing methods to access the data and those methods implementation also hidden from the outside person(abstraction).
Why we need abstraction
When you do designs, you will not talk about implementations. You say If you give these parameters to this method it will give these output.
We hide the internal implementation of the method and talk about what it will do so this is an abstraction.
Example
public int add(int a, int b);
This method definition tells us that if you give two variables it will do addition and return the result.
here we will not look at the implementation and we ay only what this method does and not how it does.
Method implementations can be differs based on developers.
1.
public int add(int a, int b){
return a + b;
}
public int add(int a, int b){
return b + a;
}
Two methods are doing the same thing what their implementation differs.
Basically,
Abstraction is needed to model the system. Encapsulation is needed to enhance system security.
Abstraction:
Is usually done to provide polymorphic access to a set of classes.
An abstract class cannot be instantiated thus another class will have to derive from it to create a more concrete representation.
A common usage example of an abstract class can be an implementation of a template method design pattern where an abstract injection point is introduces so that the concrete class can implement it in its own "concrete" way.
see: http://en.wikipedia.org/wiki/Abstraction_(computer_science)
Encapsulation:
It is the process of hiding the implementation complexity of a specific class from the client that is going to use it, keep in mind that the "client" may be a program or event the person who wrote the class.
see: http://en.wikipedia.org/wiki/Encapsulation_(object-oriented_programming)
There is a great article that touches on differences between Abstraction, Encapsulation and Information hiding in depth: http://www.tonymarston.co.uk/php-mysql/abstraction.txt
Here is the conclusion from the article:
Abstraction, information hiding, and encapsulation are very different,
but highly-related, concepts. One could argue that abstraction is a
technique that helps us identify which specific information should be
visible, and which information should be hidden. Encapsulation is then
the technique for packaging the information in such a way as to hide
what should be hidden, and make visible what is intended to be
visible.
A very practical example is.
let's just say I want to encrypt my password.
I don't want to know the details, I just call
encryptionImpl.encrypt(password) and it returns an encrypted
password.
public interface Encryption{ public String encrypt(String password); }
This is called abstraction. It just shows what should be done.
Now let us assume We have Two types of Encryption Md5 and RSA which
implement Encryption from a third-party encryption jar.
Then those Encryption classes have their own way of implementing
encryption which protects their implementation from outsiders
This is called Encapsulation. Hides how it should be done.
Remember:what should be done vs how it should be done.
Hiding complications vs Protecting implementations
Yes, it is true that Abstraction and Encapsulation are about hiding.
Using only relevant details and hiding unnecessary data at Design Level is called Abstraction. (Like selecting only relevant properties for a class 'Car' to make it more abstract or general.)
Encapsulation is the hiding of data at Implementation Level. Like how to actually hide data from direct/external access. This is done by binding data and methods to a single entity/unit to prevent external access. Thus, encapsulation is also known as data hiding at implementation level.
Encapsulation:
Hiding something, sort of like medicine capsule. We don't know what is in the capsule, we just take it. Same as in programming - we just hide some special code of method or property and it only gives output, same as capsule. In short, encapsulation hides data.
Abstraction:
Abstraction means hiding logic or implementation. For example, we take tablets and see their color and but don't know what is the purpose of this and how it works with the body.
difference in both is just the View Point
Encapsulation word is used for hiding data if our aim is to prevent client seeing inside view of our logic
Abstraction word is used for hiding data if our aim is to show our client a out side view
Outside view means that let suppose
BubbleSort(){
//code
swap(x,y);
}
here we use swap in bubble sort for just showing our client what logic we are applying, If we replace swap(x,y) with whole code here, In a single instance he/she can't understand our logic
Let me explain it in with the same example discussed above. Kindly consider the same TV.
Encapsulation: The adjustments we can make with the remote is a good example - Volume UP/DOWN, Color & Contrast - All we can do is adjust it to the min and max value provided and cannot do anything beyond what is provided in the remote - Imagine the getter and setter here(The setter function will check whether the value provided is valid if Yes, it process the operation if not won't allow us to make changes - like we cannot decrease the volume beyond zero even we press the volume down button a hundred times).
Abstraction: We can take the same example here but with a higher Degree/Context. The volume down button will decrease the volume - and this is the info we provide to the user and the user is not aware of neither the infrared transmitter inside the remote nor the receiver in the TV and the subsequent process of parsing the signal and the microprocessor architecture inside the TV. Simply put it is not needed in the context - Just provide what is necessary. One can easily relate the Text book definition here ie., Hiding the inner implementation and only providing what it will do rather than how it do that!
Hope it clarifies a bit!
Briefly, Abstraction happens at class level by hiding implementation and implementing an interface to be able to interact with the instance of the class. Whereas, Encapsulation is used to hide information; for instance, making the member variables private to ban the direct access and providing getters and setters for them for indicrect access.
Encapsulation is wrapping up of data and methods in a single unit and making the data accessible only through methods(getter/setter) to ensure safety of data.
Abstraction is hiding internal implementation details of how work is done.
Take and example of following stack class:
Class Stack
{
private top;
void push();
int pop();
}
Now encapsulation helps to safeguard internal data as top cannot be accessed directly outside.
And abstraction helps to do push or pop on stack without worrying about what are steps to push or pop
Abstraction
As the name suggests abstract means summary or brief about somtehing. In case of OOP Abstract Classes are the ones which do not contain every information about that object in real world, for eg. you want to book a hotel room, if your object is that room you mainly care about:
its prices, size, beds etc.
but you do not care about
the wiring they have used in the hotel room for electricity.
which cement they have used to build it up
So, you get abstracted information about the room which you care about.
On the other hand, Encapsulation is basically capsulating the related information together, for eg. you booked the hotel room, you go there and switch on a bulb by pressing the switch. Now the switch object has all internal wirings which are required to switch that bulb ON, but you really do not care about those wirings. You care only about bulb is switched ON or not.
Now one can argue that abstraction also applies here:
one can say the internal wiring of the switch is also abstracted to you, so this must be case of abstraction but here are some subtle differences:
Abstraction is more of a contextual thing, it does not have the non abstracted information, like the wiring info which you do not care about, is not present in the context of website for booking hotel room (like your class room do not have information about the wiring grid of it, since this room is delegated for online booking only) , whereas encapsulation is more granular, it means hiding and capsulating the granular things which you do not need to care about, for switching the bulb ON the switch hides the wiring inside the switch board (like private attributes/methods of classes).
Now the switch class has the information but it is hidden to you. On the other hand room class does not have the information about wiring design of a hotel room since it is not even in the context of online booking of the room
Thus, the abstraction is more related to classes and encapsulation is more related to internal of the class objects, attributes and methods.
Abstraction
is the process of hiding the how, and only showing the what
the purpose is to simplify information and hide unnecessary details from the user
Encapsulation
is the process of wrapping data and functionality into a single unit
the purpose is to protect data, by preventing direct access and only providing a safer and indirect way
In simple terms, Encapsulation is data hiding(information hiding) while Abstraction is detail hiding(implementation hiding)
Abstraction
In Java, abstraction means hiding the information to the real world. It establishes the contract between the party to tell about “what should we do to make use of the service”.
Example, In API development, only abstracted information of the service has been revealed to the world rather the actual implementation. Interface in java can help achieve this concept very well.
Interface provides contract between the parties, example, producer and consumer. Producer produces the goods without letting know the consumer how the product is being made. But, through interface, Producer let all consumer know what product can buy. With the help of abstraction, producer can markets the product to their consumers.
Encapsulation:
Encapsulation is one level down of abstraction. Same product company try shielding information from each other production group. Example, if a company produce wine and chocolate, encapsulation helps shielding information how each product Is being made from each other.
If I have individual package one for wine and another one for chocolate, and if all the classes are declared in the package as default access modifier, we are giving package level encapsulation for all classes.
Within a package, if we declare each class filed (member field) as
private and having a public method to access those fields, this way
giving class level encapsulation to those fields
If I am the one who faced the interview, I would say that as the end-user perspective abstraction and encapsulation are fairly same. It is nothing but information hiding. As a Software Developer perspective, Abstraction solves the problems at the design level and Encapsulation solves the problem in implementation level
Encapsulation is the composition of meaning.
Abstraction is the simplification of meaning.
Just a few more points to make thing clear,
One must not confuse data abstraction and the abstract class. They are different.
Generally we say abstract class or method is to basically hide something. But no.. That is wrong. What is the word abstract means ? Google search says the English word abstraction means
"Existing in thought or as an idea but not having a physical or concrete existence."
And thats right in case of abstract class too. It is not hiding the content of the method but the method's content is already empty (not having a physical or concrete existence) but it determines how a method should be (existing in thought or as an idea) or a method should be in the calss.
So when do you actually use abstract methods ?
When a method from base class will differ in each child class that extends it.
And so you want to make sure the child class have this function implemented.
This also ensures that method, to have compulsory signature like, it must have n number of parameters.
So about abstract class!
- An Abstract class cannot be instantiated only extended! But why ?
A class with abstract method must be prevented from creating its own instance because the abstract methods in it, are not having any meaningful implementation.
You can even make a class abstract, if for some reason you find that it is meaning less to have a instance of your that class.
An Abstract class help us avoid creating new instance of it!
An abstract method in a class forces the child class to implement that function for sure with the provided signature!
Abstraction: what are the minimum functions and variables that should be exposed to the outside of our class.
Encapsulation: how to achieve this requirement, meaning how to implement it.
I would like to set up a class hierarchy in Python 3.2 with 'protected' access: Members of the base class would be in scope only for derived classes, but not 'public'.
A double underscore makes a member 'private', a single underscore indicates a warning but the member remains 'public'. What (if any...) is the correct syntax for designating a 'protected' member.
Member access allowance in Python works by "negotiation" and "treaties", not by force.
In other words, the user of your class is supposed to leave their hands off things which are not their business, but you cannot enforce that other than my using _xxx identifiers making absolutely clear that their access is (normally) not suitable.
Double underscores don't make a member 'private' in the C++ or Java sense - Python quite explicitly eschews that kind of language-enforced access rules. A single underscore, by convention, marks an attribute or a method as an "implementation detail" - that is, things outside can still get to it, but this isn't a supported part of the class' interface and, therefore, the guarantees that the class might make about invariants or back/forwards compatibility no longer apply. This solves the same conceptual problem as 'private' (separation of interface and implementation) in a different way.
Double underscores invoke name mangling which still isn't 'private' - it is just a slightly stronger formulation of the above, whereby:
- This function is an implementation detail of this class, but
- Subclasses might reasonably expect to have a method of the same name that isn't meant as an overridden version of the original
This takes a little bit of language support, whereby the __name is mangled to include the name of the class - so that subclass versions of it get different names instead of overriding. It is still quite possible for a subclass or outside code to call that method if it really wants to - and the goal of name mangling is explicitly not to prevent that.
But because of all this, 'protected' turns out not to make much sense in Python - if you really have a method that could break invariants unless called by a subclass (and, realistically, you probably don't even if you think you do), the Python Way is just to document that. Put a note in your docstring to the effect of "This is assumed to only be called by subclasses", and run with the assumption that clients will do the right thing - because if they don't, it becomes their own problem.
I recently posted a question on stackoverflow and I got a resolution.
Some one suggested to me about the coding style and I haven't received further input. I have the following question with reference to the prior query.
How can we declare private variables inside a class in python? I thought that by using a double underscore (__) the variable is treated as private. Please correct me.
As per the suggestion received before, we don't have to use a getter or setter method. Shouldn't we use a getter or setter or both? Please let me know your suggestion on this one.
Everything is public in Python, the __ is a suggestion by convention that you shouldn't use that function as it is an implementation detail.
This is not enforced by the language or runtime in any way, these names are decorated in a semi-obfuscated way, but they are still public and still visible to all code that tries to use them.
Idiomatic Python doesn't use get/set accessors, it is duplication of effort since there is no private scope.
You only use accessors when you want indirect access to a member variable to have code around it, and then you mark the member variable with __ as the start of its name and provide a function with the actual name.
You could go to great lengths with writing reams of code to try and protect the user from themselves using Descriptors and meta programming, but in the end you will end up with more code that is more to test and more to maintain, and still no guarantee that bad things won't happen. Don't worry about it - Python has survived 20 years this way so far, so it can't be that big of a deal.
PEP 8 (http://www.python.org/dev/peps/pep-0008/) has a section "Designing for inheritance" that should address most of these concerns.
To quote:
"We don't use the term "private" here, since no attribute is really
private in Python (without a generally unnecessary amount of work)."
Also:
"If your class is intended to be subclassed, and you have attributes
that you do not want subclasses to use, consider naming them with
double leading underscores and no trailing underscores."
If you've not read the entire section, I would encourage you to do so.
Update:
To answer the question (now that the title has changed). The pythonic way to use private variables, is to not use private variables. Trying to hide something in python is seldom seen as pythonic.
You can use Python properties instead of getters and setters. Just use an instance attribute and when you need something more complex, make this attribute a property without changing too much code.
http://adam.gomaa.us/blog/2008/aug/11/the-python-property-builtin/
Private variables:
If you use the double underscore at the beginning of your class members they are considered to be private, though not REALLY enforced by python. They simply get some naming tacked on to the front to prevent them from being easily accessed. Single underscore could be treated as "protected".
Getter/Setter:
You can use these if you want to do more to wrap the process and 'protect' your 'private' attributes. But its, again, not required. You could also use Properties, which has getter/setter features.
1) http://docs.python.org/tutorial/classes.html#private-variables
“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice.
(continue reading for more details about class-private variables and name mangling)
2) http://docs.python.org/library/functions.html#property