I am studying python. I am trying to understand how to design a library that exposes a public api. I want avoid to expose internal methods that could change in future. I am looking for a simple and pythonic way to do it.
I have a library that contains a bunch of classes. Some methods of those classes are used internally among classes. I don't want to expose those methods to the client code.
Suppose that my library (f.e. mylib) contains a class C with two methods a C.public() method thought to be used from client code and C.internal() method used to do some work into the library code.
I want to commit myself to the public api (C.public()) but I am expecting to change the C.internal() method in future, for example adding or removing parameters.
The following code illustrates my question:
mylib/c.py:
class C:
def public(self):
pass
def internal(self):
pass
mylib/f.py:
class F:
def build():
c = C()
c.internal()
return c
mylib/__init__.py:
from mylib.c import C
from mylib.f import F
client/client.py:
import mylib
f = mylib.F()
c = f.build()
c.public()
c.internal() # I wish to hide this from client code
I have thought the following solutions:
document only public api, warning user in documentation to don't use private library api. Live in peace hoping that clients will use only public api. If the next library version breaks client code is the client fault:).
use some form of naming convention, f.e. prefix each method with "_", (it is reserved for protected methods and raises a warning into ide), perhaps I can use other prefixes.
use objects composition to hide internal methods.
For example the library could return to the clients only PC object that
embeds C objects.
mylib/pc.py:
class PC:
def __init__(self, c):
self.__c__
def public(self):
self.__cc__.public()
But this looks a little contrived.
Any suggestion is appreciated :-)
Update
It was suggested that this question is duplicated of Does Python have “private” variables in classes?
It is similar question but I is a bit different about scope. My scope is a library not a single class. I am wondering if there is some convention about marking (or forcing) which are the public methods/classes/functions of a library. For example I use the __init__.py to export the public classes or functions. I am wondering if there is some convention about exporting class methods or if i can rely only on documentation.
I know I can use "_" prefix for marking protected methods. As best as I know protected method are method that can be used in class hierarchy.
I have found a question about marking public method with a decorator #api Sphinx Public API documentation but it was about 3 years ago. There is commonly accepted solution, so if someone are reading my code understand what are methods intended to be library public api, and methods intended to be used internally in the library?
Hope I have clarified my questions.
Thanks all!
You cannot really hide methods and attributes of objects. If you want to be sure that your internal methods are not exposed, wrapping is the way to go:
class PublicC:
def __init__(self):
self._c = C()
def public(self):
self._c.public()
Double underscore as a prefix is usually discouraged as far as I know to prevent collision with python internals.
What is discouraged are __myvar__ names with double-underscore prefix+suffix ...this naming style is used by many python internals and should be avoided -- Anentropic
If you prefer subclassing, you could overwrite internal methods and raise Errors:
class PublicC(C):
def internal(self):
raise Exception('This is a private method')
If you want to use some python magic, you can have a look at __getattribute__. Here you can check what your user is trying to retrieve (a function or an attribute) and raise AttributeError if the client wants to go for an internal/blacklisted method.
class C(object):
def public(self):
print "i am a public method"
def internal(self):
print "i should not be exposed"
class PublicC(C):
blacklist = ['internal']
def __getattribute__(self, name):
if name in PublicC.blacklist:
raise AttributeError("{} is internal".format(name))
else:
return super(C, self).__getattribute__(name)
c = PublicC()
c.public()
c.internal()
# --- output ---
i am a public method
Traceback (most recent call last):
File "covering.py", line 19, in <module>
c.internal()
File "covering.py", line 13, in __getattribute__
raise AttributeError("{} is internal".format(name))
AttributeError: internal is internal
I assume this causes the least code overhead but also requires some maintenance. You could also reverse the check and whitelist methods.
...
whitelist = ['public']
def __getattribute__(self, name):
if name not in PublicC.whitelist:
...
This might be better for your case since the whitelist will probably not change as often as the blacklist.
Eventually, it is up to you. As you said yourself: It's all about documentation.
Another remark:
Maybe you also want to reconsider your class structure. You already have a factory class F for C. Let F have all the internal methods.
class F:
def build(self):
c = C()
self._internal(c)
return c
def _internal(self, c):
# work with c
In this case you do not have to wrap or subclass anything. If there are no hard design constraints to render this impossible, I would recommend this approach.
I have thought the following solutions:
document only public api, warning user in documentation to don't use
private library api. Live in peace hoping that clients will use only
public api. If the next library version breaks client code is the
client fault:).
use some form of naming convention, f.e. prefix each method with "_",
(it is reserved for protected methods and raises a warning into ide),
perhaps I can use other prefixes.
use objects composition to hide internal methods. For example the
library could return to the clients only PC object that embeds C
objects.
You got it pretty right with the first two points.
The Pythonic way is to name internal methods starting with single underscore '_', this way all Python developers know that this method is there, but it's use is discouraged and won't use it. (Until they decide to do some monkey-patching, but you shouldn't care for this scenario.) For newbie developers you might want to mention explicitly about not using methods starting with underscore. Also, just don't provide public documentation for your "private" methods, use it for internal reference only.
You might want to take a look at "name mangling", but it's less common.
Hiding internals with object composition or methods like __getattribute__ and etc. is generally discouraged in Python.
You might want to look at source code of some popular libraries to see how they manage this, e.g. Django, Twisted, etc.
Related
I am coding a small Python module composed of two parts:
some functions defining a public interface,
an implementation class used by the above functions, but which is not meaningful outside the module.
At first, I decided to "hide" this implementation class by defining it inside the function using it, but this hampers readability and cannot be used if multiple functions reuse the same class.
So, in addition to comments and docstrings, is there a mechanism to mark a class as "private" or "internal"? I am aware of the underscore mechanism, but as I understand it it only applies to variables, function and methods name.
Use a single underscore prefix:
class _Internal:
...
This is the official Python convention for 'internal' symbols; "from module import *" does not import underscore-prefixed objects.
Reference to the single underscore convention.
In short:
You cannot enforce privacy. There are no private classes/methods/functions in Python. At least, not strict privacy as in other languages, such as Java.
You can only indicate/suggest privacy. This follows a convention. The Python convention for marking a class/function/method as private is to preface it with an _ (underscore). For example, def _myfunc() or class _MyClass:. You can also create pseudo-privacy by prefacing the method with two underscores (for example, __foo). You cannot access the method directly, but you can still call it through a special prefix using the classname (for example, _classname__foo). So the best you can do is indicate/suggest privacy, not enforce it.
Python is like Perl in this respect. To paraphrase a famous line about privacy from the Perl book, the philosophy is that you should stay out of the living room because you weren't invited, not because it is defended with a shotgun.
For more information:
Private variables Python Documentation
Why are Python’s ‘private’ methods not actually private? Stack Overflow question 70528
Define __all__, a list of names that you want to be exported (see documentation).
__all__ = ['public_class'] # don't add here the 'implementation_class'
A pattern that I sometimes use is this:
Define a class:
class x(object):
def doThis(self):
...
def doThat(self):
...
Create an instance of the class, overwriting the class name:
x = x()
Define symbols that expose the functionality:
doThis = x.doThis
doThat = x.doThat
Delete the instance itself:
del x
Now you have a module that only exposes your public functions.
The convention is prepend "_" to internal classes, functions, and variables.
To address the issue of design conventions, and as chroder said, there's really no such thing as "private" in Python. This may sound twisted for someone coming from C/C++ background (like me a while back), but eventually, you'll probably realize following conventions is plenty enough.
Seeing something having an underscore in front should be a good enough hint not to use it directly. If you're concerned with cluttering help(MyClass) output (which is what everyone looks at when searching on how to use a class), the underscored attributes/classes are not included there, so you'll end up just having your "public" interface described.
Plus, having everything public has its own awesome perks, like for instance, you can unit test pretty much anything from outside (which you can't really do with C/C++ private constructs).
Use two underscores to prefix names of "private" identifiers. For classes in a module, use a single leading underscore and they will not be imported using "from module import *".
class _MyInternalClass:
def __my_private_method:
pass
(There is no such thing as true "private" in Python. For example, Python just automatically mangles the names of class members with double underscores to be __clssname_mymember. So really, if you know the mangled name you can use the "private" entity anyway. See here. And of course you can choose to manually import "internal" classes if you wanted to).
In fact you can achieve something similar to private members by taking advantage of scoping. We can create a module-level class that creates new locally-scoped variables during creation of the class, then use those variables elsewhere in that class.
class Foo:
def __new__(cls: "type[Foo]", i: int, o: object) -> "Foo":
_some_private_int: int = i
_some_private_obj: object = o
foo = super().__new__(cls)
def show_vars() -> None:
print(_some_private_int)
print(_some_private_obj)
foo.show_vars = show_vars
return foo
def show_vars(self: "Foo") -> None:
pass
We can then do, e.g.
foo = Foo(10, {"a":1})
foo.show_vars()
# 10
# {'a': 1}
Alternatively, here's a poor example that creates a class in a module that has access to variables scoped to the function in which the class is created. Do note that this state is shared between all instances (so be wary of this specific example). I'm sure there's a way to avoid this, but I'll leave that as an exercise for someone else.
def _foo_create():
_some_private_int: int
_some_private_obj: object
class Foo:
def __init__(self, i: int, o: object) -> None:
nonlocal _some_private_int
nonlocal _some_private_obj
_some_private_int = i
_some_private_obj = o
def show_vars(self):
print(_some_private_int)
print(_some_private_obj)
import sys
sys.modules[__name__].Foo = Foo
_foo_create()
As far as I am aware, there is not a way to gain access to these locally-scoped variables, though I'd be interested to know otherwise, if it is possible.
I'm new to Python but as I understand it, Python isn't like Java.
Here's how it happens in Python:
class Student:
__schoolName = 'XYZ School' # private attribute
def __nameprivamethod(self): # private function
print('two underscore')
class Student:
_schoolName = 'XYZ School' # protected attribute
Don't to check how to access the private and protected parts.
I have come across a python project that commonly calls external functions from class methods and passes the class instance and some other parameters to the external function.
The method used is shown in method_one below and I have never come across this implementation before. Using locals to get both the local method parameters and the self class instance seems strange to say the least. The code then relies upon the dictionary keys being named correctly i.e. the same as the parameters of the external function (some_function).
To me, the obvious, simpler direct alternative is method_two but even over that I would prefer either
making some_function a method of ExampleClass1 so it has direct access to self, or
passing only the required attributes of the ExampleClass1 instance to some_function.
Example code:
class ExampleClass1(object):
def __init__(self, something):
self.something = something
def method_one(self, param_1, param_2):
all_params = locals()
all_params['example_self'] = all_params.pop('self')
some_function(**all_params)
def method_two(self, param_1, param_2):
some_function(self, param_1, param_2)
def some_function(example_self, param_1, param_2):
print(example_self.something, param_1, param_2)
e = ExampleClass1("do")
e.method_one(1, "a")
e.method_two(2, "b")
So,
Is there any reason to be using method_one that I'm not aware of?
How would you offer advice on the best practice for this situation?
Passing self as a parameter to external functions is a totally standard practice. I'm a little unclear why the call to locals() is used and why keys are being shuffled around, but that's a separate matter. In general, I find that if you're using locals(), then 9 times out of 10 the code you're writing can be simpler. Notable exception being metaprogramming, which is another topic.
One example that I use this for is when you want to separate out code into several modules rather than have one large class with a bunch of methods. There's a lot of ways to organize code, but one approach that I use is to segregate functions to other modules based on their domain, and then pass self to those functions for their use.
Concrete example: a server object accepting requests can have the routes handling those requests live elsewhere, and then delegate the actual business logic to the external route functions. If those routes need the server object, though, then you may want to pass self (being the server) to them. You could make an argument they should just be methods then, but that's a matter of code style and depends a lot on exact use case.
In general, passing self around isn't a bad practice when used appropriately.
I'm relatively new to Python.
When I did C/C++ programming, I used the internal classes quite often. For example, in some_file.cc, we may implement a class in the anonymous namespace to prevent it from being used outside. This is useful as a helper class specific to that file.
Then, how we can do a similar thing in Python?
class MyClassOuter:
def __init__(self,...):
class MyClassInner:
def __init__(self,...):
pass
self.my_class = MyClassInner(..)
would only have MyClassInner available inside the __init__ function of MyClassOuter
likewise you could put it inside a function
def my_class_factory(arg1,arg2,...):
class MyClass:
def __init__(self,arg1,arg2,...):
print "OK??"
return MyClass(arg1,arg2,...)
Python code doesn't have any such equivalent for an anonymous namespace, or static linkage for functions. There are a few ways you can get what you're looking for
Prefix with _. Names beginning with an underscore are understood
to be for internal use to that python file and are not exported by
from * imports. it's as simple as class _MyClass.
Use __all__: If a python file contains a list a list of strings
named __all__, the functions and classes named within are
understood to be local to that python file and are not exported by
from *.
Use local classes/functions. This would be done the same way you've
done so with C++ classes.
None these gets exactly what you want, but privacy and restricting in this way are just not part of the language (much like how there's no private data member equivalent). Pydoc is also well aware of these conventions and will provide informative documentation for the intended-to-be-public functions and classes.
Suppose we have the following structure:
class A():
class __A():
def __to_be_mocked(self):
#something here
def __init__(self):
with A.lock:
if not A.instance:
A.instance = A.__A()
def __getattr__(self,name):
return getattr(self.instance,name)
Now we want to mock the function __to_be_mocked.How can we mock it as the target accepted by mock.patch.object is package.module.ClassName.I have tried all methods like
target = A.__A
target = A.___A
and many more.
EDIT:
I solved it using
target=A._A__A and attribute as '_A__to_be_mocked`
Now the question is __to_be_mocked is inside __A so shouldn't it be ___A__to_be_mocked .
Is it because of setattribute in A or __init__ in A?
I mocked a lot of things in python and after did it lot of times I can say:
NEVER mock/patch __something attributes (AKA private attributes)
AVOID to mock/patch _something attributes (AKA protected attributes)
Private
If you mock private things you'll tangled production and test code. When you do this kind of mocks there is always a way to obtain the same behavior by patching or mocking public or protected stuffs.
To explain better what I mean by tangling production and test code I can use your example: to patch A.__B.__to_be_mocked() (I replaced __A inner class by __B to make it more clear) you need to write something like
patch('amodule.A._A__B._B__to_be_mocked')
Now by patching __to_be_mocked you are spreading A, B and to_be_mocked names in your test: that is exactly what I mean to tangled code. So if you need to change some name you should go in all your test and change your patches and no refactoring tool can propose to you to change _A__B._B string.
Now if you are a good guy and take your tests clean you can have just a few points where these names come out but if it is a singleton I can bet that it will spot out like mushrooms.
I would like to point out that private and protected have nothing to do with some security concern but are just way to make your code more clear. That point is crystal clear in python where you don't need to be a hacker to change private or protected attributes: these conventions are here just to help you on reading code where you can say Oh great! I don't need to understand what is it ... it just the dirty work. IMHO private attributes in python fails this goal (__ is too long and see it really bother me) and protected are just enough.
Side note: little example to understand python's private naming:
>>> class A():
... class __B():
... def __c(self):
... pass
...
>>> a = A()
>>> dir(a)
['_A__B', '__doc__', '__module__']
>>> dir(a._A__B)
['_B__c', '__doc__', '__module__']
To come back at your case: How your code use __to_be_mocked() method? is it possible to have the same effect by patch/mock something else in A (and not A.__A) class?
Finally, if you are mocking private method to sense something to test you are in the wrong place: never test the dirty work it should/may/can change without change your tests. What you need is to test code behavior and not how it is written.
Protected
If you need test, patch or mock protected stuffs maybe your class hide some collaborators: test it and use your test to refactor your code then clean your tests.
Disclaimer
Indeed: I spread this kind of crap in my tests and then I fight to remove it when I understand that I can do it better.
Class & instance members starting with double underscores have their names rewritten to prevent collisions with same-name members in parent classes, making them behave as if "private". So __B here is actually accessible as A._A__B. (Underscore, class name, double underscored member name). Note that if you use the single-underscore convention (_B), no rewriting happens.
That being said, you'll rarely see anyone actually use this form of access and especially not in prod code as things are made "private" for a reason. For mocking, maybe, if there's no better way.
I am coding a small Python module composed of two parts:
some functions defining a public interface,
an implementation class used by the above functions, but which is not meaningful outside the module.
At first, I decided to "hide" this implementation class by defining it inside the function using it, but this hampers readability and cannot be used if multiple functions reuse the same class.
So, in addition to comments and docstrings, is there a mechanism to mark a class as "private" or "internal"? I am aware of the underscore mechanism, but as I understand it it only applies to variables, function and methods name.
Use a single underscore prefix:
class _Internal:
...
This is the official Python convention for 'internal' symbols; "from module import *" does not import underscore-prefixed objects.
Reference to the single underscore convention.
In short:
You cannot enforce privacy. There are no private classes/methods/functions in Python. At least, not strict privacy as in other languages, such as Java.
You can only indicate/suggest privacy. This follows a convention. The Python convention for marking a class/function/method as private is to preface it with an _ (underscore). For example, def _myfunc() or class _MyClass:. You can also create pseudo-privacy by prefacing the method with two underscores (for example, __foo). You cannot access the method directly, but you can still call it through a special prefix using the classname (for example, _classname__foo). So the best you can do is indicate/suggest privacy, not enforce it.
Python is like Perl in this respect. To paraphrase a famous line about privacy from the Perl book, the philosophy is that you should stay out of the living room because you weren't invited, not because it is defended with a shotgun.
For more information:
Private variables Python Documentation
Why are Python’s ‘private’ methods not actually private? Stack Overflow question 70528
Define __all__, a list of names that you want to be exported (see documentation).
__all__ = ['public_class'] # don't add here the 'implementation_class'
A pattern that I sometimes use is this:
Define a class:
class x(object):
def doThis(self):
...
def doThat(self):
...
Create an instance of the class, overwriting the class name:
x = x()
Define symbols that expose the functionality:
doThis = x.doThis
doThat = x.doThat
Delete the instance itself:
del x
Now you have a module that only exposes your public functions.
The convention is prepend "_" to internal classes, functions, and variables.
To address the issue of design conventions, and as chroder said, there's really no such thing as "private" in Python. This may sound twisted for someone coming from C/C++ background (like me a while back), but eventually, you'll probably realize following conventions is plenty enough.
Seeing something having an underscore in front should be a good enough hint not to use it directly. If you're concerned with cluttering help(MyClass) output (which is what everyone looks at when searching on how to use a class), the underscored attributes/classes are not included there, so you'll end up just having your "public" interface described.
Plus, having everything public has its own awesome perks, like for instance, you can unit test pretty much anything from outside (which you can't really do with C/C++ private constructs).
Use two underscores to prefix names of "private" identifiers. For classes in a module, use a single leading underscore and they will not be imported using "from module import *".
class _MyInternalClass:
def __my_private_method:
pass
(There is no such thing as true "private" in Python. For example, Python just automatically mangles the names of class members with double underscores to be __clssname_mymember. So really, if you know the mangled name you can use the "private" entity anyway. See here. And of course you can choose to manually import "internal" classes if you wanted to).
In fact you can achieve something similar to private members by taking advantage of scoping. We can create a module-level class that creates new locally-scoped variables during creation of the class, then use those variables elsewhere in that class.
class Foo:
def __new__(cls: "type[Foo]", i: int, o: object) -> "Foo":
_some_private_int: int = i
_some_private_obj: object = o
foo = super().__new__(cls)
def show_vars() -> None:
print(_some_private_int)
print(_some_private_obj)
foo.show_vars = show_vars
return foo
def show_vars(self: "Foo") -> None:
pass
We can then do, e.g.
foo = Foo(10, {"a":1})
foo.show_vars()
# 10
# {'a': 1}
Alternatively, here's a poor example that creates a class in a module that has access to variables scoped to the function in which the class is created. Do note that this state is shared between all instances (so be wary of this specific example). I'm sure there's a way to avoid this, but I'll leave that as an exercise for someone else.
def _foo_create():
_some_private_int: int
_some_private_obj: object
class Foo:
def __init__(self, i: int, o: object) -> None:
nonlocal _some_private_int
nonlocal _some_private_obj
_some_private_int = i
_some_private_obj = o
def show_vars(self):
print(_some_private_int)
print(_some_private_obj)
import sys
sys.modules[__name__].Foo = Foo
_foo_create()
As far as I am aware, there is not a way to gain access to these locally-scoped variables, though I'd be interested to know otherwise, if it is possible.
I'm new to Python but as I understand it, Python isn't like Java.
Here's how it happens in Python:
class Student:
__schoolName = 'XYZ School' # private attribute
def __nameprivamethod(self): # private function
print('two underscore')
class Student:
_schoolName = 'XYZ School' # protected attribute
Don't to check how to access the private and protected parts.