Style question: single leading underscore in a package

Style question: single leading underscore in a package - python

The Python PEP 8 style guide gives the following guidance for a single leading underscore in method names:
_single_leading_underscore: weak "internal use" indicator. E.g. from M import * does not import objects whose names start with an underscore.
What constitutes "internal use"?
Is this for methods only called within a given class?
MyClass:
def _internal_method(self):
# do_something
def public_method(self):
self._internal_method()
What about inherited methods - are they still considered "internal"?
BaseClass:
def _internal_method(self):
# do something
MyClass(BaseClass):
def public_method(self):
self._internal_method() # or super()._internal_method()
What about if the inheritance is from another module within a software package?
file1.py
BaseClass:
def _internal_method(self):
# do something
file2.py
from file1 import BaseClass
MyClass(BaseClass):
def public_method(self):
self._internal_method() # or super()._internal_method()
All these examples are fine technically, but are they all acceptable stylistically? At what point do you say the leading underscore is not necessary/helpful?

A single leading underscore is Python's convention for "private" and "protected" variables, available as hard-implementations in some other languages.
The "internal use" language is just to say that you are reserving that name, as developer, to be used by your code as you want, and other users of your module/code can't rely on the thing tied to that name to behave the same way in further versions, or even to exist. It is just the use case for "protected" attributes, but without a hard-implementation from the language runtime: users are supposed to know that attribute/function/method can be changed without any previous warning.
So, yes, as long as other classes using your _ prefixed methods are on the same code package - even if on other file, or folder (other completly distinct package), it is ok to use them.
If you have different Python packages, even if closely related, it would not be advisable to call directly on the internal stuff on the other package, style-wise.
And as for limits, sometimes there are entire modules and classes that are not supposed to be used by users of your class - and it would be somewhat impairing to prefix everything on those modules with an _ - I'd say that it is enough to document what public interfaces to your package users are supposed to call, and add on the docs that certain parts (modules/classes/functions) are designed for "internal use and may change without note" - no need to meddle with their names.
As an illustration, I am currently developing a set of tools/library for text-art on the terminal - I put everything users should call as public names in its __init__.py - the remaining names are meant to be "internal".

Related

when to use "_(.)" kind of things in python.? [duplicate]

How can I make methods and data members private in Python? Or doesn't Python support private members?

9.6. Private Variables
“Private” instance variables that
cannot be accessed except from inside
an object, don’t exist in Python.
However, there is a convention that is
followed by most Python code: a name
prefixed with an underscore (e.g.
_spam) should be treated as a non-public part of the API (whether it
is a function, a method or a data
member). It should be considered an
implementation detail and subject to
change without notice.
Since there is a valid use-case for
class-private members (namely to avoid
name clashes of names with names
defined by subclasses), there is
limited support for such a mechanism,
called name mangling. Any identifier
of the form __spam (at least two
leading underscores, at most one
trailing underscore) is textually
replaced with _classname__spam, where
classname is the current class name
with leading underscore(s) stripped.
This mangling is done without regard
to the syntactic position of the
identifier, as long as it occurs
within the definition of a class.
So, for example,
class Test:
def __private_symbol(self):
pass
def normal_symbol(self):
pass
print dir(Test)
will output:
['_Test__private_symbol',
'__doc__',
'__module__',
'normal_symbol']
__private_symbol should be considered a private method, but it would still be accessible through _Test__private_symbol.

The other answers provide the technical details. I'd like to emphasise the difference in philosophy between Python on one hand and languages like C++/Java (which I presume you're familiar with based on your question).
The general attitude in Python (and Perl for that matter) is that the 'privacy' of an attribute is a request to the programmer rather than a barbed wire fence by the compiler/interpreter. The idea is summarised well in this mail and is often referred to as "We're all consenting adults" since it 'assumes' that the programmer is responsible enough to not meddle with the insides. The leading underscores serve as a polite message saying that the attribute is internal.
On the other hand, if you do want to access the internals for some applications (a notable example is documentation generators like pydoc), you're free to do so. Onus is on you as a programmer to know what you're doing and do it properly rather than on the language to force you do to things it's way.

There are no private of any other access protection mechanisms in Python. There is a convention documented in the Python style guide for indicating to the users of your your class that they should not be accessing certain attribute.
_single_leading_underscore: weak "internal use" indicator. E.g. from M import * does not import objects whose name starts with an underscore.
single_trailing_underscore_: used by convention to avoid conflicts with Python keyword, e.g. Tkinter.Toplevel(master, class_='ClassName')
__double_leading_underscore: when naming a class attribute, invokes name mangling (inside class FooBar, __boo becomes _FooBar__boo; see below).

If the name of a Python function,
class method, or attribute starts with
(but doesn't end with) two
underscores, it's private; everything
else is public. Python has no concept
of protected class methods (accessible
only in their own class and descendant
classes). Class methods are either
private (accessible only in their own
class) or public (accessible from
anywhere).
Dive Into Python

Python does not support privacy directly . Programmer need to know when it is safe to modify attribute from outside but anyway with python you can achieve something like private with little tricks.
Now let's see a person can put anything private to it or not.
class Person(object):
def __priva(self):
print "I am Private"
def publ(self):
print " I am public"
def callpriva(self):
self.__priva()
Now When we will execute :
>>> p = Person()
>>> p.publ()
I am public
>>> p.__priva()
Traceback (most recent call last):
File "", line 1, in
p.__priva()
AttributeError: 'Person' object has no attribute '__priva'
#Explanation : You can see here we are not able to fetch that private method directly.
>>> p.callpriva()
I am Private
#Explanation : Here we can access private method inside class
Then how someone can access that variable ???
You can do like :
>>> p._Person__priva
I am Private
wow , actually if python is getting any variable starting with double underscore are “translated” by adding a single underscore and the class name to the beginning:
Note : If you do not want this name changing but you still want to send a signal for other objects to stay away, you can use a single initial underscore names with an initial underscore aren’t imported with starred imports (from module import *)
Example :
#test.py
def hello():
print "hello"
def _hello():
print "Hello private"
#----------------------
#test2.py
from test import *
print hello()
print _hello()
output-->
hello
Traceback (most recent call last):
File "", line 1, in
NameError: name '_hello' is not defined
Now if we will call _hello manually .
#test2.py
from test import _hello , hello
print hello()
print _hello()
output-->
hello
hello private
Finally : Python doesn’t really have an equivalent privacy support, although single
and double initial underscores do to some extent give you two levels of privacy

This might work:
import sys, functools
def private(member):
#functools.wraps(member)
def wrapper(*function_args):
myself = member.__name__
caller = sys._getframe(1).f_code.co_name
if (not caller in dir(function_args[0]) and not caller is myself):
raise Exception("%s called by %s is private"%(myself,caller))
return member(*function_args)
return wrapper
class test:
def public_method(self):
print('public method called')
#private
def private_method(self):
print('private method called')
t = test()
t.public_method()
t.private_method()

This is kinda a l-o-n-g answer but I think it gets to the root of the real problem here -- scope of visibility. Just hang in there while I slog through this!
Simply importing a module need not necessarily give the application developer access to all of its classes or methods; if I can't actually SEE the module source code how will I know what's available? Some one (or some THING) has to tell me what I can do and explain how to use those features I'm allowed to use, otherwise the whole thing is useless to me.
Those developing higher-level abstractions based on fundamental classes and methods via imported modules are presented with a specification DOCUMENT -- NOT the actual source code.
The module spec describes all the features intended to be visible to the client developer. When dealing with large projects and software project teams, the actual implementation of a module should ALWAYS remain hidden from those using it -- it's a blackbox with an interface to the outside world. For OOD purists, I believe the techie terms are "decoupling" and "coherence". The module user need only know the interface methods without being burden with the details of implementation.
A module should NEVER be changed without first changing its underlying spec document, which may require review / approval in some organizations prior to changing the code.
As hobby programmer (retired now), I start a new module with the spec doc actually written out as a giant comment block at the top of the module, this will be the part the user actually sees in the spec library. Since it's just me, I've yet to set up a library, but it would be easy enough to do.
Then I begin coding by writing the various classes and methods but without functional bodies -- just null print statements like "print()" -- just enough to allow the module to compile without syntax errors. When this step is complete I compile the completed null-module -- this is my spec. If I were working on a project team, I would present this spec/interface for review & commentary before proceeding with fleshing out the body.
I flesh out the bodies of each method one at a time and compile accordingly, ensuring syntax errors are fixed immediately on-the-fly. This is also a good time to start writing a temporary "main" execution section at the bottom to test each method as you code it. When the coding/testing are complete, all of the test code is commented out until you need it again should updates become necessary.
In a real-world development team, the spec comment block would also appear in a document control library, but that's another story. The point is: you, as the module client, see only this spec and NOT the source code.
PS: long before the beginning of time, I worked in the defense aerospace community and we did some pretty cool stuff, but things like proprietary algorithms and sensitive systems control logic were tightly vaulted and encrypted in super-duper secure software libraries. We had access to module / package interfaces but NOT the blackbox implementation bodies. There was a document management tool that handled all system-level designs, software specs, source code and test records -- it was all synched together. The government had strict requirements software quality assurance standards. Anyone remember a language called "Ada"? That's how old I am!

import inspect
class Number:
def __init__(self, value):
self.my_private = value
def set_private(self, value):
self.my_private = value
def __setattr__(self, my_private, value):
f = inspect.stack()[1][3]
if f not in ['__init__', 'set_private']:
raise Exception("can't access private member-my_private")
# the default behavior
self.__dict__[my_private] = value
def main():
n = Number(2)
print(n.my_private)
n.set_private(3)
print(n.my_private)
if __name__ == '__main__':
main()

I use Python 2.7 and 3.5. I wrote this code:
class MyOBject(object):
def __init__(self):
self.__private_field = 10
my_object = MyOBject()
print(my_object.__private_field)
ran it and got:
AttributeError: 'MyOBject' object has no attribute '__private_field'
Please see:
https://www.tutorialsteacher.com/python/private-and-protected-access-modifiers-in-python

Using module as a singleton in Python - is that ok?

I've got a really complex singleton object. I've decided to modify it, so it'll be a separate module with module--wide global variables that would store data.
Are there some pitfalls of this approach? I just feel, like that's a little bit hacky, and that there may be some problems I cannot see now.
Maybe someone did this or have some opinion :) Thanks in advance for help.
Regards.
// Minimal, Complete, and Verifiable example:
"""
This is __init__.py of the module, that could be used as a singleton:
I need to set and get value of IMPORTANT_VARIABLE from different places in my code.
Folder structure:
--singleton_module
|
-__init__.py
Example of usage:
import singleton_module as my_singleton
my_singleton.set_important_variable(3)
print(my_singleton.get_important_variable())
"""
IMPORTANT_VARIABLE = 0
def set_important_variable(value):
global IMPORTANT_VARIABLE
IMPORTANT_VARIABLE = value
def get_important_variable():
return IMPORTANT_VARIABLE

Technically, Python modules ARE singletons, so from this point of view there's no particular issue (except the usual issues with singletons that is) with your code. I'd just spell the varibale in all_lower (ALL_UPPER denotes a pseudo-constant) and prefix it with either a single ("protected") or double ("really private") leading underscore to make clear it's not part of the public API (standard Python naming convention).
Now whether singletons are a good idea is another debate but that's not the point here...
e.g that in one potential situation I may lost data, or that module could be imported in different places of code two times, so it would not be a singleton if imported inside scope of function or something like that.
A module is only instanciated once per process (the first time it's imported), then subsquent imports will directly get if from sys.modules. The only case where you could have two distinct instances of the same module is when the module is imported by two different path, which can only happens if you have a somewhat broken sys.path ie something like this:
src/
foo/
__init.py
bar/
__init__.py
baaz/
__init__.py
mymodule.py
with both "src" and "foo" in sys.path, then importing mymodule once as from foo.bar.baaz import mymodule and a second time as from bar.baaz import mymodule
Needless to say that it's a degenerate case, but it can happens and lead to hard to diagnose bugs. Note that when you have this case, you do have quite a few other things that breaks, like identity testing anything from mymodule.
Also, I am not sure how would using object instead of module increase security
It doesn't.
And I am just asking, if that's not a bad practice, maybe someone did this and found some problems. This is probably not a popular pattern
Well, quite on the contrary you'll often find advises on using modules as singletons instead of using classes with only staticmethods, classmethods and class attributes (another way of implementing a singleton in Python). This most often concerns stateless classes used as namespaces while your example does have a state, but this doesn't make much practical difference.
Now what you won't get are all the nice OO features like computed attributes, inheritance, magicmethods etc, but I assume you already understood this.
As far as I'm concerned, depending on the context, I might rather use a plain class but only expose one single instance of the class as the module's API ie:
# mymodule.py
__all__ = ["mysingleton"]
class __MySingletonLike(object):
def __init__(self):
self._variable = 42
#property
def variable(self):
return self._variable
#variable.setter
def variable(self, value):
check_value(value) # imaginary validation
self._variable = value
mysingleton = __MySingleton()
but that's only when I have special concerns about the class (implementation reuse, proper testability, other special features requiring a class etc).

Python has or not access modifiers?

I found from the internet:
public="a" # is a public variable
_protected="b" # is a protected variable
__private="c" # is a private variable
Example of code:
class c1:
def __init__(self,a,b,c):
self.public=a
self._protected=b
self.__private=c
v=c1("A","B","C")
print(v.public)
v._protected="Be" # !??? I can access a protected variable
print(v._protected)
print(v.__private) # !??? AttributeError: 'c1' object has no attribute '__private'
I can access a protected variable!?

No, Python does not have access modifiers which outright prevent access. But then again, most languages don't. Even languages which sport protected and private keywords usually have some way through introspection or such to get at the value anyway in ways which "should not be allowed."
Access modifiers are, one way or another, just a hint as to how the property is supposed to be used.
Python's philosophy is to assume that everyone contributing code is a responsible adult, and that a hint in the form of one or two underscores is perfectly enough to prevent "unauthorised access" to a property. If it starts with an underscore, you probably shouldn't mess with it.

You could acess your protected variable as:
print(v._c1__private)
'C'

Python doesn't have modifiers like private, protected, public. You can emulate their behavior with __getattr__ and __getattribute__, but it's not a Pythonic way to write programs.
Using single underscore _ is a convention, so when you see an attribute or method starting with one underscore, consider that library developer didn't expect it to be part of public API. Also when executing from module import * Python interpreter doesn't import names starting with _, though there're ways to modify this behavior.
Using double underscore __ is not just a convention, it leads to "name-mangling" by interpreter - adding class name in front of attribute, so i.e.:
class Klass():
def __like_private():
print("Hey!")
will make _Klass__like_private.
You won't be able to access __like_private directly as defined, though you still will be able to get to it knowing how names are composed by using _Klass__like_private() in subclasses for example or in module:
Klass.__like_private() will give you an error.
Klass._Klass__like_private() will print Hey!.

Is there a point to setting all and then using leading underscores anyway?

I've been reading through the source for the cpython HTTP package for fun and profit, and noticed that in server.py they have the __all__ variable set but also use a leading underscore for the function _quote_html(html).
Isn't this redundant? Don't both serve to limit what's imported by from HTTP import *?
Why do they do both?

Aside from the "private-by-convention" functions with _leading_underscores, there are:
Quite a few imported names;
Four class names;
Three function names without leading underscores;
Two string "constants"; and
One local variable (nobody).
If __all__ wasn't defined to cover only the classes, all of these would also be added to your namespace by a wildcard from server import *.
Yes, you could just use one method or the other, but I think the leading underscore is a stronger sign than the exclusion from __all__; the latter says "you probably won't need this often", the former says "keep out unless you know what you're doing". They both have their place.

__all__ indeed serves as a limit when doing from HTTP import *; prefixing _ to the name of a function or method is a convention for informing the user that that item should be considered private and thus used at his/her own risk.

This is mostly a documentation thing, in a similar vein to comments. A leading underscore is a clearer indication to a person reading the code that particular functions or variables aren't part of the public API than having that person check each name against __all__. PEP8 explicitly recommends using both conventions in this way:
To better support introspection, modules should explicitly declare
the names in their public API using the __all__ attribute. Setting
__all__ to an empty list indicates that the module has no public API.
Even with __all__ set appropriately, internal interfaces (packages,
modules, classes, functions, attributes or other names) should still
be prefixed with a single leading underscore.

Python: 'Private' module in a package

I have a package mypack with modules mod_a and mod_b in it. I intend the package itself and mod_a to be imported freely:
import mypack
import mypack.mod_a
However, I'd like to keep mod_b for the exclusive use of mypack. That's because it exists merely to organize the latter's internal code.
My first question is, is it an accepted practice in Python programming to have 'private' modules like this?
If yes, my second question is, what is the best way to convey this intention to the client? Do I prefix the name with an underscore (i.e. _mod_b)? Or would it be a good idea to declare a sub-package private and place all such modules there?

I prefix private modules with an underscore to communicate the intent to the user. In your case, this would be mypack._mod_b
This is in the same spirit (but not completely analogous to) the PEP8 recommendation to name C-extension modules with a leading underscore when it’s wrapped by a Python module; i.e., _socket and socket.

The solution I've settled on is to create a sub-package 'private' and place all the modules I wish to hide in there. This way they stay stowed away, leaving mypack's module list cleaner and easier to parse.
To me, this doesn't look unpythonic either.

While there are not explicit private keywords there is a convention to have put private functions start with a single underscore but a double leading underscore will make it so others cannot easily call the function from outside the module. See the following from PEP 8
- _single_leading_underscore: weak "internal use" indicator. E.g. "from M
import *" does not import objects whose name starts with an underscore.
- single_trailing_underscore_: used by convention to avoid conflicts with
Python keyword, e.g.
Tkinter.Toplevel(master, class_='ClassName')
- __double_leading_underscore: when naming a class attribute, invokes name
mangling (inside class FooBar, __boo becomes _FooBar__boo; see below).
- __double_leading_and_trailing_underscore__: "magic" objects or
attributes that live in user-controlled namespaces. E.g. __init__,
__import__ or __file__. Never invent such names; only use them
as documented.
To make an entire module private, don't include it __init__.py file.

One thing to be aware of in this scenario is indirect imports. If in mypack you
from mypack._mod_b import foo
foo()
Then a user can
from mypack import foo
foo()
and be none the wiser. I recommend importing as
from mypack import _mod_b
_mod_b.foo()
then a user will immediately see a red flag when they try to
from mypack import _mod_b
As for actual directory structure, you could even extend Jeremy's answer into a _package_of_this_kind package, where anything in that can have any 'access modifiers' on it you like - users will know there be dragons

Python doesn't strictly know or support "private" or "protected" methods or classes. There's a convention that methods prefixed with a single underscore aren't part of an official API, but I wouldn't do this on classes or files - it's ugly.
If someone really needs to subclass or access mod_b, why prevent him/her from doing so? You can always supply a preferred API in your documentation and document in your module that you shouldn't access it directly and use mypack in stead.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.