Use cases for __init__.py in python 3.3+ - python

Now that __init__.py is no longer required to make a directory recognized as a package, is it best practice to avoid them entirely if possible? Or are there still well-accepted use cases for __init__.py in python 3.3+?
From what I understand, __init__.py were very commonly used to run code at module import time (for example to encapsulate internal file structure of the package or to perform some initialization steps). Are these use cases still relevant with python 3.3+?

There's a very good discussion of this in this answer, and you should probably be familiar with PEP 420 to clarify the difference between and regular packages (use __init__.py) and namespace packages (don't).
What I offer by way of answer is a combination of reading, references, and opinion. No claims to being "canonical" or "pythonic" here.
Are [initialization] use cases still relevant with python 3.3+?
Yes. Take your example as a use case, where the package author wants to bring several things into the root package namespace so the user doesn't have to concern themselves with its internal structure.
Another case is creating a hierarchy of modules. That reference (O'Reilly) actually says:
The purpose of the __init__.py files is to include optional initialization code that runs as different levels of a package are encountered.
They do consider namespace packages in that discussion, but continue:
All things being equal, include the __init__.py files if you’re just starting out with the creation of a new package.
So, for your second question,
is it best practice to avoid __init__.py entirely if possible?
No, unless your intent is to create a namespace package rather than a regular package, in which case you must not use __init__.py.
Why might you want that? The O'Reilly reference has the clearest example I've seen about why namespace packages are cool, which is being able to collapse namespaces from separate, independently-maintained packages:
foo-package/
spam/
blah.py
bar-package/
spam/
grok.py
Which allows
>>> import sys
>>> sys.path.extend(['foo-package', 'bar-package'])
>>> import spam.blah
>>> import spam.grok
>>>
So anyone can extend the namespace with their own code. Cool.

Related

Python modules confusion

I feel a bit confused with how python modules work when I started looking at PyMySQL repository, see here: https://github.com/PyMySQL/PyMySQL?files=1
1) Why is there no pymysql.py file because it is imported like: import pymysql? Isnt it required to have such a file?
2) I cannot find the method connect, used like: pymysql.connect(...), anywhere. Is it possible to rename exported methods somehow?
There's a directory pymysql there. A directory can also be imported as a module*, with the advantage that it can contain submodules. Classically, there's a __init__.py file in the directory that controls what's in the top-level pymysql.* namespace.
So, the connect method you're missing will either be defined directly in pymysql/__init__.py, or defined in one of its siblings in that directory, and then imported from there by pymysql/__init__.py.
*Strictly speaking, a directory that you import like a module is really called a "package". I like to avoid that term—it's potentially confusing because the term is overloaded: what you install with pip is also called a "package" in sense 2, and that might actually contain multiple "packages" in sense 1.
See What is __init__.py for? and the official docs

Python 2 relative imports: two different packages needs a common class

So I have thought about it this way for a Python 2.7 project. It would be composed of two independent parts requiring a common class (module) file in a third package:
SomeRootFolder/Package1Folder/manyPythonModuleFiles.py
SomeRootFolder/Package2Folder/manyPythonModuleFiles.py
SomeRootFolder/SharedPackageFolder/OneCommonClassNeedsToBeShared.py
What I want to do is to import the common class in the shared package from both packages. The two first packages do not require to interact together but needs that one class. The python programs might be runned with the console opened from within the two package folders themselves, such as:
cd Package1Folder
python SomeMainFile.py
If it is easier, the Python call could be like python Package1Folder/SomeMainFile.py but I need to plan this.
Could you provide how I could do the relative imports from package 1 or 2 for a file in the third shared package? Do I need an __init__.py file in the SomeRootFolder folder? I am always confused by relative imports and those import standards and syntax that are different between Python 2 and 3. Also could you validate to me that this is an acceptable way to proceed? Any other ideas?
Thanks all!
If you want to use relative imports,you need __init__.py in SharedPackageFolder folder,and you can use this to import OneCommonClassNeedsToBeShared.py:
from ..SharedPackageFolder import OneCommonClassNeedsToBeShared
See more details about Rationale for Relative Imports .
With the shift to absolute imports, the question arose whether
relative imports should be allowed at all. Several use cases were
presented, the most important of which is being able to rearrange the
structure of large packages without having to edit sub-packages. In
addition, a module inside a package can't easily import itself without
relative imports.
Also you can use absolute imports,relative imports are no longer strongly discouraged,using absolute_import is strongly suggested in some case.
You need to make sure SomeRootFolder is in your PYTHONPATH,or make this folder as sources root,it's more easier for you to import package or scripts in your large project,but sometimes you should be careful with absolute imports.
from SharedPackageFolder import OneCommonClassNeedsToBeShared.py
Absolute imports. From PEP 8:
Relative imports for intra-package imports are highly discouraged. Always use the absolute package path for all imports. Even now that
PEP 328 [7] is fully implemented in Python 2.5, its style of explicit
relative imports is actively discouraged; absolute imports are more
portable and usually more readable.
By the way,relative imports in Python 3 might return SystemError,have a look at Question:Relative imports in Python 3.
#vaultah offers some solutions,they might be helpful.
Hope this helps.

python 3 import from subdir

My project has to be extensible, i have a lot of scripts with the same interface that lookup things online. Before i was using __import__ but that does not let me put my 'plugins' on a dedicated directory:
root/
main.py
plugins/
[...]
So my question is: Is there a way to individually import modules from that subdirectory? I'm guessing importlib, but i'm so lost in how the python module loading process works... What i want to do is something like this:
for pluginname in plugins:
plugin = somekindofimport("plugins/{name}".format(name=pluginname))
plugin.unififedinterface()
Also, as a side question, the way am i trying to achieve extensibility is a good way?
I'm on python3.3
Stop thinking in terms of pathnames and start thinking in terms of packages. Read Packages in the tutorial, and if you want more detail see The import system.
But the basic idea is this:
Create a file name plugins/__init__.py. It can be empty; that's enough to turn plugins into a package. Which means you can import modules from that package with:
import plugins.plugin
So, how do you do this dynamically? That's what importlib is for. (You can also use __import__ here, but it's less flexible, and less readable in non-trivial cases, so unless you need pre-3.3 compatibility, don't.)
plugin = importlib.import_module('plugins.{name}'.format(name=pluginname))
It would probably be cleaner to import plugins to get the package, and then use relative imports from within that package, as shown in the examples in the import_module docs.
This also means Python takes care of the .pyc creation and caching, etc.
And it means that you can later expand plugins to be a "namespace package", which can be split across multiple directories like /usr/share/myapp/plugins for stock plugins, /etc/myapp/plugins for site plugins and ~/myapp/plugins for user-specific plugins.
If you really, really want to import from a directory that isn't a package, you can create a module loader and use it, but that's a whole lot of work for no actual benefit. (It's actually not that hard in 3.3 (SourceLoader and friends will do most of the work for you), but you will find almost no examples out there to guide you; instead, you'll find examples of the 2.6-3.2 way, or the 2.0-2.5 way, both of which are hard.) Plus, it means that if someone creates a plugin named, say, gzip, you can end up blocking the stdlib gzip module with the plugin. (That's especially fun if the gzip plugin tries to use the gzip stdlib module, as it likely will…) If the plugin ends up being named plugins.gzip, there's no problem.
Also, as a side question, the way am i trying to achieve extensibility is a good way?
As long as you only want to support 3.3+, yes, I think this is a great solution.
Before 3.3, using a package for plugins was a lot more problematic. People have come up with a variety of different plugin systems—in one case going so far as to dynamically create module objects and execfile into them. If you need to deal with that, I would suggest looking at existing Python apps with plugins (e.g., MusicBrainz Picard) to get different ideas.

What does `__import__('pkg_resources').declare_namespace(__name__)` do?

In some __init__.py files of modules I saw such single line:
__import__('pkg_resources').declare_namespace(__name__)
What does it do and why people use it? Suppose it's related to dynamic importing and creating namespace at runtime.
It boils down to two things:
__import__ is a Python function that will import a package using a string as the name of the package. It returns a new object that represents the imported package. So foo = __import__('bar') will import a package named bar and store a reference to its objects in a local object variable foo.
From setup utils pkg_resources' documentation, declare_namespace() "Declare[s] that the dotted package name name is a "namespace package" whose contained packages and modules may be spread across multiple distributions."
So __import__('pkg_resources').declare_namespace(__name__) will import the 'pkg_resources' package into a temporary and call the declare_namespace function stored in that temporary (the __import__ function is likely used rather than the import statement so that there is no extra symbol left over named pkg_resources). If this code were in my_namespace/__init__.py, then __name__ is my_namespace and this module will be included in the my_namespace namespace package.
See the setup tools documentation for more details
See this question for discussion on the older mechanism for achieving the same effect.
See PEP 420 for the standardized mechanism that provides similar functionality beginning with Python 3.3.
This is a way to declare the so called "namespace packages" in Python.
What are these and what is the problem:
Imagine you distribute a software product which has a lot of functionality, and not all people want all of it, so you split it into pieces and ship as optional plugins.
You want people to be able to do
import your_project.plugins.plugin1
import your_project.plugins.plugin2
...
Which is fine if your directory structure is exactly as above, namely
your_project/
__init__.py
plugins/
__init__.py
plugin1.py
plugin2.py
But what if you ship those two plugins as separate python packages so they are located in two different directories? Then you might want to put __import__('pkg_resources').declare_namespace(__name__) in each package's __init__.py so that Python knows those packages are part of a bigger "namespace package", in our case it's your_project.plugins.
Please refer to the documentation for more info.

python module layout

I'm just starting to get to the point in my python projects that I need to start using multiple packages and I'm a little confused on exactly how everything is supposed to work together. What exactly should go into the __init__.py of the package? Some projects I see just have blank inits and all of their code are in modules in that package. Other projects implement what seems to be the majority of the package's classes and functions inside the init.
Is there a document or style guide or something that describes what the python authors had in mind for the use of packages and the __init__ file and such?
Edit:
I know the point of having the __init__.py file in the simplest sense that it makes a folder a package. But why would I put a function there instead of a module in that same folder(package)?
__init__.py can be empty, but what it really does is make sure Python treats your directories correctly, provide any initialization you might need for when your package is imported (configuring the environment or something along those lines), or defining __all__ so that Python knows what to do when someone uses from package import *.
Most everything you need to know is described in the docs on Packages. Dive Into Python also has a piece on packaging.
You already know, I guess that __init__.py files are required to make Python treat the directories as containing packages.
In the above model __init__.py can remain empty.
You can can also execute initialization code for the package.
You can also set the __all__ variable.
[Edit: learnings]
When you do "from package import item", or "from package import *", then the variable __all__ can be used to import selected packages.
See : http://docs.python.org/tutorial/modules.html

Categories