The setuptools documentation is very explicit about adding code to __init__.py files from namespaces:
You must NOT include any other code and data in a namespace package's __init__.py. Even though it may appear to work during development, or when projects are installed as .egg files, it will not work when the projects are installed using "system" packaging tools -- in such cases the __init__.py files will not be installed, let alone executed.
Yet, I do not understand what these "system" packaging tools are. What are they? How could I reproduce this situation where the __init__.py files are gone?
#Anzel's comment looked like a good answer, and I'd say PEP-420 confirms that. In its Rationale section, we read:
Namespace packages are designed to support being split across multiple directories (and hence found via multiple sys.path entries). In this configuration, it doesn't matter if multiple portions all provide an __init__.py file, so long as each portion correctly initializes the namespace package. However, Linux distribution vendors (amongst others) prefer to combine the separate portions and install them all into the same file system directory. This creates a potential for conflict, as the portions are now attempting to provide the same file on the target system - something that is not allowed by many package managers. Allowing implicit namespace packages means that the requirement to provide an __init__.py file can be dropped completely, and affected portions can be installed into a common directory or split across multiple directories as distributions see fit.
So yes, we cannot add any more code to our __init__.py files because OS package managers (and others) would prefer to merge them into only one directory tree.
Related
I'm trying to write a plugin for Sublime Text 3.
I have to use several third party packages in my code. I have managed to get the code working by manually copying the packages into /home/user/.config/sublime-text-3/Packages/User/, then I used relative imports to get to the needed code. How would I distribute the plugin to the end users? Telling them to copy the needed dependencies to the appropriate location is certainly not the way to go. How are 3rd party modules supposed to be used properly with Sublime Text plugins? I can't find any documentation online; all I see is the recommendation to put the modules in the folder.
Sublime uses it's own embedded Python interpreter (currently Python 3.3.6 although the next version will also support Python 3.8 as well) and as such it will completely ignore any version of Python that you may or may not have installed on your system, as well as any libraries that are installed for that version.
For that reason, if you want to use external modules (hereafter dependencies) you need to do extra work. There are a variety of ways to accomplish this, each with their own set of pros and cons.
The following lists the various ways that you can achieve this; all of them require a bit of an understanding about how modules work in Python in order to understand what's going on. By and large except for the paths involved there's nothing too "Sublime Text" about the mechanisms at play.
NOTE: The below is accurate as of the time of this answer. However there are plans for Package Control to change how it works with dependencies that are forthcoming that may change some aspect of this.
This is related to the upcoming version of Sublime supporting multiple versions of Python (and the manner in which it supports them) which the current Package Control mechanism does not support.
It's unclear at the moment if the change will bring a new way to specify dependencies or if only the inner workings of how the dependencies are installed will change. The existing mechanism may remain in place regardless just for backwards compatibility, however.
All roads to accessing a Python dependency from a Sublime plugin involve putting the code for it in a place where the Python interpreter is going to look for it. This is similar to how standard Python would do things, except that locations that are checked are contained within the area that Sublime uses to store your configuration (referred to as the Data directory) and instead of a standalone Python interpreter, Python is running in the plugin host.
Populate the library into the Lib folder
Since version 3.0 (build 3143), Sublime will create a folder named Lib in the data directory and inside of it a directory based on the name of the Python version. If you use Preferences > Browse Packages and go up one folder level, you'll see Lib, and inside of it a folder named e.g. python3.3 (or if you're using a newer build, python33 and python38).
Those directories are directly on the Python sys.path by default, so anything placed inside of them will be immediately available to any plugin just as a normal Python library (or any of those built in) would be. You could consider these folders to be something akin to the site-packages folder in standard Python.
So, any method by which you could install a standard Python library can be used so long as the result is files ending up in this folder. You could for example install a library via pip and then manually copy the files to that location from site-packages, manually install from sources, etc.
Lib/python3.3/
|-- librarya
| `-- file1.py
|-- libraryb
| `-- file2.py
`-- singlefile.py
Version restrictions apply here; the dependency that you want to use must support the version of Python that Sublime is using, or it won't work. This is particularly important for Python libraries with a native component (e.g. a .dll, .so or .dylib), which may require hand compiling the code.
This method is not automatic; you would need to do it to use your package locally, and anyone that wants to use your package would also need to do it as well. Since Sublime is currently using an older version of Python, it can be problematic to obtain a correct version of libraries as well.
In the future, Package Control will install dependencies in this location (Will added the folder specifically for this purpose during the run up to version 3.0), but as of the time I'm writing this answer that is not currently the case.
Vendor your dependencies directly inside of your own package
The Packages folder is on the sys.path by default as well; this is how Sublime finds and loads packages. This is true of both the physical Packages folder, as well as the "virtual" packages folder that contains the contents of sublime-package files.
For example, one can access the class that provides the exec command via:
from Default.exec import ExecCommand
This will work even though the exec.py file is actually stored in Default.sublime-package in the Sublime text install folder and not physically present in the Packages folder.
As a result of this, you can vendor any dependencies that you require directly inside of your own package. Here this could be the User package or any other package that you're creating.
It's important to note that Sublime will treat any Python file in the top level of a package as a plugin and try to load it as one. Hence it's important that if you go this route you create a sub-folder in your package and put the library in there.
MyPackage/
|-- alibrary
| `-- code.py
`-- my_plugin.py
With this structure, you can access the module directly:
import MyPackage.alibrary
from MyPackage.alibrary import someSymbol
Not all Python modules lend themselves to this method directly without modification; some code changes in the dependency may be required in order to allow different parts of the library to see other parts of itself, for example if it doesn't use relative import to get at sibling files. License restrictions may also get in the way of this as well, depending on the library that you're using.
On the other hand, this directly locks the version of the library that you're using to exactly the version that you tested with, which ensures that you won't be in for any undue surprises further on down the line.
Using this method, anything you do to distribute your package will automatically also distribute the vendored library that's contained inside. So if you're distributing by Package Control, you don't need to do anything special and it will Just Work™.
Modify the sys.path to point to a custom location
The Python that's embedded into Sublime is still standard Python, so if desired you can manually manipulate the sys.path that describes what folders to look for packages in so that it will look in a place of your choosing in addition to the standard locations that Sublime sets up automatically.
This is generally not a good idea since if done incorrectly things can go pear shaped quickly. It also still requires you to manually install libraries somewhere yourself first, and in that case you're better off using the Lib folder as outlined above, which is already on the sys.path.
I would consider this method an advanced solution and one you might use for testing purposes during development but otherwise not something that would be user facing. If you plan to distribute your package via Package Control, the review of your package would likely kick back a manipulation of the sys.path with a request to use another method.
Use Package Control's Dependency system (and the dependency exists)
Package control contains a dependency mechanism that uses a combination of the two prior methods to provide a way to install a dependency automatically. There is a list of available dependencies as well, though the list may not be complete.
If the dependency that you're interested in using is already available, you're good to go. There are two different ways to go about declaring that you need one or more dependencies on your package.
NOTE: Package Control doesn't currently support dependencies of dependencies; if a dependency requires that another library also be installed, you need to explicitly mention them both yourself.
The first involves adding a dependencies key to the entry for your package in the package control channel file. This is a step that you'd take at the point where you're adding your package to Package Control, which is something that's outside the scope of this answer.
While you're developing your package (or if you decide that you don't want to distribute your package via Package Control when you're done), then you can instead add a dependencies.json file into the root of your package (an example dependencies.json file is available to illustrate this).
Once you do that, you can choose Package Control: Satisfy Dependencies from the command Palette to have Package Control download and install the dependency for you (if needed).
This step is automatic if your package is being distributed and installed by Package Control; otherwise you need to tell your users to take this step once they install the package.
Use Package Control's Dependency system (but the dependency does not exist)
The method that Package Control uses to install dependencies is, as outlined at the top of the question subject to change at some point in the (possibly near) future. This may affect the instructions here. The overall mechanism may remain the same as far as setup is concerned, with only the locations of the installation changing, but that remains to be seen currently.
Package Control installs dependencies via a special combination of vendoring and also manipulation of the sys.path to allow things to be found. In order to do so, it requires that you lay out your dependency in a particular way and provide some extra metadata as well.
The layout for the package that contains the dependency when you're building it would have a structure similar to the following:
Packages/my_dependency/
├── .sublime-dependency
└── prefix
└── my_dependency
└── file.py
Package Control installs a dependency as a Package, and since Sublime treats every Python file in the root of a package as a plugin, the code for the dependency is not kept in the top level of the package. As seen above, the actual content of the dependency is stored inside of the folder labeled as prefix above (more on that in a second).
When the dependency is installed, Package Control adds an entry to it's special 0_package_control_loader package that causes the prefix folder to be added to the sys.path, which makes everything inside of it available to import statements as normal. This is why there's an inherent duplication of the name of the library (my_dependency in this example).
Regarding the prefix folder, this is not actually named that and instead has a special name that determines what combination of Sublime Text version, platform and architecture the dependency is available on (important for libraries that contain binaries, for example).
The name of the prefix folder actually follows the form {st_version}_{os}_{arch}, {st_version}_{os}, {st_version} or all. {st_version} can be st2 or st3, {os} can be windows, linux or osx and {arch} can be x32 or x64.
Thus you could say that your dependency supports only st3, st3_linux, st3_windows_x64 or any combination thereof. For something with native code you may specify several different versions by having multiple folders, though commonly all is used when the dependency contains pure Python code that will work regardless of the Sublime version, OS or architecture.
In this example, if we assume that the prefix folder is named all because my_dependency is pure Python, then the result of installing this dependency would be that Packages/my_dependency/all would be added to the sys.path, meaning that if you import my_dependency you're getting the code from inside of that folder.
During development (or if you don't want to distribute your dependency via Package Control), you create a .sublime-dependency file in the root of the package as shown above. This should be a text file with a single line that contains a 2 digit number (e.g. 01 or 50). This controls in what order each installed dependency will get added to the sys.path. You'd typically pick a lower number if your dependency has no other dependencies and a higher value if it does (so that it gets injected after those).
Once you have the initial dependency laid out in the correct format in the Packages folder, you would use the command Package Control: Install Local Dependency from the Command Palette, and then select the name of your dependency.
This causes Package Control to "install" the dependency (i.e. update the 0_package_control_loader package) to make the dependency active. This step would normally be taken by Package Control automatically when it installs a dependency for the first time, so if you are also manually distributing your dependency you need to provide instructions to take this step.
I have a python package built from source code in /Document/pythonpackage directory
/Document/pythonpackage/> python setup.py install
This creates a folder in site-packages directory of python
import pythonpackage
print(pythonpackage.__file__)
>/anaconda3/lib/python3.7/site-packages/pythonpackage-x86_64.egg/pythonpackage/__init__.py
I am running a script on multiple environments so the only path I know I will have is pythonpackage.__file__. However Document/pythonpackage has some data that is not in site-packages is there a way to automatically find the path to /Document/pythonpackage given that you only have access to the module in python?
working like that is discouraged. it's generally assumed that after installing a package the user can remove the installation directory (as most automated package managers would do). instead you'd make sure your setup.py copied any data files over into the relevant places, and then your code would pick them up from there.
assuming you're using the standard setuptools, you can see the docs on Including Data Files, which says at the bottom:
In summary, the three options allow you to:
include_package_data
Accept all data files and directories matched by MANIFEST.in.
package_data
Specify additional patterns to match files that may or may not be matched by MANIFEST.in or found in source control.
exclude_package_data
Specify patterns for data files and directories that should not be included when a package is installed, even if they would otherwise have been included due to the use of the preceding options.
and then says:
Typically, existing programs manipulate a package’s __file__ attribute in order to find the location of data files. However, this manipulation isn’t compatible with PEP 302-based import hooks, including importing from zip files and Python Eggs. It is strongly recommended that, if you are using data files, you should use the ResourceManager API of pkg_resources to access them
Not sure, but you could create a repository for your module and use pip to install it. The egg folder would then have a file called PKG-INFO which would contain the url to the repository you imported your module from.
In my office we have a quite complex directory structure when it comes to our code.
One of the things we have is a libs module to drop "common" things used by other parts of our big application (or set of applications... that are all living under a common directory).
The code in that libs/ directory requires certain packages installed in order for it to work. In said libs/ directory we have a requirements.txt file that supposedly lists the dependencies required for the things (things being Python code) in it to work. We have been filling that requirements.txt file pretty manually, tracking that "if this .py file uses this module, we should add it to the requirements file" so it's almost certain that by now we have forgotten adding some required modules.
Because of the complex structure we have (some parts use pipenv, some other have their own requirements.txt...) is very hard knowing whether a required module is going to end up installed or not.
So I would like to make sure that this libs/ directory (cough, cough... module ) has all its dependencies listed in its libs/requirements.txt.
Is that possible? Ideally it'd be "run this command passing /libs/ as an argument, it'll scan the directory and tell you what packages are needed by the py(s) found in it"
Thank you in advance.
Unfortunately, python does not know whether its dependencies are satisfied until runtime. requirements.txt is just a helper file for pip and similar tools, and you have to update it manually.
That said, you could
use the os module to recursively get a list of all *.py files in the folder
parse each one of them for lines having the format import aaa.bbb or from aaa import bbb
keep a set of the imports
However, even in that case, the name of the imported module is not the same as the name you need to pass to pip (eg, import yaml requires pyyaml in requirements.txt), but at least it could be a hint of what's missing.
I have two python projects, one includes useful packages for file manipulation and such. They are usefull because they can be reused in any other kind of project. The second is one of these projects, requiring the use of my useful packages. Here is my projects' file structure:
Python-projects/
Useful/
package_parsing/
package_file_manipulation/
package_blabla/
Super-Application/
pkg1/
__init__.py
module1.py
module2.py
pkg2/
setup.py
MANIFEST.in
README.rst
First of, I would like to use the Useful/package_parsing package in my module Super-Application/pkg1/module1.py. Is there a more convenient way to do it other than copying the package_parsing in Super-Application project?
Depending in the first answer, that is if there is a way to link a module from a different project, how could I include such external module in a release package of my Super-Application project? I am not sure that making use of install_requires in the setup.py will do.
My main idea here is not to duplicate the Useful/package_parsing package in all of my other development projects, especially when I would like to do modifications to this useful package. I wouldn't like to update all the outdated copies in each project.
=============
EDIT 1
It appears the first part of my question cna be dealt with appending the path:
import sys
sys.path.insert(0, path/to/Useful/package_parsing)
Moreover I can simply check the available paths using:
for p in sys.path:
print p
Now for the second part, how could I include such external module in a release package, possibly using the setup.py installation file?
I'm working toward adopting Python as part of my team's development tool suite. With the other languages/tools we use, we develop many reusable functions and classes that are specific to the work we do. This standardizes the way we do things and saves a lot of wheel re-inventing.
I can't seem to find any examples of how this is usually handled with Python. Right now I have a development folder on a local drive, with multiple project folders below that, and an additional "common" folder containing packages and modules with re-usable classes and functions. These "common" modules are imported by modules within multiple projects.
Development/
Common/
Package_a/
Package_b/
Project1/
Package1_1/
Package1_2/
Project2/
Package2_1/
Package2_2/
In trying to learn how to distribute a Python application, it seems that there is an assumption that all referenced packages are below the top-level project folder, not collateral to it. The thought also occurred to me that perhaps the correct approach is to develop common/framework modules in a separate project, and once tested, deploy those to each developer's environment by installing to the site-packages folder. However, that also raises questions re distribution.
Can anyone shed light on this, or point me to a resource that discusses this issue?
If you have common code that you want to share across multiple projects, it may be worth thinking about storing this code in a physically separate project, which is then imported as a dependency into your other projects. This is easily achieved if you host your common code project in github or bitbucket, where you can use pip to install it in any other project. This approach not only helps you to easily share common code across multiple projects, but it also helps protect you from inadvertently creating bad dependencies (i.e. those directed from your common code to your non common code).
The link below provides a good introduction to using pip and virtualenv to manage dependencies, definitely worth a read if you and your team are fairly new to working with python as this is a very common toolchain used for just this kind of problem:
http://dabapps.com/blog/introduction-to-pip-and-virtualenv-python/
And the link below shows you how to pull in dependencies from github using pip:
How to use Python Pip install software, to pull packages from Github?
The must-read-first on this kind of stuff is here:
What is the best project structure for a Python application?
in case you haven't seen it (and follow the link in the second answer).
The key is that each major package be importable as if "." was the top level directory, which means that it will also work correctly when installed in a site-packages. What this implies is that major packages should all be flat within the top directory, as in:
myproject-0.1/
myproject/
framework/
packageA/
sub_package_in_A/
module.py
packageB/
...
Then both you (within your other packages) and your users can import as:
import myproject
import packageA.sub_package_in_A.module
etc
Which means you should think hard about #MattAnderson's comment, but if you want it to appear as a separately-distributable package, it needs to be in the top directory.
Note this doesn't stop you (or your users) from doing an:
import packageA.sub_package_in_A as sub_package_in_A
but it does stop you from allowing:
import sub_package_in_A
directly.
...it seems that there is an assumption that all referenced packages
are below the top-level project folder, not collateral to it.
That's mainly because the current working directory is the first entry in sys.path by default, which makes it very convenient to import modules and packages below that directory.
If you remove it, you can't even import stuff from the current working directory...
$ touch foo.py
$ python
>>> import sys
>>> del sys.path[0]
>>> import foo
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named foo
The thought also occurred to me that perhaps the correct approach is
to develop common/framework modules in a separate project, and once
tested, deploy those to each developer's environment by installing to
the site-packages folder.
It's not really a major issue for development. If you're using version control, and all developers check out the source tree in the same structure, you can easily employ relative path hacks to ensure the code works correctly without having to mess around with environment variables or symbolic links.
However, that also raises questions re distribution.
This is where things can get a bit more complicated, but only if you're planning to release libraries independently of the projects which use them, and/or having multiple project installers share the same libraries. It that's the case, take a look at distutils.
If not, you can simply employ the same relative path hacks used in development to ensure you project works "out of the box".
I think that this is the best reference for creating a distributable python package:
link removed as it leads to a hacked site.
also, don't feel that you need to nest everything under a single directory. You can do things like
platform/
core/
coremodule
api/
apimodule
and then do things like from platform.core import coremodule, etc.