My project has to be extensible, i have a lot of scripts with the same interface that lookup things online. Before i was using __import__ but that does not let me put my 'plugins' on a dedicated directory:
root/
main.py
plugins/
[...]
So my question is: Is there a way to individually import modules from that subdirectory? I'm guessing importlib, but i'm so lost in how the python module loading process works... What i want to do is something like this:
for pluginname in plugins:
plugin = somekindofimport("plugins/{name}".format(name=pluginname))
plugin.unififedinterface()
Also, as a side question, the way am i trying to achieve extensibility is a good way?
I'm on python3.3
Stop thinking in terms of pathnames and start thinking in terms of packages. Read Packages in the tutorial, and if you want more detail see The import system.
But the basic idea is this:
Create a file name plugins/__init__.py. It can be empty; that's enough to turn plugins into a package. Which means you can import modules from that package with:
import plugins.plugin
So, how do you do this dynamically? That's what importlib is for. (You can also use __import__ here, but it's less flexible, and less readable in non-trivial cases, so unless you need pre-3.3 compatibility, don't.)
plugin = importlib.import_module('plugins.{name}'.format(name=pluginname))
It would probably be cleaner to import plugins to get the package, and then use relative imports from within that package, as shown in the examples in the import_module docs.
This also means Python takes care of the .pyc creation and caching, etc.
And it means that you can later expand plugins to be a "namespace package", which can be split across multiple directories like /usr/share/myapp/plugins for stock plugins, /etc/myapp/plugins for site plugins and ~/myapp/plugins for user-specific plugins.
If you really, really want to import from a directory that isn't a package, you can create a module loader and use it, but that's a whole lot of work for no actual benefit. (It's actually not that hard in 3.3 (SourceLoader and friends will do most of the work for you), but you will find almost no examples out there to guide you; instead, you'll find examples of the 2.6-3.2 way, or the 2.0-2.5 way, both of which are hard.) Plus, it means that if someone creates a plugin named, say, gzip, you can end up blocking the stdlib gzip module with the plugin. (That's especially fun if the gzip plugin tries to use the gzip stdlib module, as it likely will…) If the plugin ends up being named plugins.gzip, there's no problem.
Also, as a side question, the way am i trying to achieve extensibility is a good way?
As long as you only want to support 3.3+, yes, I think this is a great solution.
Before 3.3, using a package for plugins was a lot more problematic. People have come up with a variety of different plugin systems—in one case going so far as to dynamically create module objects and execfile into them. If you need to deal with that, I would suggest looking at existing Python apps with plugins (e.g., MusicBrainz Picard) to get different ideas.
Related
I am a little bit confused in the difference between a package and a library. When I install packages from pypi.org, these packages contain several sub-packages, that contain modules. When I googled the difference between a package and I library, I found this.
And that being the case, can a package contain several sub-packages be also called as a library? If no then what is a library? And what is the difference between a library and a package containing sub-packages?
Library
Most often will refer to the general library or another collection created with a similar format and use. The General Library is the sum of 'standard', popular and widely used Modules, witch can be thought of as single file tools, for now or short cuts making things possible or faster. The general library is an option most people enable when installing Python. Because it has this name "Python General Library" it is used often with similar structure, and ideas. Witch is simply to have a bunch of Modules, maybe even packages grouped together, usually in a list. The list is usually to download them. Generally it is just related files, with similar interests. That is the easiest way to describe it.
Module
A Module refers to a file. The file has script 'in it' and the name of the file is the name of the module, Python files end with .py. All the file contains is code that ran together makes something happen, by using functions, strings ect. Main modules you probably see most often are popular because they are special modules that can get info from other files/modules. It is confusing because the name of the file and module are equal and just drop the .py. Really it's just code you can use as a shortcut written by somebody to make something easier or possible.
Package
This is a termis used to generally sometimes, although context makes a difference. The most common use from my experience is multiple modules (or files) that are grouped together. Why they are grouped together can be for a few reasons, that is when context matters. These are ways I have noticed the term package(s) used. They are a group of Downloaded, created and/or stored modules. Which can all be true, or only 1, but really it is just a file that references other files, that need to be in the correct structure or format, and that entire sum is the package itself, installed or may have been included in the python general library. A package can contain modules(.py files) because they depend on each other and sometimes may not work correctly, or at all. There is always a common goal of every part (module/file) of a package, and the total sum of all of the parts is the package itself.
Most often in Python Packages are Modules, because the package name is the name of the module that is used to connect all the pieces. So you can input a package because it is a module, also allows it to call upon other modules, that are not packages because they only perform a certain function, or task don't involve other files. Packages have a goal, and each module works together to achieve that final goal.
Most confusion come from a simple file file name or prefix to a file, used as the module name then again the package name.
Remember Modules and Packages can be installed. Library is usually a generic term for listing, or formatting a group of modules and packages. Much like Pythons general library. A hierarchy would not work, APIs do not belong really, and if you did they could be anywhere and every ware involving Script, Module, and Packages, the worl library being such a general word, easily applied to many things, also makes API able to sit above or below that. Some Modules can be based off of other code, and that is the only time I think it would relate to a pure Python related discussion.
I am writing a compiler in Python, using the PLY (Python Lex-Yacc) library to 'compile' the compiler. The compiler has to go through a lot of rules (the
number of just the core rules is eventually going to be a little less than a hundred, and they can be extended). So to keep the different types of rules separate, I made many Python modules in a single modules directory.
To include all the rules, I don't have to include the modules in this directory, but I have to include the rules (implemented as Python functions) into the current namespace. Once they simply exist there, the compiler's input will be properly tokenized, parsed, etc.
Here's what I've read about and tried:
using __import__, getattr, and sys.modules (very raw and in general not preferred)
the importlib library (how do I get everything inside the module?)
a lot of fiddling with __init__.py and just trying to from modules import * which will import everything in the modules as well
But none of these seem entirely satisfactory to me. I can't do precisely what I want to do with any of them. So my question is: how can I import some of the attributes of a Python module in a subdirectory into the running namespace of a top-level module?
Thanks for your attention!
You want to use an existing plugin library like stevedore. It will give you the tools to enumerate files that can be imported, and tools to import those modules.
We have a growing library of apps depending on a set of common util modules. We'd like to:
share the same utils codebase between all projects
allow utils to be extended (and fixed!) by developers working on any project
have this be reasonably simple to use for devs (i.e. not a big disruption to workflow)
cross-platform (no diffs for devs on Macs/Win/Linux)
We currently do this "manually", with the utils versioned as part of each app. This has its benefits, but is also quite painful to repeatedly fix bugs across a growing number of codebases.
On the plus side, it's very simple to deal with in terms of workflow - util module is part of each app, so on that side there is zero overhead.
We also considered (fleetingly) using filesystem links or some such (not portable between OS's)
I understand the implications about release testing and breakage, etc. These are less of a problem than the mismatched utils are at the moment.
You can take advantage of Python paths (the paths searched when looking for module to import).
Thus you can create different directory for utils and include it within different repository than the project that use these utils. Then include path to this repository in PYTHONPATH.
This way if you write import mymodule, it will eventually find mymodule in the directory containing utils. So, basically, it will work similarly as it works for standard Python modules.
This way you will have one repository for utils (or separate for each util, if you wish), and separate repositories for other projects, regardless of the version control system you use.
What versioning system are you under? If you are under git, take a look to submodules. The idea in this case is that you would be able to keep a unique, separate repository with the utils, that would be polled into the various project automatically.
I have no direct experience with mercurial, but I believe subrepositories are the equivalent feature.
If you are under SVN... wait... I hope not! :)
I have a few Python packages that I would like to tidy up and publish on PyPI. These packages import a couple of Python modules I've written to augment or simplify certain operations (e.g., reading/writing from CSV files with headers by wrapping csv functions), provide handy data structures, etc. Currently these modules are housed in the top-level directory that holds the code for my projects, and I rely on reaching them by having added that directory to my PYTHONPATH environmental variable. (Less than tidy, I know.)
By creating a separate package for these modules and uploading them on PyPI, I could mark such a package as a dependency for the packages I actually want to distribute. These convenience modules are, however, small and of limited use and interest, such that I don't think they warrant distributing as a separate package on PyPI. On the other hand, I am hesitant to copy these convenience modules (i.e., use cp convenience_module.py projectX/.) into each project directory, as this creates multiple copies of the same file both in the VCS repository housing my Python code and in the different source distribution tarballs I would post to PyPI. Is there an elegant solution to this problem?
You don't say why you're hesitant to 'provide copies'. In general, I think a reasonable approach is to think about how you've set things up for yourself to use the convenience modules. Did you install them in site-packages (or equivalent), or did you just depend on them being in the directory you ran the code from? However you use the modules, is that situation ideal, or is there a way that would be nicer for you?
Start with that, and figure out how to automate it through setup.py, which lets you put things wherever you want on the system (though I strongly discourage abusing this capability).
Whether you distribute them as a tarball or with the package that needs them, you still have to maintain all of the files, so the only real question is whether you intend for those convenience modules to develop their own user communities with their own support requests, etc., or whether they're decidedly intended only for use in support of this other module.
If you intend those modules to be used only for the one module, include them in the package, perhaps in a 'utils' package inside the distribution. Otherwise you're just cluttering the index with things people might think are useful, but are really joined at the hip with something else that drives the changes and maintenance of them.
If you intend those modules to be generic, and intend to maintain them as such, and think they have use outside of supporting this module, distribute them separately.
As far as I know, distributing these small packages via PyPI is only viable option. Yes, it clutters the index with near-useless packages, but its something that should be solved by PyPI maintainers, not package developers. Another alternative is to use stdlib's or other util packages data and functions rather than reinventing the wheel.
Just make sure you describe that utils package as such, or extend them in something more useful for others.
This is something that I think would be very useful. Basically, I'd like there to be a way to edit Python source programmatically without requiring human intervention. There are a couple of things I would like to do with this:
Edit the configuration of Python apps that use source modules for configuration.
Set up a "template" so that I can customize a Python source file on the fly. This way, I can set up a "project" system on an open source app I'm working on and allow certain files to be customized.
I could probably write something that can do this myself, but I can see that opening up a lot of "devil's in the details" type issues. Are there any ways to do this currently, or am I just going to have to bite the bullet and implement it myself?
Python's standard library provides pretty good facilities for working with Python source; note the tokenize and parser modules.
Most of these kinds of things can be determined programatically in Python, using modules like sys, os, and the special _file_ identifier which tells you where you are in the filesystem path.
It's important to keep in mind that when a module is first imported it will execute everything in the file-scope, which is important for developing system-dependent behaviors. For example, the os module basically determines what operating system you're using on import and then adjusts its implementation accordingly (by importing another module corresponding to Linux, OSX, Windows, etc.).
There's a lot of power in this feature and something along these lines is probably what you're looking for. :)
[Edit] I've also used socket.gethostname() in some rare, hackish instances. ;)
I had the same issue and I simply opened the file and did some replace: then reload the file in the Python interpreter. This works fine and is easy to do.
Otherwise AFAIK you have to use some conf objects.