Some background: I am revamping the package we use in my company as the code base for many projects. This is an internal package, and is used/developed by a very limited number of people, so we allow ourselves to sometimes make major changes without a proper deprecation protocol.
The problem at hand: I am removing a few modules from the package as part of the code revamp. These contain some old classes and functions, which are no longer used in new projects. Just in case someone will ever need these objects, perhaps running some old notebook, I would like to move these modules to a different "legacy" package. When someone attempts to import a now-deleted module in the main package, instead of the standard ImportError, I would like to be able to give a useful error message instructing them to import from the legacy package. Is there any way to achieve this? Perhaps using the __init__.py file?
Related
I have a project, this project has several modules, let's pick this as an example:
Utility: all modules that use this module have shared functions
Read: This will read files with some configurations provided in the init function, can be called needed.
Process Data 1 (PD1): This will work with some data read with the "Read" module.
Process Data 2: This will work with some data read with the "Read" module and will use Process Data 2.
Manage: This will register any process data and handle them.
First, Utility will be used for all the modules, so, I can't just copy the module folder inside of any module that wants to use it. The dependency of PD1, PD2, and Manage causes that the module Manage needs to can load or access to the PD1, but PD2 needs PD1 too. There is the Read module too, with similar problems.
But, where is the problem?, Usually, if we add a new module, we put them in folders, especially if have a lot of files to don't mix up things, but this doesn't work fine here, excluding symbolic links, there is no way by only sorting the folders to have the right access to any section, well, there are some ways, but here is the point:
If we sort it all now, add a new module that uses the right modules, can break the actual sort.
I try the next thing, I put all the modules folders in a folder, and then add the ".." path, so now I can call any module I need for any existent or new module, but this has several problems... the import function start doing weird things, workaround, first import all external modules, then add the new ones..., but is a bit..., too much messy.
There is a bigger solution, I can handle all the modules in an independent way, install all of them in the system, so now all the modules are exposed to the others, and all works properly, and that is how should works, conceptually and in python terms is right. The major issue here is that installing all in this way is, developing time-consuming. I will test the "system" if I don't have installed everything... but install it every time for every single test.
Another thing to consider is that, if I install all the modules in the system, and I work with too many of them..., there can be a conflict with other modules names.., so, I need a way to can "encapsulate" all these modules with 1 module, and only can expose all of them, "inside" of this module.
Summary, I need a "Super module" that will be exposed to the system, and inside the super module, it needs to contain, Utility, Read, PD1, PD2, and Manage, where all of these modules needs to can call every one of them, without including any new "path" to python variables in any way.
I have a package animals with several thousand modules dog_1.py, cat_3.py, etc. Each module contains functions color(), size(), etc, and these functions often depend on other functions in the package outside the module. In other words, the bark() function in dog_2.py might depend on 5 other functions from different modules within the same animals package. At the bottom of each module, I have if __name__ == '__main__': because I need to be able to run each module as a stand-alone script but I don't want the module to execute if it is imported into another module. I could accomplish what I need by adding a bunch of import statements to each module to satsify the dependencies, but I thought there must be a better way to do this.
How is this type of problem typically handled?
Essentially, I need to be able to run each module separately as a script, and each module has (possibly) hundreds of dependencies within the same package.
I tried adding to the __init__.py a statement: __all__ = ['dog_1.py', 'cat_3.py', ...]. This seems like a good idea. But the problem I run into is that when I include from animals import * at the top of a given module, the import of that current module causes an error. I thought it was possible to import the module to itself, but for some reason, it is not working. If I remove the current module from the __all__ list, then it seems to work fine.
I also tried installing the package locally by creating an outer directory and adding .cfg and setup.py files. I thought it might fix the problem, but I wasn't able to get anywhere.
I feel like I might be just going about this completely wrongly. It seems like it would be an easy problem to handle: Make a bunch of program files in one directory and have them all share the functions between each other. Any help is appreciated. Thank you.
I am a little bit confused in the difference between a package and a library. When I install packages from pypi.org, these packages contain several sub-packages, that contain modules. When I googled the difference between a package and I library, I found this.
And that being the case, can a package contain several sub-packages be also called as a library? If no then what is a library? And what is the difference between a library and a package containing sub-packages?
Library
Most often will refer to the general library or another collection created with a similar format and use. The General Library is the sum of 'standard', popular and widely used Modules, witch can be thought of as single file tools, for now or short cuts making things possible or faster. The general library is an option most people enable when installing Python. Because it has this name "Python General Library" it is used often with similar structure, and ideas. Witch is simply to have a bunch of Modules, maybe even packages grouped together, usually in a list. The list is usually to download them. Generally it is just related files, with similar interests. That is the easiest way to describe it.
Module
A Module refers to a file. The file has script 'in it' and the name of the file is the name of the module, Python files end with .py. All the file contains is code that ran together makes something happen, by using functions, strings ect. Main modules you probably see most often are popular because they are special modules that can get info from other files/modules. It is confusing because the name of the file and module are equal and just drop the .py. Really it's just code you can use as a shortcut written by somebody to make something easier or possible.
Package
This is a termis used to generally sometimes, although context makes a difference. The most common use from my experience is multiple modules (or files) that are grouped together. Why they are grouped together can be for a few reasons, that is when context matters. These are ways I have noticed the term package(s) used. They are a group of Downloaded, created and/or stored modules. Which can all be true, or only 1, but really it is just a file that references other files, that need to be in the correct structure or format, and that entire sum is the package itself, installed or may have been included in the python general library. A package can contain modules(.py files) because they depend on each other and sometimes may not work correctly, or at all. There is always a common goal of every part (module/file) of a package, and the total sum of all of the parts is the package itself.
Most often in Python Packages are Modules, because the package name is the name of the module that is used to connect all the pieces. So you can input a package because it is a module, also allows it to call upon other modules, that are not packages because they only perform a certain function, or task don't involve other files. Packages have a goal, and each module works together to achieve that final goal.
Most confusion come from a simple file file name or prefix to a file, used as the module name then again the package name.
Remember Modules and Packages can be installed. Library is usually a generic term for listing, or formatting a group of modules and packages. Much like Pythons general library. A hierarchy would not work, APIs do not belong really, and if you did they could be anywhere and every ware involving Script, Module, and Packages, the worl library being such a general word, easily applied to many things, also makes API able to sit above or below that. Some Modules can be based off of other code, and that is the only time I think it would relate to a pure Python related discussion.
I am writing a compiler in Python, using the PLY (Python Lex-Yacc) library to 'compile' the compiler. The compiler has to go through a lot of rules (the
number of just the core rules is eventually going to be a little less than a hundred, and they can be extended). So to keep the different types of rules separate, I made many Python modules in a single modules directory.
To include all the rules, I don't have to include the modules in this directory, but I have to include the rules (implemented as Python functions) into the current namespace. Once they simply exist there, the compiler's input will be properly tokenized, parsed, etc.
Here's what I've read about and tried:
using __import__, getattr, and sys.modules (very raw and in general not preferred)
the importlib library (how do I get everything inside the module?)
a lot of fiddling with __init__.py and just trying to from modules import * which will import everything in the modules as well
But none of these seem entirely satisfactory to me. I can't do precisely what I want to do with any of them. So my question is: how can I import some of the attributes of a Python module in a subdirectory into the running namespace of a top-level module?
Thanks for your attention!
You want to use an existing plugin library like stevedore. It will give you the tools to enumerate files that can be imported, and tools to import those modules.
My project has to be extensible, i have a lot of scripts with the same interface that lookup things online. Before i was using __import__ but that does not let me put my 'plugins' on a dedicated directory:
root/
main.py
plugins/
[...]
So my question is: Is there a way to individually import modules from that subdirectory? I'm guessing importlib, but i'm so lost in how the python module loading process works... What i want to do is something like this:
for pluginname in plugins:
plugin = somekindofimport("plugins/{name}".format(name=pluginname))
plugin.unififedinterface()
Also, as a side question, the way am i trying to achieve extensibility is a good way?
I'm on python3.3
Stop thinking in terms of pathnames and start thinking in terms of packages. Read Packages in the tutorial, and if you want more detail see The import system.
But the basic idea is this:
Create a file name plugins/__init__.py. It can be empty; that's enough to turn plugins into a package. Which means you can import modules from that package with:
import plugins.plugin
So, how do you do this dynamically? That's what importlib is for. (You can also use __import__ here, but it's less flexible, and less readable in non-trivial cases, so unless you need pre-3.3 compatibility, don't.)
plugin = importlib.import_module('plugins.{name}'.format(name=pluginname))
It would probably be cleaner to import plugins to get the package, and then use relative imports from within that package, as shown in the examples in the import_module docs.
This also means Python takes care of the .pyc creation and caching, etc.
And it means that you can later expand plugins to be a "namespace package", which can be split across multiple directories like /usr/share/myapp/plugins for stock plugins, /etc/myapp/plugins for site plugins and ~/myapp/plugins for user-specific plugins.
If you really, really want to import from a directory that isn't a package, you can create a module loader and use it, but that's a whole lot of work for no actual benefit. (It's actually not that hard in 3.3 (SourceLoader and friends will do most of the work for you), but you will find almost no examples out there to guide you; instead, you'll find examples of the 2.6-3.2 way, or the 2.0-2.5 way, both of which are hard.) Plus, it means that if someone creates a plugin named, say, gzip, you can end up blocking the stdlib gzip module with the plugin. (That's especially fun if the gzip plugin tries to use the gzip stdlib module, as it likely will…) If the plugin ends up being named plugins.gzip, there's no problem.
Also, as a side question, the way am i trying to achieve extensibility is a good way?
As long as you only want to support 3.3+, yes, I think this is a great solution.
Before 3.3, using a package for plugins was a lot more problematic. People have come up with a variety of different plugin systems—in one case going so far as to dynamically create module objects and execfile into them. If you need to deal with that, I would suggest looking at existing Python apps with plugins (e.g., MusicBrainz Picard) to get different ideas.