I need to programmatically block import of a python package and all child packages.
For example, I need to block loading of the package "foo" and also ensure that all children of foo such as "foo.bar" cannot be imported.
How can this be achieved in python 2.x without restructuring my site packages or PYTHONPATH?
For context, the intent is to programmatically avoid the risk of importing proprietary code into GPL licensed code.
This might not fit your solution exactly but you can simply mock out the entire module so the module and all of it's submodules have no effect:
https://pypi.python.org/pypi/mock
Related
I am writing a compiler in Python, using the PLY (Python Lex-Yacc) library to 'compile' the compiler. The compiler has to go through a lot of rules (the
number of just the core rules is eventually going to be a little less than a hundred, and they can be extended). So to keep the different types of rules separate, I made many Python modules in a single modules directory.
To include all the rules, I don't have to include the modules in this directory, but I have to include the rules (implemented as Python functions) into the current namespace. Once they simply exist there, the compiler's input will be properly tokenized, parsed, etc.
Here's what I've read about and tried:
using __import__, getattr, and sys.modules (very raw and in general not preferred)
the importlib library (how do I get everything inside the module?)
a lot of fiddling with __init__.py and just trying to from modules import * which will import everything in the modules as well
But none of these seem entirely satisfactory to me. I can't do precisely what I want to do with any of them. So my question is: how can I import some of the attributes of a Python module in a subdirectory into the running namespace of a top-level module?
Thanks for your attention!
You want to use an existing plugin library like stevedore. It will give you the tools to enumerate files that can be imported, and tools to import those modules.
Will the Sphinx documentation engine successfully generate documentation on a project that doesn't import well? In particular my project has an exotic dependency. I don't want document generation to depend on this dependency.
Does Sphinx need to import my module and use introspection or does it parse?
If you're using the autodoc extension, then yes, your project must be importable. But sometimes it's possible to mock out dependencies in your conf.py (since, presumably, at the time of import, the dependencies are needed in name only). The Read the Docs documentation has an example of how to do this.
Core Sphinx doesn't touch your code at all. The autodoc extension does, and it indeed imports it:
For Sphinx (actually, the Python interpreter that executes Sphinx) to find your module, it must be importable.
I need to ship a collection of Python programs that use multiple packages stored in a local Library directory: the goal is to avoid having users install packages before using my programs (the packages are shipped in the Library directory). What is the best way of importing the packages contained in Library?
I tried three methods, but none of them appears perfect: is there a simpler and robust method? or is one of these methods the best one can do?
In the first method, the Library folder is simply added to the library path:
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'Library'))
import package_from_Library
The Library folder is put at the beginning so that the packages shipped with my programs have priority over the same modules installed by the user (this way I am sure that they have the correct version to work with my programs). This method also works when the Library folder is not in the current directory, which is good. However, this approach has drawbacks. Each and every one of my programs adds a copy of the same path to sys.path, which is a waste. In addition, all programs must contain the same three path-modifying lines, which goes against the Don't Repeat Yourself principle.
An improvement over the above problems consists in trying to add the Library path only once, by doing it in an imported module:
# In module add_Library_path:
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'Library'))
and then to use, in each of my programs:
import add_Library_path
import package_from_Library
This way, thanks to the caching mechanism of CPython, the module add_Library_path is only run once, and the Library path is added only once to sys.path. However, a drawback of this approach is that import add_Library_path has an invisible side effect, and that the order of the imports matters: this makes the code less legible, and more fragile. Also, this forces my distribution of programs to inlude an add_Library_path.py program that users will not use.
Python modules from Library can also be imported by making it a package (empty __init__.py file stored inside), which allows one to do:
from Library import module_from_Library
However, this breaks for packages in Library, as they might do something like from xlutils.filter import …, which breaks because xlutils is not found in sys.path. So, this method works, but only when including modules in Library, not packages.
All these methods have some drawback.
Is there a better way of shipping programs with a collection of packages (that they use) stored in a local Library directory? or is one of the methods above (method 1?) the best one can do?
PS: In my case, all the packages from Library are pure Python packages, but a more general solution that works for any operating system is best.
PPS: The goal is that the user be able to use my programs without having to install anything (beyond copying the directory I ship them regularly), like in the examples above.
PPPS: More precisely, the goal is to have the flexibility of easily updating both my collection of programs and their associated third-party packages from Library by having my users do a simple copy of a directory containing my programs and the Library folder of "hidden" third-party packages. (I do frequent updates, so I prefer not forcing the users to update their Python distribution too.)
Messing around with sys.path() leads to pain... The modern package template and Distribute contain a vast array of information and were in part set up to solve your problem.
What I would do is to set up setup.py to install all your packages to a specific site-packages location or if you could do it to the system's site-packages. In the former case, the local site-packages would then be added to the PYTHONPATH of the system/user. In the latter case, nothing needs to changes
You could use the batch file to set the python path as well. Or change the python executable to point to a shell script that contains a modified PYTHONPATH and then executes the python interpreter. The latter of course, means that you have to have access to the user's machine, which you do not. However, if your users only run scripts and do not import your own libraries, you could use your own wrapper for scripts:
#!/path/to/my/python
And the /path/to/my/python script would be something like:
#!/bin/sh
PYTHONPATH=/whatever/lib/path:$PYTHONPATH /usr/bin/python $*
I think you should have a look at path import hooks which allow to modify the behaviour of python when searching for modules.
For example you could try to do something like kde's scriptengine does for python plugins[1].
It adds a special token to sys.path(like "<plasmaXXXXXX>" with XXXXXX being a random number just to avoid name collisions) and then when python try to import modules and can't find them in the other paths, it will call your importer which can deal with it.
A simpler alternative is to have a main script used as launcher which simply adds the path to sys.path and execute the target file(so that you can safely avoid putting the sys.path.append(...) line on every file).
Yet an other alternative, that works on python2.6+, would be to install the library under the per-user site-packages directory.
[1] You can find the source code under /usr/share/kde4/apps/plasma_scriptengine_python in a linux installation with kde.
I'm interested in wrapping pep8 so I can monkey-patch it before use. What is the "right" way to wrap a module?
If my module is named pep8 and lives in my path somewhere before the real pep8, any "import pep8" in my module will just import itself. I don't know in advance where the real pep8 will live, since this needs to be generalized for multiple systems. I can't remove the path where my pep8 wrapper lives from sys.path, because that too will be different depending on the system where it's executed.
I don't want to have to rename my pep8, because I'd like for the pep8 command to work without modification.
My pep8 is a directory containing a __init__.py with the following contents:
from pep8 import *
MAX_LINE_LENGTH = 119
For Python 2.5+, you can specify using absolute imports by default. With from __future__ import absolute_import.
For monkey patching a Python module, you'll want to do relative imports from your project to your overridden module.
For this example, I will assume you are distributing a library. It requires a little finessing for other projects, since the __main__ python file cannot have relative imports.
myproject/__init__.py:
from . import pep8 # optional "as pep8"
# The rest of your code, using this pep8 module.
myproject/pep8/__init__.py:
from __future__ import absolute_import
from pep8 import *
MAX_LINE_LENGTH = 119
I realize this is an old question, but it still comes up in google searches. For instances where this is actually desired (ex: protected library wrapping) I suggest the WRAPT package.
I actually use this for instances where I have a model that is part of a core set but can be extended by other applications (such as front-ends like flask apps). The core model is protected but can be extended by other developers.
https://pypi.python.org/pypi/wrapt
In some __init__.py files of modules I saw such single line:
__import__('pkg_resources').declare_namespace(__name__)
What does it do and why people use it? Suppose it's related to dynamic importing and creating namespace at runtime.
It boils down to two things:
__import__ is a Python function that will import a package using a string as the name of the package. It returns a new object that represents the imported package. So foo = __import__('bar') will import a package named bar and store a reference to its objects in a local object variable foo.
From setup utils pkg_resources' documentation, declare_namespace() "Declare[s] that the dotted package name name is a "namespace package" whose contained packages and modules may be spread across multiple distributions."
So __import__('pkg_resources').declare_namespace(__name__) will import the 'pkg_resources' package into a temporary and call the declare_namespace function stored in that temporary (the __import__ function is likely used rather than the import statement so that there is no extra symbol left over named pkg_resources). If this code were in my_namespace/__init__.py, then __name__ is my_namespace and this module will be included in the my_namespace namespace package.
See the setup tools documentation for more details
See this question for discussion on the older mechanism for achieving the same effect.
See PEP 420 for the standardized mechanism that provides similar functionality beginning with Python 3.3.
This is a way to declare the so called "namespace packages" in Python.
What are these and what is the problem:
Imagine you distribute a software product which has a lot of functionality, and not all people want all of it, so you split it into pieces and ship as optional plugins.
You want people to be able to do
import your_project.plugins.plugin1
import your_project.plugins.plugin2
...
Which is fine if your directory structure is exactly as above, namely
your_project/
__init__.py
plugins/
__init__.py
plugin1.py
plugin2.py
But what if you ship those two plugins as separate python packages so they are located in two different directories? Then you might want to put __import__('pkg_resources').declare_namespace(__name__) in each package's __init__.py so that Python knows those packages are part of a bigger "namespace package", in our case it's your_project.plugins.
Please refer to the documentation for more info.