adding a subpackage from a different path - python

I have a python package called zypp. It is generated via swig and the rpm package (called python-zypp) puts it in:
rpm -ql python-zypp
/usr/lib64/python2.6/site-packages/_zypp.so
/usr/lib64/python2.6/site-packages/zypp.py
Now, I have a different project which provides an additional sets of APIs. Pure python. Plus some scripts.
The layout is:
bin/script1
python
python/zypp
python/zypp/plugins.py
python/zypp/__init__.py
plugins.py contains a Plugin class. I intended to put this in an rpm, and put it into
/usr/lib64/python2.6/site-packages/zypp/plugins.py
script1 uses this Plugin class. But as I test it running from git, I would like it to find the module from git too if it is not installed. So it has something like:
sys.path.append(os.path.join(os.path.dirname(os.path.abspath(__file__)), '../python'))
from zypp.plugins import Plugin
However, it seems that if python-zypp is installed on /usr/lib64/python2.6/site-packages/zypp.py, then script1 won't find the plugins submodule anymore. If I uninstall python-zypp, it does.
So my question is if it is possible to extend a module by adding submodules, being the submodules being located in a different load path. Or will they always clash?
An analogy would be, I have a module foo. And I provide foo.extras in a different load path (which may use foo indeed). The script won't find foo.extras if foo is found first in the system load path. If I only use the custom load path, the script may not find foo module if foo.extras uses it.
I have more experience with ruby, but in ruby I could have installed:
/usr/lib64/ruby/gems/1.8/gems/foo-1.0/lib/foo/*
And I could have in my script:
bin/script
lib/foo/extras/*
I could do in script:
$: << File.join(File.dirname(__FILE__), "../lib"
and then my script could
require foo
require foo/extras
No mater if foo/extras is installed on the system or in the custom load path. They don't clash.
The other way around, I found out that with PYTHONPATH the local zypp.plugins is found first. But then the installed zypp module is not found:
import zypp # works, but seems to import the local one
from zypp.plugins import Plugin # works, PYTHONPATH finds it first
repoinfo = zypp.RepoInfo() # does not work

If I understand your question correctly, you want to use the development version of that module instead of the installed module. Therefore, you can use
PYTHONPATH
From the Module Search Path documentation:
When a module named spam is imported, the interpreter searches for a file named spam.py in the current directory, and then in the list of directories specified by the environment variable PYTHONPATH. This has the same syntax as the shell variable PATH, that is, a list of directory names. When PYTHONPATH is not set, or when the file is not found there, the search continues in an installation-dependent default path; on Unix, this is usually .:/usr/local/lib/python.
So, if the GIT tree of the module directory was "/home/username/some/path", you would change the PYTHONPATH to "/home/username/some/path". Or if the PYTHONPATH variable is already in use, you would append ":/home/username/some/path" to it (note the colon separator). In order to make this permanent, add the line "PYTHONPATH=value" to the file "/etc/environment".
sys.path.insert
If you have a start script for your program, you could override the module search path using sys.path.insert(0, "somepath"). This is similar to the sys.path.append call you described but inserts the path into the beginning of the list.

Related

How does Python handle subpackages?

Say Ansible was installed by means of "pip install ansible". Right after the install the following import statement succeeds:
from ansible.module_utils.basic import AnsibleModule
Now, a local package named "ansible.module_utils.custom" is created. The directory structure:
ansible/
__init__.py
module_utils/
__init__.py
custom/
__init__.py
utils.py
As soon as this is put in place the aforementioned import statement fails. Claiming "basic" is undefined. The local package does indeed not declare a "basic" subpackage. Only the installed Ansible library does. It seems Python limited its search to the local package only.
I was under the impression Python would consider the complete system path before giving up on finding code. That it would backtrack out of the local package and finally hit the installed Ansible library.
Is this an incorrect assumption ? If so, is it possible at all to make the local package to coexist with the installed package ?
How Import works
import abc
The first thing Python will do is look up the name abc in sys.modules. This is a cache of all modules that have been previously imported.
If the name isn’t found in the module cache, Python will proceed to search through a list of built-in modules. These are modules that come pre-installed with Python and can be found in the Python Standard Library. If the name still isn’t found in the built-in modules, Python then searches for it in a list of directories defined by sys.path. This list usually includes the current directory, which is searched first.
When Python finds the module, it binds it to a name in the local scope. This means that abc is now defined and can be used in the current file without throwing a NameError.
If the name is never found, you’ll get a ModuleNotFoundError. You can find out more about imports in the Python documentation here!

How does jupyter notebook import modules which are not in the current working directory?

I am trying to understand how import works in jupyter notebook.
My present working directory is "home/username". I have three python modules.
The path names of these modules are as given below.
"/home/username/module1.py"
"/home/username/folder/module2.py"
"/home/username/anaconda3/lib/python3.7/os.py" (which is an inbuilt python module)
Jupyter Notebook:
cell 1:
import module1
Works just fine
cell 2:
import module2 gives
ModuleNotFoundError: No module named 'module2'
cell 3:
import os
Works just fine
It seems like modules in the working directory can be imported without any problem. So, module1.py can be imported. Modules in other directories that are not packages cannot be imported directly. So, module2.py throws an error. But if this is the case how can os.py, which is not the working directory or in another package in the same directory, be imported directly?
This is really more about how python itself works.
You should be able to import module2 with from folder import module2. You should declare /home/username/folder as a package by create a blank init file /home/username/folder/__init__py. I recommend naming the package something more unique, like potrus_folder, that way you don't get naming conflicts down the line.
To explain: Python keeps track of what modules it has available through it's path, it is usually set in your environment variables. To see what folders it looks in for modules you can do import sys then print(sys.path). By default your working directory (/home/username/) will be included, with highest priority (it should thus be either first or last in sys.path, I don't remember). You can add your own folder with sys.path.append('/some/folder'), although it is frowned upon, and you should really add it to your system path, or just keep it as a package in your working directory.
Packages are really just subfolders of paths which have already been added. You access them, as I explained earlier, by using the from X import Y syntax, or if you want to go deeper from X.Z import Y. Remember the __init__.py file.
The path of os library is set in environment*
Whenever you give import it would search all the directories which are added in your environment + the pwd , so you could just add the directory in environment and that would work
By default /home/username/anaconda3/lib/python3.7/ is added by default at the time of installation since there is where most of the module lies, but you can add urs too

Why does "import module" and then "from package import module" load the module again?

I have a package in my PYTHONPATH that looks something like this:
package/
__init__.py
module.py
print 'Loading module'
If I'm running Python from the package/ directory (or writing another module in this directory) and type
import module
it loads module.py and prints out "Loading module" as expected. However, if I then type
from package import module
it loads module.py and prints "Loading module" again, which I don't expect. What's the rationale for this?
Note: I think I understand technically why Python is doing this, because the sys.modules key for import module is just "module", but for from package import module it's "package.module". So I guess what I want to know is why the key is different here -- why isn't the file's path name used as the key so that Python does what one expects here?
Effectively, by running code from the package directory, you've misconfigured Python. You shouldn't have put that directory on sys.path, since it's inside a package.
Python doesn't use the filename as the key because it's not importing a file, it's importing a module. Allowing people to do 'import c:\jim\my files\projects\code\stuff' would encourage all kinds of nastiness.
Consider this case instead: what if you were in ~/foo/package/ and ~/bar were on PYTHONPATH - but ~/bar is just a symlink to ~/foo? Do you expect Python to resolve, then deduplicate the symbolic link for you? What if you put a relative directory on PYTHONPATH, then change directories? What if 'foo.py' is a symlink to 'bar.py'? Do you expect both of those to be de-duplicated too? What if they're not symlinks, but just exact copies? Adding complex rules to try to do something convenient in ambiguous circumstances means it does something highly inconvenient for other people. (Python zen 12: in the face of ambiguity, refuse the temptation to guess.)
Python does something simple here, and it's your responsibility to make sure that the environment is set up correctly. Now, you could argue that it's not a very good idea to put the current directory on PYTHONPATH by default - I might even agree with you - but given that it is there, it should follow the same consistent set of rules that other path entries do. If it's intended to be run from an arbitrary directory, your application can always remove the current directory from sys.path by starting off with sys.path.remove('').
It is a minor defect of the current module system.
When importing module, you do it from the current namespace, which has no name. the values inside this namespace are the same as those in package, but the interpreter cannot know it.
When importing package.module, you import module from the package namespace.
This the reason, that the main.py should be outside the package forlder.
Many modules have this organisation :
package /
main.py
package /
sub_package1/
sub_package2/
sub_package3/
module1.py
module2.py
Calling only main.py make sure the namespaces are correctly set, aka the current namespace is main.py's. Its makes impossible to call import module1.py in module2.py. You'ld need to call import package.module1. Makes things simpler and homogeneous.
And yes, import the current folder as the current nameless folder was a bad idea.
It is a PITA if you go beyond a few scripts. But as Python started there, it was not completely senseless.

How to import module from current non-default directory

I'm using Python 2.7. I'm rather new to the python langauge. I have two python modules - "Trailcrest.py" and "Glyph.py", both in the same folder, but not in the Python27 folder.
I need to import "Trailcrest.py" into "Glyph.py", but I am getting the message that "no such module exists".
Additionally, whatever means I use to import the module needs to not be dependent on a solid-state path. This program is cross-platform, and the path can be changed depending on the user's preferences. However, these two modules will always be in the same folder together.
How do I do this?
If you have Trailcrest.py and Glyph.py in the same folder, importing one into the other is as simple as:
import Trailcrest
import Glyph
If this does not work, there seems to be something wrong with your Python setup. You might want to check what's in sys.path.
import sys
print sys.path
To elaborate a bit on Ferdinand Beyer's answer, sys.path is a list of file locations that the default module importer checks. Some, though not all installations of python will add the current directory or the directory of the __main__ module to the path. To make sure that the paths relative to a given module are importable in that module, do something like this:
import os.path, sys
sys.path.append(os.path.dirname(__file__))
But something like that shouldn't ever make it into a "production" product. Instead, use something like distutils to install the module's package into the python site-packages directory.
This can also be achieved using the environment variable PYTHONPATH which also influences Python's search path. This can be done in a shell script so that the Python files do not need to be altered. If you want it to import from the current working directory use the . notation in bash:
export PYTHONPATH=.
python python_prog.py

What sets up sys.path with Python, and when?

When I run
import sys
print sys.path
on my Mac (Mac OS X 10.6.5, Python 2.6.1), I get the following results.
/Library/Python/2.6/site-packages/ply-3.3-py2.6.egg
...
/Library/Python/2.6/site-packages/ipython-0.10.1-py2.6.egg
/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python26.zip
/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6
/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/plat-darwin
/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/plat-mac
/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/plat-mac/lib-scriptpackages
/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python
/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-tk
/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-old
/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload
/Library/Python/2.6/site-packages
/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/PyObjC
/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/wx-2.8-mac-unicode
They are grouped into 5 categories.
/Library/Python/2.6/site-packages/*.egg
/Library/Python/2.6/site-packages
Frameworks/Python.framework/Versions/2.6/lib/python2.6
Frameworks/Python.framework/Versions/2.6/Extras/lib/python
PATH from PYTHONPATH environment variable.
And I can add more paths using the code
sys.path.insert(0, MORE_PATH)
What routines sets up those paths, and when?
Are some of the paths are built in python source code?
Is it possible that the paths inserted with 'sys.path.insert' are ignored? I'm curious about this, as with mod_wsgi, I found the paths are not found with 'sys.path.insert'. I asked another post for this question.
ADDED
Based on Michael's answer, I looked into site.py, and I got the following code.
def addsitepackages(known_paths):
"""Add site-packages (and possibly site-python) to sys.path"""
sitedirs = []
seen = []
for prefix in PREFIXES:
if not prefix or prefix in seen:
continue
seen.append(prefix)
if sys.platform in ('os2emx', 'riscos'):
sitedirs.append(os.path.join(prefix, "Lib", "site-packages"))
elif sys.platform == 'darwin' and prefix == sys.prefix:
sitedirs.append(os.path.join("/Library/Python", sys.version[:3], "site-packages"))
I also think that the directory name that has site.py (/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6 for my Mac) should be built into Python source code.
Most of the stuff is set up in Python's site.py which is automatically imported when starting the interpreter (unless you start it with the -S option). Few paths are set up in the interpreter itself during initialization (you can find out which by starting python with -S).
Additionally, some frameworks (like Django I think) modify sys.path upon startup to meet their requirements.
The site module has a pretty good documentation, a commented source code and prints out some information if you run it via python -m site.
From Learning Python:
sys.path is the module search path.
Python configures it at program
startup, automatically merging the
home directory of the top-level file
(or an empty string to designate the
current working directory), any
PYTHONPATH directories, the contents
of any .pth file paths you've
created, and the standard library
directories. The result is a list of
directory name strings that Python
searches on each import of a new file.
site.py is indeed the answers. I wanted to remove any dependencies on the old Python that is installed by default on my mac. This works pretty good, as 'site.py' is called each time the python interpreter is started.
For Mac, I manually added the following line at the end of main() in /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/site.py:
sys.path = filter (lambda a: not a.startswith('/System'), sys.path)
Path has these parts:
OS paths that have your system libraries
current directory python started from
environmental variable $PYTHONPATH
you can add paths at runtime.
Paths are not ignored. But, they may not be found and that will not raise an error.
sys.path should only be added too, not subtracted from. Django would not remove paths.
Adding to the accepted answer, and addressing the comments that say a module shouldn't remove entries from sys.path:
This is broadly true but there are circumstances where you might want to modify sys.path by removing entries. For instance - and this is Mac-specific; *nix/Windows corollaries may exist - if you create a customised Python.framework for inclusion in your own project you may want to ignore the default sys.path entries that point at the system Python.framework.
You have a couple of options:
Hack the site.py, as #damirv indicates, or
Add your own sitecustomize module (or package) to the custom framework that achieves the same end result. As indicated in the site.py comments (for 2.7.6, anyway):
After these path manipulations, an attempt is made to import a module
named sitecustomize, which can perform arbitrary additional
site-specific customizations. If this import fails with an
ImportError exception, it is silently ignored.
Also note: if the PYTHONHOME env var is set, standard libraries will be loaded from this path instead of the default, as documented.
This is not a direct answer to the question, but something I just discovered that was causing the wrong standard libraries to be loaded, and my searches lead me here along the way.
You are using system python /usr/bin/python.
sys.path is set from system files at python startup.
Do not touch those files, in particular site.py, because this may perturb the system.
However, you can change sys.path within python, in particular, at startup :
in ~/.bashrc or ~/.zshrc:
export PYTHONSTARTUP=~/.pythonrc
in ~/.pythonrc:
write your changes to sys.path.
Those changes will be only for you in interactive shells.
For hacking at little risk for the system, install you own and more recent python version.

Categories