Imports in distributable python scripts - python

I have a python project pypypy with 2 files: __main__.py and foo.py.
In __main__.py I simply do import via import foo. It all works fine.
Now, I want to distribute it with pypi. After installing my module I'm execution it with python -m pypypy. When I do that, the import statement doesn't work anymore. However import pypypy.foo does the job.
Should I change all my imports before distribution or there is a better way?

Using absolute imports is strongly suggested as they work consistently across different python versions. Check this answer. In your case you should prefer using import pypypy.foo.
The reason it works in your dev environment might be because of PYTHONPATH manipulation. For example Pycharm automatically sets Add content and source roots to PYTHONPATH. Also when you run python it automatically adds current working directory to PYTHONPATH.

Related

How to add a folder to Python path?

Basically, I can only reference my other files as modules when they are in a very specific location:
C:\Users\Dave\Desktop\Programming\Python.
If I want to create a new folder for a large project with multiple modules, say
C:\Users\Dave\Desktop\Programming\Python\Project1,
I can no longer import any modules and keep getting a ModuleNotFoundError. I've looked into it and it seems I need to add that folder to the Python Path somehow, but I couldn't find any answers on how to do it. My computer runs on Windows 10 if that matters.
I think the immediate solution to your problem/the answer to your question would be to use sys.path.append() at the top of your script.
import sys
sys.path.append("<ABSOLUTE/PATH/TO/YOUR/CUSTOM/MODULES/FOLDER>")
import custom_module
This isn't an ideal solution for any kind of prod use, however. Based on what you wrote in your question, this may not be what you're looking for. More info might help to craft a more stable solution:
Are you using virtual environments when running python? Do you just run python script.py or is there a specific version of python you're using that's installed elsewhere than the common C:\Program Files\Python?
When you say that you work on a project with multiple modules, does that mean there are custom modules that you/someone wrote or is it just that that project uses non-standard library modules, i.e. you had to pip install them?
Can we get an example of the code you're running and the folder structure of your project/where the modules are that you need?

How to work with absolute imports when developing a package

It's been a while that I am struggling with imports in packages. When I develop a package, I read everywhere that it is preferable to use absolute imports in submodules of that package. I understand that and I like it more as well. But then I don't like and I also read that you shouldn't use sys.path.append('/path/to/package') to use your package in development...
So my question is, how do you develop such a package from zero, using directly absolute imports? At the moment I develop the package using relative imports, since then I am able to test the code I am writing before packaging and installing, then I change the imports once I have a release and build the package.
What is the correct way of doing such thing? In Pycharm for example you would mark the folder as 'source roor' and be able to work as if the package folder was in the path. Still I read that this is not the proper way... what am I missing? How do you develop a package while testing its code?
Your mileage may vary but this is what I usually do:
Within a package (foo), absolute (import foo.bar) or relative (import .bar) doesn't matter to me as long as it works. Sometimes, I prefer relative especially when the project is large and one day I might decide to move a number of source files into a subdirectory.
How do I test? My $PYTHONPATH usually has . in it, and my directory hierarchy is like this:
/path/to/foo_project
/setup.py
/foo
/__init__.py
/bar.py
/test
/test1.py
/test2.py
then the script in foo_project/test/test1.py will be like what you normally use the package, using import foo.bar. And when I test my code, I will be in the directory foo_project and run python test/test1.py. Since I have . in my $PYTHONPATH, it will find the directory foo and use it as a package.

Local collection of Python packages: best way to import them?

I need to ship a collection of Python programs that use multiple packages stored in a local Library directory: the goal is to avoid having users install packages before using my programs (the packages are shipped in the Library directory). What is the best way of importing the packages contained in Library?
I tried three methods, but none of them appears perfect: is there a simpler and robust method? or is one of these methods the best one can do?
In the first method, the Library folder is simply added to the library path:
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'Library'))
import package_from_Library
The Library folder is put at the beginning so that the packages shipped with my programs have priority over the same modules installed by the user (this way I am sure that they have the correct version to work with my programs). This method also works when the Library folder is not in the current directory, which is good. However, this approach has drawbacks. Each and every one of my programs adds a copy of the same path to sys.path, which is a waste. In addition, all programs must contain the same three path-modifying lines, which goes against the Don't Repeat Yourself principle.
An improvement over the above problems consists in trying to add the Library path only once, by doing it in an imported module:
# In module add_Library_path:
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'Library'))
and then to use, in each of my programs:
import add_Library_path
import package_from_Library
This way, thanks to the caching mechanism of CPython, the module add_Library_path is only run once, and the Library path is added only once to sys.path. However, a drawback of this approach is that import add_Library_path has an invisible side effect, and that the order of the imports matters: this makes the code less legible, and more fragile. Also, this forces my distribution of programs to inlude an add_Library_path.py program that users will not use.
Python modules from Library can also be imported by making it a package (empty __init__.py file stored inside), which allows one to do:
from Library import module_from_Library
However, this breaks for packages in Library, as they might do something like from xlutils.filter import …, which breaks because xlutils is not found in sys.path. So, this method works, but only when including modules in Library, not packages.
All these methods have some drawback.
Is there a better way of shipping programs with a collection of packages (that they use) stored in a local Library directory? or is one of the methods above (method 1?) the best one can do?
PS: In my case, all the packages from Library are pure Python packages, but a more general solution that works for any operating system is best.
PPS: The goal is that the user be able to use my programs without having to install anything (beyond copying the directory I ship them regularly), like in the examples above.
PPPS: More precisely, the goal is to have the flexibility of easily updating both my collection of programs and their associated third-party packages from Library by having my users do a simple copy of a directory containing my programs and the Library folder of "hidden" third-party packages. (I do frequent updates, so I prefer not forcing the users to update their Python distribution too.)
Messing around with sys.path() leads to pain... The modern package template and Distribute contain a vast array of information and were in part set up to solve your problem.
What I would do is to set up setup.py to install all your packages to a specific site-packages location or if you could do it to the system's site-packages. In the former case, the local site-packages would then be added to the PYTHONPATH of the system/user. In the latter case, nothing needs to changes
You could use the batch file to set the python path as well. Or change the python executable to point to a shell script that contains a modified PYTHONPATH and then executes the python interpreter. The latter of course, means that you have to have access to the user's machine, which you do not. However, if your users only run scripts and do not import your own libraries, you could use your own wrapper for scripts:
#!/path/to/my/python
And the /path/to/my/python script would be something like:
#!/bin/sh
PYTHONPATH=/whatever/lib/path:$PYTHONPATH /usr/bin/python $*
I think you should have a look at path import hooks which allow to modify the behaviour of python when searching for modules.
For example you could try to do something like kde's scriptengine does for python plugins[1].
It adds a special token to sys.path(like "<plasmaXXXXXX>" with XXXXXX being a random number just to avoid name collisions) and then when python try to import modules and can't find them in the other paths, it will call your importer which can deal with it.
A simpler alternative is to have a main script used as launcher which simply adds the path to sys.path and execute the target file(so that you can safely avoid putting the sys.path.append(...) line on every file).
Yet an other alternative, that works on python2.6+, would be to install the library under the per-user site-packages directory.
[1] You can find the source code under /usr/share/kde4/apps/plasma_scriptengine_python in a linux installation with kde.

How to deal with relative imports in a Python package

I'm working on a Python project with approximately the following layout
project/
foo/
__init__.py
useful.py
test/
__init__.py
test_useful.py
test_useful.py tries to import project.foo.useful so it can test it, but it doesn't work when I say "python project/foo/test/test_useful.py", but it does work if I copy it into my current directory and run "python test_useful.py".
What is the correct way to handle these imports while developing? It seems like this won't be an issue once installed, because it will be in PYTHONPATH. Should I use distutils to make a build/ folder and add it to my PYTHONPATH?
First of all you need to set up your PYTHONPATH to either include "project" or the parent of "project". This is important while you're developing too :-)
Then you should be able to use an absolute import:
from project.foo import useful
Secondly, I would suggest that instead of running tests by executing the module, you install py.test (pip install pytest). Then you'll be able to use relative imports, as long as your py.test invocation is generic enough (i.e. "py.test foo" will work, but "py.test foo/test/test_useful.py" will not). I would still recommend that you not use relative imports in tests.
Please consider using distutils/setuptools to make your project installable in a Python standard way. (Hint: you'll need to create a setup.py file parallel to the 'foo' directory, also known as a package.)
Doing so will also allow you to then use a number of common Python testing frameworks (nose, py.test, etc.) to make it possible to collect and run tests, where most such frameworks automatically ensure 'foo' is an importable package before running the tests. Your test_useful.py tests can them import 'foo.useful' without a problem.
Also worth noting from your example directory structure is that it seems to be generally recommended that your tests directory NOT be a Python package. i.e. delete the test/init.py file. The framework will ensure the tests are runnable, and not having it as a package will help ensure it only gets distributed in source distributions and not binary ones (where it likely isn't wanted.)

Automatically fetching latest version of a file on import

I have a module that I want to keep up to date, and I'm wondering if this is a bad idea:
Have a module (mod1.py) in the
site-packages directory that copies a
different module from some other
location into the site-packages
directory, and then imports * from
that module.
import shutil
from distutils.sysconfig import get_python_lib
p_source = r'\\SourceSafeServer\mod1_current.py'
p_local = get_python_lib() + r'\mod1_current.py'
shutil.copyfile(p_source, p_local)
from mod1_current import *
Now I can do this in any module, and it will always be the latest version:
from mod1 import function1
This works.... but is there a better way of doing this?
Update
Here is the current process... there is a project under source-control that has a single module: mod1.py There is also a setup.py Running setup.py copies mod1.py to the site-packages directory.
Developers that use the module must run setup.py to update the module. Sometimes, they don't and not having the latest version causes problems.
I want to be able to just check-in the a new version, and any code that imports that module will automatically grab the latest version every time, without anyone having to run setup.py
Do you really want to do this? This means you could very easily roll code to a production app simply by committing to source control. I would consider this a nasty side-effect for someone who isn't aware of your setup.
That being said this seems like a pretty good solution - you may want to add some exception-handling around the network file calls as those are prone to failure.
In some cases, we put .pth files in the Python site-packages directory. The .pth files name our various SVN checkout directories.
No install. No copy.
.pth files are described here.
The original strategy of having other developers copy mod1.py into their site-packages in order to use the module sounds like it's the real problem. Why aren't they just using the same source control are you are?
This auto-copying will make it hard to do rollbacks, especially if other developers copy your strategy. Imagine this same system used for dozens and dozens of files. And then imagine you actually do want to use a version of mod1.py that is not the latest for something.

Categories