How to set up hidden imports in pyinstaller - python

I have a large project with multiple packages. These packages use a set of modules in a common package. I am trying to create an exe on Windows using pyinstaller, but it cannot find the common package.
This cut down project exhibits the same issue. My package is organised as shown in this tree:
When I use
python -m my_package
in the top my_package directory it works perfectly.
The module main.py in my_package imports Bar (which is located in foo) from common. The __init__.py file in common includes:
from common.source.foo import Bar
When I build and exe file and run it in terminal, it fails with ' No module named common'
my pyintstaller spec includes:
hiddenimports=['../', '../common/', '../common/common/']
Should I try something different?

The hiddenimports are used to specify imports that can't be detected by pyinstaller, not the paths to those imports.
Try adding the necessary paths to the pathex list in the spec file instead (these are paths that will be available in sys.path during analysis).

Related

Struggling with python's import mechanism

I am an experienced java enterprise developer but very new to python enterprise development shop. I am currently, struggling to understand why some imports work while others don't.
Some background: Our dev team recently upgraded python from 3.6 to 3.10.5 and following is our package structure
src/
bunch of files (dockerfile, Pipfile, requrirements.txt, shell scripts, etc)
package/
__init__.py
moduleA.py
subpackage1/
__init__.py
moduleX.py
moduleY.py
subpackage2/
__init__.py
moduleZ.py
tests/
__init__.py
test1.py
Now, inside the moduleA.py, I am trying to import subpackage2/moduleZ.py like so
from .subpackage2 import moduleZ
But, I get the error saying
ImportError: attempted relative import with no known parent package
The funny thing is that if I move moduleA.py out of package/ and into src/ then it is able to find everything. I am not sure why is this the case.
I run the moduleA.py by executiong python package/moduleA.py.
Now, I read that maybe there is a problem becasue you have you give a -m parameter if running a module as a script (or something on those lines). But, if I do that, I get the following error:
ModuleNotFoundError: No module names 'package/moduleA.py'
I even try running package1/moduleA and remove the .py, but that does not work either. I can understand why as I technically never installed it ?
All of this happened because apparently, the tests broke and to make it work they added relative imports. They changed the import from "from subpackage2 import moduleZ" to "from .subpackage2 import moduleZ" and the tests started working, but the app started failing.
Any understanding I can get would be much appreciated.
The -m parameter is used with the import name, not the path. So you'd use python3 -m package.moduleA (with . instead of /, and no .py), not python3 -m package/moduleA.py.
That said, it only works if package.moduleA is locatable from one of the roots in sys.path. Shy of installing the package, the simplest way to make it work is to ensure your working directory is src (so package exists in the working directory):
$ cd path/to/src
$ python3 -m package.moduleA
and, with your existing setup, if moduleA.py includes a from .subpackage2 import moduleZ, the import should work; Python knows package.moduleA is a module within package, so it can use a relative import to look for a sibling package to moduleA named subpackage2, and then inside it it can find moduleZ.
Obviously, this is brittle (it only works if you cd to the src root directory before running Python, or hack the path to src in PYTHONPATH, which is terrible hack if the code ever has to be run by anyone else); ideally you make this an installable package, install it (in global site-packages, user site-packages, or within a virtual environment created with the built-in venv module or the third-party virtualenv module), and then your working directory no longer matters (since the site-packages will be part of your sys.path automatically). For simple testing, as long as the working directory is correct (not sure what it was for you), and you use -m correctly (you were using it incorrectly), relative imports will work, but it's not the long term solution.
So first of all - the root importing directory is the directory from which you're running the main script.
This directory by default is the root for all imports from all scripts.
So if you're executing script from directory src you can do such imports:
from package.moduleA import *
from package.subpackage1.moduleX import *
But now in files moduleA and moduleX you need to make imports based on root folder. If you want to import something from module moduleY inside moduleX you need to do:
# this is inside moduleX
from package.subpackage1.moduleY import *
This is because python is looking for modules in specific locations.
First location is your root directory - directory from which you execute your main script.
Second location is directory with modules installed by PIP.
You can check all directories using following:
import sys
for p in sys.path:
print(p)
Now to solve your problem there are couple solutions.
The fast one but IMHO not the best one is to add all paths with submodules to sys.path - list variable with all directories where python is looking for modules.
new_path = "/path/to/application/app/folder/src/package/subpackage1"
if new_path not in sys.path:
sys.path.append(new_path)
Another solution is to use full path for imports in all package modules:
from package.subpackage1.moduleX import *
I think in your case it will be the correct solution.
You can also combine 2 solutions.
First add folders with subpackages to sys.path and use subpackage folders as a root folders for imports. But it's good solution only if you have complex submodule structure. And it's not the best solution if in future you will need to deploy your package as a wheel or share between multiple projects.

Python module by SWIG in conda: Where do I have to place which file?

I am trying to generate Python bindings for a C++ shared library with SWIG and distribute the project with conda. The build process seems to work as I can execute
import mymodule as m
ret = m.myfunctioninmymodule()
in my build directory. Now I want to install files that are created (namely, mymodule.py and _mymodule.pyd) in my conda environment on Windows so that I can access them from everywhere. But where do I have to place the files?
What I have tried so far is to put both files in a package together with a __init__.py (which is empty, however) and write a setup.py as suggested here. The package has the form
- mypackage
|- __init__.py
|- mymodule.py
|- _mymodule.pyd
and is installed under C:\mypathtoconda\conda\envs\myenvironmentname\Lib\site-packages\mypackage-1.0-py3.6-win-amd64.egg. However, the python import (see above) fails with
ImportError: cannot import name '_mymodule'
It should be noted that under Linux this approach with the package works perfectly fine.
Edit: The __init__.py is empty because this is sufficient to build a package. I am not sure, however, what belongs in there. Do I have to give a path to certain components?

How do implicit relative imports work in Python?

Assume I have the following files,
pkg/
pkg/__init__.py
pkg/main.py # import string
pkg/string.py # print("Package's string module imported")
Now, if I run main.py, it says "Package's string module imported".
This makes sense and it works as per this statement in this link:
"it will first look in the package's directory"
Assume I modified the file structure slightly (added a core directory):
pkg/
pkg/__init__.py
plg/core/__init__.py
pkg/core/main.py # import string
pkg/string.py # print("Package's string module imported")
Now, if I run python core/main.py, it loads the built-in string module.
In the second case too, if it has to comply with the statement "it will first look in the package's directory" shouldn't it load the local string.py because pkg is the "package directory"?
My sense of the term "package directory" is specifically the root folder of a collection of folders with __init__.py. So in this case, pkg is the "package directory". It is applicable to main.py and also files in sub- directories like core/main.py because it is part of this "package".
Is this technically correct?
PS: What follows after # in the code snippet is the actual content of the file (with no leading spaces).
Packages are directories with a __init__.py file, yes, and are loaded as a module when found on the module search path. So pkg is only a package that you can import and treat as a package if the parent directory is on the module search path.
But by running the pkg/core/main.py file as a script, Python added the pkg/core directory to the module search path, not the parent directory of pkg. You do have a __init__.py file on your module search path now, but that's not what defines a package. You merely have a __main__ module, there is no package relationship to anything else, and you can't rely on implicit relative imports.
You have three options:
Do not run files inside packages as scripts. Put a script file outside of your package, and have that import your package as needed. You could put it next to the pkg directory, or make sure the pkg directory is first installed into a directory already on the module search path, or by having your script calculate the right path to add to sys.path.
Use the -m command line switch to run a module as if it is a script. If you use python -m pkg.core Python will look for a __main__.py file and run that as a script. The -m switch will add the current working directory to your module search path, so you can use that command when you are in the right working directory and everything will work. Or have your package installed in a directory already on the module search path.
Have your script add the right directory to the module search path (based on os.path.absolute(__file__) to get a path to the current file). Take into account that your script is always named __main__, and importing pkg.core.main would add a second, independent module object; you'd have two separate namespaces.
I also strongly advice against using implicit relative imports. You can easily mask top-level modules and packages by adding a nested package or module with the same name. pkg/time.py would be found before the standard-library time module if you tried to use import time inside the pkg package. Instead, use the Python 3 model of explicit relative module references; add from __future__ import absolute_import to all your files, and then use from . import <name> to be explicit as to where your module is being imported from.

Expanding sys.path via __init__.py

There're a lot of threads on importing modules from sibling directories, and majority recommends to either simply add init.py to source tree, or modify sys.path from inside those init files.
Suppose I have following project structure:
project_root/
__init__.py
wrappers/
__init__.py
wrapper1.py
wrapper2.py
samples/
__init__.py
sample1.py
sample2.py
All init.py files contain code which inserts absolute path to project_root/ directory into the sys.path. I get "No module names x", no matter how I'm trying to import wrapperX modules into sampleX. And when I try to print sys.path from sampleX, it appears that it does not contain path to project_root.
So how do I use init.py correctly to set up project environment variables?
Do not run sampleX.py directly, execute as module instead:
# (in project root directory)
python -m samples.sample1
This way you do not need to fiddle with sys.path at all (which is generally discouraged). It also makes it much easier to use the samples/ package as a library later on.
Oh, and init.py is not run because it only gets run/imported (which is more or less the same thing) if you import the samples package, not if you run an individual file as script.

Managing resources in a Python project

I have a Python project in which I am using many non-code files. Currently these are all images, but I might use other kinds of files in the future. What would be a good scheme for storing and referencing these files?
I considered just making a folder "resources" in the main directory, but there is a problem; Some images are used from within sub-packages of my project. Storing these images that way would lead to coupling, which is a disadvantage.
Also, I need a way to access these files which is independent on what my current directory is.
You may want to use pkg_resources library that comes with setuptools.
For example, I've made up a quick little package "proj" to illustrate the resource organization scheme I'd use:
proj/setup.py
proj/proj/__init__.py
proj/proj/code.py
proj/proj/resources/__init__.py
proj/proj/resources/images/__init__.py
proj/proj/resources/images/pic1.png
proj/proj/resources/images/pic2.png
Notice how I keep all resources in a separate subpackage.
"code.py" shows how pkg_resources is used to refer to the resource objects:
from pkg_resources import resource_string, resource_listdir
# Itemize data files under proj/resources/images:
print resource_listdir('proj.resources.images', '')
# Get the data file bytes:
print resource_string('proj.resources.images', 'pic2.png').encode('base64')
If you run it, you get:
['__init__.py', '__init__.pyc', 'pic1.png', 'pic2.png']
iVBORw0KGgoAAAANSUhE ...
If you need to treat a resource as a fileobject, use resource_stream().
The code accessing the resources may be anywhere within the subpackage structure of your project, it just needs to refer to subpackage containing the images by full name: proj.resources.images, in this case.
Here's "setup.py":
#!/usr/bin/env python
from setuptools import setup, find_packages
setup(name='proj',
packages=find_packages(),
package_data={'': ['*.png']})
Caveat:
To test things "locally", that is w/o installing the package first, you'll have to invoke your test scripts from directory that has setup.py. If you're in the same directory as code.py, Python won't know about proj package. So things like proj.resources won't resolve.
The new way of doing this is with importlib. For Python versions older than 3.7 you can add a dependency to importlib_resources and do something like
from importlib_resources import files
def get_resource(module: str, name: str) -> str:
"""Load a textual resource file."""
return files(module).joinpath(name).read_text(encoding="utf-8")
If your resources live inside the foo/resources sub-module, you would then use get_resource like so
resource_text = get_resource('foo.resources', 'myresource')
You can always have a separate "resources" folder in each subpackage which needs it, and use os.path functions to get to these from the __file__ values of your subpackages. To illustrate what I mean, I created the following __init__.py file in three locations:
c:\temp\topp (top-level package)
c:\temp\topp\sub1 (subpackage 1)
c:\temp\topp\sub2 (subpackage 2)
Here's the __init__.py file:
import os.path
resource_path = os.path.join(os.path.split(__file__)[0], "resources")
print resource_path
In c:\temp\work, I create an app, topapp.py, as follows:
import topp
import topp.sub1
import topp.sub2
This respresents the application using the topp package and subpackages. Then I run it:
C:\temp\work>topapp
Traceback (most recent call last):
File "C:\temp\work\topapp.py", line 1, in
import topp
ImportError: No module named topp
That's as expected. We set the PYTHONPATH to simulate having our package on the path:
C:\temp\work>set PYTHONPATH=c:\temp
C:\temp\work>topapp
c:\temp\topp\resources
c:\temp\topp\sub1\resources
c:\temp\topp\sub2\resources
As you can see, the resource paths resolved correctly to the location of the actual (sub)packages on the path.
Update: Here's the relevant py2exe documentation.
# pycon2009, there was a presentation on distutils and setuptools. You can find all of the videos here
Eggs and Buildout Deployment in Python - Part 1
Eggs and Buildout Deployment in Python - Part 2
Eggs and Buildout Deployment in Python - Part 3
In these videos, they describe how to include static resources in your package. I believe its in part 2.
With setuptools, you can define dependancies, this would allow you to have 2 packages that use resources from 3rd package.
Setuptools also gives you a standard way of accessing these resources and allows you to use relative paths inside of your packages, which eliminates the need to worry about where your packages are installed.

Categories