Get Path to Python File Before Installation - python

I have a project that includes c++ binaries and python scripts, it's setup such that it should be installed using setuptools. One of the python files is intended to be both used as a script "
python3 script_name.py params
and for it's primary function to be used in other python projects from script_name import function.
The primary function calls a binary which is in a known relative location before the installation (the user is expected to call pip install project_folder). So in order to call the binary I want to get this files location (pre installation)
To get this I used something like
Path(__file__).resolve().parent
however, since the installation moves the file to another folder like ~/.local/... this doesn't work when imported after the installation.
Is there a way to get the original file path, or to make the installation save that path somewhere?
EDIT:
After #sinoroc 's suggestion I tried including the binary as a resource by putting an __init__.py in the build folder and putting
from importlib.resources import files
import build
binary = files(build).joinpath("binary")
in the main init. After that package.binary still gives me a path to my .local/lib and binary.is_file() still returns False
from importlib_resources import files
GenerateHistograms = files("build").joinpath("GenerateHistograms")
gave the same result

Since you are installing your package, you also need to include your C++ binary in the installation. You cannot have a mixed setup. I suggest something like this.
In your setup.py:
from setuptools import setup, find_packages
setup(
name="mypkg",
packages=find_packages(exclude=["tests"]),
package_data={
"mypkg": [
"binary", # relative path to your package directory
]
},
include_package_data=True,
)
Then in your module use pkg_resources:
from pathlib import Path
from pkg_resources import resource_filename
# "binary" is whatever relative path you put in package_data
path_to_binary = Path(resource_filename("mypkg", "binary"))
pkg_resources should be pulled in by setuptools.
EDIT: the recipe above might be a bit out of date; as #sinoroc suggests, using importlib.resources instead of pkg_resources is probably the modern equivalent.

So I managed to solve it in with #sinroc 's approach in the end
In setup.py
package_data={'package':['build/*']}
include_package_data=True
and in the primary __init.py__:
from importlib.resources import files
binary = files("package.build").joinpath("binary")
So I could then from package import binary to get the path
EDIT: Looks like someone also pointed the error in my ways out before I finished ^^

Related

Why do I get an import error when a child file is trying to access a parent file

My directory is structured like this
>project
>tests
>test_to_json.py
>app.py
>x.db
in my test_to_json.py I want to access a class from app.py
How would I import it? I've tried from ..app import myClass but that gives me ImportError: attempted relative import with no known parent package. Is there something I'm missing?
You cannot use .. relative pathing in python. That specific type of relative python is simply not allowed. . is allowed, though. The resolution for this problem is usually done by converting your project into a python package.
Extensive tutorials for doing so can be found here, but I will give an example of how to convert your project into a package.
Step 1
The new file structure should look like this:
>project
>tests
>__init__.py #Note this file
>test_to_json.py
>__init__.py #Note this file
>setup.py #Note this file
>app.py
>x.db
Step 2
Write your setup.py.
Here is an generic setup.py that should work for your project:
from setuptools import setup, find_packages
setup(
name='project_package', #the name of your desired import
version='0.0.1',
author='An Awesome Coder',
packages=find_packages(),
description='An awesome package that does something',
install_requires=[], # a list of python dependencies for your package
)
find_packages() looks for all the __init__.py files in your package to identify submodules.
Step 3
Install your package. In the folder with your new setup.py, run pip install -e . This will install your package on your computer, basically adding the files to your python system path.
Step 4
From any python terminal on your computer you should now be able to import your package using the package_name you specified.
import project_package
from project_package.app import myClass
myClass()
.
.
.

CX_freeze with ruamel.yaml

I can't get CX_Freeze to include the package ruamel.yaml with the build_exe.
I've also tried adding it to the "packages" option like
build_exe_options = {
...
"packages": [
...
"ruamel.yaml",
...
]
...
}
cx_Freeze.setup(
...
executables=[cx_Freeze.Executable("pyhathiprep/__main__.py",
targetName="pyhathiprep.exe", base="Console")],
)
and I get
File "C:\Users\hborcher\PycharmProjects\pyhathiprep\.env\lib\site-packages\cx_Freeze\finder.py", line 350, in _ImportModule
raise ImportError("No module named %r" % name)
ImportError: No module named 'ruamel.yaml'
I've tried adding it to the "namespace_packages" like
build_exe_options = {
...
"namespace_packages": ["ruamel.yaml"]
...
}
cx_Freeze.setup(
...
executables=[cx_Freeze.Executable("pyhathiprep/__main__.py",
targetName="pyhathiprep.exe", base="Console")],
)
and I get
File "C:\Users\hborcher\PycharmProjects\pyhathiprep\.env\lib\site-packages\cx_Freeze\finder.py", line 221, in _FindModule
return None, module.__path__[0], info
TypeError: '_NamespacePath' object does not support indexing
What am I doing wrong?
The doc for ruamel.yaml clearly states that you have to use a recent version of pip and setuptools to install ruamel.yaml.
CX_Freeze is not calling pip, nor does it support installing from the (correctly pre-configured) .whl files. Instead it does seem to call setup() in a way of its own.
What you can try to do is create a ruamel directory in your source directory, then in that directory create an empty __init__.py file and yaml directory. In that yaml directory copy all of the .py files from an unpacked latest version of ruamel.yaml skipping setup.py and all of the other install cruft. Alternatively you can check those files out from Bitbucket, but then there is even more unnecessary cruft to deal with, and you run the slight risk of having a non-released intermediate version if you don't check out by release tag.
Once that works you'll have a "pure" Python version of ruamel.yaml in your frozen application.
If you are using yaml = YAML(typ='safe') or yaml = YAML(typ='unsafe') and you expect the speed up from the C based loader and dumper, then you should look at using the Windows .whl files provided on PyPI. They include the _ruamel_yaml.cpXY-win_NNN.pyd files. If you don't know your target (python and/or win32|win_amd64 you should be able to include all of them and ruamel.yaml will pick the right one when it starts (actually it only does from _ruamel_yaml import CParser, CEmitter and assumes the Python interpreter knows what to do).
Okay I figured out a solution. I think it might be a bug in CX_Freeze. If I pip install ruamel.base and ruamel.yaml cx_freeze seems to install everything correctly. This is true, even if I ask it to only include ruamel.yaml.
If I have both ruamel.base and ruamel.yaml installed, then this works...
build_exe_options = {
...
"namespace_packages": ["ruamel.yaml"]
...
}
cx_Freeze.setup(
...
executables=[cx_Freeze.Executable("pyhathiprep/__main__.py",
targetName="pyhathiprep.exe", base="Console")],
)
I had this same problem with azure. The issue is the way microsoft structured the azure package - you can import azure.something.something_else.module, but you can't import azure directly. cx_freeze needs to be able to find the folder azure (or in your case, the folder ruamel) directly, not just the subfolders.
I had to go to each directory under the azure folder that I was accessing and ensure there was an init.py file there. After that, cx_freeze was able to find it perfectly.
Another option would be to just directly copy the folder from a path that you know (direct link to your site-packages, or copy the ruamel directory into your program directory and copy it from there) into the build folder as part of your setup. I do this for things like my data files:
import shutil
shutil.copytree("icons","build/exe.win32-3.6/icons")

How to properly write the import statement

How do I properly write something like this:
from ../lib.model import mymodel
Here is the tree:
lib----->model---->mynodel.py
|
|----->myscript--->myscript.py
if your script is using lib you can create a setup.py using file for your project using setuptools
Using setuptools develop command will create a "development mode" version of your project and put it on your python path. It then becomes easy to use it like you would use any python package.
your setup.py can be as simple as:
from setuptools import setup, find_packages
setup(
name = "lib",
version = "0.1dev",
packages = find_packages(),
)
Then you can develop on your project like
python setup.py develop
Now you can import your package into any script you want
from lib.model import model
If lib is a package, run myscript as a module and import mymodel like this:
from ..model import mymodel # relative import
Or:
from lib.model import mymodel # absolute import
To run myscript.py as a module in the lib package, do one of the following:
run a program in the folder containing lib that imports lib.myscript.myscript
run myscript.py as a module from the folder containing lib, using python -m lib.myscript.myscript
Assuming you call from myscript.py.
Try this:
import sys
sys.path.insert(0, '../model/')
import mynodel
mynodel is probably mymodel, I think you made a typo in your post.
Never put the extension at the imprt statement.
sys.path is a list of paths where python will look for library files. You can simply add a relative path to the directory you want. By putting it at the front of the list, you ensure that python will first look for the file at the specified path (say for instance there is a library with the same name, your file will have priority).
Furthermore it could be useful to give the output of tree (a linux and cmd (Windows) command). This gives standardized output.

Python setuptools: how to include a config file for distribution into <prefix>/etc

How can I write setup.py so that:
The binary egg distribution (bdist_egg) includes a sample configuration file and
Upon installation puts it into the {prefix}/etc directory?
A sample project source directory looks like this:
bin/
myapp
etc/
myapp.cfg
myapp/
__init__.py
[...]
setup.py
The setup.py looks like this:
from distutils.command.install_data import install_data
packages = ['myapp', ]
scripts = ['bin/myapp',]
cmdclasses = {'install_data': install_data}
data_files = [('etc', ['etc/myapp.cfg'])]
setup_args = {
'name': 'MyApp',
'version': '0.1',
'packages': packages,
'cmdclass': cmdclasses,
'data_files': data_files,
'scripts': scripts,
# 'include_package_data': True,
'test_suite': 'nose.collector'
}
try:
from setuptools import setup
except ImportError:
from distutils.core import setup
setup(**setup_args)
setuptools are installed in both the build environment and in the installation environment.
The 'include_package_data' commented out or not does not help.
I was doing some research on this issue and I think the answer is in the setuptools documentation: http://peak.telecommunity.com/DevCenter/setuptools#non-package-data-files
Next, I quote the extract that I think has the answer:
Non-Package Data Files
The distutils normally install general "data files" to a
platform-specific location (e.g. /usr/share). This feature intended to
be used for things like documentation, example configuration files,
and the like. setuptools does not install these data files in a
separate location, however. They are bundled inside the egg file or
directory, alongside the Python modules and packages. The data files
can also be accessed using the Resource Management API [...]
Note, by the way, that this encapsulation of data files means that you
can't actually install data files to some arbitrary location on a
user's machine; this is a feature, not a bug. You can always include a
script in your distribution that extracts and copies your the
documentation or data files to a user-specified location, at their
discretion. If you put related data files in a single directory, you
can use resource_filename() with the directory name to get a
filesystem directory that then can be copied with the shutil module.
[...]

Managing resources in a Python project

I have a Python project in which I am using many non-code files. Currently these are all images, but I might use other kinds of files in the future. What would be a good scheme for storing and referencing these files?
I considered just making a folder "resources" in the main directory, but there is a problem; Some images are used from within sub-packages of my project. Storing these images that way would lead to coupling, which is a disadvantage.
Also, I need a way to access these files which is independent on what my current directory is.
You may want to use pkg_resources library that comes with setuptools.
For example, I've made up a quick little package "proj" to illustrate the resource organization scheme I'd use:
proj/setup.py
proj/proj/__init__.py
proj/proj/code.py
proj/proj/resources/__init__.py
proj/proj/resources/images/__init__.py
proj/proj/resources/images/pic1.png
proj/proj/resources/images/pic2.png
Notice how I keep all resources in a separate subpackage.
"code.py" shows how pkg_resources is used to refer to the resource objects:
from pkg_resources import resource_string, resource_listdir
# Itemize data files under proj/resources/images:
print resource_listdir('proj.resources.images', '')
# Get the data file bytes:
print resource_string('proj.resources.images', 'pic2.png').encode('base64')
If you run it, you get:
['__init__.py', '__init__.pyc', 'pic1.png', 'pic2.png']
iVBORw0KGgoAAAANSUhE ...
If you need to treat a resource as a fileobject, use resource_stream().
The code accessing the resources may be anywhere within the subpackage structure of your project, it just needs to refer to subpackage containing the images by full name: proj.resources.images, in this case.
Here's "setup.py":
#!/usr/bin/env python
from setuptools import setup, find_packages
setup(name='proj',
packages=find_packages(),
package_data={'': ['*.png']})
Caveat:
To test things "locally", that is w/o installing the package first, you'll have to invoke your test scripts from directory that has setup.py. If you're in the same directory as code.py, Python won't know about proj package. So things like proj.resources won't resolve.
The new way of doing this is with importlib. For Python versions older than 3.7 you can add a dependency to importlib_resources and do something like
from importlib_resources import files
def get_resource(module: str, name: str) -> str:
"""Load a textual resource file."""
return files(module).joinpath(name).read_text(encoding="utf-8")
If your resources live inside the foo/resources sub-module, you would then use get_resource like so
resource_text = get_resource('foo.resources', 'myresource')
You can always have a separate "resources" folder in each subpackage which needs it, and use os.path functions to get to these from the __file__ values of your subpackages. To illustrate what I mean, I created the following __init__.py file in three locations:
c:\temp\topp (top-level package)
c:\temp\topp\sub1 (subpackage 1)
c:\temp\topp\sub2 (subpackage 2)
Here's the __init__.py file:
import os.path
resource_path = os.path.join(os.path.split(__file__)[0], "resources")
print resource_path
In c:\temp\work, I create an app, topapp.py, as follows:
import topp
import topp.sub1
import topp.sub2
This respresents the application using the topp package and subpackages. Then I run it:
C:\temp\work>topapp
Traceback (most recent call last):
File "C:\temp\work\topapp.py", line 1, in
import topp
ImportError: No module named topp
That's as expected. We set the PYTHONPATH to simulate having our package on the path:
C:\temp\work>set PYTHONPATH=c:\temp
C:\temp\work>topapp
c:\temp\topp\resources
c:\temp\topp\sub1\resources
c:\temp\topp\sub2\resources
As you can see, the resource paths resolved correctly to the location of the actual (sub)packages on the path.
Update: Here's the relevant py2exe documentation.
# pycon2009, there was a presentation on distutils and setuptools. You can find all of the videos here
Eggs and Buildout Deployment in Python - Part 1
Eggs and Buildout Deployment in Python - Part 2
Eggs and Buildout Deployment in Python - Part 3
In these videos, they describe how to include static resources in your package. I believe its in part 2.
With setuptools, you can define dependancies, this would allow you to have 2 packages that use resources from 3rd package.
Setuptools also gives you a standard way of accessing these resources and allows you to use relative paths inside of your packages, which eliminates the need to worry about where your packages are installed.

Categories