Python package-data not found on CI server

Python package-data not found on CI server - python

For one python project, I want to ship a data file with the package.
Following various (and partially contradictory) advice on the mess that is Python package data, I ended up trying different things and got it to work locally on my machine with the following setup.
My setup.cfg contains, among other things that shouldn't matter here,
[options]
include_package_data = True
and no package_data or other data related keys. My MANIFEST.in states
recursive-include lexedata clics3-network.gml.zip
My setup.py is pretty bare, essentially
from setuptools import setup
readline = "readline"
setup(extras_require={"formatguesser": [readline]})
To load the file, I use
pkg_resources.resource_stream("lexedata", "data/clics3-network.gml.zip")
I test this using tox, configured with
[tox]
isolated_build = True
envlist = general
[testenv]
passenv = CI
deps =
codecov
pytest
pytest-cov
commands =
pytest --doctest-modules --cov=lexedata {envsitepackagesdir}/lexedata
pytest --cov=lexedata --cov-append test/
codecov
On my local machine, when I run pip install ., the data file lexedata/data/clics2-network.gml.zip is properly deposited inside the site-packages/lexeadata/data directory of the corresponding virtual environment, and tox packages it inside .tox/dist/lexedata-1.0.0b3.tar.gz as well as in its venv site packages directory .tox/general/lib/python3.8/site-packages/lexedata/data/.
However, continuous integration using Github actions fails on all Python 3 versions I'm testing with
UNEXPECTED EXCEPTION: FileNotFoundError(2, 'No such file or directory')
FileNotFoundError: [Errno 2] No such file or directory: '/home/runner/work/lexedata/lexedata/.tox/general/lib/python3.10/site-packages/lexedata/data/clics3-network.gml.zip'
at the equivalent of that same tox venv path.
What could be going wrong here?

You almost did it right, try slightly update your MANIFEST.in to any of the following examples:
include src/lexedata/data/*.zip
recursive-include src/* *.zip
recursive-include **/data clics3-network.gml.zip
As you can find in docs include command defines files as paths relative to the root of the project (that's why first example starts from src folder)
recursive-include expect first argument being as dir-pattern (glob-style), so it is better include asterisks

Related

Moved from pip to poetry and now pytest-cov won't collect coverage data

I recently started using poetry to manage project dependencies,
rather than using requirements.txt and test-requirements.txt and
pip.
Since making the change, I'm not able to get coverage tests to work
correctly. In both cases, I'm using tox to drive the testing (and I
have the tox-poetry extension installed).
My tox.ini currently looks like this:
[tox]
isolated_build = True
envlist = pep8,unit
[testenv]
whitelist_externals = poetry
[testenv:venv]
commands = {posargs}
[testenv:pep8]
commands =
poetry run flake8 {posargs:symtool}
[testenv:unit]
commands =
poetry run pytest --cov=symtool {posargs} tests/unit
Previously, it looked like this:
[tox]
envlist = pep8,unit
[testenv]
usedevelop = True
install_command = pip install -U {opts} {packages}
deps = -r{toxinidir}/requirements.txt
-r{toxinidir}/test-requirements.txt
[testenv:venv]
commands = {posargs}
[testenv:pep8]
commands =
flake8 {posargs:symtool}
[testenv:unit]
commands =
pytest --cov=symtool {posargs} tests/unit
Since making the change to poetry, when I run e.g. tox -e unit, I see:
unit run-test: commands[0] | poetry run pytest --cov=symtool tests/unit
===================================== test session starts =====================================
platform linux -- Python 3.9.1, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
cachedir: .tox/unit/.pytest_cache
rootdir: /home/lars/projects/symtool, configfile: tox.ini
plugins: cov-2.11.1
collected 14 items
tests/unit/test_disasm.py ..... [ 35%]
tests/unit/test_symtool.py ......... [100%]r
Coverage.py warning: No data was collected. (no-data-collected)
I'm trying to figure out that no-data-collected issue. According to
pytest --help, the --cov arguments set the path or package
name:
--cov=[SOURCE] Path or package name to measure during execution
The root of the repository (which is rootdir in the above output
from tox) looks like this:
asm
pyproject.toml
README.md
reference
symtool
tests
tox.ini
There's definitely a symtool directory there containing the package.
And even if the tests were somehow not running in the project root
directory, the symtool package is installed in the test environment,
as evidenced by the fact that the unit tests are actually passing (all
of which include some variant of import symtool).
How do I get coverage to work again?

Turning my comments into an answer:
This is an old issue with pytest-cov and tox (first reported in issue #38). tox installs your project as a third-party package and pytest will import it from the site (e.g. from .tox/unit/lib/python3.X/site-packages if the job is named unit), while --cov=symtool instructs pytest-cov to collect coverage over the symtool dir in project root.
One solution is to switch to the src layout:
├── pyproject.toml
├── README.md
├── reference
├── src
| └── symtool
├── tests
└── tox.ini
In your pyproject.toml, you will need to point poetry to source dir:
[tool.poetry]
...
packages = [
{ include = "symtool", from = "src" },
]
Now src will prevent importing code from the source tree, so --cov=symtool will collect coverage over installed modules, this being the only option. For the rest of the dev process, you shouldn't get into trouble as poetry install installs the project in editable mode, so the rest should just work.
Another option is to skip package installation with tox and install in editable mode instead. Example snippet to place in tox.ini:
[testenv]
whitelist_externals =
poetry
skip_install = true
commands_pre =
poetry install
commands =
poetry run pytest ...
This pretty much kills testing of the installed modules though (you stop testing explicitly whether your project can be installed in a blank venv, instead using what's in the source tree), so I'd go with the src layout option.

If you are using poetry, you probably like to use the locked dependencies for pytest, flake8, ... in tox as well. This can be achieved with tox-poetry-installer.
To make sure, your tests run against the build and installed package and not the local files, you must use the --import-mode flag for pytest and set it to importlib.
To measure the coverage with pytest-cov, you have to point to the {envsitepackagesdir}.
To put it all together, your tox.ini can look like this:
[tox]
isolated_build = true
requires =
tox-poetry-installer[poetry] == 0.6.0
envlist = py39
[testenv]
locked_deps =
pytest
pytest-cov
commands = pytest --cov {envsitepackagesdir}/mypackage --import-mode=importlib
What is --import-mode doing?
To run the tests, pytest needs to import the test modules. Traditionally this is done by prepending the path to the root folder where the discovered test folder is located to sys.path. Usually the test folder and the package folder share the same root folder. So the side effect of prepending to sys.path is, that tests run against the package folder and not the installed package, because python find's this folder first.
Instead of prepending to sys.path, one can advice pytest to append the discovered root folder to it. Doing this python will first look at the site-packages folder when trying to import. So one can test against the installed package.
Manipulating sys.path is almost always a bad idea, as it can to lead unwanted side effects. The docs of pytest describes one:
Same as prepend, requires test module names to be unique when the test directory tree is not arranged in packages, because the modules will put in sys.modules after importing.
The third option for importlib, was introduced with pytest 6. This uses pythons build-in importlib to load a test module dynamically without manipulating sys.path. They plan do make this option the default one in a future release.

Github Travis CI with pytest and package data - FileNotFoundError

I've got a repo on GitHub, for which I wanted to implement Travis CI with pytest for basic testing. Currently the Travis CI build fails when loading data tables within a module, raising a FileNotFoundError.
To make it short, here is the imho most important information on the build:
directory of the data tables is included in MANIFEST.in with include mypkg/data_tables/* (see below for a detailed structure)
setuptools.setup method has the include_package_data=True parameter
additionally packages=setuptools.find_packages() is provided
Travis CI installs the package with install: pip install -e .
Travis CI pytest is invoked with script: pytest --import-mode=importlib
during testing the first tests succeed. But when it comes to loading the data tables, pytest raises the error FileNotFoundError: [Errno 2] No such file or directory: '/home/travis/build/myname/mypkg/mypkg/data_tables\\my_data.csv'
Interestingly the slashes before the file name are back-slashes, while the other are not, even though the final path is constructed with os.path.abspath().
Detailed description
Unluckily the repo is private and I'm not allowed to share it. Thus I'll try to describe the GitHub package layout as detailed as possible. So let's say my repo is built with a structure like this (general layout taken from this example):
setup.py
MANIFEST.in
mypkg/
some_data_tables/
my_data.csv
my_other_data.pkl
__init__.py
view.py
tests/
test_view.py
My minimum MANIFEST.in looks like this:
include mypkg/data_tables/*
With the setup.py fully reduced to a minimum working example like this:
from setuptools import find_packages, setup
setup(
name='Mypkg',
version='123.456',
description='some_text',
python_requires='>=3.7.7',
packages=find_packages( # <---- this should be sufficient, right?
exclude=["tests", "*.tests", "*.tests.*", "tests.*"]),
include_package_data=True, # <---- also this should work
)
And the .travis.yml file (omitting - pip install -r requirements.txt etc.):
language: python
python:
- "3.7.7"
dist: xenial
install:
- pip install -e .
script:
- pytest --import-mode=importlib
Checking the content of the .egg or tar.gz files, the data tables are included. So I have no idea, where the files are "getting lost".
Any idea how to solve this error?
If providing more information could help, f.i. on the class initialized in test_view, please tell me.

How to let pytest discover and run doctests in installed modules?

I am adding unit tests and to a kind of "legacy" Python package. Some of the modules contain their own doctests embedded in docstrings. My goal is to run both those doctests and new, dedicated unit tests.
Following this Q&A ("How to make py.test run doctests as well as normal tests directory?") I'm using the --doctest-modules option to pytest. When running from the source repository, pytest indeed discovers the embedded doctests from Python modules under the src directory.
However, my goal is to test that the source distribution builds and installs at all, and then test everything against the installed package. To do this I'm using tox which automate the process of building a sdist (source distribution) tarball, installing it in a virtual environment, and running the tests against the installed version. To ensure that it is the installed version, rather than the one in the source repository, that is imported by the tests, I follow the suggestion in this article and the repository looks like this now:
repo/
src/
my_package/
__init__.py
module_a.py
module_b.py
...
tests/
test_this.py
test_that.py
requirements.txt
setup.py
tox.ini
(The test scripts under tests import the package as in import my_package, which hits the installed version, because the repository layout makes sure that the src/my_package directory is out of the module search paths.)
And in the tox configuration file, the relevant sections looks like
[tox]
envlist = py27,py36,coverage-report
[testenv]
deps =
-rrequirements.txt
commands =
coverage run -p -m pytest --
[pytest]
addopts = --doctest-modules
So far, the tests run fine, and the doctests are picked up -- from the modules under src/my_package, rather than from the package installed in tox virtual environments.
My questions related to this set-up is as follows:
Is this actually a concern? tox seems to ensure that what you install is what you have in the source repository, but does it?
How can I instruct pytest to actually run doctests from the installed modules in a sort of clean way? I can think of a few solutions such as building a dedicated documentation tree in a doc directory and let pytest find the doctests in there with the --doctest-glob option. But is there a method to do this without building the docs first?

I found the answer to my own question.
To quote pytest issue #2042:
currently doctest-modules is fundamentally incompatible with testing
against a installed package due to module search in checkout vs module
usage from site-packages
So as of now the solution does not exist.

pytest collects tests from the current directory (unless you instruct it otherwise passing an explicit directory). Add the installation directory using tox substitutions in tox.ini. I.e., either pass the directory:
[testenv]
deps =
-rrequirements.txt
commands =
coverage run -p -m pytest {envsitepackagesdir}/my_package
or change directory:
[testenv]
changedir = {envsitepackagesdir}/my_package
deps =
-rrequirements.txt
commands =
coverage run -p -m pytest --

This is how I solved the import mismatch:
$ pytest tests/ --doctest-modules --pyargs myrootpkg <other args>
The problem here is that once you start specifying source paths explicitly (in this case via --pyargs), you have to specify all other source paths as well (tests/ in this example) as pytest will stop scanning the rootdir. This shouldn't be an issue when using the src layout though, as the tests aren't usually scattered around the repository.

Python Package depending on XML file

I created a private Python package that requires an XML file. When I run the package locally and on CircleCi, everything works great. Now, when I run code that installs the package as a dependency, I keep getting an error:
<urlopen error [Errno 2] No such file or directory: '/home/ubuntu/virtualenvs/venv-system/local/lib/python2.7/site-packages/...../metadata_wsdl.xml'>
Does anyone know what could be wrong? I have not been able to figure this one out.

You need to explicitly include any resources that aren't Python source code (*.py) in your setuptools distribution.
There are several ways to do this. The one I'd recommend is to use a combination of include_package_data = True in your setup() function and a MANIFEST.in file.
So assuming your distribution is layed out as my.package/my/package (i.e., with no intermediate src or lib directory), you could use something along these lines:
setup.py
from setuptools import setup, find_packages
setup(
...
packages = find_packages('my'), # include all packages under my/
include_package_data = True, # include everything in source control
# or included in MANIFEST.in
)
MANIFEST.in
recursive-include my *
recursive-include docs *
global-exclude *.pyc
global-exclude ._*
global-exclude *.mo
This would recursively include any type of file below my.package/my/ as well as my.package/docs/, and globally exclude some other types of files unwanted in a released distribution.
Please refer to Building and Distributing Packages with Setuptools » Including Data Files for more details on the available methods to include data files, and The MANIFEST.in template for more information about how to define your MANIFEST.
Once you've successfully included your data files in your distribution, you should make sure to use the ResourceManager API to access them from your code (as opposed to __file__ trickery or other path hacks, which won't work for certain platforms or zipped eggs).

How to run tests without installing package?

I have some Python package and some tests. The files are layed out following http://pytest.org/latest/goodpractices.html#choosing-a-test-layout-import-rules
Putting tests into an extra directory outside your actual application
code, useful if you have many functional tests or for other reasons
want to keep tests separate from actual application code (often a good
idea):
setup.py # your distutils/setuptools Python package metadata
mypkg/
__init__.py
appmodule.py
tests/
test_app.py
My problem is, when I run the tests py.test, I get an error
ImportError: No module named 'mypkg'
I can solve this by installing the package python setup.py install but this means the tests run against the installed package, not the local one, which makes development very tedious. Whenever I make a change and want to run the tests, I need to reinstall, else I am testing the old code.
What can I do?

I know this question has been already closed, but a simple way I often use is to call pytest via python -m, from the root (the parent of the package).
$ python -m pytest tests
This works because -m option adds the current directory to the python path, and hence mypkg is detected as a local package (not as the installed).
See:
https://docs.pytest.org/en/latest/usage.html#calling-pytest-through-python-m-pytest

The normal approach for development is to use a virtualenv and use pip install -e . in the virtualenv (this is almost equivalent to python setup.py develop). Now your source directory is used as installed package on sys.path.
There are of course a bunch of other ways to get your package on sys.path for testing, see Ensuring py.test includes the application directory in sys.path for a question with a more complete answer for this exact same problem.

On my side, while developing, I prefer to run tests from the IDE (using a runner extension) rather than using the command line. However, before pushing my code or prior to a release, I like to use the command line.
Here is a way to deal with this issue, allowing you to run tests from both the test runner used by your IDE and the command line.
My setup:
IDE: Visual Studio Code
Testing: pytest
Extension (test runner): https://marketplace.visualstudio.com/items?itemName=LittleFoxTeam.vscode-python-test-adapter
Work directory structure (my solution should be easily adaptable to your context):
project_folder/
src/
mypkg/
__init__.py
appmodule.py
tests/
mypkg/
appmodule_test.py
pytest.ini <- Use so pytest can locate pkgs from ./src
.env <- Use so VsCode and its extention can locate pkgs from ./src
.env:
PYTHONPATH="${PYTHONPATH};./src;"
pytest.ini (tried with pytest 7.1.2):
[pytest]
pythonpath = . src
./src/mypkg/appmodule.py:
def i_hate_configuring_python():
return "Finally..."
./tests/mypkg/appmodule_test.py:
from mypkg import app_module
def test_demo():
print(app_module.i_hate_configuring_python())
This should do the trick

Import the package using from .. import mypkg. For this to work you will need to add (empty) __init__.py files to the tests directory and the containing directory. py.test should take care of the rest.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python package-data not found on CI server - python

Related

Moved from pip to poetry and now pytest-cov won't collect coverage data

Github Travis CI with pytest and package data - FileNotFoundError

How to let pytest discover and run doctests in installed modules?

Python Package depending on XML file

How to run tests without installing package?

Categories

Resources