Rationale
With complex dependency matrix, tox testenv name patterns end up being a list like
py37-pytest5-framework1
py37-pytest5-framework2
py37-pytest6-framework1
py37-pytest6-framework2
py38-pytest5-framework1
py38-pytest5-framework2
py38-pytest6-framework1
py38-pytest6-framework2
...
py310-pytest6-framework2
While the inner tox.ini syntax allows configuring a lot of things with name fragments, e.g.
[testenv]
basepython =
py37: python3.7
py38: python3.8
py39: python3.9
py310: python3.10
deps =
pytest5: pytest ~= 5.0
pytest6: pytest ~= 6.0
framework1: framework ~= 1.0
framework2: framework ~= 2.0
setenv =
framework2: FOO=bar
I find there is no way in telling the tox CLI in running all testenvs matching a name fragment like tox -e py39 or tox -e framework2.
Issues
The main drawback is that most usually CI testing jobs will end up being segregated by python version, so you end up writing instructions like
tox -e $PY-pytest5-framework1,$PY-pytest5-framework2,$PY-pytest6-framework1,$PY-pytest6-framework2
but then the CI jobs definition is coupled to the tox test matrix because it must be aware of:
testenvs being added or removed
matrix exclusions like pytest-5 is not compatible with python-3.10
And this is cumbersome to maintain.
Incomplete workaround
An easy-to-go workaround is simply running tox --skip-missing-interpreters, but the drawbacks are:
CI jobs can't be segregated by framework version instead of python version, for example to reuse some special framework cache
CI VMs could feature system python installations beyond the one targeted by each job, so you could en up with e.g. python-3.8 being run in all CI jobs.
Question
Am I missing some out-of-the-box mechanism to filter the testenvs to be run with a fragment that powers me to write CI jobs agnostic to the tox dependency matrix? I mean something like tox -e '*-framework2'.
Am I bound to filter and aggregate the output of tox --listenvs with shell tricks?
You could negate a regex pattern for the TOX_SKIP_ENV as the following:
$ env TOX_SKIP_ENV='.*[^-framework2]$' tox
tox4, which will be introduced within the next couple of months, introduces labels. While this may be not an immediate help for your problem, maybe you see a way to simplify your tox.ini.
Related
I am trying to create a python package (deb & rpm) from cmake, ideally using cpack. I did read
https://cmake.org/cmake/help/latest/cpack_gen/rpm.html and,
https://cmake.org/cmake/help/latest/cpack_gen/deb.html
The installation works just fine (using component install) for my shared library. However I cannot make sense of the documentation to install the python binding (glue) code. Using the standard cmake install mechanism, I tried:
install(
FILES __init__.py library.py
DESTINATION ${ACME_PYTHON_PACKAGE_DIR}/project_name
COMPONENT python)
And then using brute-force approach ended-up with:
# debian based package (relative path)
set(ACME_PYTHON_PACKAGE_DIR lib/python3/dist-packages)
and
# rpm based package (full path required)
set(ACME_PYTHON_PACKAGE_DIR /var/lang/lib/python3.8/site-packages)
The above is derived from:
debian % python -c 'import site; print(site.getsitepackages())'
['/usr/local/lib/python3.9/dist-packages', '/usr/lib/python3/dist-packages', '/usr/lib/python3.9/dist-packages']
while:
rpm % python -c 'import site; print(site.getsitepackages())'
['/var/lang/lib/python3.8/site-packages']
It is pretty clear that the brute-force approach will not be portable, and is doomed to fail on the next release of python. The only possible solution that I can think of is generating a temporary setup.py python script (using setuptools), that will do the install. Typically cmake would call the following process:
% python setup.py install --root ${ACME_PYTHON_INSTALL_ROOT}
My questions are:
Did I understand the cmake/cpack documentation correctly for python package ? If so this means I need to generate an intermediate setup.py script.
I have been searching through the cmake/cpack codebase (git grep setuptools) but did not find helper functions to handle generation of setup.py and passing the result files back to cpack. Is there an existing cmake module which I could re-use ?
I did read, some alternative solution, such as:
How to build debian package with CPack to execute setup.py?
Which seems overly complex, and geared toward Debian-only based system. I need to handle RPM in my case.
As mentionned in my other solution, the ugly part is dealing with absolute path in cmake install() commands. I was able to refactor the code to avoid usage of absolute path in install(). I simply changed the installation into:
install(
# trailing slash is important:
DIRECTORY ${SETUP_OUTPUT}/
# "." syntax is a reliable mechanism, see:
# https://gitlab.kitware.com/cmake/cmake/-/issues/22616
DESTINATION "."
COMPONENT python)
And then one simply needs to:
set(CMAKE_INSTALL_PREFIX "/")
set(CPACK_PACKAGING_INSTALL_PREFIX "/")
include(CPack)
At this point all install path now need to include explicitely /usr since we've cleared the value for CMAKE_INSTALL_PREFIX.
The above has been tested for deb and rpm packages. CPACK_BINARY_TGZ does properly run with the above solution:
https://gitlab.kitware.com/cmake/cmake/-/issues/22925
I am going to post the temporary solution I am using at the moment, until someone provide something more robust.
So I eventually manage to stumble upon:
https://alioth-lists.debian.net/pipermail/libkdtree-devel/2012-October/000366.html and,
Using CMake with setup.py
Re-using the above to do an install step instead of a build step can be done as follow:
find_package(Python COMPONENTS Interpreter)
set(SETUP_PY_IN "${CMAKE_CURRENT_SOURCE_DIR}/setup.py.in")
set(SETUP_PY "${CMAKE_CURRENT_BINARY_DIR}/setup.py")
set(SETUP_DEPS "${CMAKE_CURRENT_SOURCE_DIR}/project_name/__init__.py")
set(SETUP_OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/build-python")
configure_file(${SETUP_PY_IN} ${SETUP_PY})
add_custom_command(
OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/setup_timestamp
COMMAND ${Python_EXECUTABLE} ARGS ${SETUP_PY} install --root ${SETUP_OUTPUT}
COMMAND ${CMAKE_COMMAND} -E touch ${CMAKE_CURRENT_BINARY_DIR}/setup_timestamp
DEPENDS ${SETUP_DEPS})
add_custom_target(target ALL DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/setup_timestamp)
And then the ugly part is:
install(
# trailing slash is important:
DIRECTORY ${SETUP_OUTPUT}/
DESTINATION "/" # FIXME may cause issues with other cpack generators
COMPONENT python)
Turns out that the documentation for install() is pretty clear about absolute paths:
https://cmake.org/cmake/help/latest/command/install.html#introduction
DESTINATION
[...]
As absolute paths are not supported by cpack installer generators,
it is preferable to use relative paths throughout.
For reference, here is my setup.py.in:
from setuptools import setup
if __name__ == '__main__':
setup(name='project_name_python',
version='${PROJECT_VERSION}',
package_dir={'': '${CMAKE_CURRENT_SOURCE_DIR}'},
packages=['project_name'])
You can be fancy and remove the __pycache__ folder using the -B flag:
COMMAND ${Python_EXECUTABLE} ARGS -B ${SETUP_PY} install --root ${SETUP_OUTPUT}
You can be extra fancy and add debian option such as:
if(CPACK_BINARY_DEB)
set(EXTRA_ARG "--install-layout" "deb")
endif()
use as:
COMMAND ${Python_EXECUTABLE} ARGS -B ${SETUP_PY} install --root ${SETUP_OUTPUT} ${EXTRA_ARG}
In Snakemake, conda environments can be easily set up by defining rules as such conda: "envs/my_environment.yaml". This way, YAML files specify which packages to install prior to running the pipeline.
Some software requires a path to third-party-software, to execute specific commands.
An example of this is when generating a reference index with RSEM (example from GitHub page DeweyLab - RSEM):
rsem-prepare-reference --gtf mm9.gtf \
--star \
--star-path /sw/STAR \
-p 8 \
--prep-pRSEM \
--bowtie-path /sw/bowtie \
--mappability-bigwig-file /data/mm9.bigWig \
/data/mm9 \
/ref/mouse_0
Can I locate or predefine the directory (e.g. [workdir]/.snakemake/conda/STAR) for the STAR aligner software, which is installed via conda in a prior rule?
Currently, one option may be to create a shared environment folder, using the Command-line interface option: --conda-prefixSnakemake docs - Command-line interface, however as this is a single-case-issue, I would prefer to define this information in the rules.
There are two ways that I've dealt with this.
1: Let Conda Handle PATH
That specific option (--star-path) only needs to be specified if STAR is not on PATH. However, if STAR is included in your YAML for this rule, then Conda will place it on PATH as part of the environment activation, and so that option won't be needed. Same goes for --bowtie-path. Hence, for such a rule the YAML might be something like:
name: rsem
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- rsem
- star
- bowtie
As per this thread, consider fixing the versions on the packages up to a minor version (e.g., bowtie=1.3).
2: Use config.yaml for Pipeline Options
If for some reason you don't want a fully self-contained pipeline, e.g., your system already has lots of standard genomics software like STAR preinstalled, then you could include an entries in your config.yaml where users should adjust the pipeline to their system. For example, here are the relevant parts:
config.yaml
star_path: /sw/STAR
bowtie_path: /sw/bowtie
Snakefile
configfile: config.yaml
## this is not a complete rule
rule rsem_prep_ref:
# needs input, output...
params:
star=config['star_path'],
bowtie=config['bowtie_path']
threads: 8
conda: "envs/myenv.yaml"
shell:
"""
rsem-prepare-reference --gtf mm9.gtf \
--star \
--star-path {params.star} \
-p {threads} \
--prep-pRSEM \
--bowtie-path {params.bowtie} \
--mappability-bigwig-file /data/mm9.bigWig \
/data/mm9 \
/ref/mouse_0
"""
Really, anything your pipeline assumes already exists and is not generated by the pipeline itself should go into your config.yaml (e.g., mm9.gtf or mm9.bigWig).
Note on Sharing Environments
Generally, I advise against trying to share environments. However, you can still conserve space by sharing a package cache across users and making sure environments are created on the same filesystem (this lets Conda use hardlinks instead of copying). You can use the Conda configuration option pkgs_dirs to set package cache locations. If the pipeline itself is already on the same file system as the Conda package cache, I would just let Snakemake use the default location (.snakemake/conda) and not mess with the --conda-prefix argument.
Otherwise, you can give Snakemake the --conda-prefix argument to point to a directory on the same file system in which to create Conda environments. This should be a rather generic directory in which all environments for the pipeline get located. What was proposed in OP ([workdir]/.snakemake/conda/STAR) would not make sense.
I would like to add a third option to #merv's answer. You could use which to dynamically figure out the path (assuming it is enabled on your system):
rsem-prepare-reference --star-path $(which star) ...
We have a flake8 build stage in our circle-ci workflow, and more often than not this step fails due to timeout:
Too long with no output (exceeded 10m0s): context deadline exceeded
At the same time, this same stage runs quite ok locally on our macbooks:
% time make lint
poetry run black .
All done! ✨ 🍰 ✨
226 files left unchanged.
isort -y
Skipped 2 files
PYTHONPATH=/path/to/project poetry run flake8 --show-source
0
make lint 44.00s user 4.90s system 102% cpu 47.810 total
We tried to debug the issue by adding the -vv flag to flake8 thinking we would get some plugin name that takes too long, but we don't even have the timestamps in the log:
flake8.processor ForkPoolWorker-31 1004 WARNING Plugin requested optional parameter "visitor" but this is not an available parameter.
flake8.processor ForkPoolWorker-8 1080 WARNING Plugin requested optional parameter "visitor" but this is not an available parameter.
flake8.bugbear ForkPoolWorker-26 1082 INFO Optional warning B950 not present in selected warnings: ['E', 'F', 'W', 'C90']. Not firing it at all.
Are there any known reasons why flake8 would freeze on CircleCI? How can one debug the issue?
When using a virtual-environment such as venv you should ignore the folder in the [flake8]-config (that's what happened to me). Assuming you are creating a virtualenv with virtualenv .venv it would look like this:
[flake8]
exclude = .venv
The same was for my coverage which was fixed by adding an omit to that config (solution found here):
# pyproject.toml file content
[tool.coverage.run]
omit = [
"tests/*",
".venv/*",
]
For now, the solution we seem to have found was to limit the number of cores running flake8:
.flake8
[flake8]
...
jobs = 6
Not sure it is the correct solution, but there you go. I will accept a better solution if there is one.
I've also experienced a timeout in circle-ci only, but it was due to the specific way dependencies are installed on the pipeline, creating a .venv folder which was not excluded in flake8 configuration.
The -v option helped me to notice the huge amount of files flkae8 was analyzing.
In pytest-cov documentation it says:
Note that this plugin controls some options and setting the option in
the config file will have no effect. These include specifying source
to be measured (source option) and all data file handling (data_file
and parallel options).
However it doesn't say how to change these options. Is there a way to change it (parallel=True)?
I want to change this because after coverage is upgraded from < 5 to latest (5.1) I got these:
Failed to generate report: Couldn't use data file '/path/to/jenkins/workspace/pr/or/branch/.coverage': no such table: line_bits
Note: using coverage < 5 do not have this problem
I have also tried adding .coveragerc with the following but still get the same issue.
[run]
parallel = True
The way it is run in jenkins:
pytest ./tests --mpl -n 4 \
--junitxml=pyTests.xml --log-cli-level=DEBUG -s \
--cov=. --cov-report --cov-report html:coverage-reports
This is due to pytest-cov using coverage combine, which combines all coverage results: In parallel it mixes results from other runs, that may or may not be completed, and in any cases are irrelevant.
I think if you're having the issue, it may be because you're running multiple tests in parallel, like multiple versions of Python.
In which case it's easily solved by specifying a unique COVERAGE_FILE for each run, like:
export COVERAGE_FILE=.coverage.3.7
for the Python 3.7 run, an so on.
See: https://github.com/nedbat/coveragepy/issues/883#issuecomment-650562896
How do you test different Python versions with Tox from within Travis-CI?
I have a tox.ini:
[tox]
envlist = py{27,33,34,35}
recreate = True
[testenv]
basepython =
py27: python2.7
py33: python3.3
py34: python3.4
py35: python3.5
deps =
-r{toxinidir}/pip-requirements.txt
-r{toxinidir}/pip-requirements-test.txt
commands = py.test
which runs my Python unittests in several Python versions and works perfectly.
I want to setup a build in Travis-CI to automatically run this when I push changes to Github, so I have a .travis.yml:
language: python
python:
- "2.7"
- "3.3"
- "3.4"
- "3.5"
install:
- pip install tox
script:
- tox
This technically seems to work, but it redundantly runs all my tests in each version of Python...from each version of Python. So a build that takes 5 minutes now takes 45 minutes.
I tried removing the python list from my yaml file, so Travis will only run a single Python instance, but that causes my Python3.5 tests to fail because the 3.5 interpreter can't be found. Apparently, that's a known limitation as Travis-CI won't install Python3.5 unless you specify that exact version in your config...but it doesn't do that for the other versions.
Is there a way I can workaround this?
For this I would consider using tox-travis. This is a plugin which allows use of Travis CI’s multiple python versions and Tox’s full configurability.
To do this you will configure the .travis.yml file to test with Python:
sudo: false
language: python
python:
- "2.7"
- "3.4"
install: pip install tox-travis
script: tox
This will run the appropriate testenvs, which are any declared env with py27 or py34 as factors of the name by default. Py27 or py34 will be used as fallback if no environments match the given factor.
Further Reading
For more control and flexibility you can manually define your matrix so that the Python version and tox environment match up:
language: python
matrix:
include:
- python: 2.7
env: TOXENV=py27
- python: 3.3
env: TOXENV=py33
- python: 3.4
env: TOXENV=py34
- python: 3.5
env: TOXENV=py35
- python: pypy
env: TOXENV=pypy
- env: TOXENV=flake8
install:
- pip install tox
script:
- tox
In case it's not obvious, each entry in the matrix starts on a line which begins with a hyphen (-). Any items following that line which are indented are additional lines for that single item.
For example, all entries except for the last, are two lines. the last entry is only one line and does not contain a python setting; therefore, it simply uses the default Python version (Python 2.7 according to the Travis documentation). Of course, a specific Python version is not as important for that test. If you wanted to run such a test against both Python 2 and 3 (once each), then it is recommended to use the versions Travis installs by default (2.7 and 3.4) so that the tests complete more quickly as they don't need to install a non-standard Python version first. For example:
- python: 2.7
env: TOXENV=flake8
- python: 3.4
env: TOXENV=flake8
The same works with pypy (second to last entry on matrix) and pypy3 (not shown) in addition to Python versions 2.5-3.6.
While the various other answers provide shortcuts which give you this result in the end, sometimes its helpful to define the matrix manually. Then you can define specific things for individual environments within the matrix. For example, you can define dependencies for only a single environment and avoid the wasted time installing that dependency in every environment.
- python: 3.5
env: TOXENV=py35
- env: TOXENV=checkspelling
before_install: install_spellchecker.sh
- env: TOXENV=flake8
In the above matrix, the install_spellchecker.sh script is only run for the relevant environment, but not the others. The before_install setting was used (rather than install), as using the install setting would have overridden the global install setting. However, if that's what you want (to override/replace a global setting), simply redefine it in the matrix entry. No doubt, various other settings could be defined for individual environments within the matrix as well.
Manually defining the matrix can provide a lot of flexibility. However, if you don't need the added flexibility, one of the various shortcuts in the other answers will keep your config file simpler and easier to read and edit later on.
Travis provides the python version for each test as TRAVIS_PYTHON_VERSION, but in the form '3.4', while tox expects 'py34'.
If you don't want to rely on an external lib (tox-travis) to do the translation, you can do that manually:
language: python
python:
- "2.7"
- "3.3"
- "3.4"
- "3.5"
install:
- pip install tox
script:
- tox -e $(echo py$TRAVIS_PYTHON_VERSION | tr -d .)
Search this pattern in a search engine and you'll find many projects using it.
This works for pypy as well:
tox -e $(echo py$TRAVIS_PYTHON_VERSION | tr -d . | sed -e 's/pypypy/pypy/')
Source: flask-mongoengine's .travis.yml.
TOXENV environment variable can be used to select subset of tests for each version of Python via specified matrix:
language: python
python:
- "2.7"
- "3.4"
- "3.5"
env:
matrix:
- TOXENV=py27-django-19
- TOXENV=py27-django-110
- TOXENV=py27-django-111
- TOXENV=py34-django-19
- TOXENV=py34-django-110
- TOXENV=py34-django-111
- TOXENV=py35-django-19
- TOXENV=py35-django-110
- TOXENV=py35-django-111
install:
- pip install tox
script:
- tox -e $TOXENV
In tox config specify to skip missing versions of Python:
[tox]
skip_missing_interpreters=true