Pip pyproject.toml: Can optional dependency groups require other optional dependency groups? - python

I am using the latest version of pip, 23.01. I have a pyproject.toml file with dependencies and optional dependency groups (aka "extras"). To avoid redundancies and make managing optional dependency groups easier, I would like to know how to have optional dependency groups require other optional dependency groups.
I have a pyproject.toml where the optional dependency groups have redundant overlaps in dependencies. I guess they could described as "hierarchical". It looks like this:
[project]
name = 'my-package'
dependencies = [
'pandas',
'numpy>=1.22.0',
# ...
]
[project.optional-dependencies]
# development dependency groups
test = [
'my-package[chem]',
'pytest>=4.6',
'pytest-cov',
# ...
# Redundant overlap with chem and torch dependencies
'rdkit',
# ...
'torch>=1.9',
# ...
]
# feature dependency groups
chem = [
'rdkit',
# ...
# Redundant overlap with torch dependencies
'torch>=1.9',
# ...
]
torch = [
'torch>=1.9',
# ...
]
In the above example, pip install .[test] will include all of chem and torch groups' packages, and pip install .[chem] will include torch group's packages.
Removing overlaps and references from one group to another, a user can still get packages required for chem by doing pip install .[chem,torch], but I work with data scientists who may not realize immediately that the torch group is a requirement for the chem group, etc.
Therefore, I want a file that's something like this:
[project]
name = 'my-package'
dependencies = [
'pandas',
'numpy>=1.22.0',
# ...
]
[project.optional-dependencies]
# development dependency groups
test = [
'my-package[chem]',
'pytest>=4.6',
'pytest-cov',
# ...
]
# feature dependency groups
chem = [
'my-package[torch]',
'rdkit',
# ...
]
torch = [
'torch>=1.9',
# ...
]
This approach can't work because my-package is hosted in our private pip repository, so having'my-package[chem]' like the above example fetches the previously built version's chem group packages.
It appears that using Poetry and its pyproject.toml format/features can make this possible, but I would prefer not to switch our build system around too much. Is this possible with pip?

I think PDM has it solved:
Can be that it's not doing anything special and it would work the same with pip, the trick is to make a third one to depend on the other two, example from the docs:
[project]
name = "foo"
version = "0.1.0"
[project.optional-dependencies]
socks = ["pysocks"]
jwt = ["pyjwt"]
all = ["foo[socks,jwt]"]
reference: PDM - Manage Dependencies

Related

How can I keep poetry and commitizen versions synced?

I have a pyproject.toml with
[tool.poetry]
name = "my-project"
version = "0.1.0"
[tool.commitizen]
name = "cz_conventional_commits"
version = "0.1.0"
I add a new feature and commit with commit message
feat: add parameter for new feature
That's one commit.
Then I call
commitizen bump
Commitizen will recognize a minor version increase, update my pyproject.toml, and commit again with the updated pyproject.toml and a tag 0.2.0.
That's a second commit.
But now my pyproject.toml is "out of whack" (assuming I want my build version in sync with my git tags).
[tool.poetry]
name = "my-project"
version = "0.1.0"
[tool.commitizen]
name = "cz_conventional_commits"
version = "0.2.0"
I'm two commits in, one tagged, and things still aren't quite right. Is there workflow to keep everything aligned?
refer to support-for-pep621 and version_files
you can add "pyproject.toml:^version" to pyproject.toml:
[tool.commitizen]
version_files = [
"pyproject.toml:^version"
]

Concatenate 2 arrays in pyproject.toml

I'm giving a shot to the pyproject.toml file, and I'm stuck on this simple task. Consider the following optional dependencies:
[project.optional-dependencies]
style = ["black", "codespell", "isort", "flake8"]
test = ["pytest", "pytest-cov"]
all = ["black", "codespell", "isort", "flake8", "pytest", "pytest-cov"]
Is there a way to avoid copy/pasting all the optional-dep in the all key? Is there a way to do all = style + test at least?
There is no such feature directly in the toml markup.
However, there is a tricky way to do this in Python packaging by depending on yourself:
[project.optional-dependencies]
style = ["black", "codespell", "isort", "flake8"]
test = ["pytest", "pytest-cov"]
all = ["myproject[style]", "myproject[test]"]
Source:
Circular dependency is a feature that Python packaging is explicitly designed to allow, so it works and should continue to work.

Can not add Seaborn dependency using Poetry on Python

I'm trying to add Seaborn dependency to my module, using Poetry.
I've tried it on different ways, but always without success, maybe I'm doing it wrong.
Here's my current toml config file:
[tool.poetry]
name = "seaborn"
version = "0.1.0"
description = ""
authors = ["me"]
[tool.poetry.dependencies]
python = "3.9.6"
pandas = "^1.4.1"
jupyter = "^^.0.0"
scipy = "1.7.0"
numpy = "^1.22.3"
[tool.poetry.dev-dependencies]
pytest = "^5.2"
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
I've tried on the CLI:
poetry add seaborn
But no success.
Here's the output
poetry add seaborn
Using version ^0.11.2 for seaborn
Updating dependencies
Resolving dependencies... (0.0s)
AssertionError
at ~/.pyenv/versions/3.10.0/lib/python3.10/site-packages/poetry/mixology/incompatibility.py:111 in __str__
107│ )
108│
109│ def __str__(self):
110│ if isinstance(self._cause, DependencyCause):
→ 111│ assert len(self._terms) == 2
112│
113│ depender = self._terms[0]
114│ dependee = self._terms[1]
115│ assert depender.is_positive()
If I try to add it to the toml config file like seaborn = "^0.0.1"
The out put is very similar:
poetry update
Updating dependencies
Resolving dependencies... (0.0s)
AssertionError
at ~/.pyenv/versions/3.10.0/lib/python3.10/site-packages/poetry/mixology/incompatibility.py:111 in __str__
107│ )
108│
109│ def __str__(self):
110│ if isinstance(self._cause, DependencyCause):
→ 111│ assert len(self._terms) == 2
112│
113│ depender = self._terms[0]
114│ dependee = self._terms[1]
115│ assert depender.is_positive()
Can anyone help me?
Thank you so much!
After a few hours of dropping modules/restarting Pycharm / Invalidating cache... My project is up-to date without any issue!
For future note:
Do not name your modules/scripts with an already existing package (eg: scipy, seaborn, and so on)
I cannot comment yet, so need to supply a new answer.
This issue has broad applicability beyond just the Seaborn module and should be renamed something like, "Cannot add package using Poetry, AssertionError incompatibility.py:111".
The existing Answer, by #diguex, that I up-voted is exactly the fix needed and this answer helped me with the same problem, attempting to import 'flask-restx' into a demo project named 'flask-restx'.
Long and short, Poetry cannot import a dependency into itself. Naming the module with an already existing package name will confuse Poetry into thinking it is doing just that. For more discussion, see: https://github.com/python-poetry/poetry/issues/3491

Ensuring minimum version for imported Python package

Most Python packages follow the convention that the version is provided as a string at [package_name].version.version. Let's use Numpy as an example. Say I wanted to import Numpy but ensure that the minimum version is 1.18.1. This is what I do currently:
import numpy as np
if tuple(map(int, np.version.version.split('.'))) < (1, 18, 1):
raise ImportError('Numpy version too low! Must be >= 1.18.1')
While this seems to work, it requires me to import the package before the version can be checked. It would be nice to not have to import the package if the condition is not satisfied.
It also seems a bit "hacky" and it feels like there's probably a method using the Python standard library that does this. Something like version('numpy') > '1.18.1'. But I haven't been able to find one.
Is there a way to check the version of a package BEFORE importing it within the bounds of the Python standard library?
I am looking for a programmatic solution in Python code. Telling me to use a requirements.txt or pip install is not answering the question.
Edit to add context: Adding this package to my requirements.txt is not useful as the imported package is supposed to be an optional dependency. This code would go in a submodule that is optionally loaded in the __init__.py via a try statement. Essentially, some functionality of the package is only available if a package of minimum version is found and successfully imported.
Run pip show for a specific package using subprocess then parse the result to compare the installed version to your requirment(s).
>>> import subprocess
>>> result = subprocess.run(['pip', 'show', 'numpy'], stdout=subprocess.PIPE)
>>> result.stdout
b'Name: numpy\r\nVersion: 1.17.4\r\nSummary: NumPy is the fundamental package for array computing with Python.\r\nHome-page: https://www.numpy.org\r\nAuthor: Travis E. Oliphant et al.\r\nAuthor-email: None\r\nLicense: BSD\r\nLocation: c:\\python38\\lib\\site-packages\r\nRequires: \r\nRequired-by: scipy, scikit-learn, perfplot, pandas, opencv-python, matplotlib\r\n'
>>> result = subprocess.run(['pip', 'show', 'pandas'], stdout=subprocess.PIPE)
>>> for thing in result.stdout.splitlines():
... print(thing)
b'Name: pandas'
b'Version: 0.25.3'
b'Summary: Powerful data structures for data analysis, time series, and statistics'
b'Home-page: http://pandas.pydata.org'
b'Author: None'
b'Author-email: None'
b'License: BSD'
b'Location: c:\\python38\\lib\\site-packages'
b'Requires: numpy, python-dateutil, pytz'
b'Required-by: '
>>>
>>> from email.header import Header
>>> result = subprocess.run(['pip', 'show', 'pandas'], stdout=subprocess.PIPE)
>>> h = Header(result.stdout)
>>> print(str(h))
Name: pandas
Version: 0.25.3
Summary: Powerful data structures for data analysis, time series, and statistics
Home-page: http://pandas.pydata.org
Author: None
Author-email: None
License: BSD
Location: c:\python38\lib\site-packages
Requires: python-dateutil, pytz, numpy
Required-by:
>>> d = {}
>>> for line in result.stdout.decode().splitlines():
... k,v = line.split(':',1)
... d[k] = v
>>> d['Version']
' 0.25.3'
>>>
Or look at everything:
>>> result = subprocess.run(['pip', 'list'], stdout=subprocess.PIPE)
>>> for thing in result.stdout.splitlines():
print(thing)
b'Package Version '
b'---------------- ----------'
b'-illow 6.2.1 '
b'aiohttp 3.6.2 '
b'appdirs 1.4.3 '
...
Use containers to control all the dependencies and runtime environment of your program. An easy way to do it would be to create a Docker image that holds the exact version of python that you require. Then use a requirements.txt to install the correct python modules you need with the exact versions.
Lastly, you can create a shell script or something similar to actually spin-up the docker container with one click.
Alternatively (if Docker seems overkill), check out venv

Where is the bazel rule generating the `gen_io_ops.py` file when building TensorFlow from sources?

I'm trying to determine how the gen_io_ops module is generated by bazel when building TensorFlow from source.
In tensorflow/python/ops/io_ops.py, there is this piece of code:
from tensorflow.python.ops.gen_io_ops
[...]
# used in the TextLineReader initialization
rr = gen_io_ops._text_line_reader_v2(...)
referring to the bazel-genfiles/tensorflow/python/ops/gen_io_ops.py module (and generated by bazel when building TensorFlow).
The _text_line_reader_v2 is a wrapper of the TextLineReaderV2 defined in tensorflow/tensorflow/core/kernels/text_line_reader_op.cc.
As far as I understand, the build step are the followings:
1) The kernel library for the text_line_reader_op is built in tensorflow/tensorflow/core/kernels/BUILD
tf_kernel_library(
name = "text_line_reader_op",
prefix = "text_line_reader_op",
deps = IO_DEPS,)
where tf_kernel_library basically looks for text_line_reader_op.c file and build it.
2) The :text_line_reader_op kernel library is then used as a dependency by the io library defined in the same file:
cc_library(
name = "io",
deps = [
":text_line_reader_op", ...
],
)
I suppose the io library now contains the definition of the TextLineReaderV2kernel.
From what I get from this answer, there should be a third step where the io library is used to generate the python wrappers that are in the bazel-genfiles/tensorflow/python/ops/gen_io_ops.py module. This file generation can be done by the tf_op_gen_wrapper_py rule in Basel or by thetf.load_op_library() method, but none of them seem involved.
Does someone know where this third step is defined in the build process?
I finally got it.
There is indeed a call to tf_op_gen_wrapper_py but it's hidden in a call to tf_gen_op_wrapper_private_py:
def tf_gen_op_wrapper_private_py(name, out=None, deps=[],
require_shape_functions=True,
visibility=[]):
if not name.endswith("_gen"):
fail("name must end in _gen")
[...]
bare_op_name = name[:-4]
tf_gen_op_wrapper_py(name=bare_op_name, ...
So the steps are the following.
In tensorflow/tensorflow/python/BUILD, there is this rule
tf_gen_op_wrapper_private_py(
name = "io_ops_gen",
[...]
)
And so, in this rule the _gen suffix will be removed (in tf_gen_op_wrapper_private_py) and a gen_ prefix will be added in tf_gen_op_wrapper_py and therefore the gen_io_ops.py module will be generated by this rule.

Categories