Python wheel force ABI to "none" - python

I would think this is an easy question, but I haven't found an answer yet, so I'm posting here.
I have a Python 3 app, that I package into a platform wheel. I have the setup.py and everything works as expected. The only thing I can't figure out is the generated wheel always includes an ABI tag (like"cp34m"), and when that's included I find that I can't actually install the wheel via pip. (My build script installs the latest pip, setuptools, and wheel before running.)
The solution is simple. I just change the file name of the wheel to change "cp34m" to "none". This is obviously easy to add into my build script, but I wonder if it's possible to set an option for bdist_wheel or something so that the .whl file that's generated has none set on its own?
The command I use to create the wheel is (for example on x64):
python setup.py bdist_wheel --plat-name win_amd64
That creates a wheel like:
mpf_mc-0.30.0.dev269-cp34-cp34m-win_amd64.whl
Which I then rename before uploading to PyPI to:
mpf_mc-0.30.0.dev269-cp34-none-win_amd64.whl
Everything seems to work fine by manually renaming it. But I wonder if this the right way to do it, or am I missing something?

This is difficult to answer without seeing your setup.py file, but I will give my thought:
the naming of the wheel postfix is a combination of the python tag, ABI tag, and platform tag, as shown here. But it sounds like you already knew that.
This link explaining it all is less official, but clearer, and here is the definition of ABI.
In my ignorant understanding, the ABI is the service level agreement (interface) between applications.
For the purposes of Python, the ABI is often the same as the python tag, with some additional info like the m you have in your case, which means the Cython aspects of it used malloc.
Reason Your Solution Works
So, I would guess the reason things do not work when you use the original name and do work once you change the name is:
pip will not let you install things using the m part of the ABI tag... probably because it has not been vetted, and not because of any real problem, and/or
you are not actually using any aspects of Cython that depend upon malloc.
Reason The ABI Tag Is Being Added
Now, if we assume that there is no real need for the m tag, as I said, then the reason it is probably being added is that you have a dependency in your setup file that needs it.
I am primarily a data scientist, so I will use a common example from there: numpy is compiled down to C code and has many Python calls to that lower level language to boost speed. Even if I write pure Python 3 code, and do not even use numpy anywhere, if I have a dependency in my setup file that asks for numpy, then I will have a platform-specific build. However, more common is that I am in fact using numpy, but I may not be using any aspects of it that rely upon C. (in numpy, that is rare, as all array initializations happen in C, but I hope you get the idea.) I do not know what forces ABI-specific builds, but I would guess you have some dependency asking for it, but which is never really used.

It seems like something has changed since you posted your question - I just tried your command to create a wheel file on my own project:
~$ python setup.py bdist_wheel --plat-name win_amd64
and the result file is:
my_project-1.0.0-py2-none-win_amd64.whl

Related

python import yaml - but which one? [duplicate]

This question already has answers here:
Python: How do I find which pip package a library belongs to?
(2 answers)
Closed 24 days ago.
Ok, so you clone a repo, there's an import
import yaml
ok, so you do pip install yaml and you get:
ERROR: No matching distribution found for yaml
Ok, so you look for a package with yaml in it, and there's like a gazillion of them... usually adding py in front does the job, but...
How on earth should I know which one was used?!
And it's not just yaml, oh no... there's:
import cv2 # python-opencv
import PIL # Pillow
and the list goes on and on...
How can I know which import uses which package? Shouldn't there be a PEP for this? Or a naming convention, e.g. import is always the same as the package name?
There's a similar topic here, if you're not frustrated enough :)
[When I clone a repo,] How can I know which import uses which package?
In short: it is the cloned code's responsibility to explain this, and it is an expected courtesy that the cloned code includes an installer that will take care of it.
If this is just some random person's bundle of .py files on GitHub with no installation instructions, look for notes in the associated documentation; failing that, make an issue on the tracker. (Or just give up. Maybe look for a better-engineered project that does the same thing.)
However, most "serious", contemporary Python projects are meant to be installed by using some form of packaging system. These have evolved over the years, and best practices have changed many times; but generally speaking, a properly "packaged" and "distributed" project will have either a setup.py or (newer; better in many ways, but not universally adopted yet) pyproject.toml file at the top level.
A pyproject.toml file is a config file in TOML format that simply describes a bunch of project metadata. This requires a build backend conforming to PEP 517. For a while, this required third-party tools, such as Poetry; but the standard setuptools can handle this since version 40.8.0. (As of this writing, the current release is 65.7.0.)
A setup.py script is executable code that pip will invoke after downloading a package from PyPI (or another package index). Generally, this script will use either setuptools or distutils (the predecessor to setuptools; it has finally been officially deprecated in 3.10, and will be removed in 3.12) to install the project, by calling a function named setup and passing it a big dict with some project metadata.
Security warning: this file is still executable code. It is arbitrary code, and it doesn't have to be following the standard conventions. Also, the package that is actually downloaded from PyPI doesn't necessarily match the project's source shown on GitHub (or another Git provisioning website), if such is even available. (This problem also affects package managers in other languages and ecosystems, notably npm for Javascript.)
With the setup.py based approach, package dependencies are specified using a keyword argument to the setup function. The specification has changed many times; currently, projects still using a setup.py should use the install_requires keyword argument.
With the pyproject.toml based approach, using setuptools' backend, dependencies will be an array (using JSON terminology, as TOML is a superset) stored under project.dependencies. This will vary for other backends; for example, Poetry expects this information under tool.poetry.dependencies.
In any event, pip freeze will output a list of what's installed in the current environment. It's a somewhat common practice for developers to test the code in a virtual environment where the dependencies are installed, dump this output to a requirements.txt file, and include that as documentation.
[When I want to use a third-party library in my own code,] How can I know which import uses which package?
It's worth considering the question the other way around, too: given that we have installed OpenCV for Python using pip install opencv-python, and want to use it in our own code, how do we know to import cv2 specifically?
The answer: there is no convention, and certainly no requirement for the installed package name to match the PyPI name, nor the GitHub etc. repository name. Read the documentation. Everyone who intends for their code to be used as a library, will be more than willing to show how, on at least a basic level.
Watch for requirements.txt . Big projects usually have it. You can import packages from this file. Else just google.
Keep in mind that it might not be a pip package.
Probably what is happening is that the main script is trying to import a secondary script (yaml.py, in this case) with functions or utils for the main script to use.
Check if the repo contains a file named yaml.py. If it's the case make sure to run the main script while the yaml.py is in the same directory.
Also, check for a requirements.txt file.
You can install all the requirements inside the file running in shell this line:
pip install -r *path to your requirements.txt*
Hope that this helps.
Any package on PyPI or cloned from some online repository is free to set itself up with a base directory name it chooses. That base directory xyz determines the import xyz line. Additionally a package name on PyPI doesn't have to match the repository name where its source code revisions are kept (assuming there is any).
This has the disadvantage that there is no one-to-one relation between package name, repo and/or import-line. But the advantage is that you e.g. can install Pillow, which is backwards compatible with PIL and still use import PIL instead of changing all your sources to use import Pillow as PIL.
If the repo you clone has a requirements.txt look there, you can also look in the setup.py for extra_require. But there is no guarantee that these are available, or contain the names of the packages to install (e.g. I use a generic setup.py that reads its info from a datastructure in the __init__.py file when creating/installing a package).
yaml seems to be a reserved name on PyPI (at least when I tried to upload a package with that name a few years ago). So that might be the reason the package is named PyYAML, although the Py is not very informative as the python code will not function in another programming language. PyPI' search is not very helpful as it relevance ordering is not relevant (at least not for yaml).
PyPI has no entry in the metadata for the import line, but you could extract that from .whl package file as the import line is the top level directory that doesn't match .dist-info. This is normally not possible from a .tar.gz` package file. I don't know of any site that does this kind of automatic scraping.
You can click through the packages on PyPI, after searching the import term, and hope you find something that matches the import in the documentation, but that is no guarantee you get the right one.
You might be best of searching for import yaml here on stackoverflow, and hope that the question or the answer mentions the package name.
thank you very much for your help and ideas. Big thanks to Karl Knechter for his exhaustive answer.
tl;dr: I think using some sort of "package" / "distribution" as a standard, would make everyone's lives easier.
However, my question was half-theoretical, to point out something I'd call, an incoherence in Python. You are of course right, there should be setuptools or requirements.txt or at least some documentation. But, if there isn't any, we're prone to error or additional browsing.
GospelBG pointed out something important. There could be a script yaml.py in the main folder and we need to check and/or guess.
Most importantly, naming imports differently than packages is just plainly misleading. There should be a naming convention or a PEP for this. Again, you can of course eventually get the proper package etc., but it's not explicit and obvious, and it should be! Because in programming, we like it that way, don't we?
I'm no seasoned dev in Python and I'm learning C++, but e.g. in C++, you import a header file with a particular name and static or dynamic libraries by their filename. Now I know this is very "step-by-step, on foot method", but at least you use the exact filenames.
On the upper level you have CMake, which would be an equivalent of setuptools where using find_package or find_library you can import package / library. To be honest, I'm not sure if all packages have the exact equivalent name, but at least the ones I used, did match.
Thanks again for your help and answers! I'm open for discussion and comments :)

Install a CMake macro script from within a python package install script (using setup.py)

So I have a Python package – it’s all set up on PyPI, and on GitHub, no problem. This is something I’m relatively familiar with.
What is unknown to me is: the notion of installing a CMake script as part of the python package install process. The python package in question is a development tool – you use it to preprocess some of your C/C++/Obj-C/Obj-C++ source files and generate some predefined macros in a header – and it works well when it’s wrapped in a CMake macro (for example like so) and executed as part of a proper chain of dependencies.
For one, I am not sure how to approach this, as there seem to be significant differences between the setuptools sandbox stance and distutils’ willing systems-level installer integration – and then even if I did know how to go about setting things up correctly in setup.py, I can’t find a good precedent on where a CMake script pertaining to a Python package might live.
All thoughts and insights on the matter are welcome.
It took me a while to understand your question. If I understand correctly, what you are trying to do is provide the IodSymbolize.cmake in the standard installation location of cmake so that other users/projects who rely on your software(symbolizer) can use it in their build process. I think you are thinking in a good direction, trying to provide services for end users of your package. Good question!
Here is my understanding of how things work in the cmake world.
Say I am an end user who wants to use "symbolizer" executable. What I would do is
find_package(symbolizer). This would try to figure out the location of the executable and it would set certain variables which can be used in the build process.
You need to provide Findsymbolizer.cmake file.
Please take a look at : http://www.cmake.org/Wiki/CMake:How_To_Find_Libraries
Also look at the Find*.cmake files provided in /usr//share/cmake/Modules directory if you are Unix/Linux platform.
Once the Findsymbolizer.cmake file is working properly, send it to the cmake mailing list for review. Once accepted it can be packaged in the next release of cmake. Then your module is usable with cmake. Hope I answered your question. Please update if you need more info.

How to install python binding of a C++ library

Imaging that we are given a finished C++ source code of a library, called MyAwesomeLib. The goal is to expose some of its power to python, so we create a wrapper using swig and generated a python package called PyMyAwesomeLib.
The directory structure now looks like
root_dir
|-src/
|-lib/
| |- libMyAwesomeLib.so
| |- _PyMyAwesomeLib.so
|-swig/
| |- PyMyAwesomeLib.py
|-python/
|- Script_using_myawesomelib.py
So far so good. Ideally, all we want to do next is to copy lib/*.so swig/*.py and python/*.py into the corresponding directory in site-packages in a pythonic way, i.e. using
python setup.py install
However, I got very confused when trying to achieve this simple goal using setuptools and distutils. Both tools handles the compilation of python extensions through an internal system, where the source file, compiler flags etc. are passed using setup(ext_module=[Extension(...)]). But this is ridiculous since MyAsesomeLib has a fully functioning build system that is based on makefile. Porting the logic embedded in makefiles would be redundant and completely un-necessary work.
After some research, it seems there are two options left, I can either override setuptools.command.build and setuptools.command.install to use the existing makefile and copy the results directly, or I can somehow let setuptools know about these files and ask it to copy them during installation. The second way is more appealing, but it is what gives me the most headache. I have tried the following optionts without success
package_data, and include_package_data does not work because *.so files are not under version control and they are not inside of any package.
data_files does not seems to work since the files only get included when running python setup.py sdist, but ignored when python setup.py install. This is the opposite of what I want. The .so files should not be included in the source distribution, but get copied during the installation step.
MANIFEST.in failed for the same reason as data_files.
eager_resources does not work either, but honestly I do not know the difference between eager_resources and data_files or MANIFEST.in.
I think this is actually a common situation, and I hope there is a simple solution to it. Any help would be greatly appreciated.
Porting the logic embedded in makefiles would be redundant and
completely un-necessary work.
Unfortunately, that's exactly what I had to do. I've been struggling with this same issue for a while now.
Porting it over actually wasn't too bad. distutils does understand SWIG extensions, but it this was implemented rather haphazardly on their part. Running SWIG creates Python files, and the current build order assumes that all Python files have been accounted for before running build_ext. That one wasn't too hard to fix, but it's annoying that they would claim to support SWIG without mentioning this. Distutils attempts to be cross-platform when compiling things, so there is still an advantage to using it.
If you don't want to port your entire build system over, use the system's package manager. Many complex libraries do this (but they also try their best with setup.py). For example, to get numpy and lxml on Ubuntu you'd just do:
sudo apt-get install python-numpy python-lxml. No pip.
I realize you'd rather write one setup file instead of dealing with every package manager ever so this is probably not very helpful.
If you do try to go the setuptools route there is one fatal flaw I ran into: dependencies.
For instance, if you are distributing a SWIG-based project, it's going to need libpython. If they don't have it, an error like this happens:
#include <Python.h>
error: File not found
That's pretty unhelpful to the average user.
Even worse, if you require a shared library but the user's library is out of date, the user can get some crazy errors. You're at the mercy of their C++ compiler to output Google-friendly error messages so they can figure it out.
The long-term solution would be to get setuptools/distutils to get better at detecting non-python libraries, hopefully as good as Ruby's gem. I pretty much had to roll my own. For instance, in this setup.py I'm working on you can see a few functions at the top I hacked together for dependency detection (still doesn't work on all systems...definitely not Windows).

Preparing a complex python project for submission to launchpad

I'm trying to wrap my head around the whole PPA thing and it seems to be as unnecessarily difficult as everybody is making it out to be. Let's take a project like http://docs.bokeh.org/ which has a node.js dependency and make a .deb out of it. Following this guide, and various posts here, I tried to use stdeb to do it:
pypi-download bokeh
tar xfz bokeh-0.7.0.tar.gz
cd bokeh-0.7.0/bokehjs/
npm install
grunt build
cd ..
python3 setup.py --command-packages=stdeb.command sdist_dsc
The end of the output is
dh clean --with python3 --buildsystem=python_distutils
dh_testdir -O--buildsystem=python_distutils
debian/rules override_dh_auto_clean
make[1]: Entering directory `/home/emre/Desktop/bokeh-0.7.0/deb_dist/bokeh-0.7.0'
python3 setup.py clean -a
/home/emre/Desktop/bokeh-0.7.0/deb_dist/bokeh-0.7.0/bokehjs
ERROR: Cannot install BokehJS: files missing in `./bokehjs/build`.
Please build BokehJS by running setup.py with the `--build_js` option.
Dev Guide: http://docs.bokeh.org/docs/dev_guide.html.
I just did that! Am I missing something? Is this building even necessary for something that's straight off pypi? The guides gloss over these things.
Making good debs can be complicated, yes, especially when you are not the upstream author and aren't sure exactly what their intentions were for installations of their software. The complication is necessary because well-behaved debs must conform to a fairly long list of policies and requirements so that users know what to expect from them in many different situations and cases. Source for debs needs to contain enough information that it can be built by automated systems (including installing any necessary build dependencies). Binary (built) debs must put their files in the right places on the system and not break any other packages and be able to clean up after themselves fully on uninstall. Debs should be installable without a user watching on an interactive terminal. Debs must declare all of their dependencies, and necessary versions of those dependencies, except for a few packages considered "required". Debs should not download anything from the internet during build or install. And so on, and so on. This strictness and the degree to which the community adheres to it is actually one of the most important benefits of running a Debian-based distribution.
Python source distributions such as those you find on PyPI, on the other hand, can pretty much do whatever they want. There are emerging best-practices for build and install commands with setup.py, but they're not always followed, and even when they are, there is still a lot of room for interpretation and variance. Some, such as the one you reference here, might arbitrarily require the user to call setup.py with a different nonstandard option before building normally. Some go ahead and download their own dependencies and put them wherever they want. Most packages beyond the trivial don't know how to uninstall themselves.
Both approaches are fine and are better in different contexts. But hopefully you can see now why it's not possible in the general case to make arbitrary Python source distributions automatically into working debs. There is just too much that the computer has to assume about how the Python will behave.
Having said all that, if you don't care about conforming to Ubuntu/Debian policy and you just want to be able to put something in a personal repository, the easiest path for you might be to change the Python source so that it does its --build_js thing automatically as necessary, rather than complaining and asking the user to do it.

Problems installing a package from PyPI: root files not installed

After installing the BitTorrent-bencode package, either via easy_install BitTorrent-bencode or pip install BitTorrent-bencode, or by downloading the tarball and installing that via easy_install $tarball, I discover that /usr/local/lib/python2.6/dist-packages/BitTorrent_bencode-5.0.8-py2.6.egg/ contains EGG-INFO/ and test/ directories. Although both of these subdirectories contain files, there are no files in the BitTorr* directory itself. The tarball does contain bencode.py, which is meant to be the actual source for this package, but it's not installed by either of those utils.
I'm pretty new to all of this so I'm not sure if this is a problem with the package or with what I'm doing. The package was packaged a while ago (2007), so perhaps it's using some deprecated configuration aspect that I need to supply a command-line flag for.
I'm more interested in learning what's wrong with either the package or my procedures than in getting this particular package installed; there is another package called hunnyb that seems to do a decent enough job of decoding bencoded data. Mostly I'd like to know how to deal with such problems in other packages. I'd also like to let the package maintainer know if the package needs updating.
edit
#Andrey Popp explains that the problem is likely with the setup.py file. I guess the only way I can really get an answer to my question is by actually R-ing TFM. However since I likely won't have time to do that thoroughly for a while yet, I've posted the setup.py file here.
A quick browse through the easy_install manual reveals that the function find_modules(), which this module's setup.py makes use of, searches for files named __init__.py within the package. The source code file in question is named bencode.py, so perhaps this is the problem: it should be named __init__.py?
edit 2
Having now learned Python packaging, I gather that the problem is that this module is using setuptools.find_packages, and has its source at the root of its directory structure, but hasn't passed anything in package_dir. It would seem to be fairly trivial to fix. However, the author is not reachable by his PyPI contact info. The module's PyPI page lists a "Package Index Owner" as well. I'm not sure what that's supposed to mean, but I did manage to get in touch with that person, who I think is maybe not in a position to maintain the module. In any case, it's still in the same state as when I posted this question back in June.
Given that the module seems to be more or less abandoned, and that there's a suitable replacement for it in hunnyb, I've accepted that #andreypopp's answer is about as good of one as I'm going to get.
It seems this package's setup.py is broken — it does not define right package for distribution. I think, you need to check setup.py in source release and if it is true — report a bug to author of this package.

Categories