Managing wheel files with conda - python

So, I have just started using python and I installed conda to manage the packages, but now I need to install a package (which comes in a "wheel" file, whatever that is?) which is not available from conda repositories and I am not sure what to do. I might be able to use pip but I have read somewhere that this is not a good way since I will not be able to use conda features to manage it later. I saw this https://docs.conda.io/projects/conda-build/en/latest/user-guide/wheel-files.html article, but it talks about something called "conda recipe" and other thing I have no clue about and doesn't provide a step-by-step instruction on what to do. So, how do I fully incorporate such "non-conda" packages into conda?

A wheel file is like an .exe file for windows. It installes an application, or in this case a Python package. To install wheel files with conda run the following command in your terminal:
conda install -c 'wheel'
Replace 'wheel' with your downloaded file. Check out this source for wheel files Unofficial windows binaries.

Related

Building docs fails due to missing pandoc

The problem
I am having trouble getting my docs to build successfully when clicking "Build" on the readthedocs.io web interface, but it builds just fine on my local machine. To test that it is an environment issue, I created a virtual environment:
conda create virtualenv -n venv
conda env export -n venv
source activate venv
Then I installed my requirements.txt file as:
pip install -r requirements.txt
and then ran
make clean html
In the virtual env and on buildthedocs online, I get the error:
Notebook error:
PandocMissing in ex_degassing.ipynb:
Pandoc wasn't found.
Please check that pandoc is installed:
http://pandoc.org/installing.html
make: *** [html] Error 2
I have searched and searched for a solution, but my best guess is that pandoc is not being installed via pip even though it is in the requirements.txt file. I tried also telling it to build from source by replacing pandoc with git+git://github.com/jgm/pandoc#egg=pandoc in my requirements.txt file, but this did not work (see below for what my files look like). I can easily install pandoc on my local machine, but it fails to install via the requirements.txt file in my virtual environment or on readthedocs.
Some of my files
Here is my requirements.txt file:
sphinx>=1.4
sphinx_rtd_theme
ipykernel
nbsphinx
pandas
pandoc
numpy
matplotlib
ipynb
scipy
I read through all of the readthedocs documentation and, ironically, found the documentation for building woefully inadequate for someone of my limited coding ability. I tried making both readthedocs.yml and environment.yaml files, but I have no idea if they are doing anything or if they are properly written. For completeness, here are those files:
readthedocs.yml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
# Required
version: 2
# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/conf.py
python:
version: 3.7
install:
- requirements: docs/requirements.txt
environment.yaml
channels:
- conda-forge
dependencies:
- python =3.7
- pandoc
In summary
I have created beautiful documentation with readthedocs and sphinx, but I cannot get it to build on the web. How can I either: a) solve the issue of installing pandoc (is that the issue??) or b) remove the need for pandoc (e.g., is there a way to include jupyter notebook files in my docs without needing pandoc? How to do this?)?
Thanks so much for anyone's advice!
It's currently not very well explained on the PyPI page for pandocs, but pandoc is actually written in Haskell, not Python. The pandoc package for python which you can install with pip install pandoc doesn't come with an executable copy of pandoc, it is just a wrapper to provide bindings to pandoc within python. It does this by making shell calls to your system copy of pandoc, and so you need to separately install a copy of the actual pandoc executable to your system in order for the python bindings to have something to bind to!
For example, on Debian/Ubuntu, you will need to install a system copy of pandoc using apt in addition to installing the python bindings with pip.
sudo apt install pandoc
pip install pandoc
Though you can do this on your local system, this isn't a viable solution for getting it to work on readthedocs.
In contrast to this, conda tries to make packages work irrespective of things installed on the rest of your system. Hence the conda pandoc package ships with an actual copy of pandoc, which it uses instead of trying to use your system copy of pandoc. If you're using conda, there is only one command you need to run to get pandoc working within python:
conda install pandoc
If you're using a conda environment for your readthedocs specification (as is the situation for OP), you can just add pandoc to the conda section instead of the pip section of the env.
The pandoc package on PyPI which you can install with pip isn't actually an official pandoc package for python. There's also pypandoc and py-pandoc, which are equally unofficial.
Supposedly py-pandoc is a one-stop shop for installing pandoc through a python interface. I think you just have to do pip install py-pandoc or conda install pandoc and it works immediately, though I'm not sure due to the lack of documentation.
For myself, I'm using Sphinx to build documentation. My personal solution to the problem of installing pandoc when building python documentation (including on readthedocs) is to add the following code block to conf.py.
from inspect import getsourcefile
# Get path to directory containing this file, conf.py.
DOCS_DIRECTORY = os.path.dirname(os.path.abspath(getsourcefile(lambda: 0)))
def ensure_pandoc_installed(_):
import pypandoc
# Download pandoc if necessary. If pandoc is already installed and on
# the PATH, the installed version will be used. Otherwise, we will
# download a copy of pandoc into docs/bin/ and add that to our PATH.
pandoc_dir = os.path.join(DOCS_DIRECTORY, "bin")
# Add dir containing pandoc binary to the PATH environment variable
if pandoc_dir not in os.environ["PATH"].split(os.pathsep):
os.environ["PATH"] += os.pathsep + pandoc_dir
pypandoc.ensure_pandoc_installed(
quiet=True,
targetfolder=pandoc_dir,
delete_installer=True,
)
def setup(app):
app.connect("builder-inited", ensure_pandoc_installed)
with pypandoc appearing in the requirements.txt file
This tells Sphinx to check to see if pandoc is already on the system path, and if it isn't available then it will download and install a copy to the docs/bin/pandoc directory of the repository and add it to PATH.
I faced the same issue.
Notebook error:
PandocMissing in getting_started.ipynb:
Pandoc wasn't found.
Please check that pandoc is installed:
https://pandoc.org/installing.html
It seems there is a difference whether you install pandoc via pip or if you use conda.
Try the following:
pip uninstall pandoc
conda install pandoc
Using the conda version solved the issue on my machine.
On a different front, if you are using poetry, you may also come across the same error message, i.e. PandocMissing.
Windows:
Install Pandoc via its Windows installer: https://pandoc.org/installing.html.
Linux (tested with Ubuntu 2204):
sudo apt install pandoc
first
pip uninstall pandoc
next,
conda install pandoc
just do it in Anaconda prompt, (restart kernel if necessary), and done!
This can be done by the non-Conda, pip fanboys.
I finally hit the wall and relented to allow Miniconda to play a small role in managing packages for my python3.1 on a Windows PC. I could not get Jupyter-book to convert a markdown file to a mySt markdown file. And the issue was with pandoc.
But I will always remain determined to continue controlling my python's packages myself using pip. I will not give control to Anaconda, and I will continue to control python's packages myself.
With that long qualifier out of the way, I will explain what I did.
I uninstalled pandocs using pip:
pip uninstall pandocs
And then I installed miniConda: https://docs.conda.io/en/latest/miniconda.html
There was a Windows Installer for my version of python, python3.10. It downloads into the Windows download folder.
Years ago, I fought Anaconda for control over my python and will not allow MiniConda any similar control. During the installation of Miniconda, I unselected two default checkboxes which would have given Miniconda control over vsCode, etc... And also did not accept the default installation destination and instead installed Miniconda to a new folder:
C:\miniConda
Then needed to add a path to Miniconda by adding a new system variable. That was done by Editing System Properties - Environmental Variables.
C:\MiniConda\Scripts\conda.exe
Close the cmd prompt and then open again. At the folder holding the .md file, open the cmd prompt. The smallish sized cmd prompt is preferred over the large Terminal window. Run conda and install pandocs.
conda install pandoc
Finally, convert the .md to myst.
jupyter-book myst init myBook.md --kernel ir
It works. Miniconda is kept under control and the pandoc package worked. And the process created a mySt markdown file.
And for those who don't read instructions carefully, they will miss this mention that I'm using the r kernel rather than the python kernel.
Well, I got it working on my system:
download and install pandoc from https://pandoc.org/installing.html
start a new session (cmd or power shell)
call sphinx make again
Simple as that. Hope you can get it working!

Installing dependencies from (Conda) environment.yml without Conda?

I currently use Conda to capture my dependencies for a python project in a environment.yml.
When I build a docker service from the project I need to reinstall these dependencies. I would like to get around, having to add (mini-)conda to my docker image.
Is it possible to parse environment.yml with pip/pipenv or transform this into a corresponding requirements.txt?
(I don't want to leave conda just yet, as this is what MLflow captures, when I log models)
Nope.
conda automatically installs dependencies of conda packages. These are resolved differently by pip, so you'd have to resolve the Anaconda dependency tree in your transformation script.
Many conda packages are non-Python. You couldn't install those dependencies with pip at all.
Some conda packages contain binaries that were compiled with the Anaconda compiler toolchain. Even if the corresponding pip package can compile such binaries on installation, it wouldn't be using the Anaconda toolchain. What you'd get would be fundamentally different from the corresponding conda package.
Some conda packages have fixes applied, which are missing from corresponding pip packages.
I hope this is enough to convince you that your idea won't fly.
Installing Miniconda isn't really a big deal. Just do it :-)

Install non-conda packages in Anaconda on air-gapped machine

Trying to get pyNastran onto an air-gapped machine with a new install of Anaconda.
I've tried conda install pyNastran-0.7.1.zip on the zipped source code, and conda install setup.py inside the unzipped folder. Both commands cause conda to try to get "package metadata" from https://repo.continuum.io/pkgs and fail when they can't reach the server, despite this being the method suggested here.
python setup.py install fails due to setuptools not being installed, and installing setuptools through python fails, apparently due to setuptools not being installed (?!).
I must be doing something wrong here. How do I get this to install?
conda can only work with tar.bz2 files.
So, unzip pyNastran-0.7.1.zip and re-zip as pyNastran-0.7.1.tar.bz2 using some zipping tool.
Now, you need to tell conda to work offline with --offline:
conda install --offline pyNastran-0.7.1.tar.bz2

Python Packaging and Distribution Scenario

I am still relatively new to python packaging, each time I think I find "the" solution, I am thrown another curve ball, here is my problem followed by what I've tried:
I have CentOS and Ubuntu systems with Python 2.7.3 installed that is partitioned from the net so I have to create an "all in one package"
The target system does NOT have setuptools, easy_install, pip, virtualenv installed (this is the problem I'm trying to solve here)
The requirements.txt (or setup.py install_dependencies) is fairly heavy (Flask, etc...) for the application (though really, this isn't the problem)
My packaging sophistication has progressed slowly:
For connected systems, I had a really nice process going with
packaging: python2.7 setup.py sdist
installation: create a virtualenv, untar the distribution, python setup.py install
For the disconnected system, I've tried a few things. Wheels seem to be appropriate but I can't get to the "final" installation that includes setuptools, easy_install, pip. I am new to wheels so perhaps I am missing something obvious.
I started with these references:
Python on Wheels, this was super helpful but I could not get my .sh scripts, test data, etc... installed so I am actually using a wheel/sdist hybrid right now
Wheel, the Docs, again, very helpful but I am stuck on "the final mile of a disconnected system"
I then figured out I could package virtualenv as a wheel :-) Yay
I then figured out I could package easy_install as a python program :-) Yay, but it depends on setuptools, boo, I can't find how to get these packaged / installed
Is there a reference around for bootstrapping a system that has Python, is disconnected, but does not have setuptools, pip, wheels, virtualenv? My list of things a person must do to install this simple agent is becoming just way too long :/ I suppose if I can finish the dependency chain there must be a way to latch in a custom script to setup.py to shrink the custom steps back down ...
Your process will likely vary according to what platform you are targeting, but in general, a typical way to get what you are trying to achieve is to download packages on an online machine, copy them over to the offline one, and then install them from a file rather than from a URL or repository).
A possible workflow for RPM-based distros may be:
Install python-pip through binary packages (use rpm or yum-downloadonly, to download the package on an online machine, then copy it over and install it on the offline one with rpm -i python-pip.<whatever-version-and-architecture-you-downloaded>).
On your online machine, use pip install --download <pkgname> to download the packages you need.
scp or rsync the packages to a given directory X onto your offline machine
Use pip install --find-links=<your-dir-here> <pkgname> to install packages on your offline machine.
If you have to replicate the process on many servers, I'd suggest you set up your own repositories behind a firewall. In case of pip, it is very easy, as it's just a matter of telling pip to use a directory as its own index:
$ pip install --no-index --find-links=file:///local/dir/ SomePackage
For RPM or DEB repos is a bit more complicated (but not rocket science!), but possibly also not that necessary, as you really only ought to install python-pip once.
The pip install --download option that #mac mentioned has been deprecated and removed. Instead the documentation states that the pip download method should be used instead. So the workflow should be:
Download the python package or installer using your online machine.
Install python using the offline method used by your package manager or the python installer for windows on the offline machine.
On the online machine use pip download -r requirements.txt where "requirments.txt" contains the packages you will be needing the proper format
Use pip install --find-links=<your-dir-here> <pkgname> to install packages on your offline machine.

Deploy Python package on Windows, that compiles dependencies, without installing Visual Studio?

What's the best way to deploy a Python package to a Windows Server if some of the dependencies need to be compiled? Installing Visual Studio is out of the question and I'm reluctant to pass a 250MB file around everytime things need updating. pip install -r requirements.txt is the goal.
Any suggestions?
easy_install - allows installing from exe
easy_install does install from exe installers, if they are available. pip does not install from exe, so in case of C/C++ packages, easy_install is often the way to install.
making pip to install binaries
pip is supporting installation from wheel format.
In case you have a wheel on pypi or another index server you use, allow pip to use wheels (--use-wheels). Latest version of pip is using wheel format by default, older ones required setting --use-wheel switch.
building wheels
You can either build (compile) wheels, or you can convert eggs or exe installers to wheels.
For this, use wheel command.
Serving packages in wheel format
Some packages (e.g. lxml) do not provide wheels on pypi server. If you want to use them, you have to manage your own index. Easiest one is to use dedicated directory and pip configured with --find-links pointing there.
See SO answer to caching packages for more tips and details.
Very good alternative is devpi index server, which is easy to set up, has very good workflow and will integrate with your pip very well.
Using a compile-once, deploy-many-times strategy might help, with the use of Python's newest package distribution format, wheels. They are an already-compiled, binary format. So the requirements are just a development / build platform that's similar / running the same python as the deployment platform (the latter won't require a build chain anymore).
Install the latest pip, setuptools and wheel packages. You can then use the pip wheel command as you would pip install, except it will instead produce a folder called (by default) wheelhouse full of the necessary, already-compiled wheels for another python/pip to install what you required.
You can then pass those wheel filepaths directly to a pip install command (e.g on your deployment machine), or use the normal package name(s) with --no-index --find-links= flags to point to the folder location where those wheels are. This location can also be a simple http folder - for example.

Categories