Supplying NumPy site.cfg arguments to pip - python

I'm using NumPy built against Intel's Math Kernel Library. I use virtualenv, and typically use pip to install packages.
However, in order for NumPy to find the MKL libraries, it's necessary to create a site.cfg file in the NumPy source directory prior to compiling it, then manually build and install. I could script this whole process, but I was hoping for a simpler solution.
I have a standard site.cfg file that can be used for this purpose under version control. Are there any pip command line options that will tell it to copy a particular file to the source directory before building a package?
Alternatively, are there any environment variables that can be set instead of supplying the library paths in a site.cfg file? Here is the site.cfg file that I use. It was taken almost verbatim from Intel's site.
[mkl]
library_dirs = /opt/intel/composer_xe_2013.1.117/mkl/lib/intel64
include_dirs = /opt/intel/composer_xe_2013.1.117/mkl/include
mkl_libs = mkl_rt
lapack_libs =
For reference, I'm running Ubuntu, Python 2.7, and NumPy 1.6.

From the source (https://github.com/numpy/numpy/blob/master/site.cfg.example):
To assist automatic installation like easy_install, the user's home directory
will also be checked for the file ~/.numpy-site.cfg .
Is that a workable solution? You'd still need to preload the home directories with the global .numpy-site.cfg, but you wouldn't have to muck with the build or installation after that.

I ended up putting together a script to automate this. Here it is, in case it can help someone else. I've tested it in Python 2.7, but it should work elsewhere without significant modifications.
from __future__ import unicode_literals
import io
import os.path
import re
import subprocess
import urllib2
# This downloads, builds, and installs NumPy against the MKL in the
# currently active virtualenv
file_name = 'numpy-1.6.2.tar.gz'
url = ('http://sourceforge.net/projects/numpy/files/NumPy/1.6.2/'
'numpy-1.6.2.tar.gz/download')
def main():
# download NumPy and unpack it
file_data = urllib2.urlopen(url).read()
with io.open(file_name, 'wb') as fobj:
fobj.write(file_data)
subprocess.check_call('tar -xvf {0}'.format(file_name), shell=True)
base_name = re.search(r'(.*)\.tar\.gz$', file_name).group(1)
os.chdir(base_name)
# write out a site.cfg file in the build directory
site_cfg = (
'[mkl]\n'
'library_dirs = /opt/intel/composer_xe_2013.1.117/mkl/lib/intel64\n'
'include_dirs = /opt/intel/composer_xe_2013.1.117/mkl/include\n'
'mkl_libs = mkl_rt\n'
'lapack_libs =\n')
with io.open('site.cfg', 'wt', encoding='UTF-8') as fobj:
fobj.write(site_cfg)
# build and install NumPy
subprocess.check_call('python setup.py build', shell=True)
subprocess.check_call('python setup.py install', shell=True)
if __name__ == '__main__':
main()

Your goal of installing NumPy to use Intel's Math Kernel Library is now much easier since Intel created pips to install MKL + NumPy:
pip uninstall numpy -y # if the standard numpy is present
pip install intel-numpy
as well as intel-scipy, intel-scikit-learn, pydaal, tbb4py, mkl_fft, mkl_random, and the lower level packages if you need just them. Again, you must first uninstall the standard packages if they're already installed in your virtualenv.
NOTE:
If standard NumPy, SciPy and Scikit-Learn packages are already installed, the packages must be uninstalled before installing the Intel® variants of these packages(intel-numpy etc) to avoid any conflicts. As mentioned earlier, pydaal uses intel-numpy, hence it is important to first remove the standard Numpy library (if installed) and then install pydaal.

Alternatively, are there any environment variables that can be set instead of supplying the library paths in a site.cfg file?
NumPy 1.21 introduces environment variables for this purpose.
E.g.
NPY_BLAS_ORDER=MKL NPY_LAPACK_ORDER=MKL pip install numpy --no-binary numpy
to auto-detect the MKL library when installing NumPy from source code. If needed, you can set the environment variables NPY_BLAS_LIBS, NPY_CBLAS_LIBS, and NPY_LAPACK_LIBS to linker CLI options which put your chosen libraries on the linker path.
This is easier for a script to do than creating a ~/.numpy-site.cfg file,
[openblas]
libraries = openblas
library_dirs = /usr/local/opt/openblas/lib
include_dirs = /usr/local/opt/openblas/include
runtime_library_dirs = /usr/local/opt/openblas/lib
then running
pip install numpy --no-binary numpy
BTW, the file ~/.numpy-site.cfg also works when installing scipy from source code:
pip install scipy --no-binary scipy
NOTE: If you're still using Python 2.7, install numpy then install scipy. Attempting to install them together will:
invoke a SciPy easy_install installer that requests NumPy,
load the latest NumPy installer (even if you specifically asked pip to install numpy==1.14.6 scipy==1.0.1 --no-binary numpy,scipy), then
fail with RuntimeError: Python version >= 3.5 required because the latest NumPy does not support Python 2.7.

Related

pip install -e . vs setup.py

I have been locally editing (inside a conda env) the package GSTools cloned from the github repo https://github.com/GeoStat-Framework/GSTools, to adapt it to my own purposes. The package is c++ wrapped in python (cython).
I've thus far used pip install -e . in the main package dir for my local changes. But I want to now use their OpenMP support by setting the env variable export GSTOOLS_BUILD_PARALLEL=1 . Then doing pip install -e . I get among other things in the terminal ...
Installing collected packages: gstools
Running setup.py develop for gstools
Successfully installed gstools-1.3.6.dev37
The issue is nothing actually changed because, setup.py (shown below) is supposed to print "OpenMP=True" if the env variable is set to GSTOOLS_BUILD_PARALLEL=1 in the linux terminal , and print something else if its not set to 1.
here is setup.py.
# -*- coding: utf-8 -*-
"""GSTools: A geostatistical toolbox."""
import os
​
import numpy as np
from Cython.Build import cythonize
from extension_helpers import add_openmp_flags_if_available
from setuptools import Extension, setup
​
# cython extensions
CY_MODULES = [
Extension(
name=f"gstools.{ext}",
sources=[os.path.join("src", "gstools", *ext.split(".")) + ".pyx"],
include_dirs=[np.get_include()],
define_macros=[("NPY_NO_DEPRECATED_API", "NPY_1_7_API_VERSION")],
)
for ext in ["field.summator", "variogram.estimator", "krige.krigesum"]
]
# you can set GSTOOLS_BUILD_PARALLEL=0 or GSTOOLS_BUILD_PARALLEL=1
if int(os.getenv("GSTOOLS_BUILD_PARALLEL", "0")):
added = [add_openmp_flags_if_available(mod) for mod in CY_MODULES]
print(f"## GSTools setup: OpenMP used: {any(added)}")
else:
print("## GSTools setup: OpenMP not wanted by the user.")
​
# setup - do not include package data to ignore .pyx files in wheels
setup(ext_modules=cythonize(CY_MODULES), include_package_data=False)
I tried instead just python setup.py install but that gives
UNKNOWN 0.0.0 is already the active version in easy-install.pth
Installed /global/u1/b/benabou/.conda/envs/healpy_conda_gstools_dev/lib/python3.8/site-packages/UNKNOWN-0.0.0-py3.8-linux-x86_64.egg
Processing dependencies for UNKNOWN==0.0.0
Finished processing dependencies for UNKNOWN==0.0.0
and import gstools
no longer works correctly.
So how can I install my edited version of the package with OpenMP support?
developer of GSTools here.
I guess you don't see the printed message, because pip is suppressing output for the setup. So you could try making pip verbose with:
GSTOOLS_BUILD_PARALLEL=1 pip install -v -e .
BTW, we are always interested in enhancements. So maybe you are willing the share your edits on GSTools? :-)
Cheers,
Sebastian

How can I ensure and confirm that the scipy package I installed has pythran optimized functions available?

How can I ensure and confirm that the SciPy package I installed has pythran optimized functions available? I am installing python 3.9.5 from python software foundation website, using PyCharm IDE for package management that uses pip to set up packages.
If you've used pip to install the SciPy 1.7.0 wheel from PyPI, it will have the Pythran modules included by default (the extensions can be excluded by setting the environment variable SCIPY_USE_PYTHRAN=0 when building).
>>> import scipy.interpolate._rbfinterp_pythran
>>> help(scipy.interpolate._rbfinterp_pythran)
Help on module scipy.interpolate._rbfinterp_pythran in scipy.interpolate:
NAME
scipy.interpolate._rbfinterp_pythran
DATA
__pythran__ = ('0.9.11', '2021-06-19 21:59:20.760235', '3c30425550c454...
FILE
...\lib\site-packages\scipy\interpolate\_rbfinterp_pythran.cp39-win_amd64.pyd

Gohlke's numpy + mkl installation - Change MKL install directory on Windows

I've been trying to get a working fast numpy with BLAS on Windows, and so far, the only method that seems feasible is downloading the precompiled library with MKL from http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy.
So far ok, but chekcing later numpy.__config__.show(), I see it points to directories that don't exist, such as C:\program files (x86)\IntelSWTools
I assume numpy is trying to place the MKL libraries in this directory, but I have no administration privileges for creating files in C:\program files (x86).
Is there any simple way to use this numpy distribution and install the MKL libs in another directory? Such as a pip install filename.whl --some_option_to_install_mkl_in_another_dir?
(Windows 7 64bit, python 3.5.2)
Already attempted:
Use pip install <package> --user: it seems to install everything exactly the same way as the same command without --user. (My default installation folder is aldready the user folder)
User pip install <package> --root <some_path>: installs everything in the passed path, but Numpy config still points to C:\program files (x86)\IntelSWTools, and python cannot find numpy, even if I add <some_path> to both PATH and PYTHONPATH environment vars
Tried to create the pip.ini file, with the lines [global] and target=E:\destination. The destination folder remains untouched.
Rename the wheels file to zip, find all files containing the IntelSWTools folder, change all these folders to one that I have access to. Make it a wheels file again and pip install. Absolutely no file appears in the folder I chose, but numpy config is pointing to that folder. -- This makes me wonder: does this distribution really installs MKL?
Numpy+MKL does not place (or try to place) MKL libraries in C:\program files (x86)\IntelSWTools. The MKL runtime DLLs necessary to use numpy+MKL are copied to sys.prefix\Lib\site-packages\numpy\core during installation with pip.
C:\program files (x86)\IntelSWTools is the location of the MKL development files (link libraries, header files, DLLs, documentation) that were used to build numpy+MKL. If you want to build other software from source that relies on MKL development files, you need to download MKL from Intel.
I have tried something like this:
pip install --install-option="--prefix=$PREFIX_PATH" package_name
I the above line:
$PREFIX_PATH ---- Change the path you want to specify.
package_name ---- Change the Package name with the desired package name or the wheel file.
On Windows, I tried the above and it is not working. But the below answer will work:
python.exe -m pip install --target=c:\data\ pandas
The pandas got stored in the data folder. Only the thing we need to do is we have to specify the path to our Python, so that it will fetch the proper library. You can go in the data folder and run python. You will be able to access the library.
Hope this helps you.

cannot install scipy on openshift

I want to install scikit-learn but this library needs scipy and numpy too.
I tried to add them on the setup.py but I had an error with numpy. I handle to install scikit-learn and numpy from virtenv, but I cannot install scipy.
I tried pip install scipy. The procedure finished without any problem but there isn't any scipy folder on site-packages.
Also, I tried to add only scipy on setup.py. The same as above. The procedure finished without an error but scipy isn't there.
Any help?
I don't know openshift but maybe you can adapt the work that was done to install Atlas / numpy / scipy / scikit-learn on heroku:
https://github.com/dbrgn/heroku-buildpack-python-sklearn
In particular building scipy from source (using pip) requires a fortran compiler (e.g. gfortran) which is probably not installed on OpenShift by default.
Edit: a possible alternative would be to build binary packages for numpy, scipy and scikit-learn using the wheel format and then point the pip install command to an OpenShift blob store that hosts the pre-built packages.
To make sure that the wheel package will work on OpenShift you will have to build them on the same OS (I think it's Redhat 6).
Edit #2: the manylinux1 platform tag was designed to solve this issue and makes it possible to embed the third party libraries you need inside the wheel package. There should be official numpy and scipy wheel files for x86_64 linux. In the mean time you can build them your-self by following the instructions at: https://github.com/pypa/manylinux
You will probably find more info sshing into your app ad typing tail_all.

Use local directories when installing matplotlib from source

I installed freetype-2.4.10, libpng-1.5.12, and zlib-1.2.7 into local directories. Now, in matplotlib I would like to do:
python setup.py install
When I did this for lxml I did something like:
python setup.py install --with-xml2-config=/home/test/libxml2/bin/xml2-config --with-xslt-config=/home/test/libxslt/bin/xslt-config
How can I point matplotlib to the proper freetype, libpng and zlib libraries?
From the INSTALL file:
If you have installed prerequisites to nonstandard places and need to
inform matplotlib where they are, edit setupext.py and add the
base dirs to the basedir dictionary entry for your
sys.platform. e.g., if the header to some required library is in
/some/path/include/someheader.h, put /some/path in the
basedir list for your platform.

Categories