Numpy installing via PyPI vs distro package manager - python

This is probably a trivial question and maybe even a duplicate.
What is the difference between numpy/scipy as installed from PyPI and as opposed to the one installed from a distribution's repository, say Ubuntu using apt-get? I think I have a vague idea- numpy as installed from PyPI requires a lot of other tools like gcc, gfortran before it can build. I am guessing a distro's version of numpy package comes with all these tools? Not sure if this is the right picture.
If so, using PyPI depending on which python I am pointing to I can install numpy and scipy for a particular version of python. Using apt-get, can you install numpy and scipy for a specific version of python? Does the package manager apt-get use the version of python I am pointing to?

The main difference is that, in pip you have a always a fresh version, in
ubuntu repository you always have a little outdated python package.
And yes you can install for example python-numpy or python3-numpy and that will download all dependency -> http://packages.ubuntu.com/precise/python-numpy.
the same is with PyPI, you can use pip/pip3 to install package that you want, but that can be more 'tricky', because sometimes you must find a dependency manually. Like with ipython-notebook, when you install from apt-get, everything will be downloaded and you don't care about dependency, but when you want a fresh version and you download this from pip, you must also install tornado,jsonscheme, pyzqt manually with using pip.
And with using pip/apt-get you can install numpy/scikit for different python version. (in ubuntu default version of python is 2.7 so when you install sth for python3 you must add 3 ;) )
apt-get install python-numpy /pip install numpy
or
apt-get install python3-numpy/ pip3 install numpy
and the same with scikit :)

The majority of Linux distributions have a package manager that installs pre-compiled binary packages. In the case of numpy/scipy they would thus install Python source code with the precompiled C/Fortran extensions. No C/Fortran compilers are necessary for the install.
PyPI on the other hand, is a package manager for Python that is very roughly a wrapper around the python setup.py install command. It will in particular compile the necessary C/Fortran extensions from sources. It thus requires the gcc, gfortran compilers to be present on the system. This takes longer (~15 min for numpy) but has the advantage that it could be potentially optimized with compilation flags to the current CPU architecture and therefore marginally faster (that shouldn't matter much in practice though).

Related

Installing f2py in ubuntu

I wonder if anyone could help me with one issue: I am using ubuntu 12.04 and I wanted to install f2py. However the version found here:
https://sysbio.ioc.ee/projects/f2py2e/index.html#installation
Gives me an error with python 2.7.6. This issue arises to many users due to word "as" becoming a keyword since python 2.6 (http://comments.gmane.org/gmane.comp.python.f2py.user/1802)
Hence which is the updated way to install f2py? Or using the one from numpy?
Thanks
Vital
The version you link to is very very old. The installation instructions refer to Python2.1!
You'll find a newer version by searching PyPi. But the homepage for that package states that, as of 2007-07-19,
F2PY is now part of NumPy. All the development and maintenance of F2PY is carried out under NumPy SVN tree.
So the easiest way to install f2py on ubuntu is
to install numpy:
sudo apt-get install python-numpy

Forcing `pip` to recompile a previously installed package (numpy) after switching to a different Python binary

This question is as much a question about my particular problem (which I sort of found a work-around, so it's not a burning issue) as it is about the general process I am using.
Setup (the part that works):
I have Python 2.7.9 installed locally on my Ubuntu 14.04, and I have a virtualenv in which I am running it. Everything is very much separated from the "system" Python, which I am not touching.
The part I did:
It all started well enough, with my Python installed and all libraries running. For example, I also pip installed numpy 1.10.1, it compiled for a while, then it worked just fine.
The problem:
The problem is that for reasons beyond my control, I had to rebuild the python with ucs4 enabled, that is I installed it using
./configure --enable-unicode=ucs4
After doing this, I also uninstalled all libraries and reinstalled them using pip. However, it seems that the numpy library was not properly uninstalled because it installed instantly this time, and when I tried to import numpy into my new Python, I got an error message indicating that the numpy was compiled with the ucs2-enabled Python.
This hypothesis is pretty solid, since I tried then to pip install numpy==1.9.3. The installation once again took a long time, and it produced a numpy version that works on the new ucs4 enabled Python.
Now, my question:
How can I get the numpy uninstallation process to delete all traces of the old numpy?
Edit:
I also tried to manually remove numpy by deleting it from my virtualenv site-packages directory. After deleting, import numpy returned an ImportError as expected. I then reinstalled it (pip install numpy) and it came back with the same ucs2-related error.
Edit 2:
The full sys.path seen by my virtualenv Python is
['',
'/home/jkralj/.virtualenvs/work/lib/python27.zip',
'/home/jkralj/.virtualenvs/work/lib/python2.7',
'/home/jkralj/.virtualenvs/work/lib/python2.7/plat-linux2',
'/home/jkralj/.virtualenvs/work/lib/python2.7/lib-tk',
'/home/jkralj/.virtualenvs/work/lib/python2.7/lib-old',
'/home/jkralj/.virtualenvs/work/lib/python2.7/lib-dynload',
'/usr/local/lib/python2.7.9/lib/python2.7',
'/usr/local/lib/python2.7.9/lib/python2.7/plat-linux2',
'/usr/local/lib/python2.7.9/lib/python2.7/lib-tk',
'/home/jkralj/.virtualenvs/work/lib/python2.7/site-packages']
Also, it might be important to mention that the /usr/local/lib/python2.7.9/ installation of python does not have numpy installed.
You can use --no-binary and --ignore-installed to rebuild a package as follows
pip install --user --force-reinstall --ignore-installed --no-binary :all: PackageName
The problem is solved by pip uninstalling numpy (or any other troublesome package), then running
pip install numpy --no-cache-dir
to prevent pip from simply taking the cached installation and repeating it.

Python update from 2.7.6 to 2.7.8 - did I just lose all my previously installed modules?

Title basically states it all. I upgraded my version of Python in order to hopefully play more nicely with Mac OS 10.9, but am now unable to use some modules I need for my work (NumPy, Pandas, SciPy, Scikit-Learn, etc.) Does this upgrade automatically wipe out any previously installed modules? Do I just need to install them again? Thanks in advance.
When you upgraded, it created a new sitepackages directory structure. Your packages are not installed any more, so yes you need to reinstall them into the new version.
Before you do that, take a good look at virtual environments rather than install the modules and packages globally.
http://docs.python-guide.org/en/latest/dev/virtualenvs will get you started, then google virtualenvwrapper.
I would recommend you try out the anaconda python distribution. It comes with all of these packages pre-installed, and its free. Also, in addition to pip, you can use the conda package manager which is much better for scientific packages. See http://technicaldiscovery.blogspot.com/2013/12/why-i-promote-conda.html for an explanation.
With conda, you can install numpy/scipy/pandas/etc with conda install numpy scipy pandas and it just works, and takes about 10 seconds. No compilation necessary (OTOH pip install scipy can take over 15 minutes, requires a fortran compiler, and is generally very tricky).
link: http://continuum.io/downloads

Why did matrix multiplication using python's numpy become so slow after upgrading ubuntu from 12.04 to 14.04?

I used to have Ubuntu 12.04 and recently did a fresh installation of Ubuntu 14.04. The stuff I'm working on involves multiplications of big matrices (~2000 X 2000), for which I'm using numpy. The problem I'm having is that now the calculations are taking 10-15 times longer.
Going from Ubuntu 12.04 to 14.04 implied going from Python 2.7.3 to 2.7.6 and from numpy 1.6.1 to 1.8.1. However, I think that the issue might have to do with the linear algebra libraries that numpy is linked to. Instead of libblas.so.3gf and liblapack.so.3gf, I can only find libblas.so.3 and liblapack.so.3.
I also installed libopenblas and libatlas:
$ sudo apt-get install libopenblas-base libatlas3-base
and tried them, but the slowdown doesn't change. So, my questions are:
What's the difference between the packages with and without the "gf"?
Is this possibly causing the slowdown in the matrix multiplications?
If so, how can I go back to libblas.so.3gf and liblapack.so.3gf? They seem to be discontinued in Ubuntu 14.04.
Thanks much!
wim is correct, in that the problem is probably caused by numpy linking to a slower BLAS library (e.g. the reference CBLAS library rather than ATLAS).
You can check which BLAS library is being linked at runtime by calling the ldd utility on one of numpy's compiled shared libraries.
For example, if you installed numpy in the standard location using apt-get:
~$ ldd /usr/lib/python2.7/dist-packages/numpy/core/_dotblas.so
...
libblas.so.3 => /usr/lib/libblas.so.3 (0x00007f01f0188000)
...
This output tells me that numpy is linked against /usr/lib/libblas.so.3. This is usually a symlink to the reference CBLAS library, which is pretty slow.
You could, as wim suggests, remove the version of numpy installed via apt-get and build it yourself, either using pip or by downloading the source directly. However, I would strongly discourage you from using sudo pip install ... to install Python modules system-wide. This is a bad habit to get into, since you run the risk of breaking dependencies in your system-wide Python environment.
It is much safer to either install into your ~/.local/ directory using pip install --user ... or even better, to install into a completely self-contained virtualenv.
Another option would be to use update-alternatives to force your system-wide numpy to link against a different BLAS library. I've written a previous answer here that shows how to do this.
Are you installing numpy through package manager?
If so, I recommend to go through pip instead so you can clearly see in the build process what is being successfully linked during setup.
Remove the apt version (sudo apt-get purge python-numpy)
Install build-deps headers and static libraries (sudo apt-get install libblas-dev liblapack-dev gfortran), maybe there are some others but these are the ones I remember.
pip install numpy

Trouble installing scipy despite having python2.7 and numpy installed already

I'm having trouble installing scipy via the binaries provided at http://sourceforge.net/projects/scipy/files/scipy/
Double clicking on the mpkg file after mounting the dmg installer gives the following error:
"scipy 0.13.0 can't be installed on this disk. scipy requires System Python 2.7 to install"
However, I already have python 2.7 and numpy installed. The python 2.7 came default with OSX Lion, so I assume it is System Python. With other python modules, one normally can download the binary then run
python setup.py install
Is there a way to cd through the mpkg file and locate a setup.py? Any advice install via this dmg installer?
I know there are other ways to manage python modules, like port and brew. However, I already installed a bunch of packages through setup.py, and I couldn't figure out how to get port to recognize those packages (for example, it will try to reinstall python and numpy via port)
Thanks!
If you have Mavericks and XCode 5, then you'll have to install Command Line Tools manually from the Apple Developer Site. I found this helpful post
You've got a few misconceptions here.
With other python modules, one normally can download the binary then run python setup.py install
No, that's what you do with source packages.
Is there a way to cd through the mpkg file and locate a setup.py?
No. What's inside an mpkg are pkg files. Which are filled with xar archives filled with cpio archives. Inside there is the built version of SciPy—that is, the files that setup.py would have copied to your site-packages if you'd run it—not the source package.
But you can download the source package yourself.
Or, better, let pip (or easy_install, but pip is better) download and run the setup.py for you.
Any advice install via this dmg installer?
If it won't work, my advice would be to not use it, and instead install with pip.
This blog post explains it, but I'll give you the details relevant to you below.
I know there are other ways to manage python modules, like port and brew. However, I already installed a bunch of packages through setup.py, and I couldn't figure out how to get port to recognize those packages.
You can't. MacPorts will not touch your system Python; it builds its own separate Python 2.7, with a completely independent site-packages directory and everything else. You would have to reinstall everything for this second Python 2.7. And deal with the confusion of having two Python 2.7 installations on the same machine.
Don't do that unless you absolutely have to.
In fact, if you want to use Homebrew for anything (and you do, see below), uninstall MacPorts, unless you really need it for something.
So, here are the steps:
Uninstall MacPorts.
I assume you already have Xcode and its Command Line Tools.
I assume you already have Homebrew.
Install a Fortran compiler with brew install gfortran.
Lion's Python 2.7 comes with easy_install, but not pip. So sudo easy_install pip to fix that. While you're at it, I'd suggest sudo easy_install readline, because you'll want that for ipython, and it won't work right with pip.
Apple's pre-installed NumPy has to be upgraded, and rebuilt with Fortran support, to make SciPy work. Fix that with sudo pip install --upgrade --force-reinstall numpy.
If you want ipython, pandas, etc. sudo pip install each of them as well.
In case you're considering upgrading soon, the exact same steps worked for me with OS X 10.9.0, except for some extra work to get the Xcode 5 command line tools set up.

Categories