Namespace package conflict - python

I have a package that I'm working on (LDB_Algebra). It has an extra that depends on another package that I created (LDB_LAPACK). I have created a virtualenv and installed each of these packages as shown:
$ virtualenv -p pypy ve_pypy
$ . ve_pypy/bin/activate
(ve_pypy) $ pip install LDB_LAPACK
...
(ve_pypy) $ python setup.py install
... (Installs LDB_Algebra)
Each has the following for its __init__.py file under the ldb package:
__import__('pkg_resources').declare_namespace(__name__)
Problem:
The trouble is that when I try to use ldb.algebra it reports that it can't find the package. Just to make sure it hasn't completely lost everything I attempt to import ldb.lapack and that works fine. This suggests to me that I'm having a namespace package problem. It seems a similar question has been asked here (with no answer sadly). Upon investigating the directory structure of my virtualenv I find that under ve_pypy/site-packages/ there is a folder for the ldb namespace package which includes the lapack package but not the algebra package. I also see an egg file, LDB_Algebra-0.3.2-py2.7.egg. Inside this egg file in the ldb directory is an __init__.py file with the appropriate namespace declaration (as above). Presumably this is supposed to be where it gets the ldb.algebra package from but it's not looking there.
Questions:
Can anyone confirm with a reference that what I'm seeing is a known issue (i.e. that I'm not just doing some slightly wrong that's causing all these troubles)? Are eggs and w/e the pip install method created (the ldb package directory under site-packages) fundamentally incompatible?
Assuming that the answer to the first question is that my method of package installing is fundamentally flawed, is there an easier way of installing the LDB_LAPACK package from pypi and the LDB_Algebra package from my local directory? I'm not a setuptools wiz or anything so the answer may be very straightforward (don't overlook the obvious).

Apparently this is a well known problem. The solution that was suggested to me and seems to work fine is to use pip install . instead of python setup.py install.

Related

Installing third party modules in python3 - Ubuntu

In short, my question is, how do I install the latest version of scikit-image into my usr/lib/python3/dist-packages so I can actually use it? I think there is a problem with my understanding of how third-party modules are installed. As a newb, I don’t know how to rectify that, hence this post.
I need help to understand how to install packages in python3 up until now I have used pip/pip3/apt-get/synaptic etc and it has worked fine for many packages. However, I have hit several barriers (Skimage, opencv, plantcv in python3). I must emphasise, the problem I am having is using these packages in python3, not 2.7.
For example, I want to use the latest version of scikit-image (0.14) with python3. (http://scikit-image.org/) I have tried using the installation instructions and have not yet successfully managed to install it. I have navigated to my usr/lib/python3/dist-packages and copied scikit-image into this directory (I have all the dependencies installed in here already).
Image of my folder for dist-packages as proof
As you can see, the folder containing skimage is in the directory I want to be installed in, how do I actually install it? Do I have to extract skimage out of the folder into the directory and then run the install command? If I navigate to usr/lib/python3/dist-packages/scikit-image and then run pip install -e . I get an error, stating that I need numpy. If I write a python script using python3 I can clearly see I have it installed (and I have been using it for a long time). So, there must be a problem in how I have this package in my file system. I think a janky workaround would be to copy all the modules into my working directory and Import them that way as if they were modules I have made myself, but this obviously negates the whole point of installing packages.
This has also happened with another package called plantcv. Where I went into the directory usr/lib/python3/dist-packages then cloned the source from git hub and installed as per instructions. When I import plantcv in my python3 script. It Imports fine. But, there is nothing in it, as python cannot see the modules which are inside this folder at usr/lib/python3/dist-packages/plantcv/plantcv.
There is clearly some comprehension here that I am missing, as I have a similar problem for two packages now. Please, Internet. Help me understand what I am missing!
You simply need to copy the folder in /usr/lib/python3/dist-packages/package-name
However, there are certain things that are specific to python packages. The folder named package name should be a valid package. A good indicator of that is it will contain a file "__init__.py". It is very likely that every sub-directory inside this package directory will contain a "__init__.py" file. It depends on whether there are modules inside these sub-directories.
In your code simply import the package like the following.
import package-name
where package-name can be skimage

How should I write a simple installer for python package?

I'm writing a simple python package with many helper functions to be used with other projects.
How should handle I packaging and make installer for said package?
Directory layout
package:
package (/git/dev/package)
|
+-- __init__.py
|
+-- foo.py
|
+-- bar.py
project using said package:
project (/git/dev/project)
|
+-- project.py
How do I make this package available to every local python project (I don't need to distribute it publicly)? Should installer add current package location to path or use some other way?
Current preferred workflow:
1. checkout package from version control
2. do something so python finds and can use that said package
3. use package in some project
4. edit package (project should use edited package even before I push those changes to repo)
File contents
project.py:
# Doesn't work currently since package isn't added to path
from package import foo
from package import bar
foo.do_stuff()
bar.do_things()
How do I make this package available to every python project?
The standard way to distribute packages is as source distributions on PyPI (or a private pip-compatible repository). The details take up many pages, so I can't explain them all here, but the Python Packaging User Guide has everything you want to know.
The basic idea is that you create a setup.py file that tells Python how to install your program. Look at the official sample linked from the tutorial.
Now, someone can just download and unzip your package and run python setup.py install and it will install your program for them. Or, better, they can type pip install ..
But, even better, you can python setup.py sdist to create a source distribution, test it out, upload it to PyPI, and then users don't have to download anything, they can just pip install mylib.
If you want binary installations for different platforms, setuptools knows how to make both Windows installers, which can be double-clicked and run, and wheel files, which can be installed with pip (in fact, it'll automatically find and use wheels instead of source distributions). If you don't have any compiled C extensions, python setup.py bdist_wheel --universal is enough to build a wheel that works everywhere.
Should installer add current package location to path or use some other way?
The installer should install the package into the user's system or user (or current-virtual-environment-system) site-packages. If you use setuptools, pip will take care of this automatically.
If you're not trying to make this public, you can still use setuptools and pip. Just check the source distribution into source control, then you can install the latest version at any time like this:
pip install --upgrade git+https://github.com/mycompany/mylib
This also means you can skip a lot of the PyPI metadata in your setup.py file (e.g., nobody cares about your project's classifiers if it's not going to end up in the PyPI repository).
But you can still take advantage of all of the parts of the Python packaging system that make your life easier, while skipping the parts that aren't relevant.

Can we shed some definitive light on how python packaging and import works?

I had my fair chance of getting through the python management of modules, and every time is a challenge: packaging is not what people do every day, and it becomes a burden to learn, and a burden to remember, even when you actually do it, since this happens normally once.
I would like to collect here the definitive overview of how import, package management and distribution works in python, so that this question becomes the definitive explanation for all the magic that happens under the hood. Although I understand the broad level of the question, these things are so intertwined that any focused answer will not solve the main problem: understand how all works, what is outdated, what is current, what are just alternatives for the same task, what are the quirks.
The list of keywords to refer to is the following, but this is just a sample out of the bunch. There's a lot more and you are welcome to add additional details.
PyPI
setuptools / Distribute
distutils
eggs
egg-link
pip
zipimport
site.py
site-packages
.pth files
virtualenv
handling of compiled modules in eggs (with and without installation via easy_install)
use of get_data()
pypm
bento
PEP 376
the cheese shop
eggsecutable
Linking to other answers is probably a good idea. As I said, this question is for the high-level overview.
For the most part, this is an attempt to look at the packaging/distribution side, not the mechanics of import. Unfortunately, packaging is the place where Python provides way more than one way to do it. I'm just trying to get the ball rolling, hopefully others will help fill what I miss or point out mistakes.
First of all there's some messy terminology here. A directory containing an __init__.py file is a package. However, most of what we're talking about here are specific versions of packages published on PyPI, one of it's mirrors, or in a vendor specific package management system like Debian's Apt, Redhat's Yum, Fink, Macports, Homebrew, or ActiveState's pypm.
These published packages are what folks are trying to call "Distributions" going forward in an attempt to use "Package" only as the Python language construct. You can see some of that usage in PEP-376 PEP-376.
Now, your list of keywords relate to several different aspects of the Python Ecosystem:
Finding and publishing python distributions:
PyPI (aka the cheese shop)
PyPI Mirrors
Various package management tools / systems: apt, yum, fink, macports, homebrew
pypm (ActiveState's alternative to PyPI)
The above are all services that provide a place to publish Python distributions in various formats. Some, like PyPI mirrors and apt / yum repositories can be run on your local machine or within your companies network but folks typically use the official ones. Most, if not all provide a tool (or multiple tools in the case of PyPI) to help find and download distributions.
Libraries used to create and install distributions:
setuptools / Distribute
distutils
Distutils is the standard infrastructure on which Python packages are compiled and built into distributions. There's a ton of functionality in distutils but most folks just know:
from distutils.core import setup
setup(name='Distutils',
version='1.0',
description='Python Distribution Utilities',
author='Greg Ward',
author_email='gward#python.net',
url='http://www.python.org/sigs/distutils-sig/',
packages=['distutils', 'distutils.command'],
)
And to some extent that's a most of what you need. With the prior 9 lines of code you have enough information to install a pure Python package and also the minimal metadata required to publish that package a distribution on PyPI.
Setuptools provides the hooks necessary to support the Egg format and all of it's features and foibles. Distribute is an alternative to Setuptools that adds some features while trying to be mostly backwards compatible. I believe Distribute is going to be included in Python 3 as the successor to Distutil's from distutils.core import setup.
Both Setuptools and Distribute provide a custom version of the distutils setup command
that does useful things like support the Egg format.
Python Distribution Formats:
source
eggs
Distributions are typically provided either as source archives (tarball or zipfile). The standard way to install a source distribution is by downloading and uncompressing the archive and then running the setup.py file inside.
For example, the following will download, build, and install the Pygments syntax highlighting library:
curl -O -G http://pypi.python.org/packages/source/P/Pygments/Pygments-1.4.tar.gz
tar -zxvf Pygments-1.4.tar.gz
cd Pygments-1.4
python setup.py build
sudo python setup.py install
Alternatively you can download the Egg file and install it. Typically this is accomplished by using easy_install or pip:
sudo easy_install pygments
or
sudo pip install pygments
Eggs were inspired by Java's Jarfiles and they have quite a few features you should read about here
Python Package Formats:
uncompressed directories
zipimport (zip compressed directories)
A normal python package is just a directory containing an __init__.py file and an arbitrary number of additional modules or sub-packages. Python also has support for finding and loading source code within *.zip files as long as they are included on the PYTHONPATH (sys.path).
Installing Python Packages:
easy_install: the original egg installation tool, depends on setuptools
pip: currently the most popular way to install python packages. Similar to easy_install but more flexible and has some nice features like requirements files to help document dependencies and reproduce deployments.
pypm, apt, yum, fink, etc
Environment Management / Automated Deployment:
bento
buildout
virtualenv (and virtualenvwrapper)
The above tools are used to help automate and manage dependencies for a Python project. Basically they give you tools to describe what distributions your application requires and automate the installation of those specific versions of your dependencies.
Locations of Packages / Distributions:
site-packages
PYTHONPATH
the current working directory (depends on your OS and environment settings)
By default, installing a python distribution is going to drop it into the site-packages directory. That directory is usually something like /usr/lib/pythonX.Y/site-packages.
A simple programmatic way to find your site-packages directory:
from distuils import sysconfig
print sysconfig.get_python_lib()
Ways to modify your PYTHONPATH:
Python's import statement will only find packages that are located in one of the directories included in your PYTHONPATH.
You can inspect and change your path from within Python by accessing:
import sys
print sys.path
sys.path.append("/home/myname/lib")
Besides that, you can set the PYTHONPATH environment variable like you would any other environment variable on your OS or you could use:
.pth files: *.pth files located in directories that are already on your PYTHONPATH are read and each line of the *.pth file is added to your PYTHONPATH. Basically any time you would copy a package into a directory on your PYTHONPATH you could instead create a mypackages.pth. Read more about *.pth files: site module
egg-link files: Internal structure of python eggs they are a cross platform alternative to symbolic links. Creating an egg link file is similar to creating a pth file.
site.py modifications
To add the above /home/myname/lib to site-packages with a *.pth file you'd create a *.pth file. The name of the file doesn't matter but you should still probably choose something sensible.
Let's create myname.pth:
# myname.pth
/home/myname/lib
That's it. Drop that into sysconfig.get_python_lib() on your system or any other directory in your PYTHONPATH and /home/myname/lib will be added to the path.
For packaging question, this should help http://guide.python-distribute.org/
For import, the old article from Fredrik Lundh http://effbot.org/zone/import-confusion.htm still a very good starting point.
I recommend Tarek Ziadek's Book on Python. There's a chapter dedicated to packaging and distribution.
I don't think import needs to be explored (Python's namespacing and importing functionality is intuitive IMHO).
I use pip exclusively now. I haven't run into any issues with it.
However, the topic of packaging and distribution is something worth exploring. Instead of giving a lengthy answer, I will say this:
I learned how to package and distribute my own "packages" by simply copying how Pylons or many other open-source packages do it. I then combined that sort-of template with reading up of the docs to flesh it out even further and have come up with a solid distribution method.
When you grok package management and distribution for python (distutils and pypi) it's actually quite powerful. I like it a lot.
[edit]
I also wanted to add in a bit about virtualenv. USE IT. I create a virtualenv for every project and I always use --no-site-packages; I install all the packages I need for that particular project (even if it's something common amongst them all, like lxml) inside the virtualev. It keeps everything isolated and it's much easier for me to maintain the grouping in my head (rather than trying to keep track of what's where and for which version of python!)
[/edit]

Migrating to pip+virtualenv from setuptools

So pip and virtualenv sound wonderful compared to setuptools. Being able to uninstall would be great. But my project is already using setuptools, so how do I migrate? The web sites I've been able to find so far are very vague and general. So here's an anthology of questions after reading the main web sites and trying stuff out:
First of all, are virtualenv and pip supposed to be in a usable state by now? If not, please disregard the rest as the ravings of a madman.
How should virtualenv be installed? I'm not quite ready to believe it's as convoluted as explained elsewhere.
Is there a set of tested instructions for how to install matplotlib in a virtual environment? For some reason it always wants to compile it here instead of just installing a package, and it always ends in failure (even after build-dep which took up 250 MB of disk space). After a whole bunch of warnings it prints src/mplutils.cpp:17: error: ‘vsprintf’ was not declared in this scope.
How does either tool interact with setup.py? pip is supposed to replace easy_install, but it's not clear whether it's a drop-in or more complicated relationship.
Is virtualenv only for development mode, or should the users also install it?
Will the resulting package be installed with the minimum requirements (like the current egg), or will it be installed with sources & binaries for all dependencies plus all the build tools, creating a gigabyte monster in the virtual environment?
Will the users have to modify their $PATH and $PYTHONPATH to run the resulting package if it's installed in a virtual environment?
Do I need to create a script from a text string for virtualenv like in the bad old days?
What is with the #egg=Package URL syntax? That's not part of the standard URL, so why isn't it a separate parameter?
Where is #rev included in the URL? At the end I suppose, but the documentation is not clear about this ("You can also include #rev in the URL").
What is supposed to be understood by using an existing requirements file as "as a sort of template for the new file"? This could mean any number of things.
Wow, that's quite a set of questions. Many of them would really deserve their own SO question with more details. I'll do my best:
First of all, are virtualenv and pip
supposed to be in a usable state by
now?
Yes, although they don't serve everyone's needs. Pip and virtualenv (along with everything else in Python package management) are far from perfect, but they are widely used and depended upon nonetheless.
How should virtualenv be installed?
I'm not quite ready to believe it's as
convoluted as explained elsewhere.
The answer you link is complex because it is trying to avoid making any changes at all to your global Python installation and install everything in ~/.local instead. This has some advantages, but is more complex to setup. It's also installing virtualenvwrapper, which is a set of convenience bash scripts for working with virtualenv, but is not necessary for using virtualenv.
If you are on Ubuntu, aptitude install python-setuptools followed by easy_install virtualenv should get you a working virtualenv installation without doing any damage to your global python environment (unless you also had the Ubuntu virtualenv package installed, which I don't recommend as it will likely be an old version).
Is there a set of tested instructions
for how to install matplotlib in a
virtual environment? For some reason
it always wants to compile it here
instead of just installing a package,
and it always ends in failure (even
after build-dep which took up 250 MB
of disk space). After a whole bunch of
warnings it prints
src/mplutils.cpp:17: error: ‘vsprintf’
was not declared in this scope.
It "always wants to compile" because pip, by design, installs only from source, it doesn't install pre-compiled binaries. This is a controversial choice, and is probably the primary reason why pip has seen widest adoption among Python web developers, who use more pure-Python packages and commonly develop and deploy in POSIX environments where a working compilation chain is standard.
The reason for the design choice is that providing precompiled binaries has a combinatorial explosion problem with different platforms and build architectures (including python version, UCS-2 vs UCS-4 python builds, 32 vs 64-bit...). The way easy_install finds the right binary package on PyPI sort of works, most of the time, but doesn't account for all these factors and can break. So pip just avoids that issue altogether (replacing it with a requirement that you have a working compilation environment).
In many cases, packages that require C compilation also have a slower-moving release schedule and it's acceptable to simply install OS packages for them instead. This doesn't allow working with different versions of them in different virtualenvs, though.
I don't know what's causing your compilation error, it works for me (on Ubuntu 10.10) with this series of commands:
virtualenv --no-site-packages tmp
. tmp/bin/activate
pip install numpy
pip install -f http://downloads.sourceforge.net/project/matplotlib/matplotlib/matplotlib-1.0.1/matplotlib-1.0.1.tar.gz matplotlib
The "-f" link is necessary to get the most recent version, due to matplotlib's unusual download URLs on PyPI.
How does either tool interact with
setup.py? pip is supposed to replace
easy_install, but it's not clear
whether it's a drop-in or more
complicated relationship.
The setup.py file is a convention of distutils, the Python standard library's package management "solution." distutils alone is missing some key features, and setuptools is a widely-used third-party package that "embraces and extends" distutils to provide some additional features. setuptools also uses setup.py. easy_install is the installer bundled with setuptools. Setuptools development stalled for several years, and distribute was a fork of setuptools to fix some longstanding bugs. Eventually the fork was resolved with a merge of distribute back into setuptools, and setuptools development is now active again (with a new maintainer).
distutils2 was a mostly-rewritten new version of distutils that attempted to incorporate the best ideas from setuptools/distribute, and was supposed to become part of the Python standard library. Unfortunately this effort failed, so for the time being setuptools remains the de facto standard for Python packaging.
Pip replaces easy_install, but it does not replace setuptools; it requires setuptools and builds on top of it. Thus it also uses setup.py.
Is virtualenv only for development
mode, or should the users also install
it?
There's no single right answer to that; it can be used either way. In the end it's really your user's choice, and your software ideally should be able to be installed inside or out of a virtualenv; though you might choose to document and emphasize one approach or the other. It depends very much on who your users are and what environments they are likely to need to install your software into.
Will the resulting package be
installed with the minimum
requirements (like the current egg),
or will it be installed with sources &
binaries for all dependencies plus all
the build tools, creating a gigabyte
monster in the virtual environment?
If a package that requires compilation is installed via pip, it will need to be compiled from source. That also applies to any dependencies that require compilation.
This is unrelated to the question of whether you use a virtualenv. easy_install is available by default in a virtualenv and works just fine there. It can install pre-compiled binary eggs, just like it does outside of a virtualenv.
Will the users have to modify their
$PATH and $PYTHONPATH to run the
resulting package if it's installed in
a virtual environment?
In order to use anything installed in a virtualenv, you need to use the python binary in the virtualenv's bin/ directory (or another script installed into the virtualenv that references this binary). The most common way to do this is to use the virtualenv's activate or activate.bat script to temporarily modify the shell PATH so the virtualenv's bin/ directory is first. Modifying PYTHONPATH is not generally useful or necessary with virtualenv.
Do I need to create a script from a
text string for virtualenv like in the
bad old days?
No.
What is with the #egg=Package URL
syntax? That's not part of the
standard URL, so why isn't it a
separate parameter?
The "#egg=projectname-version" URL fragment hack was first introduced by setuptools and easy_install. Since easy_install scrapes links from the web to find candidate distributions to install for a given package name and version, this hack allowed package authors to add links on PyPI that easy_install could understand, even if they didn't use easy_install's standard naming conventions for their files.
Where is #rev included in the URL? At
the end I suppose, but the
documentation is not clear about this
("You can also include #rev in the
URL").
A couple sentences after that quoted fragment there is a link to "read the requirements file format to learn about other features." The #rev feature is fully documented and demonstrated there.
What is supposed to be understood by
using an existing requirements file as
"as a sort of template for the new
file"? This could mean any number of
things.
The very next sentence says "it will keep the packages listed in devel-req.txt in order and preserve comments." I'm not sure what would be a better concise description.
I can't answer all your questions, but hopefully the following helps.
Both virtualenv and pip are very usable. Many Python devs use these everyday.
Since you have a working easy_install, the easiest way to install both is the following:
easy_install pip
easy_install virtualenv
Once you have virtualenv, just type virtualenv yourEnvName and you'll get your new python virtual environment in a directory named yourEnvName.
From there, it's as easy as source yourEnvName/bin/activate and the virtual python interpreter will be your active. I know nothing about matplotlib, but following the installation interactions should work out ok unless there are weird hard-coded path issues.
If you can install something via easy_install you can usually install it via pip. I haven't found anything that easy_install could do that pip couldn't.
I wouldn't count on users being able to install virtualenv (it depends on who your users are). Technically, a virtual python interpreter can be treated as a real one for most cases. It's main use is not cluttering up the real interpreter's site-packages and if you have two libraries/apps that require different and incompatible versions of the same library.
If you or a user install something in a virtualenv, it won't be available in other virtualenvs or the system Python interpreter. You'll need to use source /path/to/yourvirtualenv/bin/activate command to switch to a virtual environment you installed the library on.
What they mean by "as a sort of template for the new file" is that the pip freeze -r devel-req.txt > stable-req.txt command will create a new file stable-req.txt based on the existing file devel-req.txt. The only difference will be anything installed not already specified in the existing file will be in the new file.

'setup.py test' egg install location?

I have all the eggs my project requires pre-downloaded in a directory, and I would like setuptools to only install packages from that directory.
In my setup.cfg I have:
[easy_install]
allow_hosts = None
find_links = ../../setup
I run python setup.py develop and it finds and installs all the packages correctly.
For testing, I have an additional requirement, specified in setup.py.
tests_require=["pinocchio==0.2"],
This egg also resides locally in the ../../setup directory.
I run python setup.py test and it sees the dependency and finds the egg in ../../setup just fine. However, the egg gets installed to my current directory instead of the site-packages directory with the rest of the eggs.
I've tried specifying the install-dir both in setup.cfg and on the command line and neither seemed to work for the tests command.
I could just add the dependency to the install_requires section, but I'd like to keep what is required for installation and tests separate if possible.
How can I keep the dependency in the tests_require section, but have it installed to the site-packages directory?
Just looking at the source code (setuptools/command/tests.py), it doesn't look like setup.py test is not supposed to install anything by design (it is testing, so why put anything in site-packages?). It uses fetch_build_egg (setuptools/dist.py) to get the eggs, which actually does a local easy_install. I suspect you can't trivially make test do what you want.
Notes/ideas:
My experience with setuptools is that it there are bugs in it and undocumented behavior. (One especially nasty trip-up I found was that it wouldn't enter softlinked directories, when distutils would).
I'd recommend either A) not doing this. :), B) manually installing the file by calling easy_install package. or C) looking into the setuptools system and maybe adding your own command. It isn't too difficult to understand, and knowing it will help a lot when you get future setuptools hick-ups.

Categories