How to include non PyPi packages for virtualenv requirements file? - python

Is there a way to include packages/modules not available through pip in the requirements file so that the project is portable?
The default version of lxml seems to have issues with pypy so I need to use a custom fork.
The problem is I need Heroku (where I deploy this application) to use a custom version of lxml and not the one that's available via pip. Is there any way to do this?

You can by using the listed git packages syntax, you would need to add the following line to your requirements.txt
-e git://github.com/aglyzov/lxml.git#egg=lxml

Related

Updating dependencies of downstream projects using pip requirements.txt

I have a python package that is used by other applications across an organization, let's call it buildtools.
Other applications in my organization have installed this package via
pip install git+https://${OAUTH_TOKEN}:x-oauth-basic#github.com/my_organization/buildtools#egg=buildtools
I want to add a new feature to buildtools that requires a 3rd party package, let's just say its requests. So within buildtools I add requests to requirements.txt, import it, and it's all good.
But none of the other applications in my organization have requests as one of their dependencies in requirements.txt.
When I merge my new code in and update the package, I believe we will run into some ImportError: No module named requests errors in the downstream applications that use buildtools.
How can I ensure that any application that uses the buildtools package gets the requests package installed when they get the latest buildtools?
In other words, how can I update buildtools's dependencies recursively?
I am aware that I could add requests to requirements.txt across all the applications in my organization that uses buildtools, but I'm trying to avoid that.
Why don't you just run
pip install -r requirements.txt
as discussed, e.g. here?
That's the best and most painless way to update/install required packages recursively.
After further research I found that install_requires within setup.py is exactly what I was looking for. This example explains it well.

How to make a private module pip installable?

I have python package which need to be installed to run a Django project?
I go into the python virtual environment and clone the module from git in site-packages folder inside lib.
What i need is to make that module pip intallable and installation access should be given only to specific people i.e that module should not be public to everyone.
Build the python package as you normally would for a public build. For helpful step-by-step instructions on that front, check out the python docs
There are a number of ways to maintain both installability and privacy. When I looked into this for my own packages I started with the suggestions at this site. This site includes instructions on how to build your own equivalent of a PyPi server.
The solution I landed on though, I feel is quite simpler. I pushed the entire package to a private git repository hosted on github. You can then install using pip install git+[insert full url to your github repository here]. You can enforce privacy by restricting who has access to your git repository.
To make your package part of the requirements, place it where it will be accessible only by the people you want to have access, e.g. on a private github. Then you can add a line like this to any project's requirements.txt, and pip will fetch it and install it:
-e git://github.com/<user>/<package>.git#egg=<package>
(Replace with the name of the package you are distributing.) This line is simply added to the list of simple package names that requirements.txt usually contains. You could also replace the git download with an egg placed on a local fileshare, or whatever.

How to `pip install` a package that has Git dependencies?

I have a private library called some-library (actual names have been changed) with a setup file looking somewhat like this:
setup(
name='some-library',
// Omitted some less important stuff here...
install_requires=[
'some-git-dependency',
'another-git-dependency',
],
dependency_links=[
'git+ssh://git#github.com/my-organization/some-git-dependency.git#egg=some-git-dependency',
'git+ssh://git#github.com/my-organization/another-git-dependency.git#egg=another-git-dependency',
],
)
All of these Git dependencies may be private, so installation via HTTP is not an option. I can use python setup.py install and python setup.py develop in some-library's root directory without problems.
However, installing over Git doesn't work:
pip install -vvv -e 'git+ssh://git#github.com/my-organization/some-library.git#1.4.4#egg=some-library'
The command fails when it looks for some-git-dependency, mistakenly assumes it needs to get the dependency from PyPI and then fails after concluding it's not on PyPI. My first guess was to try re-running the command with --process-dependency-links, but then this happened:
Cannot look at git URL git+ssh://git#github.com/my-organization/some-git-dependency.git#egg=some-git-dependency
Could not find a version that satisfies the requirement some-git-dependency (from some-library) (from versions: )
Why is it producing this vague error? What's the proper way to pip install a package with Git dependencies that might be private?
What's the proper way to pip install a package with Git dependencies that might be private?
Two options
Use dependency_links as you do. See below for details.
Along side the dependency_links in your setup.py's, use a special dependency-links.txt that collects all the required packages. Then add this package in requirements.txt. That's my recommendend option as explained below.
# dependency-links.txt
git+ssh://...#tag#egg=package-name1
git+ssh://...#tag#egg=package-name2
# requirements.txt (per deployed application)
-r dependency-links.txt
While option 2 adds some extra burden on package management, namely keeping dependency-links.txt up to date, it makes installing packages a lot easier because you can' forget to add the --process-dependency-link option on pip install.
Perhaps more importantly, using dependency-links.txt you get to specify the exact version to be installed on deployment, which is want you want in a CI/CD environment - nothing is more risky than to install some version. From a package maintainer's perspective however it is common and considered good practice to specify a minimum version, such as
# setup.py in a package
...
install_requires = [ 'foo>1.0', ... ]
That's great because it makes your packages work nicely with other packages that have similar dependencies yet possibly on different versions. However, in a deployed application this can still cause mayhem if there are conflicting requirements among packages. E.g. package A is ok with foo>1.0, package B wants foo<=1.5 and the most recent version is foo==2.0. Using dependency-links.txt you can be specific, applying one version for all packages:
# dependency-links.txt
foo==1.5
The command fails when it looks for some-git-dependency,
To make it work, you need to add --process-dependency-links for pip to recognize the dependency to github, e.g.
pip install --process-dependency-links -r private-requirements.txt
Note since pip 8.1.0 you can add this option to requirements.txt. On the downside it gets applied to all packages installed and may have unintended consequences. That said, I find using dependency-links.txt is a safer and more manageable solution.
All of these Git dependencies may be private
There are three options:
Add collaborators on each of the required packages' repositories. These collaborators need to have their ssh keys setup with github for this to work. Then use git+ssh://...
Add a deploy key to each of the repositories. The downside here is that you need to distribute the corresponding private key to all the machines that need to deploy. Again use git+ssh://...
Add a personal access token on the github account that holds the private repositories. Then you can use git+https://accesstoken#github.com/... The downside is that the access token will have read + write access to all repositories, public and private, on the respective github account. On the plus side distributing and managing per-repository private keys is no longer necessary, and cycling the key is a lot simpler. In an all-inhouse environment where every dev has access to all repositories I have found this to be the most efficient, hassle-free way for everybody. YMMV
This should work for private repositories as well:
dependency_links = [
'git+ssh://git#github.com/my-organization/some-git-dependency.git#master#egg=some-git-dependency',
'git+ssh://git#github.com/my-organization/another-git-dependency.git#master#egg=another-git-dependency'
],
You should use git+git when url with #egg, like this:
-e git+git#repo.some.la:foo/my-repo.git#egg=my-repo
Use git+ssh in production without #egg, but you can specify #version or branch #master
git+ssh://git#repo.some.la/foo/my-repo.git#1.1.6
for work with app versions use git tagging Git Basics - Tagging
If I refer to "pip install dependency links", you would not refer to the GitHub repo itself, but to the tarball image associated to that GitHub repo:
dependency_links=[
'git+ssh://git#github.com/my-organization/some-git-dependency/tarball/master/#egg=some-git-dependency',
'git+ssh://git#github.com/my-organization/another-git-dependency/tarball/master/#egg=another-git-dependency',
],
with "some-git-dependency" being the name *and version of the dependency.
"Cannot look at git URL git+ssh://git#github.com/my-organization/some-git-dependency.git#egg=some-git-dependency" means pip cannot fetch an html page from this url to look for direct download links in the page, i.e, pip doesn't recognize the URL as a vcs checkout, because maybe some discrepancy between the requirement specifier and the fragment part in the vcs url.
In the case of a VCS checkout, you should also append #egg=project-version in order to identify for what package that checkout should be used.
Be sure to escape any dashes in the name or version by replacing them with underscores.
Check Dependencies that aren’t in PyPI
replace - with an _ in the package and version string.
git+ssh://git#github.com/my-organization/some-git-dependency.git#egg=some_git_dependency
and --allow-all-external may be useful.

How can I install a python package without pip or virtualenv

I have to deploy a python application to a production server (Ubuntu) that I do not control nor do I have permissions to apt-get, pip, virtualenv, etc. Currently, its the server is running python 2.6+. I need to install pycrypto as a dependency for the application but given my limited permissions, I'm not sure as to how to do it. The only think I have permissions to do is wget a resource and unpack it or things along those lines.
First off, is it possible to use it without getting it installed in the aforementioned approach? If not, could I download the package then drop in __init__.py files in the pycrypto dir so python knows how to find it like so:
/my_app
/pycrypto
/__init__.py
/pycrypto.py
According to PEP370, starting with python 2.6 you can have a per-user site directory (see the What's new in Python 2.6?).
So you can use the --user option of easy_install to install the directory for each user instead of system-wide. I believe a similar option exists for pip too.
This doesn't require any privileges since it only uses current user directories.
If you don't have any installer installed you can manually unpack the package into:
~/.local/lib/python2.6/site-packages
Or, if you are on Windows, into:
%APPDATA%/Python/Python26/site-packages
In the case of pycrypto, the package requires building before installation because it contains some C code. The sources should contain a setup.py file. You have to build the library running
python setup.py build
Afterwards you can install it in the user directory by giving:
python setup.py install --user
Note that the building phase might require some C library to already be installed.
If you don't want to do this, the only option is to ship the library together with your application.
By the way: I believe easy_install doesn't really check whether you are root before performing a system wide install. It simply checks whether it can write in the system-wide site directory. So, if you do have the privileges to write there, there's no need to use sudo in the first place. However this would be really odd...
Use easy_install. It should be installed already on Ubuntu for python 2.6+. If not take a look at these install instructions.

Best practice for installing python modules from an arbitrary VCS repository

I'm newish to the python ecosystem, and have a question about module editing.
I use a bunch of third-party modules, distributed on PyPi. Coming from a C and Java background, I love the ease of easy_install <whatever>. This is a new, wonderful world, but the model breaks down when I want to edit the newly installed module for two reasons:
The egg files may be stored in a folder or archive somewhere crazy on the file system.
Using an egg seems to preclude using the version control system of the originating project, just as using a debian package precludes development from an originating VCS repository.
What is the best practice for installing modules from an arbitrary VCS repository? I want to be able to continue to import foomodule in other scripts. And if I modify the module's source code, will I need to perform any additional commands?
Pip lets you install files gives a URL to the Subversion, git, Mercurial or bzr repository.
pip install -e svn+http://path_to_some_svn/repo#egg=package_name
Example:
pip install -e hg+https://rwilcox#bitbucket.org/ianb/cmdutils#egg=cmdutils
If I wanted to download the latest version of cmdutils. (Random package I decided to pull).
I installed this into a virtualenv (using the -E parameter), and pip installed cmdutls into a src folder at the top level of my virtualenv folder.
pip install -E thisIsATest -e hg+https://rwilcox#bitbucket.org/ianb/cmdutils#egg=cmdutils
$ ls thisIsATest/src
cmdutils
Are you wanting to do development but have the developed version be handled as an egg by the system (for instance to get entry-points)? If so then you should check out the source and use Development Mode by doing:
python setup.py develop
If the project happens to not be a setuptools based project, which is required for the above, a quick work-around is this command:
python -c "import setuptools; execfile('setup.py')" develop
Almost everything you ever wanted to know about setuptools (the basis of easy_install) is available from the the setuptools docs. Also there are docs for easy_install.
Development mode adds the project to your import path in the same way that easy_install does. An changes you make will be available to your apps the next time they import the module.
As others mentioned, you can also directly use version control URLs if you just want to get the latest version as it is now without the ability to edit, but that will only take a snapshot, and indeed creates a normal egg as part of the process. I know for sure it does Subversion and I thought it did others but I can't find the docs on that.
You can use the PYTHONPATH environment variable or symlink your code to somewhere in site-packages.
Packages installed by easy_install tend to come from snapshots of the developer's version control, generally made when the developer releases an official version. You're therefore going to have to choose between convenient automatic downloads via easy_install and up-to-the-minute code updates via version control. If you pick the latter, you can build and install most packages seen in the python package index directly from a version control checkout by running python setup.py install.
If you don't like the default installation directory, you can install to a custom location instead, and export a PYTHONPATH environment variable whose value is the path of the installed package's parent folder.

Categories