Avoiding race condition with continuous deployment into virtualenv

Avoiding race condition with continuous deployment into virtualenv - python

My colleagues don't want to create their own virtualenv and deploy my tools from git on their own as part of their automation. Instead they want the tools pre installed on a shared server.
So I was thinking of making a /opt directory, putting the virtualenv in there and then pulling from git every hour to force an update of the python package. I am not version tagging the tools currently and instead would just ask pip to force the update every time.
The problem is the race condition. If the tool is called by their automation during the pip upgrade, the tool can fail due to the install not being atomic - I believe pip removes the entire package first.
I've thought of various work arounds (forcing the use of flock, using symlink to atomically switch the virtualenv, wrapping the tool in a script that makes the virtualenv in a temp directory for every use...)
Is there a best practice here I'm not aware of?
edit: I also asked over at devops and got a pretty good answer.

Related

setup.py + virtualenv = chicken and egg issue?

I'm a Java/Scala dev transitioning to Python for a work project. To dust off the cobwebs on the Python side of my brain, I wrote a webapp that acts as a front-end for Docker when doing local Docker work. I'm now working on packaging it up and, as such, am learning about setup.py and virtualenv. Coming from the JVM world, where dependencies aren't "installed" so much as downloaded to a repository and referenced when needed, the way pip handles things is a bit foreign. It seems like best practice for production Python work is to first create a virtual environment for your project, do your coding work, then package it up with setup.py.
My question is, what happens on the other end when someone needs to install what I've written? They too will have to create a virtual environment for the package but won't know how to set it up without inspecting the setup.py file to figure out what version of Python to use, etc. Is there a way for me to create a setup.py file that also creates the appropriate virtual environment as part of the install process? If not — or if that's considered a "no" as this respondent stated to this SO post — what is considered "best practice" in this situation?

You can think of virtualenv as an isolation for every package you install using pip. It is a simple way to handle different versions of python and packages. For instance you have two projects which use same packages but different versions of them. So, by using virtualenv you can isolate those two projects and install different version of packages separately, not on your working system.
Now, let's say, you want work on a project with your friend. In order to have the same packages installed you have to share somehow what versions and which packages your project depends on. If you are delivering a reusable package (a library) then you need to distribute it and here where setup.py helps. You can learn more in Quick Start
However, if you work on a web site, all you need is to put libraries versions into a separate file. Best practice is to create separate requirements for tests, development and production. In order to see the format of the file - write pip freeze. You will be presented with a list of packages installed on the system (or in the virtualenv) right now. Put it into the file and you can install it later on another pc, with completely clear virtualenv using pip install -r development.txt
And one more thing, please do not put strict versions of packages like pip freeze shows, most of time you want >= at least X.X version. And good news here is that pip handles dependencies by its own. It means you do not have to put dependent packages there, pip will sort it out.
Talking about deploy, you may want to check tox, a tool for managing virtualenvs. It helps a lot with deploy.

Python default package path always point to system environment, that need Administrator access to install. Virtualenv able to localised the installation to an isolated environment.
For deployment/distribution of package, you can choose to
Distribute by source code. User need to run python setup.py --install, or
Pack your python package and upload to Pypi or custom Devpi. So the user can simply use pip install <yourpackage>
However, as you notice the issue on top : without virtualenv, they user need administrator access to install any python package.
In addition, the Pypi package worlds contains a certain amount of badly tested package that doesn't work out of the box.
Note : virtualenv itself is actually a hack to achieve isolation.

How do you make quick edits to a pip installable library before commiting

I have an existing pip library, however I would like to make modifications to it over time. I have the pip library installed from a github project in a virtualenv. The only options I can think of for making edits before deciding a change is worth actually committing and pushing to the cloud is to edit the library directly within site-packages, which is particularly annoying as the virtualenv is stored within a docker container. Are there any short cuts or best practices to improve this workflow?

pip offers the -e option for editable installs. This is very similar to running setup.py develop to put a package in "development mode". This way you can keep your code where you want and change it as needed, without having to reinstall after fixing every syntax error and refactoring.

When using local python modules and pip, how to avoid creating a new version with every edit?

I'm sure this has been answered elsewhere, but I must not know the right keywords to find the answer...
I'm working on a site that requires several different components deployed on different servers but relying on some shared functions. I've implemented this by putting the shared functions into a pip module in its own git repo that I can put into the requirements.txt file of each project.
This is pretty standard stuff - more or less detailed here:
https://devcenter.heroku.com/articles/python-pip
My question is now that I have this working to deploy code into production, how do I set up my dev environment in such a way that I can make edits to the code in the shared module without having to do all of the following?
1. Commit changes
2. increment the version in setup.py in shared library
3. Increment in requirements.txt
4. pip install -r requirements.txt
That's a lot of steps to do all over again if I make one typo.

On a similar note, I used jenkins with git hook and a simple(4 or 5 lines maybe that would install/upgrade requirements.txt, restart webserver and little more stuff) bash script. When I commit changes, jenkins would run my bash script, then voila. Almost instant upgrade.
But note that, this is hack-ish. Jenkins is a continuous integration tool focusing building and testing, and there are probably better and simpler tools in this case, hint: Continuous Integration.

Find all unmet dependencies for a Python/Django project

I've been handed a dozen or so legacy Django applications to maintain. The first part of this process is moving them off their ancient Ubuntu 9.04 server (which is long out of support) onto something fresh and safe.
But the projects don't include any sort of dependency listing. From habit I'm used to generating a requirements.txt file as I develop a site and that makes redeployment a simple and automated process.
As it stands I would have to manually step through these projects, making sure to scrape every corner to find possible missing dependencies. Either that or I install everything.
Is there an automated code-analysis option here? Something that can use to scan the local project directories for each project to generate a list of packages it needs... ideally in PyPI formatted package names.

z3c.dependencychecker can be used for this purpose.
It's in the z3c namespace, but from what I know that's only because it has been developed with a Zope ecosystem in mind, but it can just as well be used for plain Python projects. Unless you want to run its tests, it does not have any dependencies on Zope packages.
It does however only consider dependencies declared in setup.py, not in requirements.txt. But it should be pretty easy to sync up missing dependencies for the full list of dependencies once they've been determined.
Usage:
Activate your virtualenv, and install z3c.dependencychecker, e.g. by doing pip install z3c.dependencychecker
Make sure you have run python setup.py develop for your project recently, so you have an up-to-date *.egg-info.
cd into your projects source directory
run dependencychecker
Note that z3c.dependencychecker isn't perfect (pretty much by definition, because of the way it works), so it can report some false positives. But in my experience it's a very good start, and it should be pretty easy to verify why it reported a particular dependency, and weed out false positives.

Howto deploy python applications inside corporate network

First let me explain the current situation:
We do have several python applications which depend on custom (not public released ones) as well as general known packages. These depedencies are all installed on the system python installation. Distribution of the application is done via git by source. All these computers are hidden inside a corporate network and don't have internet access.
This approach is bit pain in the ass since it has the following downsides:
Libs have to be installed manually on each computer :(
How to better deploy an application? I recently saw virtualenv which seems to be the solution but I don't see it yet.
virtualenv creates a clean python instance for my application. How exactly should I deploy this so that usesrs of the software can easily start it?
Should there be a startup script inside the application which creates the virtualenv during start?
The next problem is that the computers don't have internet access. I know that I can specify a custom location for packages (network share?) but is that the right approach? Or should I deploy the zipped packages too?
Would another approach would be to ship the whole python instance? So the user doesn't have to startup the virutalenv? In this python instance all necessary packages would be pre-installed.
Since our apps are fast growing we have a fast release cycle (2 weeks). Deploying via git was very easy. Users could pull from a stable branch via an update script to get the last release - would that be still possible or are there better approaches?
I know that there are a lot questions. Hopefully someone can answer me r give me some advice.

You can use pip to install directly from git:
pip install -e git+http://192.168.1.1/git/packagename#egg=packagename
This applies whether you use virtualenv (which you should) or not.
You can also create a requirements.txt file containing all the stuff you want installed:
-e git+http://192.168.1.1/git/packagename#egg=packagename
-e git+http://192.168.1.1/git/packagename2#egg=packagename2
And then you just do this:
pip install -r requirements.txt
So the deployment procedure would consist in getting the requirements.txt file and then executing the above command. Adding virtualenv would make it cleaner, not easier; without virtualenv you would pollute the systemwide Python installation. virtualenv is meant to provide a solution for running many apps each in its own distinct virtual Python environment; it doesn't have much to do with how to actually install stuff in that environment.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Avoiding race condition with continuous deployment into virtualenv - python

Related

setup.py + virtualenv = chicken and egg issue?

How do you make quick edits to a pip installable library before commiting

When using local python modules and pip, how to avoid creating a new version with every edit?

Find all unmet dependencies for a Python/Django project

Howto deploy python applications inside corporate network

Categories

Resources