Including Git submodules on pythonpath when using virtualenv

Including Git submodules on pythonpath when using virtualenv - python

I'm using Git for version control on a Django project.
As much as possible, all the source-code that is not part of the project per se, but that the project depends on, is brought in as Git submodules. These live on a lib directory and have to be included on python path. The directory/files layout looks like:
.git
docs
lib
my_project
apps
static
templates
__init__.py
urls.py
manage.py
settings.py
.gitmodules
README
What, would you say, is the best practice for including the libs on python path?
I am using virtualenv, so I could easily sym-link the libraries to the virtualenv's site-packages directory. However, this will tie the virtualenv to this specific project. my understanding is that the virtualenv should not depend on my files. instead, my files should depend on the virtualenv.
I was thinking of using the same virtualenv for different local copies of this project, but if I do things this way I will lose that capability. Any better idea how to approach this?
Update:
The best solution turned out being to let pip manage all the dependencies.
However, this means not being able to use git submodules, as pip can't yet handle relative paths properly. So, external dependencies will have to live on the virtualenv (typically: my_env/src/a_python_module).
I'd still prefer to use submodules, to have some of the dependencies living on my project tree. This makes more sense to me as I've already needed to fork those repos to change some bits of them, and will likely have to change them some more in the future.

dump all your installed packages in a requirement file (requirements.txt looks the standard naming) using
pip freeze > requirements.txt
everytime you need a fresh virtualenv you just have to do:
virtualenv <name> --no-site-packages
pip install -r requirements.txt
the install -r requirements.txt works great also if you want to update to newer packages
just keep requirements.txt in sync with your packages (by running pip freeze every time something changes) and you're done, no matter how many virtualenv you have.
NOTE: if you need to do some development on a package you can install that using the -e (editable) param, this way you can edit the package and you don't have to uninstall/install every time you want to test new stuff :)

Related

What directory do I install a virtualenvironment? [duplicate]

I'm confused as to where I should put my virtualenvs.
With my first django project, I created the project with the command
django-admin.py startproject djangoproject
I then cd'd into the djangoproject directory and ran the command
virtualenv env
which created the virtual environment directory at the same level as the inner djangoproject directory.
Is this the wrong place in which to create the virtualenv for this particular project?
I'm getting the impression that most people keep all their virtualenvs together in an entirely different directory, e.g. ~/virtualenvs, and then use virtualenvwrapper to switch back and forth between them.
Is there a correct way to do this?

Many people use the virtualenvwrapper tool, which keeps all virtualenvs in the same place (the ~/.virtualenvs directory) and allows shortcuts for creating and keeping them there. For example, you might do:
mkvirtualenv djangoproject
and then later:
workon djangoproject
It's probably a bad idea to keep the virtualenv directory in the project itself, since you don't want to distribute it (it might be specific to your computer or operating system). Instead, keep a requirements.txt file using pip:
pip freeze > requirements.txt
and distribute that. This will allow others using your project to reinstall all the same requirements into their virtualenv with:
pip install -r requirements.txt

Changing the location of the virtualenv directory breaks it
This is one advantage of putting the directory outside of the repository tree, e.g. under ~/.virtualenvs with virutalenvwrapper.
Otherwise, if you keep it in the project tree, moving the project location will break the virtualenv.
See: Renaming a virtualenv folder without breaking it
There is --relocatable but it is known to not be perfect.
Another minor advantage: you don't have to .gitignore it.
The advantages of putting it gitignored in the project tree itself are:
keeps related stuff close together.
you will likely never reuse a given virtualenv across projects, so putting it somewhere else does not give much advantage
This is an annoying design flaw in my opinion. They should implement virutalenv in a way that does not matter where the directory is, as storing in-tree is just simpler and more isolated. Node.js' NPM package manager does it without any problem. And while we are at it: pip should just use local directories by default just like NPM. Having this separate virtualenv layer is wonky. Node.js just have NPM that does it all without extra typing. I can't believe I'm prasing the JavaScript ecosystem on a Python post, but it's true.

The generally accepted place to put them is the same place that the default installation of virtualenvwrapper puts them: ~/.virtualenvs
Related: virtualenvwrapper is an excellent tool that provides shorthands for the common virtualenv commands. http://www.doughellmann.com/projects/virtualenvwrapper/

If you use pyenv install Python, then pyenv-virtualenv will be a best practice. If set .python-version file, it can auto activate or deactivate virtual env when you change work folder. Pyenv-virtualenv also put all virtual env into $HOME/.pyenv/versions folder.

From my personal experience, I would recommend to organize all virtual environments in one single directory. Unless someone has extremely sharp memory and can remember files/folders scattered across file system.
Not a big fan of using other tools just to mange virtual environments. In VSCode if I configure(python.venvPath) directory containing all virtual environments, it can automatically recognize all of them.

For Anaconda installations of Python, the "conda create" command puts it in a directory within the anaconda3 folder by default. Specifically (for Windows):
C:\Users\username\anaconda3\envs
This allows other conda commands to work without specifying the path. One advantage, not noted above, is that putting environments in the project folder allows you to use the same name for all of them (but that is not much of an advantage for me). For more info, see:
https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html

How to switch between test code and production code in python?

I have a project that is constantly undergoing development. I have installed a release of the project in my python distribution's site-packages directory using the setup.py script for the project.
However, when I make changes to the project I would like my test scripts to find the files that are under the project's directory and not those that it finds in site-packages. What is the proper way to do this? I only know of one approach which is to modify the search path in the test script itself using sys.path, but then it means that I cannot use the same scripts to test the "installed" version of my codes without editing the sys.path again.

I'm not quite sure what you are asking but you could use
python setup.py develop to create a develop version of your project
https://pythonhosted.org/setuptools/setuptools.html#development-mode
Under normal circumstances, the distutils assume that you are going to
build a distribution of your project, not use it in its “raw” or
“unbuilt” form. If you were to use the distutils that way, you would
have to rebuild and reinstall your project every time you made a
change to it during development.
Another problem that sometimes comes up with the distutils is that you
may need to do development on two related projects at the same time.
You may need to put both projects’ packages in the same directory to
run them, but need to keep them separate for revision control
purposes. How can you do this?
Setuptools allows you to deploy your projects for use in a common
directory or staging area, but without copying any files. Thus, you
can edit each project’s code in its checkout directory, and only need
to run build commands when you change a project’s C extensions or
similarly compiled files. You can even deploy a project into another
project’s checkout directory, if that’s your preferred way of working
(as opposed to using a common independent staging area or the
site-packages directory).

Use "Editable" package installation like:
pip install -e path/to/SomeProject
Assuming we are in the same directory with setup.py, the command will be:
pip install -e .

Virtualenv, Django and PyCharm. File structure

I am newbie using VirtualEnv and recently try to create one using PyCharm. During the process, PyCharm ask me to specify the project location, application name and VirtualEnv name and location. My doubt is, after I specify the name and location of the VirtualEnv the location of the Django project files must be inside the VirtualEnv? or it's possible to have the VirtualEnv files in a different location than the Django project files?
Maybe I am not understanding the purpose of the VirtualEnv. Perhaps, VirtualEnv it's just a list of the dependencies of my project, Python version, Django version, Pip version, Jinja2 version and all other required files, but not necessarily the Django application files (the website that is being developed).
Thanks in advance.

Ya, I think you misunderstand what virtualenv does:
https://virtualenv.pypa.io/en/latest/
The basic problem being addressed is one of dependencies and versions, and indirectly permissions. Imagine you have an application that needs version 1 of LibFoo, but another application requires version 2. How can you use both these applications? If you install everything into /usr/lib/python2.7/site-packages (or whatever your platform’s standard location is), it’s easy to end up in a situation where you unintentionally upgrade an application that shouldn’t be upgraded.
Your project files don't need to be (and shouldn't be) where the virtualenv files are.
virtualenv installs your app's python dependancies in a folder for the specific virtualenv that is being used.
Let's say you are not using virtualenv, the dependencies would be installed into into the site-packages folder for your system. The dependancies aren't installed in your project directory and your project directory isn't in your system's site-packages directory.
Using virtualenv doesn't change that, it just changes where the dependencies are installed.

virtualenv is not just a list of dependencies! It actually has all the modules under its umbrella. Think of a virtualenv as a space which isolates all the packages used by your project from the rest of the other packages that were installed previously or at a later time.Yes, there is an option to have the virtualenv make use of packages that are "outside" of the environment but that's just an option.
The main purpose of having an virtualenv is to enable the user to use packages versions of his choice and keep them isolated from the rest of the space. Usually, the list of packages belonging to a specific virtualenv are captured into a file, requirements.txt. If you want to run the project on a different machine or share it with someone, having requirements.txt will make it easy to recreate the environment via pip install -r requirement.txt from within virtualenv

Migrating virtualenv and Github between computers

I primarily work these days with Python 2.7 and Django 1.3.3 (hosted on Heroku) and I have multiple projects that I maintain. I've been working on a Desktop with Ubuntu running inside of a VirtualBox, but recently had to take a trip and wanted to get everything loaded up on my notebook. But, what I quickly discovered was that virtualenv + Github is really easy for creating projects, but I struggled to try and get them moved over to my notebook. The approach that I sort of came up with was to create new virtualenv and then clone the code from github. But, I couldn't do it in the folder that I really wanted because it would say the folder is not empty. So, I would clone it to a tmp folder than them cut/paste the everthing into where I really wanted it. Not TERRIBLE, but I just feel like I'm missing something here and that it should be easier. Maybe clone first, then mkvirtualenv?
It's not a crushing problem, but I'm thinking about making some more changes (like getting ride of the VirtualBox and just going with a Dual boot system) and it would be great if I could make it a bit smoother. :)
Finally, I found and read a few posts about moving git repos between computers, but I didn't see any dealing with Virtualenv (maybe I just missed it).
EDIT: Just to be clear and avoid confusion, I'm not try to "move" the virtualenv. I'm just talking about best way to create a new one. Install the packages, and then clone the repo from github.

The only workflow you should need is:
git clone repo_url somedir
cd somedir
virtualenv <name of environment directory>
source <name of environment directory>/bin/activate
pip install -r requirements.txt
This assumes that you have run pip freeze > requirements.txt (while the venv is activated) to list all the virtualenv-pip-installed libraries and checked it into the repo.

That's because you're not even supposed to move virtualenvs to different locations on one system (there's relocation support, but it's experimental), let alone from one system to another. Create a new virtualenv:
Install virtualenv on the other system
Get a requirements.txt, either by writing one or by storing the output of pip freeze (and editing the output)
Move the requirements.txt to the other system, create a new virtualenv, and install the libraries via pip install -r requirements.txt.
Clone the git repository on the other system
For more advanced needs, you can create a bootstrapping script which includes virtualenv + custom code to set up anything else.
EDIT: Having the root of the virtualenv and the root of your repository in the same directory seems like a pretty bad idea to me. Put the repository in a directory inside the virtualenv root, or put them into completely separate trees. Not only you avoid git (rightfully -- usually, everything not tracked by git is fair game to delete) complaining about existing files, you can also use the virtualenv for multiple repositories and avoid name collisions.

In addition to scripting creating a new virtualenv, you should make a requirements.txt file that has all of your dependencies (e.g Django1.3), you can then run pip install -r requirements.txt and this will install all of your dependencies for you.
You can even have pip create this for you by doing pip freeze > stable-req.txt which will print out you dependencies as there are in your current virtualenv. You can then keep the requirements.txt under version control.

The nice thing about a virtualenv is that you can describe how to make one, and you can make it repeatedly on multiple platforms.
So, instead of cloning the whole thing, clone a method to create the virtualenv consistently, and have that in your git repository. This way you avoid platform-specific nasties.

Developing Python Module

I'd like to start developing an existing Python module. It has a source folder and the setup.py script to build and install it. The build script just copies the source files since they're all python scripts.
Currently, I have put the source folder under version control and whenever I make a change I re-build and re-install. This seems a little slow, and it doesn't settle well with me to "commit" my changes to my python install each time I make a modification. How can I cause my import statement to redirect to my development directory?

Use a virtualenv and use python setup.py develop to link your module to the virtual Python environment. This will make your project's Python packages/modules show up on the sys.path without having to run install.
Example:
% virtualenv ~/virtenv
% . ~/virtenv/bin/activate
(virtenv)% cd ~/myproject
(virtenv)% python setup.py develop

Virtualenv was already mentioned.
And as your files are already under version control you could go one step further and use Pip to install your repo (or a specific branch or tag) into your working environment.
See the docs for Pip's editable option:
-e VCS+REPOS_URL[#REV]#egg=PACKAGE, --editable=VCS+REPOS_URL[#REV]#egg=PACKAGE
Install a package directly from a checkout. Source
will be checked out into src/PACKAGE (lower-case) and
installed in-place (using setup.py develop).
Now you can work on the files that pip automatically checked out for you and when you feel like it, you commit your stuff and push it back to the originating repository.
To get a good, general overview concerning Pip and Virtualenv see this post: http://www.saltycrane.com/blog/2009/05/notes-using-pip-and-virtualenv-django

Install the distrubute package then use the developer mode. Just use python setup.py develop --user and that will place path pointers in your user dir location to your workspace.

Change the PYTHONPATH to your source directory. A good idea is to work with an IDE like ECLIPSE that overrides the default PYTHONPATH.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.