Pip - Change directory of pip cache on Linux? - python

I heard changing XDG_CACHE_DIR or XDG_DATA_HOME fixes that but I did
export XDG_CACHE_DIR=<new path>
export XDG_DATA_HOME=<new path>
I've also tried
pip cache dir --cache-dir <new path>
and
pip cache --cache-dir <new path>
and
--cache-dir <new path>
and
python --cache-dir <new path>
from https://pip.pypa.io/en/stable/reference/pip/#cmdoption-cache-dir
and when I type
pip cache dir
It's still in the old location. How do I change the directory of pip cache?

TL;TR;: Changing XDG_CACHE_HOME globally with use of export like some people suggested will not only affect pip but also other apps as well. And you simply do not want to mess that much because there's not necessary really. So long story short, before I dive into alternatives, do not change XDG_CACHE_HOME globally unless you really sure you want to do that!
So what are your alternatives then? You should be using pip's --cache-dir <dir> command line argument instead or, at least, if you want to go that way, you should override XDG_CACHE_HOME value for pip invocation only:
XDG_CACHE_HOME=<path> pip ...
which also can be made more permanent by using shell alias feature:
alias pip="XDG_CACHE_HOME=<path> pip"
BUT, but, but... you do not need to touch XDG_CACHE_HOME at all, as pip can have own configuration file, in which you can override all of the defaults to match your needs, including alternative location of cache directory. Moreover, all command line switches have accompanying environment variables that pip looks for at runtime, which looks like the cleanest approach for your tweakings.
In your particular case, --cache-dir can be provided via PIP_CACHE_DIR env variable. So you can either set it globally:
export PIP_CACHE_DIR=<path>
or per invocation:
PIP_CACHE_DIR=<path> pip ...
or, you create said pip's configuration file and set it there.
See docs for more information about pip config file and variables.

To simplify the other answer:
# find the config file location under variant "global"
pip config list -v
# create the file and add
[global]
cache-dir=/path/to/dir
# test if it worked
pip config list
pip cache dir

Related

Can I tell the "pip config" command to put the pip.config it generates in another location?

I'm setting up pypi auth in some build and deployment automation.
It would be most useful to me if I could just programmatically generate a pip.config in the location of my choosing instead of having to write that file out manually or write it to the default location and then copy it somewhere else.
I'm using these commands to update the config:
pip config --user set global.index-url https://....myprivate repo
pip config --user set global.trusted-host myhost
Works as expected. I looked in docs and in pip config --help, but could not find a switch for changing the location.
Is there support to have pip config write the config file to another location?
Or is there some other command I could run first that tells pip the config is in another place and then when I run pip config commands it updates/creates a file there instead?

In psynet, if I use the "newenv" approach, how can I change the psynet code?

I want to use the "newenv" approach, but now I need to make changes to psynet (for example adding a log line or rpdb.set_trace()), where is this local copy? And can I simply change it or I need to reinstall the environment with reenv?
I believe this is the right workflow:
In a new project, start by creating a new environment, with the command newenv, which currently stands for
alias newenv="python3 -m venv env && source env/bin/activate && pip install -r constraints.txt"
This will install all the dependencies in your constraints file, which should include the Psynet version too.
If you make any changes in the code (e.g., experiment.py) that do not involve any change in your dependencies/ packaged, then you do not need to do anything. But if your changes involve the packages your are working with in your environment (a change in PsyNet or some of your working packages, like REPP or melody-experiments), then you will need to first push the changes in GITHUB and then reinstall your working environment, using the command reenv, which stands for:
alias reenv="rm -rf env && newenv && source env/bin/activate && pip install -r constraints.txt"
There are other situations in which one only wants to make a change in requirements.txt, such as adding a new package or changing PsyNet's version. In this case, after changing your requirements, you should run dallinger generate-constraints, which will update your constraints.txt accordingly Then, run reenv.
If you want to achieve this, you should edit the source code files within your env directory, which will be located in your project directory. PsyNet will be located somewhere like this:
env/lib/python3.9/site-packages/psynet
Note that any changes will be reset the next time you run newenv.
The simplest way to do this is certainly to edit the file in the venv, but as has been pointed out it's easy to lose track of that change and overwrite it. I do that anyway if it's a short term test.
However, my preferred approach when I want to modify a library like this is to clone its source code from Git, do the modifications in my cloned sandbox, and do a pip install -e . inside the module after having activated the venv. That will make this venv replace the previously installed version of the library with my sandbox version of it.

What are the possibilities in order to reduce the size of a python virtual environment?

How is it possible to reduce the size of a python virtual environment?
This might be:
Removing packages from site_packages but which one can be removed?
Removing the *.pyc files
Checking for used files as mentioned here: https://medium.com/#mojodna/slimming-down-lambda-deployment-zips-b3f6083a1dff
...
What else can be removed or stripped down? Or are there other way?
The use case is for example the upload of the virtualenv to a server with limited space (e.g. AWS Lambda function with 512 MB limit)
If there is a .pyc file, you can remove the .py file, just be aware that you will lose stack trace information from these files, which will most likely mess up any error/exception logging you have.
Apart from that there is no universal way of reducing the size of a virtualenv - it will be highly dependent on the packages you have installed, and you will most likely have to resort to trial and error or reading source code to figure out exactly what you can remove.
The best you can do is look for the packages that take up the most space and then further investigate the ones that take up the most disk space. On a *nix system with the standard coreutil commands available, you can run the following command:
du -ha /path/to/virtualenv | sort -h | tail -20
After you installed all packages you could try to remove all packages in the virutalenv that are related to installing packages.
rm -r pip*
rm -r pkg_resources*
rm -r setuptools*
Depending on which packages you have installed, the result might still work as desired since most packages wont have runtime dependencies on these three packages. Use at your own risk.
When you create your virtualenv you can tell it to use your system site_packages. If you installed all required packages globally on the system, when you created your virtualenv, it would essentially be empty.
$ pip install package1 package2 ...
$ virtualenv --system-site-packages venv
$ source venv/bin/activate
(venv) $ # now you can use package1, package2, ...
With this method you can overinstall a package. If, inside your virtualenv, you install a package, that will be used instead of whatever is on the system.

Anaconda: Permanently include external packages (like in PYTHONPATH)

I know how to install packages in Anaconda using conda install and also how to install packages that are on PyPi which is described in the manual.
But how can I permanently include packages/folders into the PYTHONPATH of an Anaconda environment so that code that I am currently working on can be imported and is still available after a reboot?
My current approach is to use sys:
import sys
sys.path.append(r'/path/to/my/package')
which is not really convenient.
Any hints?
I found two answers to my question in the Anaconda forum:
1.) Put the modules into into site-packages, i.e. the directory $HOME/path/to/anaconda/lib/pythonX.X/site-packages which is always on sys.path. This should also work by creating a symbolic link.
2.) Add a .pth file to the directory $HOME/path/to/anaconda/lib/pythonX.X/site-packages. This can be named anything (it just must end with .pth). A .pth file is just a newline-separated listing of the full path-names of directories that will be added to your path on Python startup.
Alternatively, if you only want to link to a particular conda environment then add the .pth file to ~/anaconda3/envs/{NAME_OF_ENVIRONMENT}/lib/pythonX.X/site-packages/
Both work straightforward and I went for the second option as it is more flexible.
*** UPDATE:
3.) Use conda develop i. e. conda-develop /path/to/module/ to add the module which creates a .pth file as described under option 2.).
4.) Create a setup.py in the folder of your package and install it using pip install -e /path/to/package which is the cleanest option from my point of view because you can also see all installations using pip list. Note that the option -e allows to edit the package code. See here for more information.
Thanks anyway!
I'm able to include local modules using the following:
conda-develop /path/to/module/
I hope it helps.
The way I do this, which I believe is the most native to conda, is by creating env_vars.sh files in my environment, as per the official documentation here.
For macOS and Linux users, the steps are as follows:
Go to your environment folder (e.g. /miniconda1/env/env_name). $CONDA_PREFIX is the environemnt variable for your environment path.
cd $CONDA_PREFIX
Create the activate.d and deactivate.d directories.
mkdir -p ./etc/conda/activate.d
mkdir -p ./etc/conda/deactivate.d
Inside the each respective directory, create one env_vars.sh file. The one in the activate.d directory will set (or export) your environment variables when you conda activate your environment. The file in the deactivate.d directory will serve to unset the environment variables when you conda deactivate your environment.
touch ./etc/conda/activate.d/env_vars.sh
touch ./etc/conda/deactivate.d/env_vars.sh
First edit the $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh to export the desired environment variables.
#!/bin/sh
export VAR_A='some-thing-here'
export VAR_B=/path/to/my/file/
Afterwards, open to edit the $CONDA_PREFIX/etc/conda/deactivate/env_vars.sh, in order to unset the env variables when you conda deactivate like so:
#!/bin/sh
unset VAR_A
unset VAR_B
Again, the source of my description comes straight from the conda docs here.
Just to add to Cord Kaldemeyer's answer above, for the 2nd option. If you only want to link to a particular conda environment then add the .pth file to ~/anaconda3/envs/{NAME_OF_ENVIRONMENT}/lib/pythonX.X/site-packages/

Move the virtualenvs to another host folder

By error, I forgot to specify the WORKON_HOME variable before creating my virtual environments, and they were created in /root/.virtualenvs directory. They worked fine, and I did some testing by activating certain environment and then doing (env)$ pip freeze to see what specific modules are installed there.
So, whe I discovered the workon home path error, I needed to change the host directory to /usr/local/pythonenv. I created it and moved all the contents of /root/.virtualenvs directory to /usr/local/pythonenv, and changed the value of WORKON_HOME variable. Now, activating an environment using workon command seems to work fine (ie, the promt changes to (env)$), however if I do (env)$ pip freeze, I get way longer list of modules than before and those do not include the ones installed in that particular env before the move.
I guess that just moving the files and specifying another dir for WORKON_HOME variable was not enough. Is there some config where I should specify the new location of the host directory, or some config files for the particular environment?
Virtualenvs are not by default relocatable. You can use virtualenv --relocatable <virtualenv> to turn an existing virtualenv into a relocatable one, and see if that works. But that option is experimental and not really recommended for use.
The most reliable way is to create new virtualenvs. Use pip freeze -l > requirements.txt in the old ones to get a list of installed packages, create the new virtualenv, and use pip install -r requirements.txt to install the packages in the new one.
I used the virtualenv --relocatable feature. It seemed to work but then I found a different python version installed:
$ . VirtualEnvs/moslog/bin/activate
(moslog)$ ~/VirtualEnvs/moslog/bin/mosloganalisys.py
python: error while loading shared libraries: libpython2.7.so.1.0: cannot open shared object file: No such file or directory
Remember to recreate the same virtualenv tree on the destination host.

Categories