How can I configure a Docker container to use a custom pip.conf file?
This does not (seem to) work for me:
from python:3.9
COPY pip.conf ~/.config/pip/pip.conf
where pip.conf is a copy of the pip configuration that points to a proprietary package repository.
The problem is the ~ expansion. Just define like this, using the dest path explicitly.
from python:3.9
COPY pip.conf /root/.config/pip/pip.conf
If you want to use something other than /root/.config then consider to add a WORKDIR instruction and specify paths relative to that.
Related
If I search for class in non-project items, I get a too many matches:
All except one match are from virtualenvs in a .tox directory.
I know that I can change the workdir in via the tox.ini file.
Goal: I don't want my IDE to see these venvs.
But, which directory to use?
/tmp/ is not good, because this gets deleted on reboot
/var/tmp is not good because some team members use a Windows-PC.
How to set the workdir of tox, so that IDEs don't see these files? This needs to work for Linux, windows, mac.
I have a python file containing generic functions named utils.py and another set of programs say, pgm1.py, pgm2.py, pgm3.py which imports utils.py and invokes it's functions eg: utils.send_email(), utils.time_convert() etc..
My requirement is to dockerize utils, pgm1, pgm2, and pgm3 in different containers and still be able to access the generic functions.
Can someone tell me how this can be achieved
You have to declare the dependency using the normal Python packaging tools (in your setup.cfg file, or using a tool like Pipfile or Poetry) and import it into each image.
A Docker image contains an application, plus all of its dependencies. A container can't access the files, libraries, or applications in other containers. So it doesn't make sense to have "a container of generic functions" that's not running an application, and you can't "import libraries from another container"; since the filesystems are isolated from each other, one container can't access the *.py files in another.
As you've described the problem, you're dealing with a small number of individual files, and the size of the Python interpreter will be much larger than any individual script. In this case it's fine to create an image that includes all of them
FROM python:3.10
WORKDIR /app
COPY requirements.txt ./
RUN pip install -r requirements.txt
COPY ./ ./
CMD ["./pgm1.py"] # for example; do not use ENTRYPOINT here
You can then easily override that CMD when you run the container to select a different program.
docker run -d --name program-1 my-image ./pgm1.py
docker run -d --name program-2 my-image ./pgm2.py
I want to use the "newenv" approach, but now I need to make changes to psynet (for example adding a log line or rpdb.set_trace()), where is this local copy? And can I simply change it or I need to reinstall the environment with reenv?
I believe this is the right workflow:
In a new project, start by creating a new environment, with the command newenv, which currently stands for
alias newenv="python3 -m venv env && source env/bin/activate && pip install -r constraints.txt"
This will install all the dependencies in your constraints file, which should include the Psynet version too.
If you make any changes in the code (e.g., experiment.py) that do not involve any change in your dependencies/ packaged, then you do not need to do anything. But if your changes involve the packages your are working with in your environment (a change in PsyNet or some of your working packages, like REPP or melody-experiments), then you will need to first push the changes in GITHUB and then reinstall your working environment, using the command reenv, which stands for:
alias reenv="rm -rf env && newenv && source env/bin/activate && pip install -r constraints.txt"
There are other situations in which one only wants to make a change in requirements.txt, such as adding a new package or changing PsyNet's version. In this case, after changing your requirements, you should run dallinger generate-constraints, which will update your constraints.txt accordingly Then, run reenv.
If you want to achieve this, you should edit the source code files within your env directory, which will be located in your project directory. PsyNet will be located somewhere like this:
env/lib/python3.9/site-packages/psynet
Note that any changes will be reset the next time you run newenv.
The simplest way to do this is certainly to edit the file in the venv, but as has been pointed out it's easy to lose track of that change and overwrite it. I do that anyway if it's a short term test.
However, my preferred approach when I want to modify a library like this is to clone its source code from Git, do the modifications in my cloned sandbox, and do a pip install -e . inside the module after having activated the venv. That will make this venv replace the previously installed version of the library with my sandbox version of it.
I heard changing XDG_CACHE_DIR or XDG_DATA_HOME fixes that but I did
export XDG_CACHE_DIR=<new path>
export XDG_DATA_HOME=<new path>
I've also tried
pip cache dir --cache-dir <new path>
and
pip cache --cache-dir <new path>
and
--cache-dir <new path>
and
python --cache-dir <new path>
from https://pip.pypa.io/en/stable/reference/pip/#cmdoption-cache-dir
and when I type
pip cache dir
It's still in the old location. How do I change the directory of pip cache?
TL;TR;: Changing XDG_CACHE_HOME globally with use of export like some people suggested will not only affect pip but also other apps as well. And you simply do not want to mess that much because there's not necessary really. So long story short, before I dive into alternatives, do not change XDG_CACHE_HOME globally unless you really sure you want to do that!
So what are your alternatives then? You should be using pip's --cache-dir <dir> command line argument instead or, at least, if you want to go that way, you should override XDG_CACHE_HOME value for pip invocation only:
XDG_CACHE_HOME=<path> pip ...
which also can be made more permanent by using shell alias feature:
alias pip="XDG_CACHE_HOME=<path> pip"
BUT, but, but... you do not need to touch XDG_CACHE_HOME at all, as pip can have own configuration file, in which you can override all of the defaults to match your needs, including alternative location of cache directory. Moreover, all command line switches have accompanying environment variables that pip looks for at runtime, which looks like the cleanest approach for your tweakings.
In your particular case, --cache-dir can be provided via PIP_CACHE_DIR env variable. So you can either set it globally:
export PIP_CACHE_DIR=<path>
or per invocation:
PIP_CACHE_DIR=<path> pip ...
or, you create said pip's configuration file and set it there.
See docs for more information about pip config file and variables.
To simplify the other answer:
# find the config file location under variant "global"
pip config list -v
# create the file and add
[global]
cache-dir=/path/to/dir
# test if it worked
pip config list
pip cache dir
For reference I've looked at the following links.
Python Imports, Paths, Directories & Modules
Importing modules from parent folder
Importing modules from parent folder
Python Imports, Paths, Directories & Modules
I understand that I'm doing is wrong and I'm trying to avoid relative path and changing things in via sys.path as much as possible, though if those are my only options, please help me come up with a solution.
Note, here is an example of my current working directory structure. I think I should add a little more context. I started off adding __init__.py to every directory so they would be considered packages and subpackages, but I'm not sure that is what I actually want.
myapp/
pack/
__init__.py
helper.py
runservice/
service1/
Dockerfile
service2/
install.py
Dockerfile
The only packages I will be calling exist in pack/ directory, so I believe that should be the only directory considered a package by python.
Next, the reason why this might get a little tricky, ultimately, this is just a service that builds various different containers. Where the entrypoints will live in python service*/install.py where I cd into the working directory of the script. The reason for this, I don't want container1 (service1) to know about the codebase in service2, as its irrelevant I would like and the code to be separated.
But, by running install.py, I need to be able to do: from pack.helper import function but clearly I am doing something wrong.
Can someone help me come up with a solution, so I can leave my entrypoint to my container as cd service2, python install.py.
Another important thing to note, within the script I have logic like:
if not os.path.isdir(os.path.expanduser(tmpDir))
I am hoping any solution we come up with, will not affect the logic here?
I apologize for the noob question.
EDIT:
Note, I I think I can do something like
sys.path.append(os.path.join(os.path.dirname(__file__), '..'))
But as far as I understand, that is bad practice....
Fundamentally what you've described is a supporting library that goes with a set of applications that run on top of it. They happen to be in the same repository (a "monorepo") but that's okay.
The first step is to take your library and package it up like a normal Python library would be. The Python Packaging User Guide has a section on Packaging and distributing projects, which is mostly relevant; though you're not especially interested in uploading the result to PyPI. You at the very least need the setup.py file described there.
With this reorganization you should be able to do something like
$ ls pack
pack/ setup.py
$ ls pack/pack
__init__.py helper.py
$ virtualenv vpy
$ . vpy/bin/activate
(vpy) $ pip install -e ./pack
The last two lines are important: in your development environment they create a Python virtual environment, an isolated set of packages, and then install your local library package into it. Still within that virtual environment, you can now run your scripts
(vpy) $ cd runservice/service2
(vpy) $ ./install.py
Your scripts do not need to modify sys.path; your library is installed in an "expected" place.
You can and should do live development in this environment. pip install -e makes the virtual environment's source code for whatever's in pack be your actual local source tree. If service2 happens to depend on other Python libraries, listing them out in a requirements.txt file is good practice.
Once you've migrated everything into the usual Python packaging scheme, it's straightforward to transplant this into Docker. The Docker image here plays much the same role as a Python virtual environment, in that it has an isolated Python installation and an isolated library tree. So a Dockerfile for this could more or less look like
FROM python:2.7
# Copy and install the library
WORKDIR /pack
COPY pack/ ./
RUN pip install .
# Now copy and install the application
WORKDIR /app
COPY runservice/service2/ ./
# RUN pip install -r requirements.txt
# Set standard metadata to run the application
CMD ["./install.py"]
That Dockerfile depends on being run from the root of your combined repository tree
sudo docker build -f runservice/service2/Dockerfile -t me/service2 .
A relevant advanced technique is to break this up into separate Docker images. One contains the base Python plus your installed library, and the per-application images build on top of that. This avoids reinstalling the library multiple times if you need to build all of the applications, but it also leads to a more complicated sequence with multiple docker build steps.
# pack/Dockerfile
FROM python2.7
WORKDIR /pack
COPY ./ ./
RUN pip install .
# runservice/service2/Dockerfile
FROM me/pack
WORKDIR /app
COPY runservice/service2/ ./
CMD ["./install.py"]
#!/bin/sh
set -e
(cd pack && docker build -t me/pack .)
(cd runservice/service2 && docker build -t me/service2 .)