I would like to pipe a file using the cat command into a Docker container along with arguments related to running the Python file. The command to run this Python file is going to be mentioned in the Dockerfile as
RUN ["python", "myfile.py"].
After building the Docker image using Docker build -t test ., I shall call the command cat file.txt | docker run -i test --param1 value.
I understand how arguments are accepted in the Python file using the argparse module and here is what I have:
parser = argparse.ArgumentParser()
parser.add_argument("param1")
args = parser.parse_args()
value = args.param1
My question is: how do I configure the Dockerfile to route the parameter passed from the command line (param1) into the Python file's argument parser?
My findings about doing this have just shown me how to write the cat .. | docker run ... command.
Related
I'm wondering how to use a simple script with a docker container.
The script is:
example python script
# Example python script
import argparse
import pathlib
def run(
*,
input: pathlib.Path | str,
output: pathlib.Path | str,
) -> None:
pathlib.Path(output).write_text(pathlib.Path(input).read_text().upper())
def main() -> int:
desc = "example script"
parser = argparse.ArgumentParser(
description=desc,
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument(
"-i",
"--input",
help=("input file"),
required=True,
)
parser.add_argument(
"-o",
"--output",
help=("output file"),
)
parser.add_argument(
"-x",
"--overwrite",
help=("Whether to overwrite previously created file."),
action="store_true",
)
args = parser.parse_args()
if not pathlib.Path(args.input).exists():
raise FileNotFoundError(f"input file {args.input} not found")
if not args.output:
raise argparse.ArgumentError(f"output not given")
if pathlib.Path(args.output).exists() and not args.overwrite:
raise FileExistsError(f"{args.output} already exists. ")
run(input=args.input, output=args.output)
if __name__ == "__main__":
raise SystemExit(main())
The script works fine on my system (without docker).
example docker file
The Dockerfile is:
FROM python:3.10.6-bullseye
COPY . .
ENTRYPOINT ["python", "example.py"]
This works (ish) after the following:
# build
docker build -t demo .
# run
docker run demo --help
Which outputs:
usage: example.py [-h] -i INPUT [-o OUTPUT] [-x]
example.
options:
-h, --help show this help message and exit
-i INPUT, --input INPUT
input file
-o OUTPUT, --output OUTPUT
output file
-x, --overwrite Whether to overwrite previously created file.
But I'm not sure how to use it with the -i and -o arguments.
what I'd like to do
I'd like to be able to do the following:
echo "text" > input.txt
# Create output from input
docker run demo -i input.txt -o output.txt
# Create output from input and say it's ok to overwrite
docker run demo -i input.txt -o output.txt -x
And after this there by a output.txt file created which has TEXT in it.
Error
I've tried to do this with the above command, and it doesn't work.
Eg:
echo "this" > input.txt
docker run demo -i input.txt -o output.txt -x
After this there is no output.txt file created which has THIS in it.
Attempted solution (--mount within the shell command)
Using the following seems to work - but it feels as though It's a lot in a shell command :
docker run \
--mount type=bind,source="$(pwd)",target=/check \
--workdir=/check demo:latest \
-i input.txt -o output.txt -x
Is there a way to do the --mount within the dockerfile itself?
I am doing a similar thing by running a compiler inside the docker container.
Obviously the docker image gets built whenever there is a new version of the compiler or the underlying image.
The container gets to run whenever I want to compile something. And here I have to mount source and target directories, but my docker command looks smaller than yours:
docker run --rm -v /sourcecode:/project:ro -v /compiled:/output:rw -v cache:/cache:rw compilerimagename
All the rest is defined within the image.
I wan't to check whether a server/device is reachable or not, by using a python script running inside docker.
I'm using python 3.7.
The script looks like the following snipped (scriped down)
import platform
import subprocess
import asyncio
from argparse import ArgumentParser
from time import sleep
from models.configuration import Configuration
parser = ArgumentParser()
# device ip or hostname
parser.add_argument(
'-d', '--device',
type=str,
required=True,
)
async def main():
args = parser.parse_args()
configuration = Configuration(args) # the configuration object stores the arguments
param = '-n' if platform.system().lower() == 'windows' else '-c'
while True:
result = subprocess.call(['ping', param, '1', configuration.device])
print(f'## {result}')
# TODO: result = 0 => success, result > 0 => failure
sleep(5)
if __name__ == '__main__':
asyncio.run(main())
My Dockerfile:
FROM python:3.7
WORKDIR /usr/src/app
COPY . .
RUN pip install --no-cache-dir -r requierments.txt
ENTRYPOINT [ "python3", "./main.py", "-d IP_OR_HOSTNAME" ]
I also tried CMD instead of ENTRYPOINT.
I build and start the container with the following commands
docker build -t my-app .
docker run -it --network host --name my-app my-app
Running the script by docker, the ping command exits with the exit code 2 (Name or Service not known).
When I start the script from the shell inside the container (python3 /usr/src/app/main.py -d IP_OR_HOSTNAME), the device is reachable (exit code 0).
As far as I know I have to use the network mode host.
Any ideas why the script cannot reach the device when launched by docker, but from shell inside the container?
(I am open to suggestions for a better title)
The various Dockerfile commands that run commands have two forms. In shell form, without special punctuation, Docker internally runs a shell and the command is broken into words using that shell's normal rules. If you write the command as a JSON array, though, it uses an exec form and the command is executed with exactly the words you give it.
In your command:
ENTRYPOINT [ "python3", "./main.py", "-d IP_OR_HOSTNAME" ]
There are three words in the command: python3, ./main.py, and then as a single argument -d IP_OR_HOSTNAME including an embedded space. When the Python argparse module sees this, it interprets it as a single option with the -d short option and text afterwards, and so the value of the "hostname" option is IP_OR_HOSTNAME including a leading space.
There are various alternate ways to "spell" this that will have the effect you want:
# Split "-d" and the argument into separate words
ENTRYPOINT ["python3", "./main.py", "-d", "IP_OR_HOSTNAME"]
# Remove the space between "-d" and its option
ENTRYPOINT ["python3", "./main.py", "-dIP_OR_HOSTNAME"]
# Use the shell form to parse the command into words "normally"
ENTRYPOINT python3 ./main.py -d IP_OR_HOSTNAME
I have a very simple docker file
FROM python:3
WORKDIR /usr/src/app
ENV CODEPATH=default_value
ENTRYPOINT ["python3"]
CMD ["/usr/src/app/${CODEPATH}"]
Here is my container command
docker run -e TOKEN="subfolder/testmypython.py" --name mycontainer -v /opt/testuser/pythoncode/:/usr/src/app/ -t -d python-image:latest
when I see container logs it shows
python3: can't open file '/usr/src/app/${TOKEN}': [Errno 2] No such file or directory
It looks like what you want to do is override the default path to the python file which is run when you launch the container. Rather than passing this option in as an environment variable, you can just pass the path to the file as an argument to docker run, which is the purpose of CMD in your dockerfile. What you set as the CMD option is the default, which users of your image can easily override by passing an argument to the docker run command.
doker run --name mycontainer -v /opt/testuser/pythoncode/:/usr/src/app/ -t -d python-image:latest "subfolder/testmypython.py"
Environment variable name CODEPATH but your setting TOKEN as Environment variable.
could you please try setting CODEPATH as env in following way
doker run -e CODEPATH="subfolder/testmypython.py" --name mycontainer -v /opt/testuser/pythoncode/:/usr/src/app/ -t -d python-image:latest
The way you've split ENTRYPOINT and CMD doesn't make sense, and it makes it impossible to do variable expansion here. You should combine the two parts together into a single CMD, and then use the shell form to run it:
# no ENTRYPOINT
CMD python3 /usr/src/app/${CODEPATH}
(Having done this, better still is to use the approach in #allan's answer and directly docker run python-image python3 other-script-name.py.)
The Dockerfile syntax doesn't allow environment expansion in RUN, ENTRYPOINT, or CMD commands. Instead, these commands have two forms.
Exec form requires you to format the command as a JSON array, and doesn't do any processing on what you give it; it runs the command with an exact set of shell words and the exact strings in the command. Shell form doesn't have any special syntax, but wraps the command in sh -c, and that shell handles all of the normal things you'd expect a shell to do.
Using RUN as an example:
# These are the same:
RUN ["ls", "-la", "some directory"]
RUN ls -la 'some directory'
# These are the same (and print a dollar sign):
RUN ["echo", "$FOO"]
RUN echo \$FOO
# These are the same (and a shell does variable expansion):
RUN echo $FOO
RUN ["/bin/sh", "-c", "echo $FOO"]
If you have both ENTRYPOINT and CMD this expansion happens separately for each half. This is where the split you have causes trouble: none of these options will work:
# Docker doesn't expand variables at all in exec form
ENTRYPOINT ["python3"]
CMD ["/usr/src/app/${CODEPATH}"]
# ["python3", "/usr/src/app/${CODEPATH}"] with no expansion
# The "sh -c" wrapper gets interpreted as an argument to Python
ENTRYPOINT ["python3"]
CMD /usr/src/app/${CODEPATH}
# ["python3", "/bin/sh", "-c", "/usr/src/app/${CODEPATH}"]
# "sh -c" only takes one argument and ignores the rest
ENTRYPOINT python3
CMD ["/usr/src/app/${CODEPATH}"]
# ["/bin/sh", "-c", "python3", ...]
The only real effect of this ENTRYPOINT/CMD split is to make a container that can only run Python scripts, without special configuration (an awkward docker run --entrypoint option); you're still providing most of the command line in CMD, but not all of it. I tend to recommend that the whole command go in CMD, and you reserve ENTRYPOINT for a couple of more specialized uses; there is also a pattern of putting the complete command in ENTRYPOINT and trying to use the CMD part to pass it options. Either way, things will work better if you put the whole command in one directive or the other.
tl;dr: how do I do file I/O + argument-passing with docker? Or should I give up on trying to use containers like scripts?
I am trying to learn docker and I am having a hard time with some minimal examples of common I/O and argument-passing situations. I have looked through a lot of StackOverflow content such as here as well as Docker documentation, but it seems like this is so simple-minded that no one bothered to answer it. The closest is here, but the answers are not helpful and mostly seem to say "don't do this with Docker". But people seem to talk about containers as if they can do this kind of thing in standalone applications.
Briefly, it seems like in Docker all I/O paths need to be hard-coded, but I want to be able to have these paths be flexible because I want to use the container as flexibly as I could a script.
In some cases people have solved this by leaving the container idling and then passing arguments into it (e.g. here or here) but that seems rather convoluted for a simple purpose.
I am not looking for a way to do this using venvs/conda whatever, I would like to see if it is possible using Docker.
Say I have a simple Python script called test.py:
#!/usr/bin/env python3
import argparse
def parse_args():
'''Parse CLI arguments
Returns:
dict: CLI arguments
'''
parser = argparse.ArgumentParser(description='Parse arguments for test')
parser.add_argument('--out_file', '-o', required=True, type=str, help='output file')
parser.add_argument('--in_file', '-i', required=True, type=str, help='input file')
args = parser.parse_args()
return vars(args)
args = parse_args()
with open(args["in_file"]) as input_handle:
print(input_handle.readline())
with open(args["out_file"], "w") as output_handle:
output_handle.write("i wrote to a file")
Which natively in Python I can run on some input files:
% cat ../input.txt
i am an input file
% python test.py -i ../input.txt -o output.txt
i am an input file
% cat output.txt
i wrote to a file%
Let's say that for whatever reason this script needs to be dockerized while preserving the way arguments/files are passed so that people can run it without docker. I can write a very simple-minded Dockerfile:
FROM continuumio/miniconda3
COPY . .
ENTRYPOINT ["python", "test.py"]
and this will accept the arguments, but it can't access the input file, and even if it finishes, then I can't access the output:
% docker build .
Sending build context to Docker daemon 5.632kB
Step 1/3 : FROM continuumio/miniconda3
---> 52daacd3dd5d
Step 2/3 : COPY . .
---> 2e8f439e6766
Step 3/3 : ENTRYPOINT ["python", "test.py"]
---> Running in 788c40568687
Removing intermediate container 788c40568687
---> 15e93a7e47ed
Successfully built 15e93a7e47ed
% docker run 15e93a7e47ed -i ../input.txt -o output.txt
Traceback (most recent call last):
File "test.py", line 19, in <module>
with open(args["in_file"]) as input_handle:
FileNotFoundError: [Errno 2] No such file or directory: '../input.txt'
I can then attempt to mount input file's directory using the /inputs/ volume, which gets me most of the way there (though it's irritating to pass 2 arguments for 1 file), but this doesn't seem to work:
docker run --volume /path/to/input_dir/:/inputs 15e93a7e47ed -i input.txt -o output.txt
Traceback (most recent call last):
File "test.py", line 19, in <module>
with open(args["in_file"]) as input_handle:
FileNotFoundError: [Errno 2] No such file or directory: 'input.txt'
I am clearly not understanding something about how volumes are mounted here (probably setting WORKDIR would do a lot of this work), but even if I can mount the volume, it is not at all clear how to get the outputs onto the mounted volume so they can be accessed from outside the container. There are some manual solutions to this using docker cp but the whole point is to be somewhat automated.
It seems that string manipulation of the ENTRYPOINT or CMD within the Dockerfile is not possible. It seems that approaches like this are not feasible:
ENTRYPOINT ["python", "test.py", "-i data/{i_arg}", "-o data/{o_arg}"]
Where I could just write a file to some variable filename on a mounted volume /data/ that I can substitute in at run-time.
If you really want to run this script in Docker, a minimum set of options that are pretty much always required are:
sudo \ # since you can bind-mount an arbitrary host directory
docker run \
--rm \ # clean up the container when done
-it \ # some things depend on having a tty as stdout
-u $(id -u):$(id -g) \ # use host uid/gid
-v "$PWD:$PWD" \ # mount current directory into container
-w "$PWD" \ # set working directory in container
image-name \
-i input.txt -o output.txt # .. won't work here
As the last comment notes, this makes the current directory accessible to the container on the same path, but if the file you want to access is in the parent directory, it can't reach there.
Fundamentally, a Docker container is intended to be fairly isolated from the host system. A container can't normally access host files, or host devices, or see the host uid-to-name mappings. That isolation leads to many of the things you note: since a container is already isolated, you don't need a virtual environment for additional isolation; since a container is isolated, /input is a much easier directory name to remember than /home/docker/src/my-project/data/input.
Since a container is isolated from the host, any host files that need to be accessed – either inputs or outputs – need to be bind-mounted into the container. In my example I bind-mount the current directory. In your example where you have separate /input and /output container directories, both need to be bind-mounted into the container.
There's not really a way to make this easier and still use Docker; running processes on host data aren't what it's designed for. All of your examples are in Python, and Linux and MacOS systems generally come with Python pre-installed, so you might find it much more straightforward to run the script, possibly in a virtual environment.
python3 -m venv venv # once only
./venv/bin/pip install . # once only
./venv/bin/the_script -i ../input.txt output.txt
I can run a bash script local to my docker client (not local to the docker host or targeted container), without using volumes or copying the script to the container:
docker run debian bash -c "`cat script.sh`"
Q1 How do I do the equivalent on a django container? The following have not worked but my help demonstrate what Im asking for (the bash script printf the python script line with the expaned args):
docker run django shell < `cat script.py`
cat script.py | docker run django shell
Q2 How do I pass arguments to script.py passed to a dockerized managed.py? Again, examples of what does not work (for me):
./script.sh arg1 arg2 | docker run django shell
docker run django shell < echo "$(./script.sh arg1 arg2)"
I think the best way for you is to use custom Dockerfile that uses COPY or ADD command to move whatever scripts you into the container.
As for passing arguments you can either use ENTRYPOINT command in your image, like the example below:
ENTRYPOINT django shell /home/script.sh
Then you can use docker run arg1 arg2 to pass the arguments
This is the link to pass the command line arguments to python: http://www.tutorialspoint.com/python/python_command_line_arguments.htm
eg: python script.py -param1
If the script is already available in the docker you can trigger it using Dockerfile(with passing parameters)
RUN /script.py -param1 <value>
Extra:
Having said that it is always difficult to change the Dockerfile if there are more parameters to be changed frequently.Hence a small shell script can be written as a wrapper to Dockerfile like this:
Dockerwrapper.sh
pass parameters to Dockerfile
dockerbuild --tag <name> .
-
Dockerfile
RUN python script.py -param1 $1
I -------------------------------------------
IF script is not present inside docker
You can copy the script inside and then delete it by using COPY,RUN command...
(Reason: Since docker is an isolated environment, running from outside is not possible (I GUESS..))
Hoped it answered ur question.
All the best