I'm trying to port in python an ECS services deployment that at the moment is done with a bunch of bash script containing commands like the following:
ecs-cli compose -f foo.yml -p foo --cluster bar --ecs-params "dir/ecs-params.yml" service up
I thought that the easiest/fastest way could be using boto3 (which I already extensively use elsewhere so It's a safe spot), but I didn't understand from the documentation what would be the instruction equivalent of the formerly written command.
Thanks in advance.
UPDATE: this is the content of foo.yml:
version: '3'
services:
my-service:
image: ecr-image:version
env_file:
- ./some_envs.env
- ./more_envs.env
command: python3 src/main.py param1 param2
logging:
driver: awslogs
options:
awslogs-group: /my-service-log-group
awslogs-region: my-region
awslogs-stream-prefix: my-prefix
UPDATE2: this is the content of dir/ecs-params.yml:
version: 1
task_definition:
task_role_arn: my-role
services:
my-service:
cpu_shares: my-cpu-shares
mem_reservation: my-mem-reservation
The ecs-cli is a high level construct that creates a workflow that wraps many lower level API calls. NOT the same thing but you can think of the ecs-cli compose up command the trigger to deploy what's included in your foo.yml file. Based on what's in your foo.yml file you can walk backwards and try to map to single atomic ECS API calls.
None of this answers your question but, for background, the ecs-cli is no longer what we suggest to use for deploying on ECS. Its evolution is Copilot (if you are not starting from a docker compose story) OR the new docker compose integration with ECS (if docker compose is your jam).
If you want / can post the content of your foo.yml file I can take a stab at how many lower level API calls you'd need to make to do the same (or suggest some other alternatives).
[UPDATE]
Based on the content of your two files you could try this one docker compose file:
services:
my-service:
image: ecr-image:version
env_file:
- ./some_envs.env
- ./more_envs.env
x-aws-policies:
- <my-role>
deploy:
resources:
limits:
cpus: '0.5'
memory: 2048M
Some of the ECS params are interpreted off the compose spec (e.g. resource limits). Some other do not have a specific compose-ECS mapping so they are managed through x-aws extensions (e.g. IAM role). Please note that compose only deploy to Fargate so the shares do not make much sense and you'd need to use limits (to pick the right Fargate task size). As a reminder this is an alternative CLI way to deploy the service to ECS but it does not solve for how you translate ALL API calls to boto3.
Related
I have been assigned a task to develop a solution that shut down all microservice processes running on a given EC2 instance in parallel and then shut down the EC2 instance itself and do this on a set of EC2 instances in parallel. This is supposed to take input from a YAML configuration file (let's call it the parent) similar to the following that identifies the mount point of the micro services:
fabric: usprod1
sequence:
stateless:
- admin-portal
- dashboard
- haraka
- vm-prometheus
- watchtower-server
- web-analytics-service
dbclusters:
- kafka
- druid
- rabbitmq
zkclusters:
- zookeeper
shared:
- eureka
bootstrap:
- consul
- census
My solution is supposed to create "child" SSM documents that correspond to each mount point within a service group where there might be multiple EC2 instances associated with each mount point. I've reviewed the following web pages, but they don't give me any insight as to how I'm supposed to use the parent YAML file to generate the children:
How do I pass multiple parameters to AWS SSM send_command with Boto3
https://docs.aws.amazon.com/systems-manager/latest/userguide/create-ssm-document-api.html
This case can be tackled using few approaches but I would suggest do some reading on Lambda or Step Functions first and then you should be good to go.
I have a DAG in airflow that uses the KubernetesPodOperator and I am trying to get some files that are generated by the container running in the pod back to the airflow host. For development my host is a Docker container running an airflow image with a docker-desktop K8s cluster and for production I am using an AWS EC2 box with EKS.
volume_mount = VolumeMount('dbt-home',
mount_path=<CONTAINER_DIR>,
sub_path=None,
read_only=False)
volume_config= {
'hostPath':
{'path': <HOST_DIR>, 'type': 'DirectoryOrCreate'}
}
volume = Volume(name="dbt-home", configs=volume_config)
dbt_run = KubernetesPodOperator(
namespace='default',
image=MY_IMAGE>,
cmds=["bash", "-cx"],
arguments=[command],
env_vars=MY_ENVIRONMENT,
volumes=[volume],
volume_mounts=[volume_mount],
name="test-run",
task_id="test-run-task",
config_file=config_file,
get_logs=True,
reattach_on_restart=True,
dag=dag
)
I tried using the hostPath type for the volume but i think that it refers to the host of the pod. I looked in the kubernetes documentation around volumes where I found the EmptyDir one which didnt work out either.
Based on your comment, you are asking how one task run in a pod can complete and write logs to a location that another task run in a pod can read when it starts. It seems like you could do a few things.
You could just have your task that starts get the logs of the previous pod that completed via either kubectl get logs (ie- put kubectl into your task image and permission its service account to get the logs of pods in that namespace) or use the Kubernetes python API to get the logs.
You could mount a pvc into your initial Task at a certain location and write the logs there, and then when it completes, you can mount that same pvc into your next Task and it can read the logs from that location. You could use ebs if it will only be mounted into one pod at a time, or you could use nfs if it will be mounted into many pods at a time. Probably nfs would make sense so that you could share your logs across many Tasks in pods at once.
You can ship your logs to Cloudwatch via fluentd. Your task could then query Cloudwatch for the previous task's logs. I think that shipping your logs to Cloudwatch is a good practice anyway, and so you may as well do that.
I am not sure if you are looking for a more airflow native way of doing this, but those are ideas that come to mind that would solve your problem.
I know this question has been asked in various ways already, but so far none of the existing answers seem to work as they all reference docker-compose which I'm already using.
I'm trying to start a multi-container service (locally for now). One is a web frontend container running flask and exposing port 5000 (labeled 'web_page' in my docker-compose file). The other container is a text generation model (labeled "model" in my docker-compose file).
Here is my docker-compose.yml file:
version: '3'
services:
web_page:
build: ./web_app
ports:
- "5000:5000"
model:
build: ./gpt-2-cloud-run
ports:
- "8080:8080"
After I run docker-compose up and I use a browser (or postman) and go to 0.0.0.0:5000 or 0.0.0.0:8080 I get back a response and it shows exactly what I expect to get back. So both services are up and running and responding on the correct ip/port. But when I click "submit" on the web_page to send the request to the 'model" I get a connection error even though both ip/ports are responding if I test them.
If I run the 'model' container as a stand alone container and just start up the web_page app NOT in a container it works fine. When I put BOTH in containers the web_page immediately gives me
requests.exceptions.ConnectionError
Within the web_page.py code is:
requests.post('http://0.0.0.0:8080',json={'length': 100, 'temperature': 0.85,"prefix":question})
which goes out to that IP with the payload and receives the response back. Again, this works fine when the 'model' is running in a container and has port 8080:8080 mapped. When the web_page is running in the container it can't reach the model endpoint for some reason. Why would this be and how could I fix it?
Looks like you're using the default network that gets spun up by docker-compose (so it'll be named something like <directory-name_default>). If you switch your base URL for the requests to the host name of the backend docker container (so model rather than 0.0.0.0), your requests should be able to succeed. Environment variables are good here.
Btw incase you weren't aware you don't need to expose the backend application if you only ever intend on it being access by the frontend one. They both sit in the same docker network so they'll be able to talk to one another.
Elements of the other answers are correct but there are a couple points that are missing or were assumed in the other answers but not were not made explicit.
According to the docker documentation the default bridge network created will NOT provide dns resolution to the image name; only to ip addresses of other containers https://docs.docker.com/network/bridge/#differences-between-user-defined-bridges-and-the-default-bridge
So, my final compose file was:
version: '3'
services:
web_page:
build: ./web_app
ports:
- "5000:5000"
networks:
- bot-net
depends_on:
- model
model:
image: sports_int_bot_api_model
networks:
- bot-net
networks:
bot-net:
external: true
After I created a 'bot-net' network first on the CLI. I don't know that that is necessarily what has to be done, perhaps you can create a non-default bridge network in the docker-compose file as well. But it does seem, that you cannot use the default bridge network created and resolve an image name (per the docs)
The final endpoint that I pointed to is:
'http://model:8080'
I suppose this was alluded too in the other answers but they omitted the need to include the 'http' section. This also is not shown in the docs, where they use the name of the image in-place of http as in the docker example they use postgres://db:5432
https://docs.docker.com/compose/networking/
I have a python program that can control a Kubernetes cluster (from outside). During that program execution, it obtains a byte array.
I have full pod spec ready to be created.
I need to modify the pod spec (adding an init container) so that when the main container starts, there is a file somewhere with those exact bytes.
What's the easiest way to do that?
If I understand your question correctly, you want to run a Python script that will extract or derive a byte array from somewhere before your Pod starts, and writes this byte array to a file for your actual application to read from when running inside the Pod.
I can see 2 ways to achieve this:
Slightly modify your Docker image to run your script as an entrypoint and then running your application (command: & args: in your Pod spec). You would ship both together and won't need an initContainer.
Or as you were tending for: use a combination of initContainer and Volumes
For the later:
template:
spec:
volumes:
- name: byte-array
emptyDir: {}
initContainers:
- name: byte-array-generator
image: your/init-image:latest
command: ["/usr/bin/python", "byte_array_generator.py"]
volumeMounts:
- mountPath: /my/byte-array/
name: byte-array
containers:
- name: application
image: your/actual-app:latest
volumeMounts:
- name: byte-array
mountPath: /byte-array/
I condensed all 3 parts:
1 empty volume definition used to pass over the file
1 initContainer with your script generating the byte array and writing it to disk in let's say /my/byte-array/bytearray.bin (where the volume has been mounted)
1 actual container running your application and reading the byte array from /byte-array/bytearray.bin (where the volume has been mounted)
One important note from there you also need to take into consideration is: if you mount your volume on a pre-existing folder with actual files, they will all "overwritten" by your volume. The source will take the destination's place.
You might be able to prevent that using subPath but I never tried with this way, I only know it works if you mount ConfigMaps as volumes.
Edit: the answer to your comment is too long
Outside the container or outside the Kubernetes cluster ? Maybe there is a misconception, so just in case: an initContainer doesn't have to use the same image as your Pod's. You could even load your script as a ConfigMap and mount it into the initContainer using Python base image to run it if you want...
But if really your script has to be running outside of the cluster and send a file enabling your Pod to start, I'd suggest you add a logic to your byte generation that will output it to a file taking the Pod hostname for example (from Kubernetes API), and scp it to the Kubernetes node running it (pulled from Kubernetes API too) into a know destination. Just define a folder on each of your nodes, like /var/data/your_app/ and mount it on all your pods.
volumes:
- hostPath:
path: /var/data/your_app
type: Directory
name: bite-arrays
then mount bite-arrays wherever you want in whatever container needs to read it by reusing its hostname (to allow you to scale if necessary).
Since you said your script is controlling the cluster, I assume it's already talking to Kubernetes' API... You might also want to create a logic to cleanup left-overs...
Or maybe we got it all wrong and somehow your script is also generating and applying the Pod spec on the fly, which in this case could just be solved by an environment variable or a ConfigMap shipped alongside...
Pod spec:
volumes:
- name: byte-array
configMap:
name: your-app-bytes
volumeMounts:
- name: byte-array
mountPath: /data/your-app/byte-array
readOnly: true
subPath: byte-array
ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: your-app-bytes
data:
byte-array: |-
WHATEVERBYTESAREGENERATEDHERE
This is more of an opinion question/answer.
If it happens that your python script generates specific bytes I would go with an initContainer with a volume. Something like this:
initContainers:
- name: init-container
image: container:version
command: [ 'your-python.py' ]
volumeMounts:
- name: conf
mountPath: /mnt/conf.d
containers:
- name: app-container
image: container:version
command: [ 'your-actual-app' ]
volumeMounts:
- name: conf
mountPath: /mnt/conf.d
If your bytes are straight UTF-8 characters, for example, it's easier to just use a ConfigMap
I want to use Celery to implement a task queue to perform long(ish) running tasks like interacting with external APIs (e.g. Twilio for SMS sending). However, I use different API credentials in production and in development.
I can't figure out how to statically configure Celery (i.e. from the commandline) to pass in the appropriate API credentials. Relatedly, how does my application code (which launches Celery tasks) specify which Celery queue to talk to if there are both development and production queues?
Thanks for any help you can offer.
Avi
EDIT: additional bonus for a working example of how to use the --config option of celery.
The way that I do it is using an environment variable. As a simple example...
# By convention, my configuration files are in a "configs/XXX.ini" file, with
# XXX being the configuration name (e.g., "staging.ini")
config_filename = os.path.join('configs', os.environ['CELERY_CONFIG'] + '.ini')
configuration = read_config_file(config_filename)
# Now you can create the Celery object using your configuration...
celery = Celery('mymodule', broker=configuration['CELERY_BROKER_URL'])
#celery.task
def add_stuff(x, y):
....
You end up running from the command line like so...
export CELERY_CONFIG=staging
celery -A mymodule worker
This question has an example of doing something like this, but they say "how can I do this in a way that is not so ugly?" As far as I'm concerned, this is quite acceptable, and not "ugly" at all.
According to the twelve factor app, you should use environment variables instead of command line parameters.
This is specially true if you are using sensitive information like access credentials because they are visible in the ps output. The other idea (storing credentials in config files) is far from ideal because you should avoid storing sensitive information in the VCS.
That is why many container services and PaaS providers favor this approach: easier instrumentation and automated deployments.
You may want to take a look at Python Deployment Anti-patterns.