Docker compose file for airflow 2 ( version 2.0.0 )

Docker compose file for airflow 2 ( version 2.0.0 ) - python

I am looking to write docker compose file to locally execute airflow in production similar environent.
For older airflow v1.10.14, docker compose is working fine. But same docker compose is not working for latest stable version, airflow scheduler & webservice is failing continuously. error message looks like unable to create audit tables.
docker-compose.yaml:
version: "2.1"
services:
postgres:
image: postgres:12
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
ports:
- "5433:5432"
scheduler:
image: apache/airflow:1.10.14
restart: always
depends_on:
- postgres
- webserver
env_file:
- .env
ports:
- "8793:8793"
volumes:
- ./dags:/opt/airflow/dags
- ./airflow-logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
command: scheduler
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
interval: 30s
timeout: 30s
retries: 5
webserver:
image: apache/airflow:1.10.14
hostname: webserver
restart: always
depends_on:
- postgres
env_file:
- .env
volumes:
- ./dags:/opt/airflow/dags
- ./scripts:/opt/airflow/scripts
- ./airflow-logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
ports:
- "8080:8080"
entrypoint: ./scripts/airflow-entrypoint.sh
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
interval: 30s
timeout: 30s
retries: 5
.env:
AIRFLOW__CORE__LOAD_DEFAULT_CONNECTIONS=False
AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgres+psycopg2://airflow:airflow#postgres:5432/airflow
AIRFLOW__CORE__FERNET_KEY=81HqDtbqAywKSOumSha3BhWNOdQ26slT6K0YaZeZyPs=
AIRFLOW_CONN_METADATA_DB=postgres+psycopg2://airflow:airflow#postgres:5432/airflow
AIRFLOW_VAR__METADATA_DB_SCHEMA=airflow
AIRFLOW__SCHEDULER__SCHEDULER_HEARTBEAT_SEC=10
./scripts/airflow-entrypoint.sh:
#!/usr/bin/env bash
airflow upgradedb
airflow webserver

There is an official docker-compose.yml see here.
You will also find more information about docker and Kubernetes deployment in the official docs

Official docker image for Airflow version 2.0 is available now. Here is list of 2.0.0 docker images
Example docker image :
# example:
apache/airflow:2.0.0-python3.8
The below docker-compose handles
airflow db schema upgrade
admin user creation
Here is a version that I use:
version: "3.7"
# Common sections extracted out
x-airflow-common:
&airflow-common
image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.0.0-python3.8}
environment:
&airflow-common-env
AIRFLOW__CORE__EXECUTOR: CeleryExecutor
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow#postgres/airflow
AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow#postgres/airflow
AIRFLOW__CELERY__BROKER_URL: redis://:#redis:6379/0
AIRFLOW__CORE__FERNET_KEY: ''
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
# Change log level when needed
AIRFLOW__LOGGING__LOGGING_LEVEL: 'INFO'
volumes:
- ./airflow/dags:/opt/airflow/dags
depends_on:
- postgres
- redis
networks:
- default_net
volumes:
postgres-db-volume:
services:
postgres:
image: postgres:13
environment:
POSTGRES_USER: airflow
POSTGRES_PASSWORD: airflow
POSTGRES_DB: airflow
volumes:
- postgres-db-volume:/var/lib/postgresql/data
healthcheck:
test: ["CMD", "pg_isready", "-U", "airflow"]
interval: 5s
retries: 5
restart: always
networks:
- default_net
redis:
image: redis:6.0.10
ports:
- 6379:6379
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 30s
retries: 50
restart: always
networks:
- default_net
airflow-webserver:
<<: *airflow-common
command: webserver
ports:
- 8080:8080
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
interval: 10s
timeout: 10s
retries: 5
restart: always
environment:
<<: *airflow-common-env
# It is sufficient to run db upgrade and create admin user only from webserver service
_AIRFLOW_WWW_USER_PASSWORD: 'yourAdminPass'
_AIRFLOW_DB_UPGRADE: 'true'
_AIRFLOW_WWW_USER_CREATE: 'true'
airflow-scheduler:
<<: *airflow-common
command: scheduler
restart: always
airflow-worker:
<<: *airflow-common
command: celery worker
restart: always
networks:
default_net:
attachable: true

update docker-compose version to :version: "3.7"

Related

APM Server has still not connected to Elasticsearch in docker-compose

I have the following docker-compose in order to store logs from fastapi.
apm-server:
image: docker.elastic.co/apm/apm-server:7.13.0
cap_add: ["CHOWN", "DAC_OVERRIDE", "SETGID", "SETUID"]
cap_drop: ["ALL"]
networks:
- es-net
ports:
- 8200:8200
command: >
apm-server -e
-E apm-server.rum.enabled=true
-E setup.kibana.host=kibana:5601
-E setup.template.settings.index.number_of_replicas=0
-E apm-server.kibana.enabled=true
-E apm-server.kibana.host=kibana:5601
-E output.elasticsearch.hosts=["elasticsearch:9200"]
healthcheck:
interval: 10s
retries: 12
test: curl --write-out 'HTTP %{http_code}' --fail --silent --output /dev/null http://localhost:8200/
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.4.0
container_name: elasticsearch
environment:
- xpack.security.enabled=false
- discovery.type=single-node
networks:
- es-net
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
cap_add:
- IPC_LOCK
volumes:
- elasticsearch-data:/usr/share/elasticsearch/data
ports:
- 9200:9200
- 9300:9300
kibana:
container_name: kibana
image: docker.elastic.co/kibana/kibana:7.4.0
networks:
- es-net
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
ports:
- 5601:5601
depends_on:
- elasticsearch
volumes:
elasticsearch-data:
driver: local
networks:
es-net:
driver: bridge
The problem is that kibana doesn't recognise the APM service
The fastapi is configured with the following:
from elasticapm.contrib.starlette import make_apm_client, ElasticAPM
apm = make_apm_client({
'SERVICE_NAME': 'service'
})
app = FastAPI()
app.add_middleware(ElasticAPM, client=apm)
Any idea about how can I solve this?
Thanks

The problem was solved changing the apm-server to 7.4.0

How can I check the health status of my dockerized celery / django app?

I am running a dockerized django app using the following dockerfile:
services:
web:
build:
context: .
dockerfile: Dockerfile.prod
command: gunicorn PriceOptimization.wsgi:application --bind 0.0.0.0:8000
volumes:
- static_volume:/home/app/web/staticfiles
networks:
- dbnet
ports:
- "8000:8000"
environment:
aws_access_key_id: ${aws_access_key_id}
redis:
restart: always
image: redis:latest
networks:
- dbnet
ports:
- "6379:6379"
celery:
restart: always
build:
context: .
command: celery -A PriceOptimization worker -l info
volumes:
- ./PriceOptimization:/PriceOptimization
depends_on:
- web
- redis
networks:
- dbnet
environment:
access_key_id: ${access_key_id}
nginx:
build: ./nginx
ports:
- "80:80"
volumes:
- static_volume:/home/app/web/staticfiles
depends_on:
- web
networks:
- dbnet
database:
image: "postgres" # use latest official postgres version
restart: unless-stopped
env_file:
- ./database.env # configure postgres
networks:
- dbnet
ports:
- "5432:5432"
volumes:
- database-data:/var/lib/postgresql/data/ # persist data even if container shuts down
volumes:
database-data:
static_volume:
media_volume:
I have added celery.py to my app, and I am building / running the docker container as follows:
docker-compose -f $HOME/PriceOpt/PriceOptimization/docker-compose.prod.yml up -d --build
Running the application in my development environment lets me check at the command line that the celery app is correctly connected, etc. Is there a way that I can test to see if my celery app is initiated properly at the end of the build process?

Airflow getting error No module named 'httplib2'

I am getting an error when loading airflow over docker. I have the package already installed on my anaconda env. I am new to airflow and following this course, under the video, there is also a link to GitHub, where is the code that I am using to recreate the task:
https://www.youtube.com/watch?v=wAyu5BN3VpY&t=717s [29:30min]
version: '3'
services:
postgres:
image: postgres:9.6
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
ports:
- "5432:5432"
webserver:
image: puckel/docker-airflow:1.10.1
build:
context: https://github.com/puckel/docker-airflow.git#1.10.1
dockerfile: Dockerfile
args:
AIRFLOW_DEPS: gcp_api,s3
PYTHON_DEPS: sqlalchemy==1.2.0
restart: always
depends_on:
- postgres
environment:
- LOAD_EX=n
- EXECUTOR=Local
- FERNET_KEY=jsDPRErfv8Z_eVTnGfF8ywd19j4pyqE3NpdUBA_oRTo=
volumes:
- ./examples/gcloud-example/dags:/usr/local/airflow/dags
# Uncomment to include custom plugins
# - ./plugins:/usr/local/airflow/plugins
ports:
- "8080:8080"
command: webserver
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
interval: 30s
timeout: 30s
retries: 3
#!/bin/bash
docker-compose -f docker-compose-gcloud.yml up --abort-on-container-exit

Run a command in Docker after certain container is up

Have a Docker Compose with 4 containers.
How do I run a python script (python manage.py setup) in container in server_1 when postgres_1 is up, but only once (state should be persisted somewhere, maybe via volume?)
I persist PostgreSQL data to disk via volume.
Is this any nice way?
Want to make up setup and running of software very easy, just using docker-compose up. Should not matter, if this is first run or further runs. First run needs python manage.py setup invocation.
Is there a nice way of doing it?
Idea was to check for existence of file flag in mounted volume, but don't know how to wait in server_1 for postgres_1 to be up.
Here is my docker-compose.yml
version: '3'
services:
server:
build:
context: .
dockerfile: docker/backend/Dockerfile
restart: always
working_dir: /srv/scanmycode/
entrypoint: python
command: /srv/scanmycode/manage.py runserver
ports:
- 5000:5000
volumes:
- ./data1:/srv/scanmycode/quantifiedcode/data/
- ./data2:/srv/scanmycode/quantifiedcode/backend/data/
links:
- "postgres"
postgres:
image: postgres:13.2
restart: unless-stopped
environment:
POSTGRES_DB: qc
POSTGRES_USER: qc
POSTGRES_PASSWORD: qc
PGDATA: /var/lib/postgresql/data/pgdata
ports:
- "5432:5432"
volumes:
-
type: bind
source: ./postgres-data
target: /var/lib/postgresql/data
worker_1:
build:
context: .
dockerfile: docker/worker/Dockerfile
args:
- GIT_TOKEN
hostname: worker_1
restart: on-failure
depends_on:
- rabbitmq3
working_dir: /srv/scanmycode/
entrypoint: python
command: /srv/scanmycode/manage.py runworker
volumes:
- ./data1:/srv/scanmycode/quantifiedcode/data/
- ./data2:/srv/scanmycode/quantifiedcode/backend/data/
links:
- "rabbitmq3"
- "server"
- "postgres"
rabbitmq3:
container_name: "rabbitmq"
image: rabbitmq:3.8-management-alpine
environment:
- RABBITMQ_DEFAULT_USER=qc
- RABBITMQ_DEFAULT_PASS=qc
ports:
- 5672:5672
- 15672:15672
healthcheck:
test: [ "CMD", "nc", "-z", "localhost", "5672" ]
interval: 5s
timeout: 15s
retries: 1

Used this:
version: '3'
services:
server:
build:
context: .
dockerfile: docker/backend/Dockerfile
restart: always
depends_on:
- postgres
working_dir: /srv/scanmycode/
entrypoint: sh
command: -c "if [ -f /srv/scanmycode/setup_state/setup_done ]; then python /srv/scanmycode/manage.py runserver; else python /srv/scanmycode/manage.py setup && mkdir -p /srv/scanmycode/setup_state && touch /srv/scanmycode/setup_state/setup_done; fi"
ports:
- 5000:5000
volumes:
- ./data1:/srv/scanmycode/quantifiedcode/data/
- ./data2:/srv/scanmycode/quantifiedcode/backend/data/
- ./setup_state:/srv/scanmycode/setup_state
links:
- "postgres"
postgres:
image: postgres:13.2
restart: unless-stopped
environment:
POSTGRES_DB: qc
POSTGRES_USER: qc
POSTGRES_PASSWORD: qc
PGDATA: /var/lib/postgresql/data/pgdata
ports:
- "5432:5432"
volumes:
- db-data:/var/lib/postgresql/data
worker_1:
build:
context: .
dockerfile: docker/worker/Dockerfile
hostname: worker_1
restart: on-failure
depends_on:
- rabbitmq3
- postgres
- server
working_dir: /srv/scanmycode/
entrypoint: python
command: /srv/scanmycode/manage.py runworker
volumes:
- ./data1:/srv/scanmycode/quantifiedcode/data/
- ./data2:/srv/scanmycode/quantifiedcode/backend/data/
links:
- "rabbitmq3"
- "server"
- "postgres"
rabbitmq3:
container_name: "rabbitmq"
image: rabbitmq:3.8-management-alpine
environment:
- RABBITMQ_DEFAULT_USER=qc
- RABBITMQ_DEFAULT_PASS=qc
ports:
- 5672:5672
- 15672:15672
healthcheck:
test: [ "CMD", "nc", "-z", "localhost", "5672" ]
interval: 5s
timeout: 15s
retries: 1
volumes:
db-data:
driver: local

How to run Python Django and Celery using docker-compose?

I've a Python application using Django and Celery, and I trying to run using docker and docker-compose because i also using Redis and Dynamodb
The problem is the following:
I'm not able to execute both services WSGI and Celery, cause just the first instruction works fine..
version: '3.3'
services:
redis:
image: redis:3.2-alpine
volumes:
- redis_data:/data
ports:
- "6379:6379"
dynamodb:
image: dwmkerr/dynamodb
ports:
- "3000:8000"
volumes:
- dynamodb_data:/data
jobs:
build:
context: nubo-async-cfe-seces
dockerfile: Dockerfile
environment:
- REDIS_HOST=redisrvi
- PYTHONUNBUFFERED=0
- CC_DYNAMODB_NAMESPACE=None
- CC_DYNAMODB_ACCESS_KEY_ID=anything
- CC_DYNAMODB_SECRET_ACCESS_KEY=anything
- CC_DYNAMODB_HOST=dynamodb
- CC_DYNAMODB_PORT=8000
- CC_DYNAMODB_IS_SECURE=False
command: >
bash -c "celery worker -A tasks.async_service -Q dynamo-queue -E --loglevel=ERROR &&
uwsgi --socket 0.0.0.0:8080 --protocol=http --wsgi-file nubo_async/wsgi.py"
depends_on:
- redis
- dynamodb
volumes:
- .:/jobs
ports:
- "9090:8080"
volumes:
redis_data:
dynamodb_data:
Has anyone had the same problem?

You may refer to docker-compose of Saleor project. I would suggest to let celery run its daemon only depend on redis as the broker. See the configuration of docker-compose.yml file:
services:
web:
build:
context: .
dockerfile: ./Dockerfile
args:
STATIC_URL: '/static/'
restart: unless-stopped
networks:
- saleor-backend-tier
env_file: common.env
depends_on:
- db
- redis
celery:
build:
context: .
dockerfile: ./Dockerfile
args:
STATIC_URL: '/static/'
command: celery -A saleor worker --app=saleor.celeryconf:app --loglevel=info
restart: unless-stopped
networks:
- saleor-backend-tier
env_file: common.env
depends_on:
- redis
See also that the connection from both services to redis are set separately by the environtment vatables as shown on the common.env file:
CACHE_URL=redis://redis:6379/0
CELERY_BROKER_URL=redis://redis:6379/1

Here's the docker-compose as suggested by #Satevg, run the Django and Celery application by separate containers. Works fine!
version: '3.3'
services:
redis:
image: redis:3.2-alpine
volumes:
- redis_data:/data
ports:
- "6379:6379"
dynamodb:
image: dwmkerr/dynamodb
ports:
- "3000:8000"
volumes:
- dynamodb_data:/data
jobs:
build:
context: nubo-async-cfe-services
dockerfile: Dockerfile
environment:
- REDIS_HOST=redis
- PYTHONUNBUFFERED=0
- CC_DYNAMODB_NAMESPACE=None
- CC_DYNAMODB_ACCESS_KEY_ID=anything
- CC_DYNAMODB_SECRET_ACCESS_KEY=anything
- CC_DYNAMODB_HOST=dynamodb
- CC_DYNAMODB_PORT=8000
- CC_DYNAMODB_IS_SECURE=False
command: bash -c "uwsgi --socket 0.0.0.0:8080 --protocol=http --wsgi-file nubo_async/wsgi.py"
depends_on:
- redis
- dynamodb
volumes:
- .:/jobs
ports:
- "9090:8080"
celery:
build:
context: nubo-async-cfe-services
dockerfile: Dockerfile
environment:
- REDIS_HOST=redis
- PYTHONUNBUFFERED=0
- CC_DYNAMODB_NAMESPACE=None
- CC_DYNAMODB_ACCESS_KEY_ID=anything
- CC_DYNAMODB_SECRET_ACCESS_KEY=anything
- CC_DYNAMODB_HOST=dynamodb
- CC_DYNAMODB_PORT=8000
- CC_DYNAMODB_IS_SECURE=False
command: celery worker -A tasks.async_service -Q dynamo-queue -E --loglevel=ERROR
depends_on:
- redis
- dynamodb
volumes:
- .:/jobs
volumes:
redis_data:
dynamodb_data:

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Docker compose file for airflow 2 ( version 2.0.0 ) - python

There is an official docker-compose.yml see here. You will also find more information about docker and Kubernetes deployment in the official docs

update docker-compose version to :version: "3.7"

Related

APM Server has still not connected to Elasticsearch in docker-compose

How can I check the health status of my dockerized celery / django app?

Airflow getting error No module named 'httplib2'

Run a command in Docker after certain container is up

How to run Python Django and Celery using docker-compose?

Categories

Resources