i recently created a redis pod in kubernetes. I am having trouble writing to the redis database. I am getting an error cant write to read only replica
when i do kubectl get pv i see the following:
local-pv1 1Gi RWO Retain Bound default/data-redis-0 local-storage 61m
local-pv2 1Gi RWO Retain Bound default/data-redis-1 local-storage 61m
local-pv3 2Gi RWO Retain Bound default/data-redis-2 local-storage 61m
kubectl get pods
and this leads to...
redis-0 1/1 Running 0 63m
redis-1 1/1 Running 0 63m
redis-2 1/1 Running 0 62m
redis-python-deployment-9cf76d5b5-hw2nf 1/1 Running 2 (28m ago) 28m
redis-python-deployment-9cf76d5b5-vf2f4 1/1 Running 1 (28m ago) 28m
redis-python-deployment-9cf76d5b5-wjxwn 1/1 Running 2 (28m ago) 28m
redis-python-deployment-9cf76d5b5-wlsjj 1/1 Running 0 28m
where redis-0 is the master and redis-1 and redis-2 are the slaves.
now im able to do this: kubectl exec -it redis-0 sh and then redis-cli and im able to perform the following: xadd test * value 1
Now I'm testing pushing data to this redis database.
import os
import random
import redis
def connect_to_redis():
hostname = os.environ.get("REDISHOST", "localhost")
password = os.environ["REDISPASSWORD"]
port = os.environ["REDISPORT"]
r = redis.Redis(hostname, port=int(port), password=password)
return r
def send_data(redis_conn: redis.Redis, max_messages: int) -> None:
count = 0
while count < max_messages:
try:
data = {"price": random.random() * 100, "volume": random.random() * 10}
resp = redis_conn.xadd(name="omega-orderbook", fields=data)
print(resp)
count += 1
except Exception as e:
print(f"error occured at {e}")
if __name__ == "__main__":
r = connect_to_redis()
send_data(r, 10)
I've called the above python file and built it in teh following image: docker build -t redis_practice
Error im receiving is You can't write against a read only replica.
My redis headless service:
# Headless service for stable DNS entries of StatefulSet members.
apiVersion: v1
kind: Service
metadata:
name: redis-service
spec:
clusterIP: None
ports:
- port: 6379
targetPort: 6379
name: redis
selector:
app: redis
I do have a redis config that i have used from here: https://gist.github.com/bharathirajatut/dcebde585eba5ac8b1398b8ed653d32d
Is there anyway for me to refer to redis-0 in my python code? currently i have it as:
redis.Redis('redis-service', host=6379, password=mysecretpass)
Thanks
edit: ok, i managed to connect to redis after getting the ip address of the master pod. is there another way to connect?
Related
I have the following docker-compose file:
version: '2.3'
networks:
default: { external: true, name: $NETWORK_NAME } # NETWORK_NAME in .env file is `uv_atp_network`.
services:
car_parts_segmentor:
# container_name: uv-car-parts-segmentation
image: "uv-car-parts-segmentation:latest"
ports:
- "8080:8080"
volumes:
- ../../../../uv-car-parts-segmentation/configs:/uveye/configs
- /isilon/:/isilon/
# - local_data_folder:local_data_folder
command: "--run_service rabbit"
runtime: nvidia
depends_on:
rabbitmq_local:
condition: service_started
links:
- rabbitmq_local
restart: always
rabbitmq_local:
image: 'rabbitmq:3.6-management-alpine'
container_name: "rabbitmq"
ports:
- ${RABBIT_PORT:?unspecified_rabbit_port}:5672
- ${RABBIT_MANAGEMENT_PORT:?unspecified_rabbit_management_port}:15672
When this runs, docker ps shows
21400efd6493 uv-car-parts-segmentation:latest "python /uveye/app/m…" 5 seconds ago Up 1 second 0.0.0.0:8080->8080/tcp, :::8080->8080/tcp joint_car_parts_segmentor_1
bf4ab8581f1f rabbitmq:3.6-management-alpine "docker-entrypoint.s…" 5 seconds ago Up 4 seconds 4369/tcp, 5671/tcp, 0.0.0.0:5672->5672/tcp, :::5672->5672/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:15672->15672/tcp, :::15672->15672/tcp rabbitmq
I want to create a connection to that rabbitmq. The user:pass is guest:guest.
I was unable to do it, with the very uninformative AMQPConnectionError in all cases:
Below code runs in another, unrelated container.
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#rabbitmq/"))
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#localhost/"))
Also tried with
$ docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' rabbitmq
172.27.0.2
and
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#172.27.0.2/")) #
Also tried with
credentials = pika.credentials.PlainCredentials(
username="guest",
password="guest"
)
parameters = pika.ConnectionParameters(
host=ip_address, # tried all above options
port=5672,
credentials=credentials,
heartbeat=10,
)
Note that the container car_parts_segmentor is able to see the container rabbitmq. Both are started by docker-compose.
My assumption is this has to do with the uv_atp_network both containers live in, and I am trying to access a docker inside that network, from outside the network.
Is this really the problem?
If so, how can this be achieved?
For the future - how to get more informative errors from pika?
As I suspected, the problem was the name rabbitmq existed only in the network uv_atp_network.
The code attempting to connect to that network runs inside a container of its own, which was not present in the network.
Solution connect the current container to the network:
import socket
client = docker.from_env()
network_name = "uv_atp_network"
atp_container = client.containers.get(socket.gethostname())
client.networks.get(network_name).connect(container=atp_container.id)
After this, the above code in the question does work, because rabbitmq can be resolved.
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#rabbitmq/"))
I have the following docker-compose file:
version: '2.3'
networks:
default: { external: true, name: $NETWORK_NAME } # NETWORK_NAME in .env file is `uv_atp_network`.
services:
car_parts_segmentor:
# container_name: uv-car-parts-segmentation
image: "uv-car-parts-segmentation:latest"
ports:
- "8080:8080"
volumes:
- ../../../../uv-car-parts-segmentation/configs:/uveye/configs
- /isilon/:/isilon/
# - local_data_folder:local_data_folder
command: "--run_service rabbit"
runtime: nvidia
depends_on:
rabbitmq_local:
condition: service_started
links:
- rabbitmq_local
restart: always
rabbitmq_local:
image: 'rabbitmq:3.6-management-alpine'
container_name: "rabbitmq"
ports:
- ${RABBIT_PORT:?unspecified_rabbit_port}:5672
- ${RABBIT_MANAGEMENT_PORT:?unspecified_rabbit_management_port}:15672
When this runs, docker ps shows
21400efd6493 uv-car-parts-segmentation:latest "python /uveye/app/m…" 5 seconds ago Up 1 second 0.0.0.0:8080->8080/tcp, :::8080->8080/tcp joint_car_parts_segmentor_1
bf4ab8581f1f rabbitmq:3.6-management-alpine "docker-entrypoint.s…" 5 seconds ago Up 4 seconds 4369/tcp, 5671/tcp, 0.0.0.0:5672->5672/tcp, :::5672->5672/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:15672->15672/tcp, :::15672->15672/tcp rabbitmq
I want to create a connection to that rabbitmq. The user:pass is guest:guest.
I was unable to do it, with the very uninformative AMQPConnectionError in all cases:
Below code runs in another, unrelated container.
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#rabbitmq/"))
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#localhost/"))
Also tried with
$ docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' rabbitmq
172.27.0.2
and
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#172.27.0.2/")) #
Also tried with
credentials = pika.credentials.PlainCredentials(
username="guest",
password="guest"
)
parameters = pika.ConnectionParameters(
host=ip_address, # tried all above options
port=5672,
credentials=credentials,
heartbeat=10,
)
Note that the container car_parts_segmentor is able to see the container rabbitmq. Both are started by docker-compose.
My assumption is this has to do with the uv_atp_network both containers live in, and I am trying to access a docker inside that network, from outside the network.
Is this really the problem?
If so, how can this be achieved?
For the future - how to get more informative errors from pika?
As I suspected, the problem was the name rabbitmq existed only in the network uv_atp_network.
The code attempting to connect to that network runs inside a container of its own, which was not present in the network.
Solution connect the current container to the network:
import socket
client = docker.from_env()
network_name = "uv_atp_network"
atp_container = client.containers.get(socket.gethostname())
client.networks.get(network_name).connect(container=atp_container.id)
After this, the above code in the question does work, because rabbitmq can be resolved.
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#rabbitmq/"))
This question might seem like a duplicate of this.
I am trying to run Apache Beam python pipeline using flink on an offline instance of Kubernetes. However, since I have user code with external dependencies, I am using the Python SDK harness as an External Service - which is causing errors (described below).
The kubernetes manifest I use to launch the beam python SDK:
apiVersion: apps/v1
kind: Deployment
metadata:
name: beam-sdk
spec:
replicas: 1
selector:
matchLabels:
app: beam
component: python-beam-sdk
template:
metadata:
labels:
app: beam
component: python-beam-sdk
spec:
hostNetwork: True
containers:
- name: python-beam-sdk
image: apachebeam/python3.7_sdk:latest
imagePullPolicy: "Never"
command: ["/opt/apache/beam/boot", "--worker_pool"]
ports:
- containerPort: 50000
name: yay
apiVersion: v1
kind: Service
metadata:
name: beam-python-service
spec:
type: NodePort
ports:
- name: yay
port: 50000
targetPort: 50000
selector:
app: beam
component: python-beam-sdk
When I launch my pipeline with the following options:
beam_options = PipelineOptions([
"--runner=FlinkRunner",
"--flink_version=1.9",
"--flink_master=10.101.28.28:8081",
"--environment_type=EXTERNAL",
"--environment_config=10.97.176.105:50000",
"--setup_file=./setup.py"
])
I get the following error message (within the python sdk service):
NAME READY STATUS RESTARTS AGE
beam-sdk-666779599c-w65g5 1/1 Running 1 4d20h
flink-jobmanager-74d444cccf-m4g8k 1/1 Running 1 4d20h
flink-taskmanager-5487cc9bc9-fsbts 1/1 Running 2 4d20h
flink-taskmanager-5487cc9bc9-zmnv7 1/1 Running 2 4d20h
(base) [~]$ sudo kubectl logs -f beam-sdk-666779599c-w65g5
2020/02/26 07:56:44 Starting worker pool 1: python -m apache_beam.runners.worker.worker_pool_main --service_port=50000 --container_executable=/opt/apache/beam/boot
Starting worker with command ['/opt/apache/beam/boot', '--id=1-1', '--logging_endpoint=localhost:39283', '--artifact_endpoint=localhost:41533', '--provision_endpoint=localhost:42233', '--control_endpoint=localhost:44977']
2020/02/26 09:09:07 Initializing python harness: /opt/apache/beam/boot --id=1-1 --logging_endpoint=localhost:39283 --artifact_endpoint=localhost:41533 --provision_endpoint=localhost:42233 --control_endpoint=localhost:44977
2020/02/26 09:11:07 Failed to obtain provisioning information: failed to dial server at localhost:42233
caused by:
context deadline exceeded
I have no idea what the logging- or artifact endpoint (etc.) is. And by inspecting the source code it seems like that the endpoints has been hard-coded to be located at localhost.
(You said in a comment that the answer to the referenced post is valid, so I'll just address the specific error you ran into in case someone else hits it.)
Your understanding is correct; the logging, artifact, etc. endpoints are essentially hardcoded to use localhost. These endpoints are meant to be only used internally by Beam and are not configurable. So the Beam worker is implicitly assumed to be on the same host as the Flink task manager. Typically, this is accomplished by making the Beam worker pool a sidecar of the Flink task manager pod, rather than a separate service.
I'm running uwsgi+flask application,
The app is running as a k8s pod.
When i deploy a new pod (a new version), the existing pod get SIGTERM.
This causes the master to stop accepting new connection at the same moment, what causes issues as the LB still pass requests to the pod (for a few more seconds).
I would like the master to wait 30 sec BEFORE stop accepting new connections (When getting SIGTERM) but couldn't find a way, is it possible?
My uwsgi.ini file:
[uwsgi]
;https://uwsgi-docs.readthedocs.io/en/latest/HTTP.html
http = :8080
wsgi-file = main.py
callable = wsgi_application
processes = 2
enable-threads = true
master = true
reload-mercy = 30
worker-reload-mercy = 30
log-5xx = true
log-4xx = true
disable-logging = true
stats = 127.0.0.1:1717
stats-http = true
single-interpreter= true
;https://github.com/containous/traefik/issues/615
http-keepalive=true
add-header = Connection: Keep-Alive
Seems like this is not possible to achieve using uwsgi:
https://github.com/unbit/uwsgi/issues/1974
The solution - (as mentioned on this kubernetes issue):
https://github.com/kubernetes/contrib/issues/1140
Is to use the prestop hook, quite ugly but will help to achieve zero downtime:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx
spec:
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
lifecycle:
preStop:
exec:
command: ["/bin/sleep","5"]
The template is taken from this answer: https://stackoverflow.com/a/39493421/3659858
Another option is to use the CLI option:
--hook-master-start "unix_signal:15 gracefully_kill_them_all"
or in the .ini file (remove the double quotes):
hook-master-start = unix_signal:15 gracefully_kill_them_all
which will gracefully terminate workers after receiving a SIGTERM (signal 15).
See the following for reference.
When I tried the above though, it didn't work as expected from within a docker container. Instead, you can also use uWSGI's Master FIFO file. The Master FIFO file can be specified like:
--master-fifo <filename>
or
master-fifo = /tmp/master-fifo
Then you can simply write a q character to the file and it will gracefully shut down your workers before exiting.
I have a docker swarm mode with one HAProxy container, and 3 python web apps. The container with HAProxy is expose port 80 and should load balance the 3 containers of my app (by leastconn).
Here is my docker-compose.yml file:
version: '3'
services:
scraper-node:
image: scraper
ports:
- 5000
volumes:
- /profiles:/profiles
command: >
bash -c "
cd src;
gunicorn src.interface:app \
--bind=0.0.0.0:5000 \
--workers=1 \
--threads=1 \
--timeout 500 \
--log-level=debug \
"
environment:
- SERVICE_PORTS=5000
deploy:
replicas: 3
update_config:
parallelism: 5
delay: 10s
restart_policy:
condition: on-failure
max_attempts: 3
window: 120s
networks:
- web
proxy:
image: dockercloud/haproxy
depends_on:
- scraper-node
environment:
- BALANCE=leastconn
volumes:
- /var/run/docker.sock:/var/run/docker.sock
ports:
- 80:80
networks:
- web
networks:
web:
driver: overlay
When I deploy this swarm (docker stack deploy --compose-file=docker-compose.yml scraper) I get all of my containers:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
245f4bfd1299 scraper:latest "/docker-entrypoin..." 21 hours ago Up 19 minutes 80/tcp, 5000/tcp, 8000/tcp scraper_scraper-node.3.iyi33hv9tikmf6m2wna0cypgp
995aefdb9346 scraper:latest "/docker-entrypoin..." 21 hours ago Up 19 minutes 80/tcp, 5000/tcp, 8000/tcp scraper_scraper-node.2.wem9v2nug8wqos7d97zknuvqb
a51474322583 scraper:latest "/docker-entrypoin..." 21 hours ago Up 19 minutes 80/tcp, 5000/tcp, 8000/tcp scraper_scraper-node.1.0u8q4zn432n7p5gl93ohqio8e
3f97f34678d1 dockercloud/haproxy "/sbin/tini -- doc..." 21 hours ago Up 19 minutes 80/tcp, 443/tcp, 1936/tcp scraper_proxy.1.rng5ysn8v48cs4nxb1atkrz73
And when I display the haproxy container log it looks like he recognize the 3 python containers:
INFO:haproxy:dockercloud/haproxy 1.6.6 is running outside Docker Cloud
INFO:haproxy:Haproxy is running in SwarmMode, loading HAProxy definition through docker api
INFO:haproxy:dockercloud/haproxy PID: 6
INFO:haproxy:=> Add task: Initial start - Swarm Mode
INFO:haproxy:=> Executing task: Initial start - Swarm Mode
INFO:haproxy:==========BEGIN==========
INFO:haproxy:Linked service: scraper_scraper-node
INFO:haproxy:Linked container: scraper_scraper-node.1.0u8q4zn432n7p5gl93ohqio8e, scraper_scraper-node.2.wem9v2nug8wqos7d97zknuvqb, scraper_scraper-node.3.iyi33hv9tikmf6m2wna0cypgp
INFO:haproxy:HAProxy configuration:
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
log-send-hostname
maxconn 4096
pidfile /var/run/haproxy.pid
user haproxy
group haproxy
daemon
stats socket /var/run/haproxy.stats level admin
ssl-default-bind-options no-sslv3
ssl-default-bind-ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:AES128-GCM-SHA256:AES128-SHA256:AES128-SHA:AES256-GCM-SHA384:AES256-SHA256:AES256-SHA:DHE-DSS-AES128-SHA:DES-CBC3-SHA
defaults
balance leastconn
log global
mode http
option redispatch
option httplog
option dontlognull
option forwardfor
timeout connect 5000
timeout client 50000
timeout server 50000
listen stats
bind :1936
mode http
stats enable
timeout connect 10s
timeout client 1m
timeout server 1m
stats hide-version
stats realm Haproxy\ Statistics
stats uri /
stats auth stats:stats
frontend default_port_80
bind :80
reqadd X-Forwarded-Proto:\ http
maxconn 4096
default_backend default_service
backend default_service
server scraper_scraper-node.1.0u8q4zn432n7p5gl93ohqio8e 10.0.0.5:5000 check inter 2000 rise 2 fall 3
server scraper_scraper-node.2.wem9v2nug8wqos7d97zknuvqb 10.0.0.6:5000 check inter 2000 rise 2 fall 3
server scraper_scraper-node.3.iyi33hv9tikmf6m2wna0cypgp 10.0.0.7:5000 check inter 2000 rise 2 fall 3
INFO:haproxy:Launching HAProxy
INFO:haproxy:HAProxy has been launched(PID: 12)
INFO:haproxy:===========END===========
But when I try to GET to http://localhost I get an error message:
<html>
<body>
<h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body>
</html>
There was two problems:
The command in docker-compose.yml file should be one line.
The scraper image should expose port 5000 (in his Dockerfile).
Once I fix those, I deploy this swarm the same way (with stack) and the proxy container recognize the python containers and was able to load balance between them.
A 503 error usually means a failed health check to the backend server.
Your stats page might be helpful here: if you mouse over the LastChk column of one of your DOWN backend servers, HAProxy will give you a vague summary of why that server is DOWN:
It does not look like you configured the health check (option httpchk) for your default_service backend: can you reach any of your backend servers directly (e.g. curl --head 10.0.0.5:5000)? From the HAProxy documentation:
[R]esponses 2xx and 3xx are
considered valid, while all other ones indicate a server failure, including
the lack of any response.