I have docker compose file as
---
version: '3.7'
services:
myapi:
image: tiangolo/uwsgi-nginx-flask:python3.7
env_file: apivars.env
logging:
driver: syslog
options:
syslog-address: "udp://127.0.0.1:514"
tag: tags
labels: labels
ports:
- "8080:80"
deploy:
placement:
constraints:
- node.role != manager
mode: replicated
replicas: 32
update_config:
parallelism: 4
delay: 5s
order: start-first
...
I have load balancer which will redirect request to this swarm manager.
My understanding is, if I hit www.myapi.com, it will got to LoadBalancer and then request will go to swarm manager, then swarm manager send that request to on of the 32 replicas.
Now issue is, LoadBalancer logs report some of the 502 errors.
# head -n1 /var/log/haproxy.log
Apr 28 09:35:28 localhost haproxy[43117]: 172.19.9.1:50220 [28/Apr/2020:09:35:08.549] main~ API_Production/swarmnode5 0/0/1/19952/19953 502 309 - - ---- 97/97/10/1/0 0/0 "GET /v2/students/?includeFields=name,id&per_page=1000&page=88 HTTP/1.1"
I have to check its reaching to swarm manager or swarmnode5 ?
I check the logs for nginx, but its not report any 502 errors. There are some exceptions, but not sure if exception is there in code, then why nginx not logs that api call and response?
Related
I have the following docker-compose file:
version: '2.3'
networks:
default: { external: true, name: $NETWORK_NAME } # NETWORK_NAME in .env file is `uv_atp_network`.
services:
car_parts_segmentor:
# container_name: uv-car-parts-segmentation
image: "uv-car-parts-segmentation:latest"
ports:
- "8080:8080"
volumes:
- ../../../../uv-car-parts-segmentation/configs:/uveye/configs
- /isilon/:/isilon/
# - local_data_folder:local_data_folder
command: "--run_service rabbit"
runtime: nvidia
depends_on:
rabbitmq_local:
condition: service_started
links:
- rabbitmq_local
restart: always
rabbitmq_local:
image: 'rabbitmq:3.6-management-alpine'
container_name: "rabbitmq"
ports:
- ${RABBIT_PORT:?unspecified_rabbit_port}:5672
- ${RABBIT_MANAGEMENT_PORT:?unspecified_rabbit_management_port}:15672
When this runs, docker ps shows
21400efd6493 uv-car-parts-segmentation:latest "python /uveye/app/m…" 5 seconds ago Up 1 second 0.0.0.0:8080->8080/tcp, :::8080->8080/tcp joint_car_parts_segmentor_1
bf4ab8581f1f rabbitmq:3.6-management-alpine "docker-entrypoint.s…" 5 seconds ago Up 4 seconds 4369/tcp, 5671/tcp, 0.0.0.0:5672->5672/tcp, :::5672->5672/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:15672->15672/tcp, :::15672->15672/tcp rabbitmq
I want to create a connection to that rabbitmq. The user:pass is guest:guest.
I was unable to do it, with the very uninformative AMQPConnectionError in all cases:
Below code runs in another, unrelated container.
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#rabbitmq/"))
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#localhost/"))
Also tried with
$ docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' rabbitmq
172.27.0.2
and
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#172.27.0.2/")) #
Also tried with
credentials = pika.credentials.PlainCredentials(
username="guest",
password="guest"
)
parameters = pika.ConnectionParameters(
host=ip_address, # tried all above options
port=5672,
credentials=credentials,
heartbeat=10,
)
Note that the container car_parts_segmentor is able to see the container rabbitmq. Both are started by docker-compose.
My assumption is this has to do with the uv_atp_network both containers live in, and I am trying to access a docker inside that network, from outside the network.
Is this really the problem?
If so, how can this be achieved?
For the future - how to get more informative errors from pika?
As I suspected, the problem was the name rabbitmq existed only in the network uv_atp_network.
The code attempting to connect to that network runs inside a container of its own, which was not present in the network.
Solution connect the current container to the network:
import socket
client = docker.from_env()
network_name = "uv_atp_network"
atp_container = client.containers.get(socket.gethostname())
client.networks.get(network_name).connect(container=atp_container.id)
After this, the above code in the question does work, because rabbitmq can be resolved.
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#rabbitmq/"))
I have the following docker-compose file:
version: '2.3'
networks:
default: { external: true, name: $NETWORK_NAME } # NETWORK_NAME in .env file is `uv_atp_network`.
services:
car_parts_segmentor:
# container_name: uv-car-parts-segmentation
image: "uv-car-parts-segmentation:latest"
ports:
- "8080:8080"
volumes:
- ../../../../uv-car-parts-segmentation/configs:/uveye/configs
- /isilon/:/isilon/
# - local_data_folder:local_data_folder
command: "--run_service rabbit"
runtime: nvidia
depends_on:
rabbitmq_local:
condition: service_started
links:
- rabbitmq_local
restart: always
rabbitmq_local:
image: 'rabbitmq:3.6-management-alpine'
container_name: "rabbitmq"
ports:
- ${RABBIT_PORT:?unspecified_rabbit_port}:5672
- ${RABBIT_MANAGEMENT_PORT:?unspecified_rabbit_management_port}:15672
When this runs, docker ps shows
21400efd6493 uv-car-parts-segmentation:latest "python /uveye/app/m…" 5 seconds ago Up 1 second 0.0.0.0:8080->8080/tcp, :::8080->8080/tcp joint_car_parts_segmentor_1
bf4ab8581f1f rabbitmq:3.6-management-alpine "docker-entrypoint.s…" 5 seconds ago Up 4 seconds 4369/tcp, 5671/tcp, 0.0.0.0:5672->5672/tcp, :::5672->5672/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:15672->15672/tcp, :::15672->15672/tcp rabbitmq
I want to create a connection to that rabbitmq. The user:pass is guest:guest.
I was unable to do it, with the very uninformative AMQPConnectionError in all cases:
Below code runs in another, unrelated container.
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#rabbitmq/"))
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#localhost/"))
Also tried with
$ docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' rabbitmq
172.27.0.2
and
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#172.27.0.2/")) #
Also tried with
credentials = pika.credentials.PlainCredentials(
username="guest",
password="guest"
)
parameters = pika.ConnectionParameters(
host=ip_address, # tried all above options
port=5672,
credentials=credentials,
heartbeat=10,
)
Note that the container car_parts_segmentor is able to see the container rabbitmq. Both are started by docker-compose.
My assumption is this has to do with the uv_atp_network both containers live in, and I am trying to access a docker inside that network, from outside the network.
Is this really the problem?
If so, how can this be achieved?
For the future - how to get more informative errors from pika?
As I suspected, the problem was the name rabbitmq existed only in the network uv_atp_network.
The code attempting to connect to that network runs inside a container of its own, which was not present in the network.
Solution connect the current container to the network:
import socket
client = docker.from_env()
network_name = "uv_atp_network"
atp_container = client.containers.get(socket.gethostname())
client.networks.get(network_name).connect(container=atp_container.id)
After this, the above code in the question does work, because rabbitmq can be resolved.
connection = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest#rabbitmq/"))
I'm having problem with my Python Flask server deployed in Google cloud Kubernetes engine. The code below is a simple flask server that supports text/event-stream. The problem is, at exactly 60 seconds of inactivity from the server (no messages from stream) the client shows a 502 bad gateway error.
Error: Server Error
The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds.
The client will no longer receive any data from the server whenever this happens. Already tried adding timeouts as you can see on the kubernetes config file.
I tried spinning up a google cloud compute engine without using kubernetes. Deployed the same code in it and added a domain. In my surprise it works, it didn't show any 502 bad request error even if I leave the browser open.
It probably has something to do with the kubernetes config I'm running. I'd appreciate any help or idea I can get.
Update 1
I tried changing the kube service type to LoadBalancer instead of NodePort.
Accessing the IP endpoint generated works perfectly without showing a 502 error even after 60s of inactivity.
Update 2
Here is the errors generated by the LoadBalancer stackdriver logs
{
httpRequest: {
referer: "http://sse-dev.[REDACTED]/test"
remoteIp: "[REDACTED]"
requestMethod: "GET"
requestSize: "345"
requestUrl: "http://sse-dev.[REDACTED]/stream"
responseSize: "488"
serverIp: "[REDACTED]"
status: 502
userAgent: "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36"
}
insertId: "ptb7kfg2w2zz01"
jsonPayload: {
#type: "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry"
statusDetails: "backend_timeout"
}
logName: "projects/[REDACTED]-assist-dev/logs/requests"
receiveTimestamp: "2020-01-03T06:27:44.361706996Z"
resource: {
labels: {
backend_service_name: "k8s-be-30808--17630a0e8199e99b"
forwarding_rule_name: "k8s-fw-default-[REDACTED]-dev-ingress--17630a0e8199e99b"
project_id: "[REDACTED]-assist-dev"
target_proxy_name: "k8s-tp-default-[REDACTED]-dev-ingress--17630a0e8199e99b"
url_map_name: "k8s-um-default-[REDACTED]-dev-ingress--17630a0e8199e99b"
zone: "global"
}
type: "http_load_balancer"
}
severity: "WARNING"
spanId: "4b0767cace9b9500"
timestamp: "2020-01-03T06:26:43.381613Z"
trace: "projects/[REDACTED]-assist-dev/traces/d467f39f76b94c02d9a8e6998fdca17b"
}
sse.py
from typing import Iterator
import random
import string
from collections import deque
from flask import Response, request
from gevent.queue import Queue
import gevent
def generate_id(size=6, chars=string.ascii_lowercase + string.digits):
return ''.join(random.choice(chars) for _ in range(size))
class ServerSentEvent(object):
"""Class to handle server-sent events."""
def __init__(self, data, event):
self.data = data
self.event = event
self.event_id = generate_id(),
self.retry = 5000
self.desc_map = {
self.data: "data",
self.event: "event",
self.event_id: "id",
self.retry: 5000
}
def encode(self) -> str:
"""Encodes events as a string."""
if not self.data:
return ""
lines = ["{}: {}".format(name, key)
for key, name in self.desc_map.items() if key]
return "{}\n\n".format("\n".join(lines))
class Channel(object):
def __init__(self, history_size=32):
self.subscriptions = []
self.history = deque(maxlen=history_size)
self.history.append(ServerSentEvent('start_of_history', None))
def notify(self, message):
"""Notify all subscribers with message."""
for sub in self.subscriptions[:]:
sub.put(message)
def event_generator(self, last_id) -> Iterator[ServerSentEvent]:
"""Yields encoded ServerSentEvents."""
q = Queue()
self._add_history(q, last_id)
self.subscriptions.append(q)
try:
while True:
yield q.get()
except GeneratorExit:
self.subscriptions.remove(q)
def subscribe(self):
def gen(last_id) -> Iterator[str]:
for sse in self.event_generator(last_id):
yield sse.encode()
return Response(
gen(request.headers.get('Last-Event-ID')),
mimetype="text/event-stream",
headers={
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"Content-Type": "text/event-stream"
})
def _add_history(self, q, last_id):
add = False
for sse in self.history:
if add:
q.put(sse)
if sse.event_id == last_id:
add = True
def publish(self, message, event=None):
sse = ServerSentEvent(str(message), event)
self.history.append(sse)
gevent.spawn(self.notify, sse)
def get_last_id(self) -> str:
return self.history[-1].event_id
service.py
import json
import os
import requests
from app.controllers.sse import Channel
from flask import send_file, \
jsonify, request, Blueprint, Response
from typing import Iterator
blueprint = Blueprint(__name__, __name__, url_prefix='')
flask_channel = Channel()
#blueprint.route("/stream")
def stream():
return flask_channel.subscribe()
#blueprint.route('/sample/create', methods=['GET'])
def sample_create():
branch_id = request.args.get('branch_id', None)
params = request.get_json()
if not params:
params = {
'id': 'sample_id',
'description': 'sample_description'
}
flask_channel.publish(json.dumps(params), event=branch_id)
return jsonify({'success': True}), 200
kubernetes-config.yaml
---
apiVersion: v1
kind: Service
metadata:
name: sse-service
labels:
app: sse-service
spec:
ports:
- port: 80
targetPort: 5000
protocol: TCP
name: http
selector:
app: sse-service
sessionAffinity: ClientIP
type: NodePort
---
apiVersion: "extensions/v1beta1"
kind: "Deployment"
metadata:
name: "sse-service"
namespace: "default"
labels:
app: "sse-service"
spec:
replicas: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 25%
selector:
matchLabels:
app: "sse-service"
template:
metadata:
labels:
app: "sse-service"
spec:
containers:
- name: "sse-service"
image: "{{IMAGE_NAME}}"
imagePullPolicy: Always
ports:
- containerPort: 5000
livenessProbe:
httpGet:
path: /health/check
port: 5000
initialDelaySeconds: 25
periodSeconds: 15
readinessProbe:
httpGet:
path: /health/check
port: 5000
initialDelaySeconds: 25
periodSeconds: 15
---
apiVersion: "autoscaling/v2beta1"
kind: "HorizontalPodAutoscaler"
metadata:
name: "sse-service-hpa"
namespace: "default"
labels:
app: "sse-service"
spec:
scaleTargetRef:
kind: "Deployment"
name: "sse-service"
apiVersion: "apps/v1beta1"
minReplicas: 1
maxReplicas: 7
metrics:
- type: "Resource"
resource:
name: "cpu"
targetAverageUtilization: 80
---
apiVersion: cloud.google.com/v1beta1
kind: BackendConfig
metadata:
name: sse-service
spec:
timeoutSec: 120
connectionDraining:
drainingTimeoutSec: 3600
Dockerfile
FROM python:3.6.5-jessie
ENV GUNICORN_PORT=5000
ENV PYTHONUNBUFFERED=TRUE
ENV GOOGLE_APPLICATION_CREDENTIALS=/opt/creds/account.json
COPY requirements.txt /opt/app/requirements.txt
COPY app /opt/app
COPY creds/account.json /opt/creds/account.json
WORKDIR /opt/app
RUN pip install -r requirements.txt
EXPOSE ${GUNICORN_PORT}
CMD gunicorn -b :${GUNICORN_PORT} wsgi:create_app\(\) --reload --timeout=300000 --config=config.py
Base.py
from flask import jsonify, Blueprint
blueprint = Blueprint(__name__, __name__)
#blueprint.route('/health/check', methods=['GET'])
def check_health():
response = {
'message': 'pong!',
'status': 'success'
}
return jsonify(response), 200
bitbucket-pipelines.yml
options:
docker: true
pipelines:
branches:
dev:
- step:
name: Build - Push - Deploy to Dev environment
image: google/cloud-sdk:latest
caches:
- docker
- pip
deployment: development
script:
# Export all bitbucket credentials to the environment
- echo $GOOGLE_APPLICATION_CREDENTIALS | base64 -di > ./creds/account.json
- echo $CONTAINER_CREDENTIALS | base64 -di > ./creds/gcr.json
- export CLOUDSDK_CONFIG='pwd'/creds/account.json
- export GOOGLE_APPLICATION_CREDENTIALS='pwd'/creds/account.json
# Configure docker to use gcp service account
- gcloud auth activate-service-account $KUBERNETES_SERVICE_ACCOUNT --key-file=creds/gcr.json
- gcloud config list
- gcloud auth configure-docker -q
# # Build docker image with name and tag
- export IMAGE_NAME=$HOSTNAME/$PROJECT_ID/$IMAGE:v0.1.$BITBUCKET_BUILD_NUMBER
- docker build -t $IMAGE_NAME .
# # Push image to Google Container Repository
- docker push $IMAGE_NAME
# Initialize configs for kubernetes
- gcloud config set project $PROJECT_ID
- gcloud config set compute/zone $PROJECT_ZONE
- gcloud container clusters get-credentials $PROJECT_CLUSTER
# Run kubernetes configs
- cat kubernetes-config.yaml | sed "s#{{IMAGE_NAME}}#$IMAGE_NAME#g" | kubectl apply -f -
ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
ingress.kubernetes.io/backends: '{"k8s-be-30359--17630a0e8199e99b":"HEALTHY","k8s-be-30599--17630a0e8199e99b":"HEALTHY","k8s-be-30808--17630a0e8199e99b":"HEALTHY","k8s-be-30991--17630a0e8199e99b":"HEALTHY","k8s-be-31055--17630a0e8199e99b":"HEALTHY","k8s-be-31467--17630a0e8199e99b":"HEALTHY","k8s-be-31596--17630a0e8199e99b":"HEALTHY","k8s-be-31948--17630a0e8199e99b":"HEALTHY","k8s-be-32702--17630a0e8199e99b":"HEALTHY"}'
ingress.kubernetes.io/forwarding-rule: k8s-fw-default-[REDACTED]-dev-ingress--17630a0e8199e99b
ingress.kubernetes.io/https-forwarding-rule: k8s-fws-default-[REDACTED]-dev-ingress--17630a0e8199e99b
ingress.kubernetes.io/https-target-proxy: k8s-tps-default-[REDACTED]-dev-ingress--17630a0e8199e99b
ingress.kubernetes.io/ssl-cert: k8s-ssl-d6db2a7a17456a7b-64a79e74837f68e3--17630a0e8199e99b
ingress.kubernetes.io/static-ip: k8s-fw-default-[REDACTED]-dev-ingress--17630a0e8199e99b
ingress.kubernetes.io/target-proxy: k8s-tp-default-[REDACTED]-dev-ingress--17630a0e8199e99b
ingress.kubernetes.io/url-map: k8s-um-default-[REDACTED]-dev-ingress--17630a0e8199e99b
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"extensions/v1beta1","kind":"Ingress","metadata":{"annotations":{},"name":"[REDACTED]-dev-ingress","namespace":"default"},"spec":{"rules":[{"host":"bot-dev.[REDACTED]","http":{"paths":[{"backend":{"serviceName":"bot-service","servicePort":80}}]}},{"host":"client-dev.[REDACTED]","http":{"paths":[{"backend":{"serviceName":"client-service","servicePort":80}}]}},{"host":"team-dev.[REDACTED]","http":{"paths":[{"backend":{"serviceName":"team-service","servicePort":80}}]}},{"host":"chat-dev.[REDACTED]","http":{"paths":[{"backend":{"serviceName":"chat-service","servicePort":80}}]}},{"host":"chatb-dev.[REDACTED]","http":{"paths":[{"backend":{"serviceName":"chat-builder-service","servicePort":80}}]}},{"host":"action-dev.[REDACTED]","http":{"paths":[{"backend":{"serviceName":"action-service","servicePort":80}}]}},{"host":"message-dev.[REDACTED]","http":{"paths":[{"backend":{"serviceName":"message-service","servicePort":80}}]}}],"tls":[{"hosts":["bots-dev.[REDACTED]","client-dev.[REDACTED]","team-dev.[REDACTED]","chat-dev.[REDACTED]","chatb-dev.[REDACTED]","message-dev.[REDACTED]"],"secretName":"[REDACTED]-ssl"}]}}
creationTimestamp: "2019-08-09T09:19:14Z"
generation: 7
name: [REDACTED]-dev-ingress
namespace: default
resourceVersion: "73975381"
selfLink: /apis/extensions/v1beta1/namespaces/default/ingresses/[REDACTED]-dev-ingress
uid: c176cc8c-ba86-11e9-89d6-42010a940181
spec:
rules:
- host: bot-dev.[REDACTED]
http:
paths:
- backend:
serviceName: bot-service
servicePort: 80
- host: client-dev.[REDACTED]
http:
paths:
- backend:
serviceName: client-service
servicePort: 80
- host: team-dev.[REDACTED]
http:
paths:
- backend:
serviceName: team-service
servicePort: 80
- host: chat-dev.[REDACTED]
http:
paths:
- backend:
serviceName: chat-service
servicePort: 80
- host: chatb-dev.[REDACTED]
http:
paths:
- backend:
serviceName: chat-builder-service
servicePort: 80
- host: action-dev.[REDACTED]
http:
paths:
- backend:
serviceName: action-service
servicePort: 80
- host: message-dev.[REDACTED]
http:
paths:
- backend:
serviceName: message-service
servicePort: 80
- host: sse-dev.[REDACTED]
http:
paths:
- backend:
serviceName: sse-service
servicePort: 80
tls:
- hosts:
- bots-dev.[REDACTED]
- client-dev.[REDACTED]
- team-dev.[REDACTED]
- chat-dev.[REDACTED]
- chatb-dev.[REDACTED]
- message-dev.[REDACTED]
- sse-dev.[REDACTED]
secretName: [REDACTED]-ssl
status:
loadBalancer:
ingress:
- ip: [REDACTED]
The health check from our Load Balancer comes from the readinessProbe configured in the deployment. You configured the path to be /health/check, however, your flask environment has nothing listening on that path. This means that the readinessProbe is likely failing and the health check from your Load Balancer is also failing.
With the health checks failing, your Load Balancer does not see any healthy backends so it returns a 502 error message.
You can verify this 3 ways:
Check stackdriver logs, you will see the 502 responses logged, check the details fo the log to see more details about the 502. You will likely see that there are no healthy backends.
Check the status of your pods using kubectl get po | grep sse-service, the pods are likely notReady.
test the check from another pod in the cluster. (NOTE youwill need a pod that has curl installed or allows you to install it. If you don't have one, use busybox or nginx base image)
a. kubectl get po -o wide | grep sse-service and take down the ip of one of the pods
b. kubectl exec [test_pod] -- curl [sse-service_cluster_ip]/health/check this will do a curl from a pod in the cluster to one of your sse-service pods and will check if there is anything replying to /health/check. There likely is not.
To address this, you should have `#blueprint.route('/health/check', methods=['GET']. Define the function to just return 200
I'm working to setup Jupyter notebook servers on Kubernetes that are able to launch pyspark. Each user is able to have a multiple servers running at once, and would access each by navigating to the appropriate host combined with a path to the server's fully-qualified name. For example: http://<hostname>/<username>/<notebook server name>.
I have a top-level function defined that allows a user create a SparkSession that points to the Kubernetes master URL and sets their pod to be the Spark driver.
This is all well and good, but I would like to enable end users to access the URL for the Spark Web UI so that they can track their jobs. The Spark on Kubernetes documentation has port forwarding as their recommended scheme for achieving this. It seems to be that for any security-minded organization, allowing any random user to setup port forwarding in this way would be unacceptable.
I would like to use an Ingress Kubernetes definition to allow external access to the driver's Spark Web UI. I've setup something like the following:
# Service
apiVersion: v1
kind: Service
metadata:
namespace: <notebook namespae>
name: <username>-<notebook server name>-svc
spec:
type: ClusterIP
sessionAffinity: None
selector:
app: <username>-<notebook server name>-notebook
ports:
- name: app-svc-port
protocol: TCP
port: 8888
targetPort: 8888
- name: spark-ui-port
protocol: TCP
port: 4040
targetPort: 4040
# Ingress
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
namespace: workspace
name: <username>-<notebook server name>-ing
annotations:
kubernetes.io/ingress.class: traefik
spec:
rules:
- host: <hostname>
http:
paths:
- path: /<username>/<notebook server name>
backend:
serviceName: <username>-<notebook server name>-svc
servicePort: app-svc-port
- path: /<username>/<notebook server name>/spark-ui
backend:
serviceName: <username>-<notebook server name>-svc
servicePort: spark-ui-port
However, under this setup, when I navigate to http://<hostname>/<username>/<notebook server name>/spark-ui/, I'm redirected to http://<hostname>/jobs. This is because /jobs is the default entry point to Spark's Web UI. However, I don't have an ingress rule for that path, and can't set such a rule since every user's Web UI would collide with each other in the load balancer (unless I have a misunderstanding, which is totally possible).
Under the Spark UI configuration settings, there doesn't seem to be a way to set a root path for the Spark session. You can change the port on which it runs, but what I'd like to do make the UI serve at something like: http://<hostname>/<username>/<notebook server name>/spark-ui/<jobs, stages, etc>. Is there really no way of changing what comes after the hostname of the URL and before the last part?
1: set your spark config
spark.ui.proxyBase: /foo
2: Set the nginx annotations in Ingress
annotations:
nginx.ingress.kubernetes.io/proxy-redirect-from: http://$host/
nginx.ingress.kubernetes.io/proxy-redirect-to: http://$host/foo/
3:Annotation to rewrite target:
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /$2
spec:
rules:
- host: <host>
http:
paths:
- backend:
serviceName: <service>
servicePort: <port>
path: /foo(/|$)(.*)
Yes, you can achieve this. Specifically you can do this by setting the spark.ui.proxyBase property within spark-defaults.conf or at the run-time.
Example:
echo "spark.ui.proxyBase $SPARK_UI_PROXYBASE" >> /opt/spark/conf/spark-defaults.conf;
Then this should work.
I have a docker swarm mode with one HAProxy container, and 3 python web apps. The container with HAProxy is expose port 80 and should load balance the 3 containers of my app (by leastconn).
Here is my docker-compose.yml file:
version: '3'
services:
scraper-node:
image: scraper
ports:
- 5000
volumes:
- /profiles:/profiles
command: >
bash -c "
cd src;
gunicorn src.interface:app \
--bind=0.0.0.0:5000 \
--workers=1 \
--threads=1 \
--timeout 500 \
--log-level=debug \
"
environment:
- SERVICE_PORTS=5000
deploy:
replicas: 3
update_config:
parallelism: 5
delay: 10s
restart_policy:
condition: on-failure
max_attempts: 3
window: 120s
networks:
- web
proxy:
image: dockercloud/haproxy
depends_on:
- scraper-node
environment:
- BALANCE=leastconn
volumes:
- /var/run/docker.sock:/var/run/docker.sock
ports:
- 80:80
networks:
- web
networks:
web:
driver: overlay
When I deploy this swarm (docker stack deploy --compose-file=docker-compose.yml scraper) I get all of my containers:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
245f4bfd1299 scraper:latest "/docker-entrypoin..." 21 hours ago Up 19 minutes 80/tcp, 5000/tcp, 8000/tcp scraper_scraper-node.3.iyi33hv9tikmf6m2wna0cypgp
995aefdb9346 scraper:latest "/docker-entrypoin..." 21 hours ago Up 19 minutes 80/tcp, 5000/tcp, 8000/tcp scraper_scraper-node.2.wem9v2nug8wqos7d97zknuvqb
a51474322583 scraper:latest "/docker-entrypoin..." 21 hours ago Up 19 minutes 80/tcp, 5000/tcp, 8000/tcp scraper_scraper-node.1.0u8q4zn432n7p5gl93ohqio8e
3f97f34678d1 dockercloud/haproxy "/sbin/tini -- doc..." 21 hours ago Up 19 minutes 80/tcp, 443/tcp, 1936/tcp scraper_proxy.1.rng5ysn8v48cs4nxb1atkrz73
And when I display the haproxy container log it looks like he recognize the 3 python containers:
INFO:haproxy:dockercloud/haproxy 1.6.6 is running outside Docker Cloud
INFO:haproxy:Haproxy is running in SwarmMode, loading HAProxy definition through docker api
INFO:haproxy:dockercloud/haproxy PID: 6
INFO:haproxy:=> Add task: Initial start - Swarm Mode
INFO:haproxy:=> Executing task: Initial start - Swarm Mode
INFO:haproxy:==========BEGIN==========
INFO:haproxy:Linked service: scraper_scraper-node
INFO:haproxy:Linked container: scraper_scraper-node.1.0u8q4zn432n7p5gl93ohqio8e, scraper_scraper-node.2.wem9v2nug8wqos7d97zknuvqb, scraper_scraper-node.3.iyi33hv9tikmf6m2wna0cypgp
INFO:haproxy:HAProxy configuration:
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
log-send-hostname
maxconn 4096
pidfile /var/run/haproxy.pid
user haproxy
group haproxy
daemon
stats socket /var/run/haproxy.stats level admin
ssl-default-bind-options no-sslv3
ssl-default-bind-ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:AES128-GCM-SHA256:AES128-SHA256:AES128-SHA:AES256-GCM-SHA384:AES256-SHA256:AES256-SHA:DHE-DSS-AES128-SHA:DES-CBC3-SHA
defaults
balance leastconn
log global
mode http
option redispatch
option httplog
option dontlognull
option forwardfor
timeout connect 5000
timeout client 50000
timeout server 50000
listen stats
bind :1936
mode http
stats enable
timeout connect 10s
timeout client 1m
timeout server 1m
stats hide-version
stats realm Haproxy\ Statistics
stats uri /
stats auth stats:stats
frontend default_port_80
bind :80
reqadd X-Forwarded-Proto:\ http
maxconn 4096
default_backend default_service
backend default_service
server scraper_scraper-node.1.0u8q4zn432n7p5gl93ohqio8e 10.0.0.5:5000 check inter 2000 rise 2 fall 3
server scraper_scraper-node.2.wem9v2nug8wqos7d97zknuvqb 10.0.0.6:5000 check inter 2000 rise 2 fall 3
server scraper_scraper-node.3.iyi33hv9tikmf6m2wna0cypgp 10.0.0.7:5000 check inter 2000 rise 2 fall 3
INFO:haproxy:Launching HAProxy
INFO:haproxy:HAProxy has been launched(PID: 12)
INFO:haproxy:===========END===========
But when I try to GET to http://localhost I get an error message:
<html>
<body>
<h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body>
</html>
There was two problems:
The command in docker-compose.yml file should be one line.
The scraper image should expose port 5000 (in his Dockerfile).
Once I fix those, I deploy this swarm the same way (with stack) and the proxy container recognize the python containers and was able to load balance between them.
A 503 error usually means a failed health check to the backend server.
Your stats page might be helpful here: if you mouse over the LastChk column of one of your DOWN backend servers, HAProxy will give you a vague summary of why that server is DOWN:
It does not look like you configured the health check (option httpchk) for your default_service backend: can you reach any of your backend servers directly (e.g. curl --head 10.0.0.5:5000)? From the HAProxy documentation:
[R]esponses 2xx and 3xx are
considered valid, while all other ones indicate a server failure, including
the lack of any response.