What is a minimal example of using Kafka with Python? - python

What I tried
I cloned https://github.com/wurstmeister/kafka-docker and executed sudo docker-compuse up.
I started the producer.py listed below.
I started the consumer.py listed below.
This didn't work. I changed the ports of the docker-compose.yml to
version: '2'
services:
zookeeper:
image: wurstmeister/zookeeper
ports:
- "2181:2181"
kafka:
image: wurstmeister/kafka
ports:
- "9092:9092"
environment:
KAFKA_ADVERTISED_HOST_NAME: localhost
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
volumes:
- /var/run/docker.sock:/var/run/docker.sock
After starting that, the producer.py finished execution and the docker-compose terminal showed
zookeeper_1 | 2018-09-05 14:21:44,001 [myid:] - INFO [SessionTracker:ZooKeeperServer#358] - Expiring session 0x165aa1acb900000, timeout of 6000ms exceeded
zookeeper_1 | 2018-09-05 14:21:44,002 [myid:] - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor#487] - Processed session termination for sessionid: 0x165aa1acb900000
kafka_1 | [2018-09-05 14:21:44,028] INFO Creating /controller (is it secure? false) (kafka.zk.KafkaZkClient)
kafka_1 | [2018-09-05 14:21:44,033] INFO Result of znode creation at /controller is: OK (kafka.zk.KafkaZkClient)
zookeeper_1 | 2018-09-05 14:21:44,141 [myid:] - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor#649] - Got user-level KeeperException when processing sessionid:0x165aa1c265e0000 type:delete cxid:0x32 zxid:0x6f txntype:-1 reqpath:n/a Error Path:/admin/reassign_partitions Error:KeeperErrorCode = NoNode for /admin/reassign_partitions
zookeeper_1 | 2018-09-05 14:21:44,152 [myid:] - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor#649] - Got user-level KeeperException when processing sessionid:0x165aa1c265e0000 type:delete cxid:0x34 zxid:0x70 txntype:-1 reqpath:n/a Error Path:/admin/preferred_replica_election Error:KeeperErrorCode = NoNode for /admin/preferred_replica_election
zookeeper_1 | 2018-09-05 14:21:47,621 [myid:] - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor#649] - Got user-level KeeperException when processing sessionid:0x165aa1c265e0000 type:setData cxid:0x3c zxid:0x71 txntype:-1 reqpath:n/a Error Path:/config/topics/mytopic Error:KeeperErrorCode = NoNode for /config/topics/mytopic
kafka_1 | [2018-09-05 14:21:47,628] INFO Topic creation Map(mytopic-0 -> ArrayBuffer(1003)) (kafka.zk.AdminZkClient)
kafka_1 | [2018-09-05 14:21:47,639] INFO [KafkaApi-1003] Auto creation of topic mytopic with 1 partitions and replication factor 1 is successful (kafka.server.KafkaApis)
The topic was created, that is good. But then, when I execute the consumers, they don't do anything. But docker-compose shows
kafka_1 | [2018-09-05 14:24:52,566] ERROR [KafkaApi-1003] Number of alive brokers '0' does not meet the required replication factor '1' for the offsets topic (configured via 'offsets.topic.replication.factor'). This error can be ignored if the cluster is starting up and not all brokers are up yet. (kafka.server.KafkaApis)
How can I have a minimal Kafka installation / setup to see Kafka working with Python?
producer.py
from confluent_kafka import Producer
p = Producer({'bootstrap.servers': 'localhost:9092'})
p.produce('mytopic', key='hello', value='world')
print("produce done")
p.flush(10)
consumer.py
from confluent_kafka import Consumer, KafkaError
c = Consumer({
'bootstrap.servers': 'localhost:9092',
'group.id': 'mygroup',
'default.topic.config': {
'auto.offset.reset': 'smallest'
}
})
c.subscribe(['mytopic'])
while True:
msg = c.poll(1.0)
if msg is None:
continue
if msg.error():
if msg.error().code() == KafkaError._PARTITION_EOF:
continue
else:
print(msg.error())
break
print('Received message: {}'.format(msg.value().decode('utf-8')))
c.close()

Try this env section
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: INSIDE://:9092,OUTSIDE://localhost:9094
KAFKA_LISTENERS: INSIDE://:9092,OUTSIDE://:9094
KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE
Add 9094:9094 to the ports and point Python at localhost:9094 if you are not running your code inside a Python docker container
The Confluent images have a similar setup, but port 29092 instead

Related

Kafka producer and consumer not working properly on python with docker

Im working on a project that uses kafka producer and consumer in order to acquire articles (with specific topics) from a news_api every two hours and then with a consumer save them in a mongodb.
So i made three classes one for KafkaAdminClient one for KafkaProducer and one for KafkaConsumer.
The servers for my kafka are running on a docker container. The main application is a flask app, and thats where i start all the threads and the ones for kafka.
Ive been trying to change a lot of little things but it just seems very unstable and i dont know why. Firstly the data gets on the consumer and eventually in the mongodb at a random time. Then the old topics in the consumer dont get deleted and the database keeps getting populated with new and old values.
Now that i put a group in the consumer and i added the kafkaAdminClient class i dont get messages in the consumer at all. All i get is this:
articleretrieval-flask_api-1 | WARNING:kafka.cluster:Topic health is not available during auto-create initialization articleretrieval-flask_api-1 | WARNING:kafka.cluster:Topic business is not available during auto-create initialization articleretrieval-flask_api-1 | WARNING:kafka.cluster:Topic war is not available during auto-create initialization articleretrieval-flask_api-1
| WARNING:kafka.cluster:Topic motorsport is not available during auto-create initialization articleretrieval-flask_api-1
| WARNING:kafka.cluster:Topic sources is not available during auto-create initialization articleretrieval-flask_api-1
| WARNING:kafka.cluster:Topic science is not available during auto-create initialization articleretrieval-flask_api-1
| WARNING:kafka.cluster:Topic technology is not available during auto-create initialization articleretrieval-flask_api-1
| WARNING:kafka.cluster:Topic education is not available during auto-create initialization articleretrieval-flask_api-1
| WARNING:kafka.cluster:Topic space is not available during auto-create initialization articleretrieval-flask_api-1
| INFO:kafka.consumer.subscription_state:Updated partition assignment: [] articleretrieval-flask_api-1
| INFO:kafka.conn:<BrokerConnection node_id=bootstrap-0 host=kafka:29092 <connected> [IPv4 ('172.19.0.4', 29092)]>: Closing connection.
kafkaConsumerThread.py:
class KafkaConsumerThread:
def __init__(self, topics, db,logger):
self.topics = topics
self.db = db
self.logger = logger
def start(self):
self.logger.debug("Getting the kafka consumer")
try:
consumer = KafkaConsumer(bootstrap_servers=['kafka:29092'],
auto_offset_reset='earliest',
# group_id='my_group',
enable_auto_commit=False,
value_deserializer=lambda x: json.loads(x.decode('utf-8')))
except NoBrokersAvailable as err:
self.logger.error("Unable to find a broker: {0}".format(err))
time.sleep(1)
consumer.subscribe(self.topics + ["sources"])
for message in consumer:
self.logger(message)
if message.topic == "sources":
self.db.insert_source_info(message.value["source_name"], message.value["source_info"])
else:
self.db.insert_article(message.topic, [message.value])
def on_send_success(record_metadata):
return
# print(record_metadata.topic)
# print(record_metadata.partition)
def on_send_error(excp):
print(excp)
def call_apis(self, topics, news_api, media_api):
try:
producer = KafkaProducer(bootstrap_servers=['kafka:29092'],
max_block_ms=100000,
value_serializer=lambda x: json.dumps(x).encode('utf-8'))
except NoBrokersAvailable as err:
# self.logger.error("Unable to find a broker: {0}".format(err))
time.sleep(1)
domains = []
try:
if producer:
for topic in topics:
articles = news_api.get_articles(topic)
for article in articles:
if article['source'] != '':
if article['source'] not in domains:
domains.append(article['source'])
producer.send(topic, value=article).add_callback(on_send_success).add_errback(on_send_error)
producer.flush()
for domain in domains:
source_info = media_api.get_source_domain_info(domain)
if source_info:
producer.send("sources", value={"source_name": domain, "source_info": source_info}).add_callback(on_send_success).add_errback(on_send_error)
# Flush the producer to ensure all messages are sent
producer.flush()
except AttributeError:
self.logger.error("Unable to send message. The producer does not exist.")
class KafkaProducerThread:
def __init__(self, topics,logger):
self.topics = topics
self.news_api = NewsApi()
self.media_api = MediaWikiApi()
self.logger = logger
def start(self):
# Call the APIs immediately when the thread starts
call_apis(self, self.topics, self.news_api, self.media_api)
# Use a timer to schedule the next API call
timer = Timer(7200, self.start)
timer.start()
kafkaAdminClient.py:
class KafkaAdminThread:
def __init__(self,topics):
self.topics = topics
def start(self):
admin_client = KafkaAdminClient(
bootstrap_servers=['kafka:29092'],
client_id='my_client'
)
topic_list = []
for topic in self.topics:
topic_list.append(NewTopic(name=topic, num_partitions=1, replication_factor=1))
admin_client.create_topics(new_topics=topic_list, validate_only=False)
app.py:
if __name__ == "__main__":
# Creating a new connection with mongo
# threading.Thread(target=lambda: app.run(port=8080, host="0.0.0.0",debug=True,use_reloader=False)).start()
executor = ThreadPoolExecutor(max_workers=4)
producerThread = KafkaProducerThread(TOPICS,logging)
adminThread = KafkaAdminThread(TOPICS)
executor.submit(adminThread.start)
flaskThread = threading.Thread(target=lambda: app.run(port=8080, host="0.0.0.0", debug=True, use_reloader=False))
executor.submit(flaskThread.start())
time.sleep(15)
executor.submit(producerThread.start)
consumerThread = KafkaConsumerThread(TOPICS, db,logging)
executor.submit(consumerThread.start)
docker-compose.yml:
zookeeper:
image: wurstmeister/zookeeper
ports:
- "2181:2181"
kafka:
container_name: kafka_broker_1
image: wurstmeister/kafka
links:
- zookeeper
ports:
- "9092:9092"
- "29092:29092"
depends_on:
- zookeeper
environment:
KAFKA_ADVERTISED_HOSTNAME: kafka
KAFKA_ADVERTISED_LISTENERS: INSIDE://kafka:29092,OUTSIDE://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT
KAFKA_LISTENERS: INSIDE://0.0.0.0:29092,OUTSIDE://0.0.0.0:9092
KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
volumes:
- /var/run/docker.sock:/var/run/docker.sock%
flask_api:
build:
context: . #Very important it refers where the root will be for the build.
dockerfile: Dockerfile
links:
- kafka
environment:
- FLASK-KAFKA_BOOTSTRAP-SERVERS=kafka:29092
- SERVER_PORT=8080
ports:
- "8080:8080"
depends_on:
- kafka
the old topics in the consumer dont get deleted
Nothing in your shown code is deleting topics. The only way they would be deleted is if the Kafka container restarted since you've not mounted a volume for Kafka or Zookeeper to persist them.
and the database keeps getting populated with new and old values.
I assume your producer isn't keeping track of what sources it has read so far? If so, you'll end up with duplicates in the topic. I
suggest using kafka-console-consumer to debug whether the producer is actually working the way you want.
Likewise, you've disabled consumer auto commits, and I see no code committing manually, so when the consumer restarts, it'll re-process any existing data in the topics. Group / AdminClient settings shouldn't affect that, but setting a group will allow you to actually maintain offset tracking.
In terms of stability, I've used Flask and Kafka without threads before, and it has worked fine. At least, a producer... My suggestion would be to make a completely separate container for the consumer that's responsible for writing to the database. You don't need the overhead of Flask framework for that. Or, recommended, would be Kafka Connect Mongo sink instead.
The wurstmeister container supports creating topics on its own, via environment variables, by the way.

Cannot create local docker-compose file for Confluent Cloud Setup

I want to create a local kafka setup using docker-compose that replicates very closely the secured kafka setup in confluent cloud.
The cluster I have in Confluent Cloud can be connected to using
c = Consumer(
{
"bootstrap.servers": "broker_url",
"sasl.mechanism": "PLAIN",
"security.protocol": "SASL_SSL",
"sasl.username": "key",
"sasl.password": "secret",
"group.id": "consumer-name",
}
)
But I am unable to create a docker-compose.yml locally that has the same config and can be connected to using the same code.
version: '3'
services:
zookeeper:
image: confluentinc/cp-zookeeper:6.2.0
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
kafka:
image: confluentinc/cp-kafka:6.2.0
depends_on:
- zookeeper
ports:
- '9092:9092'
- '19092:19092'
expose:
- '29092'
environment:
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
KAFKA_LISTENERS: INSIDE-DOCKER-NETWORK://0.0.0.0:29092,OTHER-DOCKER-NETWORK://0.0.0.0:19092,HOST://0.0.0.0:9092
KAFKA_ADVERTISED_LISTENERS: INSIDE-DOCKER-NETWORK://kafka:29092,OTHER-DOCKER-NETWORK://host.docker.internal:19092,HOST://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE-DOCKER-NETWORK:PLAINTEXT,OTHER-DOCKER-NETWORK:PLAINTEXT,HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE-DOCKER-NETWORK
KAFKA_AUTO_CREATE_TOPICS_ENABLE: 'true'
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
# Allow to swiftly purge the topics using retention.ms
KAFKA_LOG_RETENTION_CHECK_INTERVAL_MS: 100
# Security Stuff
KAFKA_LISTENER_NAME_EXTERNAL_PLAIN_SASL_JAAS_CONFIG: |
org.apache.kafka.common.security.plain.PlainLoginModule required \
username="broker" \
password="broker" \
user_alice="alice-secret";
KAFKA_SASL_ENABLED_MECHANISMS: PLAIN
KAFKA_SASL_MECHANISM_INTER_BROKER_PROTOCOL: SASL_SSL
Here is what I have in terms of the local docker-compose file but its not working
This is the error I get when I try connecting using the same code
%3|1628019607.757|FAIL|rdkafka#consumer-1| [thrd:sasl_ssl://localhost:9092/bootstrap]: sasl_ssl://localhost:9092/bootstrap: SSL handshake failed: Disconnected: connecting to a PLAINTEXT broker listener? (after 9ms in state SSL_HANDSHAKE)
Here's your hint: Disconnected: connecting to a PLAINTEXT broker listener?
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP only has PLAINTEXT mappings, so there is no SASL_SSL connection that your client can use
For what it looks like you did configure to have SASL_SSL, you only have one broker, so KAFKA_SASL_MECHANISM_INTER_BROKER_PROTOCOL doesn't really do anything
In this demo repo you can find brokers that use all possible protocol mappings

Docker, Celery, method failing on compose

I'm trying a FastAPI based API with celery, redis, and rabitMQ as the background tasks.
when doing docker-compose up, the redis, rabbit, and flower parts work, I'm able to access the flower dashboard.
but it then gets stuck in the celery part.
the error:
rabbitmq_1 | 2020-09-08 06:32:38.552 [info] <0.716.0> connection <0.716.0> (172.22.0.6:49290 -> 172.22.0.2:5672): user 'user' authenticated and granted access to vhost '/'
celery-flower_1 | [W 200908 06:32:41 control:44] 'stats' inspect method failed
celery-flower_1 | [W 200908 06:32:41 control:44] 'active_queues' inspect method failed
celery-flower_1 | [W 200908 06:32:41 control:44] 'registered' inspect method failed
celery-flower_1 | [W 200908 06:32:41 control:44] 'scheduled' inspect method failed
celery-flower_1 | [W 200908 06:32:41 control:44] 'active' inspect method failed
celery-flower_1 | [W 200908 06:32:41 control:44] 'reserved' inspect method failed
celery-flower_1 | [W 200908 06:32:41 control:44] 'revoked' inspect method failed
celery-flower_1 | [W 200908 06:32:41 control:44] 'conf' inspect method failed
My docker-compose file:
version: "3.7"
services:
rabbitmq:
image: "bitnami/rabbitmq:3.7"
ports:
- "4000:4000"
- "5672:5672"
volumes:
- "rabbitmq_data:/bitnami"
redis:
image: "bitnami/redis:5.0.4"
environment:
- REDIS_PASSWORD=password123
ports:
- "5000:5000"
volumes:
- "redis_data:/bitnami/redis/data"
celery-flower:
image: gregsi/latest-celery-flower-docker:latest
environment:
- AMQP_USERNAME=user
- AMQP_PASSWORD=bitnami
- AMQP_ADMIN_USERNAME=user
- AMQP_ADMIN_PASSWORD=bitnami
- AMQP_HOST=rabbitmq
- AMQP_PORT=5672
- AMQP_ADMIN_HOST=rabbitmq
- AMQP_ADMIN_PORT=15672
- FLOWER_BASIC_AUTH=user:test
ports:
- "5555:5555"
depends_on:
- rabbitmq
- redis
fastapi:
build: .
ports:
- "8000:8000"
depends_on:
- rabbitmq
- redis
volumes:
- "./:/app"
command: "poetry run uvicorn app/app/main:app --bind 0.0.0.0:8000"
worker:
build: .
depends_on:
- rabbitmq
- redis
volumes:
- "./:/app"
command: "poetry run celery worker -A app.app.worker.celery_worker -l info -Q test-queue -c 1"
volumes:
rabbitmq_data:
driver: local
redis_data:
driver: local
My celery app:
celery_app = Celery(
"worker",
backend="redis://:password123#redis:6379/0",
broker="amqp://user:bitnami#rabbitmq:5672//"
)
celery_app.conf.task_routes = {
"app.app.worker.celery_worker.compute_stock_indicators": "stocks-queue"
}
celery_app.conf.update(task_track_started=True)
celery worker:
#celery_app.task(acks_late=True)
def compute_stock_indicators(stocks: list, background_task):
stocks_with_indicators = {}
for stock in stocks:
current_task.update_state(state=Actions.STARTED,
meta={f"starting to fetch {stock}'s indicators"})
stock_indicators = fetch_stock_indicators(stock) # Fetch the stock most recent indicators
current_task.update_state(state=Actions.FINISHED,
meta={f"{stock}'s indicators fetched"})
stocks_with_indicators.update({stock: stock_indicators})
current_task.update_state(state=Actions.PROGRESS,
meta={f"predicting {stocks}s..."})
The Fast API function:
log = logging.getLogger(__name__)
rabbit = RabbitMQHandler(host='localhost', port=5672, level="DEBUG")
log.addHandler(rabbit)
def celery_on_message(body):
"""
Logs the initiation of the function
"""
log.warning(body)
def background_on_message(task):
"""
logs the function when it is added to queue
"""
log.warning(task.get(on_message=celery_on_message, propagate=False))
app = FastAPI(debug=True)
#app.post("/")
async def initiator(stocks: FrozenSet, background_task: BackgroundTasks, ):
"""
:param stocks: stocks to be analyzed
:type stocks: set
:param background_task: initiate the tasks queue
:type background_task: starlette.background.BackgroundTasks
"""
log.warning(msg=f'beginning analysis on: {stocks}')
task_name = "app.app.worker.celery_worker.compute_stock_indicators"
task = celery_app.send_task(task_name, args=[stocks, background_task])
background_task.add_task(background_on_message, task)
return {"message": "Stocks indicators successfully calculated,stocks sent to prediction"}
On the docker-compose, on the worker section, the command reads:
command: "poetry run celery worker -A app.app.worker.celery_worker -l info -Q test-queue -c 1"
So essentially you are asking the worker to "watch" a queue named test-queue.
But on the celery_app, on the following section:
celery_app.conf.task_routes = {
"app.app.worker.celery_worker.compute_stock_indicators": "stocks-queue"
}
you are defining a queue named stocks-queue.
Either change the docker-compose's or the celery_app's queue name to match the other.
if you use Docker Toolbox on windows , so you should add port 5555 to VM virtualBOX network:
frist run following command on cmd:
docker-machine stop default
then open VM virtualBOX , go to Settings >Networks > advanced>port forwarding >add a row with port 5555 and leave name field
click OK and on cmd, run following command:
docker-machine start default

Where to check swarm load balancer logs?

I have docker compose file as
---
version: '3.7'
services:
myapi:
image: tiangolo/uwsgi-nginx-flask:python3.7
env_file: apivars.env
logging:
driver: syslog
options:
syslog-address: "udp://127.0.0.1:514"
tag: tags
labels: labels
ports:
- "8080:80"
deploy:
placement:
constraints:
- node.role != manager
mode: replicated
replicas: 32
update_config:
parallelism: 4
delay: 5s
order: start-first
...
I have load balancer which will redirect request to this swarm manager.
My understanding is, if I hit www.myapi.com, it will got to LoadBalancer and then request will go to swarm manager, then swarm manager send that request to on of the 32 replicas.
Now issue is, LoadBalancer logs report some of the 502 errors.
# head -n1 /var/log/haproxy.log
Apr 28 09:35:28 localhost haproxy[43117]: 172.19.9.1:50220 [28/Apr/2020:09:35:08.549] main~ API_Production/swarmnode5 0/0/1/19952/19953 502 309 - - ---- 97/97/10/1/0 0/0 "GET /v2/students/?includeFields=name,id&per_page=1000&page=88 HTTP/1.1"
I have to check its reaching to swarm manager or swarmnode5 ?
I check the logs for nginx, but its not report any 502 errors. There are some exceptions, but not sure if exception is there in code, then why nginx not logs that api call and response?

Load balance docker swarm

I have a docker swarm mode with one HAProxy container, and 3 python web apps. The container with HAProxy is expose port 80 and should load balance the 3 containers of my app (by leastconn).
Here is my docker-compose.yml file:
version: '3'
services:
scraper-node:
image: scraper
ports:
- 5000
volumes:
- /profiles:/profiles
command: >
bash -c "
cd src;
gunicorn src.interface:app \
--bind=0.0.0.0:5000 \
--workers=1 \
--threads=1 \
--timeout 500 \
--log-level=debug \
"
environment:
- SERVICE_PORTS=5000
deploy:
replicas: 3
update_config:
parallelism: 5
delay: 10s
restart_policy:
condition: on-failure
max_attempts: 3
window: 120s
networks:
- web
proxy:
image: dockercloud/haproxy
depends_on:
- scraper-node
environment:
- BALANCE=leastconn
volumes:
- /var/run/docker.sock:/var/run/docker.sock
ports:
- 80:80
networks:
- web
networks:
web:
driver: overlay
When I deploy this swarm (docker stack deploy --compose-file=docker-compose.yml scraper) I get all of my containers:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
245f4bfd1299 scraper:latest "/docker-entrypoin..." 21 hours ago Up 19 minutes 80/tcp, 5000/tcp, 8000/tcp scraper_scraper-node.3.iyi33hv9tikmf6m2wna0cypgp
995aefdb9346 scraper:latest "/docker-entrypoin..." 21 hours ago Up 19 minutes 80/tcp, 5000/tcp, 8000/tcp scraper_scraper-node.2.wem9v2nug8wqos7d97zknuvqb
a51474322583 scraper:latest "/docker-entrypoin..." 21 hours ago Up 19 minutes 80/tcp, 5000/tcp, 8000/tcp scraper_scraper-node.1.0u8q4zn432n7p5gl93ohqio8e
3f97f34678d1 dockercloud/haproxy "/sbin/tini -- doc..." 21 hours ago Up 19 minutes 80/tcp, 443/tcp, 1936/tcp scraper_proxy.1.rng5ysn8v48cs4nxb1atkrz73
And when I display the haproxy container log it looks like he recognize the 3 python containers:
INFO:haproxy:dockercloud/haproxy 1.6.6 is running outside Docker Cloud
INFO:haproxy:Haproxy is running in SwarmMode, loading HAProxy definition through docker api
INFO:haproxy:dockercloud/haproxy PID: 6
INFO:haproxy:=> Add task: Initial start - Swarm Mode
INFO:haproxy:=> Executing task: Initial start - Swarm Mode
INFO:haproxy:==========BEGIN==========
INFO:haproxy:Linked service: scraper_scraper-node
INFO:haproxy:Linked container: scraper_scraper-node.1.0u8q4zn432n7p5gl93ohqio8e, scraper_scraper-node.2.wem9v2nug8wqos7d97zknuvqb, scraper_scraper-node.3.iyi33hv9tikmf6m2wna0cypgp
INFO:haproxy:HAProxy configuration:
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
log-send-hostname
maxconn 4096
pidfile /var/run/haproxy.pid
user haproxy
group haproxy
daemon
stats socket /var/run/haproxy.stats level admin
ssl-default-bind-options no-sslv3
ssl-default-bind-ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:AES128-GCM-SHA256:AES128-SHA256:AES128-SHA:AES256-GCM-SHA384:AES256-SHA256:AES256-SHA:DHE-DSS-AES128-SHA:DES-CBC3-SHA
defaults
balance leastconn
log global
mode http
option redispatch
option httplog
option dontlognull
option forwardfor
timeout connect 5000
timeout client 50000
timeout server 50000
listen stats
bind :1936
mode http
stats enable
timeout connect 10s
timeout client 1m
timeout server 1m
stats hide-version
stats realm Haproxy\ Statistics
stats uri /
stats auth stats:stats
frontend default_port_80
bind :80
reqadd X-Forwarded-Proto:\ http
maxconn 4096
default_backend default_service
backend default_service
server scraper_scraper-node.1.0u8q4zn432n7p5gl93ohqio8e 10.0.0.5:5000 check inter 2000 rise 2 fall 3
server scraper_scraper-node.2.wem9v2nug8wqos7d97zknuvqb 10.0.0.6:5000 check inter 2000 rise 2 fall 3
server scraper_scraper-node.3.iyi33hv9tikmf6m2wna0cypgp 10.0.0.7:5000 check inter 2000 rise 2 fall 3
INFO:haproxy:Launching HAProxy
INFO:haproxy:HAProxy has been launched(PID: 12)
INFO:haproxy:===========END===========
But when I try to GET to http://localhost I get an error message:
<html>
<body>
<h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body>
</html>
There was two problems:
The command in docker-compose.yml file should be one line.
The scraper image should expose port 5000 (in his Dockerfile).
Once I fix those, I deploy this swarm the same way (with stack) and the proxy container recognize the python containers and was able to load balance between them.
A 503 error usually means a failed health check to the backend server.
Your stats page might be helpful here: if you mouse over the LastChk column of one of your DOWN backend servers, HAProxy will give you a vague summary of why that server is DOWN:
It does not look like you configured the health check (option httpchk) for your default_service backend: can you reach any of your backend servers directly (e.g. curl --head 10.0.0.5:5000)? From the HAProxy documentation:
[R]esponses 2xx and 3xx are
considered valid, while all other ones indicate a server failure, including
the lack of any response.

Categories