Clients keeps waiting for RabbitMQ response - python

I am using rabbitMQ to launch processes in remote hosts located in other parts of the world. Eg, RabbitMQ is running in an Oregon host, and it receives a client message to launch processes in Ireland and California.
Most of the time, the processes are launched, and, when they finish, rabbitMQ returns the output to the client. But, sometimes, the jobs finish successfully but rabbitMQ hasn't return the output to the client, and the client keeps hanging waiting for the response. These processes can take 10 minutes to execute, so the client is 10 minutes hanged waiting for the response.
I am using celery to connect to the rabbitMQ, and the client calls are blocking using task.get(). In other words, the client hangs until it receives the response for its call. I would like to understand why the client did not get the response if the jobs have finished. How can I debug this problem?
Here is my celeryconfig.py
import os
import sys
# add hadoop python to the env, just for the running
sys.path.append(os.path.dirname(os.path.basename(__file__)))
# broker configuration
# medusa-rabbitmq is the name of the hosts where rabbitmq is running
BROKER_URL = "amqp://celeryuser:celery#medusa-rabbitmq/celeryvhost"
CELERY_RESULT_BACKEND = "amqp"
TEST_RUNNER = 'celery.contrib.test_runner.run_tests'
# for debug
# CELERY_ALWAYS_EAGER = True
# module loaded
CELERY_IMPORTS = ("medusa.mergedirs", "medusa.medusasystem",
"medusa.utility", "medusa.pingdaemon", "medusa.hdfs", "medusa.vote.voting")

Related

Send tasks to a Celery app on a remote server

I have a server (Ubuntu Server) on the local network on ip address: 192.168.1.9.
This server is running RabbitMQ in docker.
I defined a basic Celery app:
from celery import Celery
app = Celery(
'tasks',
brocker='pyamqp://<username>:<password>#localhost//',
backend='rpc://',
)
#app.task
def add(x, y):
return x + y
Connected on the server I run the script with celery -A tasks worker --loglevel=INFO -c 2 -E
On my local laptop in a python shell I try to execute the task remotely by creating a new Celery instance with this time the ip address of my remote server.
from celery import Celery
app = Celery(
'tasks',
brocker='pyamqp://<username>:<password>#192.168.1.9//',
backend='rpc://',
)
result = app.send_task('add', (2,2))
# Note: I also tried app.send_task('tasks.add', (2,2))
And from there nothing happen, the task stay PENDING for ever, I can't see anything in the logs, it doesn't seem the server picks up the task.
If I connect to the server and run the same commands locally (but with localhost as the address) it works fine.
What is wrong? How can I send tasks remotely?
Thank you.
The task name is your celery app module's path + task name because you put it in that file.
Or you can start your worker with the DEBUG log, which will list all registered tasks:
celery -A tasks worker -l DEBUG
It should be
result = app.send_task('tasks.<celery_file>.add', (2,2))
But IMO you should use some API like https://flower.readthedocs.io/en/latest/api.html to have a more stable API.
Actually there was just a typo, brocker argument instead of broker.
In [1]: from celery import Celery
In [2]: app = Celery('tasks', broker='amqp://<username>:<password>#192.168.31.9:5672//', backend='rpc://')
In [3]: result = app.send_task('tasks.add', (2, 3))
In [4]: result.get()
Out[5]: 5

Slow response time from Twisted Web Server

I've created a web server using Twisted to handle requests for a "real time" game.
The server loops at 60hz on a separated Thread and updates all the clients.
Setup:
Twisted 16.6.0
Debian 64 bits
Python 2.7.12
The problem is when no messages are sent from the clients, the sever starts to respond every 1 second instead of 60 times per second affecting all clients. As soon as at least one client starts to send messages (such as mouse position at 60hz) everything works fine. That happens even with only one client is connected.
I've tested both reactors select() and poll but no luck.
That's how the reactor gets started:
if __name__ == "__main__":
log.startLogging(sys.stdout)
factory = AppGameFactory(u"ws://127.0.0.1:8080")
factory.protocol = AppGameServerProtocol
resource = WebSocketResource(factory)
# websockets resource on "/ws" path
root.putChild(u"ws", resource)
site = Site(root)
reactor.listenTCP(8080, site)
reactor.run()
Any ideias? Am I missing anything?

Where to place register code to zookeeper when using nd_service_registry with uwsgi+Django stack?

I'm using nd_service_registry to register my django service to zookeeper, which launched with uwsgi.
versions:
uWSGI==2.0.10
Django==1.7.5
My question is, what is the correct way to place nd_service_registry.set_node code to register itself to zookeeper server, avoiding duplicated register or deregister.
my uwsgi config ini, with processes=2, enable-threads=true, threads=2:
[uwsgi]
chdir = /data/www/django-proj/src
module = settings.wsgi:application
env = DJANGO_SETTINGS_MODULE=settings.test
master = true
pidfile = /tmp/uwsgi-proj.pid
socket = /tmp/uwsgi_proj.sock
processes = 2
threads = 2
harakiri = 20
max-requests = 50000
vacuum = true
home = /data/www/django-proj/env
enable-threads = true
buffer-size = 65535
chmod-socket=666
register code:
from nd_service_registry import KazooServiceRegistry
nd = KazooServiceRegistry(server=ZOOKEEPER_SERVER_URL)
nd.set_node('/web/test/server0', {'host': 'localhost', 'port': 80})
I've tested such cases and both worked as expected, django service registered at uwsgi master process startup only once.
place code in settings.py
place code in wsgi.py
Even if I killed uwsgi worker processes(then master process will relaunch another worker) or let worker process kill+restart by uwsgi harakiri options, no new register action triggered.
So my question is whether my register code is correct for django+uwsgi with processes and threads enabled, and where to place it.
The problem happened when you use uwsgi with master/worker. When uwsgi master process spawns workers, the connection to zookeeper maintained by thread in zookeeper client can't be copy to worker correctly.So in application of uwsgi, you should use uwsgi decorators: uwsgidecorators.postfork to initialize register code. The function decorated by #postfork will be called when spawning new workers.
Hope it will help.

Celery Closes Unexpectedly After Longer Inactivity

So I am using a RabbitMQ + Celery to create a simple RPC architecture. I have one RabbitMQ message broker and one remote worker which runs Celery deamon.
There is a third server which exposes a thin RESTful API. When it receives HTTP request, it sends a task to the remote worker, waits for response and returns a response.
This works great most of the time. However I have notices that after a longer inactivity (say 5 minutes of no incoming requests), the Celery worker behaves strangely. First 3 tasks received after a longer inactivity return this error:
exchange.declare: connection closed unexpectedly
After three erroneous tasks it works again. If there are not tasks for longer period of time, the same thing happens. Any idea?
My init script for the Celery worker:
# description "Celery worker using sync broker"
console log
start on runlevel [2345]
stop on runlevel [!2345]
setuid richard
setgid richard
script
chdir /usr/local/myproject/myproject
exec /usr/local/myproject/venv/bin/celery worker -n celery_worker_deamon.%h -A proj.sync_celery -Q sync_queue -l info --autoscale=10,3 --autoreload --purge
end script
respawn
My celery config:
# Synchronous blocking tasks
BROKER_URL_SYNC = 'amqp://guest:guest#localhost:5672//'
# Asynchronous non blocking tasks
BROKER_URL_ASYNC = 'amqp://guest:guest#localhost:5672//'
#: Only add pickle to this list if your broker is secured
#: from unwanted access (see userguide/security.html)
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'UTC'
CELERY_ENABLE_UTC = True
CELERY_BACKEND = 'amqp'
# http://docs.celeryproject.org/en/latest/userguide/tasks.html#disable-rate-limits-if-they-re-not-used
CELERY_DISABLE_RATE_LIMITS = True
# http://docs.celeryproject.org/en/latest/userguide/routing.html
CELERY_DEFAULT_QUEUE = 'sync_queue'
CELERY_DEFAULT_EXCHANGE = "tasks"
CELERY_DEFAULT_EXCHANGE_TYPE = "topic"
CELERY_DEFAULT_ROUTING_KEY = "sync_task.default"
CELERY_QUEUES = {
'sync_queue': {
'binding_key':'sync_task.#',
},
'async_queue': {
'binding_key':'async_task.#',
},
}
Any ideas?
EDIT:
Ok, now it appears to happen randomly. I noticed this in RabbitMQ logs:
=WARNING REPORT==== 6-Jan-2014::17:31:54 ===
closing AMQP connection <0.295.0> (some_ip_address:36842 -> some_ip_address:5672):
connection_closed_abruptly
Is your RabbitMQ server or your Celery worker behind a load balancer by any chance? If yes, then the load balancer is closing the TCP connection after some period of inactivity. In which case, you will have to enable heartbeat from the client (worker) side. If you do, I would not recommend using the pure Python amqp lib for this. Instead, replace it with librabbitmq.
The connection_closed_abruptly is caused when clients disconnecting without the proper AMQP shutdown protocol:
channel.close(...)
Request a channel close.
This method indicates that the sender wants to close the channel.
This may be due to internal conditions (e.g. a forced shut-down) or due to
an error handling a specific method, i.e. an exception.
When a close is due to an exception, the sender provides the class and method id of
the method which caused the exception.
After sending this method, any received methods except Close and Close-OK MUST be discarded. The response to receiving a Close after sending Close must be to send Close-Ok.
channel.close-ok():
Confirm a channel close.
This method confirms a Channel.Close method and tells the recipient
that it is safe to release resources for the channel.
A peer that detects a socket closure without having received a
Channel.Close-Ok handshake method SHOULD log the error.
Here is an issue about that.
Can you set your custom configuration for BROKER_HEARTBEAT and BROKER_HEARTBEAT_CHECKRATE and check again, for example:
BROKER_HEARTBEAT = 10
BROKER_HEARTBEAT_CHECKRATE = 2.0

How to stop a Flask server running gevent-socketio

I have a flask application running with gevent-socketio that I create this way:
server = SocketIOServer(('localhost', 2345), app, resource='socket.io')
gevent.spawn(send_queued_messages_loop, server)
server.serve_forever()
I launch send_queued_messages_loop in a gevent thread that keeps on polling on a gevent.Queue where my program stores data to send it to the socket.io connected clients
I tried different approaches to stop the server (such as using sys.exit) either from the socket.io handler (when the client sends a socket.io message) or from a normal route (when the client makes a request to /shutdown) but in any case, sys.exit seems to fail because of the presence of greenlets.
I tried to call gevent.shutdown() first, but this does not seem to change anything
What would be the proper way to shutdown the server?
Instead of using serve_forever() create a gevent.event.Event and wait for it. To actually initiate shutdown, trigger the event using its set() method:
from gevent.event import Event
stopper = Event()
server = SocketIOServer(('localhost', 2345), app, resource='socket.io')
server.start()
gevent.spawn(send_queued_messages_loop)
try:
stopper.wait()
except KeyboardInterrupt:
print
No matter from where you now want to terminate your process - all you need to do is calling stopper.set().
The try..except is not really necessary but I prefer not getting a stacktrace on a clean CTRL-C exit.

Categories