How to pause and resume consumption gracefully in rabbitmq, pika python - python

I'm using basic_consume() for receiving messages and basic_cancel for canceling consuming, but there is a problem.
Here is the code of pika.channel
def basic_consume(self, consumer_callback, queue='', no_ack=False,
exclusive=False, consumer_tag=None):
"""Sends the AMQP command Basic.Consume to the broker and binds messages
for the consumer_tag to the consumer callback. If you do not pass in
a consumer_tag, one will be automatically generated for you. Returns
the consumer tag.
For more information on basic_consume, see:
http://www.rabbitmq.com/amqp-0-9-1-reference.html#basic.consume
:param method consumer_callback: The method to callback when consuming
:param queue: The queue to consume from
:type queue: str or unicode
:param bool no_ack: Tell the broker to not expect a response
:param bool exclusive: Don't allow other consumers on the queue
:param consumer_tag: Specify your own consumer tag
:type consumer_tag: str or unicode
:rtype: str
"""
self._validate_channel_and_callback(consumer_callback)
# If a consumer tag was not passed, create one
consumer_tag = consumer_tag or 'ctag%i.%s' % (self.channel_number,
uuid.uuid4().get_hex())
if consumer_tag in self._consumers or consumer_tag in self._cancelled:
raise exceptions.DuplicateConsumerTag(consumer_tag)
self._consumers[consumer_tag] = consumer_callback
self._pending[consumer_tag] = list()
self._rpc(spec.Basic.Consume(queue=queue,
consumer_tag=consumer_tag,
no_ack=no_ack,
exclusive=exclusive),
self._on_eventok,
[(spec.Basic.ConsumeOk,
{'consumer_tag': consumer_tag})])
return consumer_tag
def basic_cancel(self, callback=None, consumer_tag='', nowait=False):
"""This method cancels a consumer. This does not affect already
delivered messages, but it does mean the server will not send any more
messages for that consumer. The client may receive an arbitrary number
of messages in between sending the cancel method and receiving the
cancel-ok reply. It may also be sent from the server to the client in
the event of the consumer being unexpectedly cancelled (i.e. cancelled
for any reason other than the server receiving the corresponding
basic.cancel from the client). This allows clients to be notified of
the loss of consumers due to events such as queue deletion.
:param method callback: Method to call for a Basic.CancelOk response
:param str consumer_tag: Identifier for the consumer
:param bool nowait: Do not expect a Basic.CancelOk response
:raises: ValueError
"""
self._validate_channel_and_callback(callback)
if consumer_tag not in self.consumer_tags:
return
if callback:
if nowait is True:
raise ValueError('Can not pass a callback if nowait is True')
self.callbacks.add(self.channel_number,
spec.Basic.CancelOk,
callback)
self._cancelled.append(consumer_tag)
self._rpc(spec.Basic.Cancel(consumer_tag=consumer_tag,
nowait=nowait),
self._on_cancelok,
[(spec.Basic.CancelOk,
{'consumer_tag': consumer_tag})] if nowait is False else [])
As you can see every time I'm cancelling consumption consumer_tag is added to _canceled list. And if I would use this tag in basic_consume again the duplicateConsumer exception will be raised.
Well, I could use a new consumer_tag every time, but in fact I'm not. Because sooner or later generated tag would be exactly match some of the previous ones.
How should I pause and resume consumption gracefully in pika?

Is there a reason why you define your own consumer_tags? You can pass an empty string and let RabbitMQ generate consumer tags for you. The reply from basic.consume, which is basic.consume-ok will return the generated consumer_tag, so you can use it later to stop consuming.
See: http://www.rabbitmq.com/amqp-0-9-1-reference.html#basic.consume-ok

That looks like Pika is doing more than it should - it doesn't need to create a consumer tag if one is not supplied (the server will) and it also doesn't need to watch for duplicated consumer tags (resuming with the same tag is supported by the server).
So I'm not sure how to do this with Pika - file a bug I suppose.

Related

How to wait for coroutines to complete synchronously within method if event loop is already running?

I'm trying to create a Python-based CLI that communicates with a web service via websockets. One issue that I'm encountering is that requests made by the CLI to the web service intermittently fail to get processed. Looking at the logs from the web service, I can see that the problem is caused by the fact that frequently these requests are being made at the same time (or even after) the socket has closed:
2016-09-13 13:28:10,930 [22 ] INFO DeviceBridge - Device bridge has opened
2016-09-13 13:28:11,936 [21 ] DEBUG DeviceBridge - Device bridge has received message
2016-09-13 13:28:11,937 [21 ] DEBUG DeviceBridge - Device bridge has received valid message
2016-09-13 13:28:11,937 [21 ] WARN DeviceBridge - Unable to process request: {"value": false, "path": "testcube.pwms[0].enabled", "op": "replace"}
2016-09-13 13:28:11,936 [5 ] DEBUG DeviceBridge - Device bridge has closed
In my CLI I define a class CommunicationService that is responsible for handling all direct communication with the web service. Internally, it uses the websockets package to handle communication, which itself is built on top of asyncio.
CommunicationService contains the following method for sending requests:
def send_request(self, request: str) -> None:
logger.debug('Sending request: {}'.format(request))
asyncio.ensure_future(self._ws.send(request))
...where ws is a websocket opened earlier in another method:
self._ws = await websockets.connect(websocket_address)
What I want is to be able to await the future returned by asyncio.ensure_future and, if necessary, sleep for a short while after in order to give the web service time to process the request before the websocket is closed.
However, since send_request is a synchronous method, it can't simply await these futures. Making it asynchronous would be pointless as there would be nothing to await the coroutine object it returned. I also can't use loop.run_until_complete as the loop is already running by the time it is invoked.
I found someone describing a problem very similar to the one I have at mail.python.org. The solution that was posted in that thread was to make the function return the coroutine object in the case the loop was already running:
def aio_map(coro, iterable, loop=None):
if loop is None:
loop = asyncio.get_event_loop()
coroutines = map(coro, iterable)
coros = asyncio.gather(*coroutines, return_exceptions=True, loop=loop)
if loop.is_running():
return coros
else:
return loop.run_until_complete(coros)
This is not possible for me, as I'm working with PyRx (Python implementation of the reactive framework) and send_request is only called as a subscriber of an Rx observable, which means the return value gets discarded and is not available to my code:
class AnonymousObserver(ObserverBase):
...
def _on_next_core(self, value):
self._next(value)
On a side note, I'm not sure if this is some sort of problem with asyncio that's commonly come across or whether I'm just not getting it, but I'm finding it pretty frustrating to use. In C# (for instance), all I would need to do is probably something like the following:
void SendRequest(string request)
{
this.ws.Send(request).Wait();
// Task.Delay(500).Wait(); // Uncomment If necessary
}
Meanwhile, asyncio's version of "wait" unhelpfully just returns another coroutine that I'm forced to discard.
Update
I've found a way around this issue that seems to work. I have an asynchronous callback that gets executed after the command has executed and before the CLI terminates, so I just changed it from this...
async def after_command():
await comms.stop()
...to this:
async def after_command():
await asyncio.sleep(0.25) # Allow time for communication
await comms.stop()
I'd still be happy to receive any answers to this problem for future reference, though. I might not be able to rely on workarounds like this in other situations, and I still think it would be better practice to have the delay executed inside send_request so that clients of CommunicationService do not have to concern themselves with timing issues.
In regards to Vincent's question:
Does your loop run in a different thread, or is send_request called by some callback?
Everything runs in the same thread - it's called by a callback. What happens is that I define all my commands to use asynchronous callbacks, and when executed some of them will try to send a request to the web service. Since they're asynchronous, they don't do this until they're executed via a call to loop.run_until_complete at the top level of the CLI - which means the loop is running by the time they're mid-way through execution and making this request (via an indirect call to send_request).
Update 2
Here's a solution based on Vincent's proposal of adding a "done" callback.
A new boolean field _busy is added to CommunicationService to represent if comms activity is occurring or not.
CommunicationService.send_request is modified to set _busy true before sending the request, and then provides a callback to _ws.send to reset _busy once done:
def send_request(self, request: str) -> None:
logger.debug('Sending request: {}'.format(request))
def callback(_):
self._busy = False
self._busy = True
asyncio.ensure_future(self._ws.send(request)).add_done_callback(callback)
CommunicationService.stop is now implemented to wait for this flag to be set false before progressing:
async def stop(self) -> None:
"""
Terminate communications with TestCube Web Service.
"""
if self._listen_task is None or self._ws is None:
return
# Wait for comms activity to stop.
while self._busy:
await asyncio.sleep(0.1)
# Allow short delay after final request is processed.
await asyncio.sleep(0.1)
self._listen_task.cancel()
await asyncio.wait([self._listen_task, self._ws.close()])
self._listen_task = None
self._ws = None
logger.info('Terminated connection to TestCube Web Service')
This seems to work too, and at least this way all communication timing logic is encapsulated within the CommunicationService class as it should be.
Update 3
Nicer solution based on Vincent's proposal.
Instead of self._busy we have self._send_request_tasks = [].
New send_request implementation:
def send_request(self, request: str) -> None:
logger.debug('Sending request: {}'.format(request))
task = asyncio.ensure_future(self._ws.send(request))
self._send_request_tasks.append(task)
New stop implementation:
async def stop(self) -> None:
if self._listen_task is None or self._ws is None:
return
# Wait for comms activity to stop.
if self._send_request_tasks:
await asyncio.wait(self._send_request_tasks)
...
You could use a set of tasks:
self._send_request_tasks = set()
Schedule the tasks using ensure_future and clean up using add_done_callback:
def send_request(self, request: str) -> None:
task = asyncio.ensure_future(self._ws.send(request))
self._send_request_tasks.add(task)
task.add_done_callback(self._send_request_tasks.remove)
And wait for the set of tasks to complete:
async def stop(self):
if self._send_request_tasks:
await asyncio.wait(self._send_request_tasks)
Given that you're not inside an asynchronous function you can use the yield from keyword to effectively implement await yourself. The following code will block until the future returns:
def send_request(self, request: str) -> None:
logger.debug('Sending request: {}'.format(request))
future = asyncio.ensure_future(self._ws.send(request))
yield from future.__await__()

Stop channel.basic_consume if the connection is idle/Not consuming from long time

I am having a use case in which i want get the last idle time(last message processed time) of a pika consumer(pika.BlockingConnection).
Usecase:
If the last processed time is greater than Threshold time(ex: 1 hr). I want the consumer to get exited or have a callback method to decide on what i need to do? Like sending a notification to a user.
Is there any way to do this?
pika supports a timeout callback.
You could add this callback at the end of each message receipt, keeping the reference to it, and removing it at the start of each message receipt.
def close_connec():
# close here
timer_id = None
def on_message(chan, method, props, body):
if timer_id is not None:
chan.connection.remove_timeout(timer_id)
# process message
timer_id = chan.connection.add_timeout(3600, close_connec)

Rabbitmq remote call with Pika

I am new to rabbitmq and trying to figure out how I can make a client request a server with information about memory and CPU utilization with this tutorial (https://www.rabbitmq.com/tutorials/tutorial-six-python.html).
So the client requests for CPU and memory ( I believe I will need two queues) and the server respond with the values.
Is there anyway to simple create a client.py and server.py with this case using the Pika library in Python.
I would recommend you to follow the first RabbitMQ tutorials if you haven't already. The RPC example builds on concepts covered on previous examples (direct queues, exclusive queues, acknowledgements, etc.).
The RPC solution proposed on the tutorial requires at least two queues, depending on how many clients you want to use:
One direct queue (rpc_queue), used to send requests from the client to the server.
One exclusive queue per client, used to receive responses.
The request/response cycle:
The client sends a message to the rpc_queue. Each message includes a reply_to property, with the name of the client exclusive queue the server should reply to, and a correlation_id property, which is just an unique id used to track the request.
The server waits for messages on the rpc_queue. When a message arrives, it prepares the response, adds the correlation_id to the new message, and sends it to the queue defined in the reply_to message property.
The client waits on its exclusive queue until it finds a message with the correlation_id that was originally generated.
Jumping straight to your problem, the first thing to do is to define the message format you'll want to use on your responses. You can use JSON, msgpack or any other serialization library. For example, if using JSON, one message could look something like this:
{
"cpu": 1.2,
"memory": 0.3
}
Then, on your server.py:
def on_request(channel, method, props, body):
response = {'cpu': current_cpu_usage(),
'memory': current_memory_usage()}
properties = pika.BasicProperties(correlation_id=props.correlation_id)
channel.basic_publish(exchange='',
routing_key=props.reply_to,
properties=properties,
body=json.dumps(response))
channel.basic_ack(delivery_tag=method.delivery_tag)
# ...
And on your client.py:
class ResponseTimeout(Exception): pass
class Client:
# similar constructor as `FibonacciRpcClient` from tutorial...
def on_response(self, channel, method, props, body):
if self.correlation_id == props.correlation_id:
self.response = json.loads(body.decode())
def call(self, timeout=2):
self.response = None
self.correlation_id = str(uuid.uuid4())
self.channel.basic_publish(exchange='',
routing_key='rpc_queue',
properties=pika.BasicProperties(
reply_to=self.callback_queue,
correlation_id=self.correlation_id),
body='')
start_time = time.time()
while self.response is None:
if (start_time + timeout) < time.time():
raise ResponseTimeout()
self.connection.process_data_events()
return self.response
As you see, the code is pretty much the same as the original FibonacciRpcClient. The main differences are:
We use JSON as data format for our messages.
Our client call() method doesn't require a body argument (there's nothing to send to the server)
We take care of response timeouts (if the server is down, or if it doesn't reply to our messages)
Still, there're a lot of things to improve here:
No error handling: For example, if the client "forgets" to send a reply_to queue, our server is gonna crash, and will crash again on restart (the broken message will be requeued infinitely as long as it isn't acknowledged by our server)
We don't handle broken connections (no reconnection mechanism)
...
You may also consider replacing the RPC approach with a publish/subscribe pattern; in this way, the server simply broadcasts its CPU/memory state every X time interval, and one or more clients receive the updates.

How to create a delayed queue in RabbitMQ?

What is the easiest way to create a delay (or parking) queue with Python, Pika and RabbitMQ? I have seen an similar questions, but none for Python.
I find this an useful idea when designing applications, as it allows us to throttle messages that needs to be re-queued again.
There are always the possibility that you will receive more messages than you can handle, maybe the HTTP server is slow, or the database is under too much stress.
I also found it very useful when something went wrong in scenarios where there is a zero tolerance to losing messages, and while re-queuing messages that could not be handled may solve that. It can also cause problems where the message will be queued over and over again. Potentially causing performance issues, and log spam.
I found this extremely useful when developing my applications. As it gives you an alternative to simply re-queuing your messages. This can easily reduce the complexity of your code, and is one of many powerful hidden features in RabbitMQ.
Steps
First we need to set up two basic channels, one for the main queue, and one for the delay queue. In my example at the end, I include a couple of additional flags that are not required, but makes the code more reliable; such as confirm delivery, delivery_mode and durable. You can find more information on these in the RabbitMQ manual.
After we have set up the channels we add a binding to the main channel that we can use to send messages from the delay channel to our main queue.
channel.queue_bind(exchange='amq.direct',
queue='hello')
Next we need to configure our delay channel to forward messages to the main queue once they have expired.
delay_channel.queue_declare(queue='hello_delay', durable=True, arguments={
'x-message-ttl' : 5000,
'x-dead-letter-exchange' : 'amq.direct',
'x-dead-letter-routing-key' : 'hello'
})
x-message-ttl (Message - Time To Live)
This is normally used to automatically remove old messages in the
queue after a specific duration, but by adding two optional arguments we
can change this behaviour, and instead have this parameter determine
in milliseconds how long messages will stay in the delay queue.
x-dead-letter-routing-key
This variable allows us to transfer the message to a different queue
once they have expired, instead of the default behaviour of removing
it completely.
x-dead-letter-exchange
This variable determines which Exchange used to transfer the message from hello_delay to hello queue.
Publishing to the delay queue
When we are done setting up all the basic Pika parameters you simply send a message to the delay queue using basic publish.
delay_channel.basic_publish(exchange='',
routing_key='hello_delay',
body="test",
properties=pika.BasicProperties(delivery_mode=2))
Once you have executed the script you should see the following queues created in your RabbitMQ management module.
Example.
import pika
connection = pika.BlockingConnection(pika.ConnectionParameters(
'localhost'))
# Create normal 'Hello World' type channel.
channel = connection.channel()
channel.confirm_delivery()
channel.queue_declare(queue='hello', durable=True)
# We need to bind this channel to an exchange, that will be used to transfer
# messages from our delay queue.
channel.queue_bind(exchange='amq.direct',
queue='hello')
# Create our delay channel.
delay_channel = connection.channel()
delay_channel.confirm_delivery()
# This is where we declare the delay, and routing for our delay channel.
delay_channel.queue_declare(queue='hello_delay', durable=True, arguments={
'x-message-ttl' : 5000, # Delay until the message is transferred in milliseconds.
'x-dead-letter-exchange' : 'amq.direct', # Exchange used to transfer the message from A to B.
'x-dead-letter-routing-key' : 'hello' # Name of the queue we want the message transferred to.
})
delay_channel.basic_publish(exchange='',
routing_key='hello_delay',
body="test",
properties=pika.BasicProperties(delivery_mode=2))
print " [x] Sent"
You can use RabbitMQ official plugin: x-delayed-message .
Firstly, download and copy the ez file into Your_rabbitmq_root_path/plugins
Secondly, enable the plugin (do not need to restart the server):
rabbitmq-plugins enable rabbitmq_delayed_message_exchange
Finally, publish your message with "x-delay" headers like:
headers.put("x-delay", 5000);
Notice:
It does not ensure your message's safety, cause if your message expires just during your rabbitmq-server's downtime, unfortunately the message is lost. So be careful when you use this scheme.
Enjoy it and more info in rabbitmq-delayed-message-exchange
FYI, how to do this in Spring 3.2.x.
<rabbit:queue name="delayQueue" durable="true" queue-arguments="delayQueueArguments"/>
<rabbit:queue-arguments id="delayQueueArguments">
<entry key="x-message-ttl">
<value type="java.lang.Long">10000</value>
</entry>
<entry key="x-dead-letter-exchange" value="finalDestinationTopic"/>
<entry key="x-dead-letter-routing-key" value="finalDestinationQueue"/>
</rabbit:queue-arguments>
<rabbit:fanout-exchange name="finalDestinationTopic">
<rabbit:bindings>
<rabbit:binding queue="finalDestinationQueue"/>
</rabbit:bindings>
</rabbit:fanout-exchange>
NodeJS implementation.
Everything is pretty clear from the code.
Hope it will save somebody's time.
var ch = channel;
ch.assertExchange("my_intermediate_exchange", 'fanout', {durable: false});
ch.assertExchange("my_final_delayed_exchange", 'fanout', {durable: false});
// setup intermediate queue which will never be listened.
// all messages are TTLed so when they are "dead", they come to another exchange
ch.assertQueue("my_intermediate_queue", {
deadLetterExchange: "my_final_delayed_exchange",
messageTtl: 5000, // 5sec
}, function (err, q) {
ch.bindQueue(q.queue, "my_intermediate_exchange", '');
});
ch.assertQueue("my_final_delayed_queue", {}, function (err, q) {
ch.bindQueue(q.queue, "my_final_delayed_exchange", '');
ch.consume(q.queue, function (msg) {
console.log("delayed - [x] %s", msg.content.toString());
}, {noAck: true});
});
Message in Rabbit queue can be delayed in 2 ways
- using QUEUE TTL
- using Message TTL
If all messages in queue are to be delayed for fixed time use queue TTL.
If each message has to be delayed by varied time use Message TTL.
I have explained it using python3 and pika module.
pika BasicProperties argument 'expiration' in milliseconds has to be set to delay message in delay queue.
After setting expiration time, publish message to a delayed_queue ("not actual queue where consumers are waiting to consume") , once message in delayed_queue expires, message will be routed to a actual queue using exchange 'amq.direct'
def delay_publish(self, messages, queue, headers=None, expiration=0):
"""
Connect to RabbitMQ and publish messages to the queue
Args:
queue (string): queue name
messages (list or single item): messages to publish to rabbit queue
expiration(int): TTL in milliseconds for message
"""
delay_queue = "".join([queue, "_delay"])
logging.info('Publishing To Queue: {queue}'.format(queue=delay_queue))
logging.info('Connecting to RabbitMQ: {host}'.format(
host=self.rabbit_host))
credentials = pika.PlainCredentials(
RABBIT_MQ_USER, RABBIT_MQ_PASS)
parameters = pika.ConnectionParameters(
rabbit_host, RABBIT_MQ_PORT,
RABBIT_MQ_VHOST, credentials, heartbeat_interval=0)
connection = pika.BlockingConnection(parameters)
channel = connection.channel()
channel.queue_declare(queue=queue, durable=True)
channel.queue_bind(exchange='amq.direct',
queue=queue)
delay_channel = connection.channel()
delay_channel.queue_declare(queue=delay_queue, durable=True,
arguments={
'x-dead-letter-exchange': 'amq.direct',
'x-dead-letter-routing-key': queue
})
properties = pika.BasicProperties(
delivery_mode=2, headers=headers, expiration=str(expiration))
if type(messages) not in (list, tuple):
messages = [messages]
try:
for message in messages:
try:
json_data = json.dumps(message)
except Exception as err:
logging.error(
'Error Jsonify Payload: {err}, {payload}'.format(
err=err, payload=repr(message)), exc_info=True
)
if (type(message) is dict) and ('data' in message):
message['data'] = {}
message['error'] = 'Payload Invalid For JSON'
json_data = json.dumps(message)
else:
raise
try:
delay_channel.basic_publish(
exchange='', routing_key=delay_queue,
body=json_data, properties=properties)
except Exception as err:
logging.error(
'Error Publishing Data: {err}, {payload}'.format(
err=err, payload=json_data), exc_info=True
)
raise
except Exception:
raise
finally:
logging.info(
'Done Publishing. Closing Connection to {queue}'.format(
queue=delay_queue
)
)
connection.close()
Depends on your scenario and needs, I would recommend the following approaches,
Using the official plugin, https://www.rabbitmq.com/blog/2015/04/16/scheduling-messages-with-rabbitmq/, but it will have a capacity issue if the total count of delayed messages exceeds certain number (https://github.com/rabbitmq/rabbitmq-delayed-message-exchange/issues/72), it will not have the high availability option and it will suffer lose of data when it runs out of delayed time during a MQ restart.
Implement a set of cascading delayed queues just like NServiceBus did (https://docs.particular.net/transports/rabbitmq/delayed-delivery).

How to wait for messages on multiple queues using py-amqplib

I'm using py-amqplib to access RabbitMQ in Python. The application receives requests to listen on certain MQ topics from time to time.
The first time it receives such a request it creates an AMQP connection and a channel and starts a new thread to listen for messages:
connection = amqp.Connection(host = host, userid = "guest", password = "guest", virtual_host = "/", insist = False)
channel = connection.channel()
listener = AMQPListener(channel)
listener.start()
AMQPListener is very simple:
class AMQPListener(threading.Thread):
def __init__(self, channel):
threading.Thread.__init__(self)
self.__channel = channel
def run(self):
while True:
self.__channel.wait()
After creating the connection it subscribes to the topic of interest, like this:
channel.queue_declare(queue = queueName, exclusive = False)
channel.exchange_declare(exchange = MQ_EXCHANGE_NAME, type = "direct", durable = False, auto_delete = True)
channel.queue_bind(queue = queueName, exchange = MQ_EXCHANGE_NAME, routing_key = destination)
def receive_callback(msg):
self.queue.put(msg.body)
channel.basic_consume(queue = queueName, no_ack = True, callback = receive_callback)
The first time this all works fine. However, it fails on a subsequent request to subscribe to another topic. On subsequent requests I re-use the AMQP connection and AMQPListener thread (since I don't want to start a new thread for each topic) and when I call the code block above the channel.queue_declare() method call never returns. I've also tried creating a new channel at that point and the connection.channel() call never returns, either.
The only way I've been able to get it to work is to create a new connection, channel and listener thread per topic (ie. routing_key), but this is really not ideal. I suspect it's the wait() method that's somehow blocking the entire connection, but I'm not sure what to do about it. Surely I should be able to receive messages with several routing keys (or even on several channels) using a single listener thread?
A related question is: how do I stop the listener thread when that topic is no longer of interest? The channel.wait() call appears to block forever if there are no messages. The only way I can think of is to send a dummy message to the queue that would "poison" it, ie. be interpreted by the listener as a signal to stop.
If you want more than one comsumer per channel just attach another one using basic_consume() and use channel.wait() after. It will listen to all queues attached via basic_consume(). Make sure you define different consumer tags for each basic_consume().
Use channel.basic_cancel(consumer_tag) if you want to cancel a specific consumer on a queue (cancelling listen to a specific topic).

Categories