How to create a delayed queue in RabbitMQ?

How to create a delayed queue in RabbitMQ? - python

What is the easiest way to create a delay (or parking) queue with Python, Pika and RabbitMQ? I have seen an similar questions, but none for Python.
I find this an useful idea when designing applications, as it allows us to throttle messages that needs to be re-queued again.
There are always the possibility that you will receive more messages than you can handle, maybe the HTTP server is slow, or the database is under too much stress.
I also found it very useful when something went wrong in scenarios where there is a zero tolerance to losing messages, and while re-queuing messages that could not be handled may solve that. It can also cause problems where the message will be queued over and over again. Potentially causing performance issues, and log spam.

I found this extremely useful when developing my applications. As it gives you an alternative to simply re-queuing your messages. This can easily reduce the complexity of your code, and is one of many powerful hidden features in RabbitMQ.
Steps
First we need to set up two basic channels, one for the main queue, and one for the delay queue. In my example at the end, I include a couple of additional flags that are not required, but makes the code more reliable; such as confirm delivery, delivery_mode and durable. You can find more information on these in the RabbitMQ manual.
After we have set up the channels we add a binding to the main channel that we can use to send messages from the delay channel to our main queue.
channel.queue_bind(exchange='amq.direct',
queue='hello')
Next we need to configure our delay channel to forward messages to the main queue once they have expired.
delay_channel.queue_declare(queue='hello_delay', durable=True, arguments={
'x-message-ttl' : 5000,
'x-dead-letter-exchange' : 'amq.direct',
'x-dead-letter-routing-key' : 'hello'
})
x-message-ttl (Message - Time To Live)
This is normally used to automatically remove old messages in the
queue after a specific duration, but by adding two optional arguments we
can change this behaviour, and instead have this parameter determine
in milliseconds how long messages will stay in the delay queue.
x-dead-letter-routing-key
This variable allows us to transfer the message to a different queue
once they have expired, instead of the default behaviour of removing
it completely.
x-dead-letter-exchange
This variable determines which Exchange used to transfer the message from hello_delay to hello queue.
Publishing to the delay queue
When we are done setting up all the basic Pika parameters you simply send a message to the delay queue using basic publish.
delay_channel.basic_publish(exchange='',
routing_key='hello_delay',
body="test",
properties=pika.BasicProperties(delivery_mode=2))
Once you have executed the script you should see the following queues created in your RabbitMQ management module.
Example.
import pika
connection = pika.BlockingConnection(pika.ConnectionParameters(
'localhost'))
# Create normal 'Hello World' type channel.
channel = connection.channel()
channel.confirm_delivery()
channel.queue_declare(queue='hello', durable=True)
# We need to bind this channel to an exchange, that will be used to transfer
# messages from our delay queue.
channel.queue_bind(exchange='amq.direct',
queue='hello')
# Create our delay channel.
delay_channel = connection.channel()
delay_channel.confirm_delivery()
# This is where we declare the delay, and routing for our delay channel.
delay_channel.queue_declare(queue='hello_delay', durable=True, arguments={
'x-message-ttl' : 5000, # Delay until the message is transferred in milliseconds.
'x-dead-letter-exchange' : 'amq.direct', # Exchange used to transfer the message from A to B.
'x-dead-letter-routing-key' : 'hello' # Name of the queue we want the message transferred to.
})
delay_channel.basic_publish(exchange='',
routing_key='hello_delay',
body="test",
properties=pika.BasicProperties(delivery_mode=2))
print " [x] Sent"

You can use RabbitMQ official plugin: x-delayed-message .
Firstly, download and copy the ez file into Your_rabbitmq_root_path/plugins
Secondly, enable the plugin (do not need to restart the server):
rabbitmq-plugins enable rabbitmq_delayed_message_exchange
Finally, publish your message with "x-delay" headers like:
headers.put("x-delay", 5000);
Notice:
It does not ensure your message's safety, cause if your message expires just during your rabbitmq-server's downtime, unfortunately the message is lost. So be careful when you use this scheme.
Enjoy it and more info in rabbitmq-delayed-message-exchange

FYI, how to do this in Spring 3.2.x.
<rabbit:queue name="delayQueue" durable="true" queue-arguments="delayQueueArguments"/>
<rabbit:queue-arguments id="delayQueueArguments">
<entry key="x-message-ttl">
<value type="java.lang.Long">10000</value>
</entry>
<entry key="x-dead-letter-exchange" value="finalDestinationTopic"/>
<entry key="x-dead-letter-routing-key" value="finalDestinationQueue"/>
</rabbit:queue-arguments>
<rabbit:fanout-exchange name="finalDestinationTopic">
<rabbit:bindings>
<rabbit:binding queue="finalDestinationQueue"/>
</rabbit:bindings>
</rabbit:fanout-exchange>

NodeJS implementation.
Everything is pretty clear from the code.
Hope it will save somebody's time.
var ch = channel;
ch.assertExchange("my_intermediate_exchange", 'fanout', {durable: false});
ch.assertExchange("my_final_delayed_exchange", 'fanout', {durable: false});
// setup intermediate queue which will never be listened.
// all messages are TTLed so when they are "dead", they come to another exchange
ch.assertQueue("my_intermediate_queue", {
deadLetterExchange: "my_final_delayed_exchange",
messageTtl: 5000, // 5sec
}, function (err, q) {
ch.bindQueue(q.queue, "my_intermediate_exchange", '');
});
ch.assertQueue("my_final_delayed_queue", {}, function (err, q) {
ch.bindQueue(q.queue, "my_final_delayed_exchange", '');
ch.consume(q.queue, function (msg) {
console.log("delayed - [x] %s", msg.content.toString());
}, {noAck: true});
});

Message in Rabbit queue can be delayed in 2 ways
- using QUEUE TTL
- using Message TTL
If all messages in queue are to be delayed for fixed time use queue TTL.
If each message has to be delayed by varied time use Message TTL.
I have explained it using python3 and pika module.
pika BasicProperties argument 'expiration' in milliseconds has to be set to delay message in delay queue.
After setting expiration time, publish message to a delayed_queue ("not actual queue where consumers are waiting to consume") , once message in delayed_queue expires, message will be routed to a actual queue using exchange 'amq.direct'
def delay_publish(self, messages, queue, headers=None, expiration=0):
"""
Connect to RabbitMQ and publish messages to the queue
Args:
queue (string): queue name
messages (list or single item): messages to publish to rabbit queue
expiration(int): TTL in milliseconds for message
"""
delay_queue = "".join([queue, "_delay"])
logging.info('Publishing To Queue: {queue}'.format(queue=delay_queue))
logging.info('Connecting to RabbitMQ: {host}'.format(
host=self.rabbit_host))
credentials = pika.PlainCredentials(
RABBIT_MQ_USER, RABBIT_MQ_PASS)
parameters = pika.ConnectionParameters(
rabbit_host, RABBIT_MQ_PORT,
RABBIT_MQ_VHOST, credentials, heartbeat_interval=0)
connection = pika.BlockingConnection(parameters)
channel = connection.channel()
channel.queue_declare(queue=queue, durable=True)
channel.queue_bind(exchange='amq.direct',
queue=queue)
delay_channel = connection.channel()
delay_channel.queue_declare(queue=delay_queue, durable=True,
arguments={
'x-dead-letter-exchange': 'amq.direct',
'x-dead-letter-routing-key': queue
})
properties = pika.BasicProperties(
delivery_mode=2, headers=headers, expiration=str(expiration))
if type(messages) not in (list, tuple):
messages = [messages]
try:
for message in messages:
try:
json_data = json.dumps(message)
except Exception as err:
logging.error(
'Error Jsonify Payload: {err}, {payload}'.format(
err=err, payload=repr(message)), exc_info=True
)
if (type(message) is dict) and ('data' in message):
message['data'] = {}
message['error'] = 'Payload Invalid For JSON'
json_data = json.dumps(message)
else:
raise
try:
delay_channel.basic_publish(
exchange='', routing_key=delay_queue,
body=json_data, properties=properties)
except Exception as err:
logging.error(
'Error Publishing Data: {err}, {payload}'.format(
err=err, payload=json_data), exc_info=True
)
raise
except Exception:
raise
finally:
logging.info(
'Done Publishing. Closing Connection to {queue}'.format(
queue=delay_queue
)
)
connection.close()

Depends on your scenario and needs, I would recommend the following approaches,
Using the official plugin, https://www.rabbitmq.com/blog/2015/04/16/scheduling-messages-with-rabbitmq/, but it will have a capacity issue if the total count of delayed messages exceeds certain number (https://github.com/rabbitmq/rabbitmq-delayed-message-exchange/issues/72), it will not have the high availability option and it will suffer lose of data when it runs out of delayed time during a MQ restart.
Implement a set of cascading delayed queues just like NServiceBus did (https://docs.particular.net/transports/rabbitmq/delayed-delivery).

Related

Kafka | exactly-once consumer consume a message more than once

In our applications have enabled exactly-once in both Producer and Consumer.
Producer is a python component.We have enabled:
idempotence
use transactions (new transactionId is used every time when we send messages)
Consumer is a Spring Boot application. We have enabled:
read_committed isolation level
use manual acknowledgement for messages
We have multi-partition Kafka topic (lets say 3 partitions) on ConfluentCloud.
Our application design is as follows:
multiple Producer app instances
for performance ,we have lots of Consumer app instances (currently around 24)
Problem:
We noticed that sometimes the same Kafka message is consumed more than once in the Consumer.We detected this by using following consumer code. We keep the previously consumed kafka message Id (with offset) in Redis and compare them with newly consumed message.
Consumer code:
#KafkaListener(topics = "${datalake.datasetevents.topic}", groupId = "${spring.kafka.consumer.group-id}")
public void listen(#Header(KafkaHeaders.RECEIVED_MESSAGE_KEY) String key,
#Header(KafkaHeaders.OFFSET) String offset,
#Payload InputEvent inputEvent, Acknowledgment acknowledgment) {
//KafkaHeaders.
Event event = new Event();
event.setCorrId(inputEvent.getCorrId());
event.setQn(inputEvent.getQn());
event.setCreatedTs(new Date());
event.setEventTs(inputEvent.getEventTs());
event.setMeta(inputEvent.getMeta() != null ? inputEvent.getMeta(): new HashMap<>());
event.setType(inputEvent.getType());
event.setUlid(key);
//detect message duplications
try {
String eventRedisKey = "tg_e_d_" + key.toLowerCase();
String redisVal = offset;
String tmp = redisTemplateString.opsForValue().get(eventRedisKey);
if (tmp != null) {
dlkLogging.error("kafka_event_dup", "Event consumed more than once ulid:" + event.getUlid()+ " redis offset: "+tmp+ " event offset:"+offset);
redisTemplateString.delete(eventRedisKey);
}
redisTemplateString.opsForValue().set(eventRedisKey, redisVal, 30, TimeUnit.SECONDS);
} catch (Exception e) {
dlkLogging.error("kafka_consumer_redis","Redis error at kafka consumere ", e);
}
//process the message and ack
try {
eventService.saveEvent(persistEvent, event);
ack.acknowledge();
} catch (Exception ee) {
//Refer : https://stackoverflow.com/questions/62413270/kafka-what-is-the-point-of-using-acknowledgment-nack-if-i-can-simply-not-ack
ack.nack(1);
dlkLogging.error("event_sink_error","error sinking kafka event.Will retry", ee);
}
}
Behavior:
We notice the "kafka_event_dup" is sent several times per day.
Error Message: Event consumed more than once
ulid:01G77G8KNTSM2Q01SB1MK60BTH redis offset: 659238 event
offset:659238
Question:
Why consumer read the same message even though we have configured exactly-once in both Producer and Consumer?
Update : After reading several SO posts, seems we still need to implement deduplication logic in Consumer side even though exactly-once is configured?
Additional Info:
Consumer configuration:
public DefaultKafkaConsumerFactory kafkaDatasetEventConsumerFactory(KafkaProperties properties) {
Map<String, Object> props = properties.buildConsumerProperties();
props.put(ENABLE_AUTO_COMMIT_CONFIG, false);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, ErrorHandlingDeserializer.class);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, ErrorHandlingDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, ErrorHandlingDeserializer.class);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, ErrorHandlingDeserializer.class);
props.put(ConsumerConfig.ISOLATION_LEVEL_CONFIG, "read_committed");
props.put(ErrorHandlingDeserializer.KEY_DESERIALIZER_CLASS, StringDeserializer.class);
props.put(ErrorHandlingDeserializer.VALUE_DESERIALIZER_CLASS, CustomJsonDeserializer.class.getName());
props.put(JsonDeserializer.VALUE_DEFAULT_TYPE, "com.fr.det.datalake.eventdriven.model.kafka.InputEvent");
return new DefaultKafkaConsumerFactory(props);
}
Producer code (python):
def __get_producer(self):
conf = {
'bootstrap.servers': self.server,
'enable.idempotence': True,
'acks': 'all',
'retry.backoff.ms': self.sleep_seconds * 100
}
if self.sasl_mechanism:
conf['sasl.mechanisms'] = self.sasl_mechanism
if self.security_protocol:
conf['security.protocol'] = self.security_protocol
if self.sasl_username:
conf['sasl.username'] = self.sasl_username
if self.sasl_username:
conf['sasl.password'] = self.sasl_password
if self.transaction_prefix:
conf['transactional.id'] = self.__get_transaction_id()
producer = Producer(conf)
return producer
#_retry_on_error
def send_messages(self, messages, *args, **kwargs):
ts = time.time()
producer = kwargs.get('producer', None)
if producer is not None:
for message in messages:
key = message.get('key', str(ulid.from_timestamp(ts)))
value = message.get('value', None)
topic = message.get('topic', self.topic)
producer.produce(topic=topic,
value=value,
key=key,
on_delivery=self.acked)
producer.commit_transaction(30)
def _retry_on_error(func, *args, **kwargs):
def inner(self, messages, *args, **kwargs):
attempts = 0
while True:
attempts += 1
sleep_time = attempts * self.sleep_seconds
try:
producer = self.__get_producer()
self.logger.info(f"Producer: {producer}, Attempt: {attempts}")
producer.init_transactions(30)
producer.begin_transaction()
res = func(self, messages, *args, producer=producer, **kwargs)
return res
except KafkaException as e:
if attempts <= self.retry_count:
if e.args[0].txn_requires_abort():
producer.abort_transaction(30)
time.sleep(sleep_time)
continue
self.logger.error(str(e), exc_info=True, extra=extra)
break
return inner

Kafka exactly-once is essentially a Kafka-Streams feature, although it can be used with regular consumer and producers as well.
Exactly once can only be achieved in a context where your applications are only interacting with Kafka: there is no XA nor other kind of distributed transactions across technologies that would enable a Kafka consumer interact with some other storage (like Redis) in an exactly-once manner.
In a distributed world, we have to acknowledge that is not desirable, since it introduce locking, contention, and exponentially degrading performance under load. If we don't need to be in a distributed world, then we don't need Kafka and many things become easier.
Transactions in Kafka are meant to be used within one application that is only interacting with Kafka, it lets you guarantee that the app will 1) read from some topic partitions, 2) write some result in some other topic partitions and 3) commit the read offsets related to 1, or do none of those things. If several apps are put back-to-back and interacting through Kafka in such manner, then you could achieve exactly once if you're very careful. If your consumer needs to 4) interact with Redis 5) interact with some other storage or do some side effect somewhere (like sending an email or so), then there is in general no way to perform steps 1,2,3,4,5 atomically as part of a distributed application. You can achieve this kind of things with other storage technologies (yes, Kafka is essentially a storage), but they cannot be distributed and your application cannot either. That's essentially what the CAP theorem tells us.
That's also why exactly-once is essentially a Kafka-streams stuff: Kafka Stream is just a smart wrapper around the Kafka consumer/producer clients that lest you build applications that interact only with Kafka.
You can also achieve exactly-once streaming processing with other data-processing framework, like Spark Streaming or Flink.
In practice it's often much simpler to not bother with transactions and just de-duplicate in the consumer. You have the guarantee that at max one consumer of the consumer group is connected to each partition at any point in time, so duplicates will always happen in the same instance of your app (until it re-scales), and, depending on your config, the duplication should typically only happen within one single Kafka consumer buffer, so you don't need to store much state in your consumer to de-duplicate. If you use some kind of event-ids that can only increase for example (which is essentially what the Kafka offset is BTW, and it's no coincidence), then you just need to keep in the state of each instance of your app the maximum event-id per partition that you've successfully processed.

I can see you have set ENABLE_AUTO_COMMIT_CONFIG to false, that means you are having a manual commit process in place. If we are not committing the offset of the messages read efficiently, then we will end up in processing duplicate messages.
Kindly refer session from 4.6 from https://www.baeldung.com/kafka-exactly-once
Also, for processing.guarantee : exactly_once the following parameters you no need to set explicitly.
isolation.level=read_committed
enable.idempotence=true
MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION=5

kafka-python Error when i havent received message since a long period

I use the kafka_python==2.0.0 library ,
With the piece of code below, if I do not receive a message for 1 hour, the next message pushed in the kafka topic are not processed by the consumer, however the loop does not stop.
I would like my listener to run 24/24 without losing the connection
consumer = KafkaConsumer(
os.environ.get('MY_TOPIC'),
bootstrap_servers=broker,
api_version=my_version,
security_protocol='SASL_PLAINTEXT',
sasl_mechanism='GSSAPI',
sasl_kerberos_service_name=service_name,
group_id='MY_GRP_ID',
max_poll_records=1
)
try:
for msg in consumer:
##PROCESS function ...
consumer.commit()
finally:
consumer.close()

I finally use the poll method :
from kafka import KafkaConsumer
# To consume latest messages and auto-commit offsets
consumer = KafkaConsumer('my-topic',
group_id='my-group',
bootstrap_servers=['localhost:9092'])
while True:
# Response format is {TopicPartiton('topic1', 1): [msg1, msg2]}
msg_pack = consumer.poll(timeout_ms=500)
for tp, messages in msg_pack.items():
# message value and key are raw bytes -- decode if necessary!
# e.g., for unicode: `message.value.decode('utf-8')`
print ("%s:%d:%d: key=%s value=%s" % (tp.topic, tp.partition,
message.offset, message.key,
message.value))
The advantage of my point to this syntax is better visibility on how to retrieve messages. it is not in my example but I could better manage the stopping of the program by loking for a sigterm signal

Publishing a message to queue in RabbitPy

I am looking to RabbitPy for consuming and publishing messages. The consumption part is fine, However I want to send back messages to a specific queue after doing something with the inbound message. However I cannot find anyway to do this in the documentation.
https://rabbitpy.readthedocs.io/en/latest/api/message.html
Here is my consume code:
with rabbitpy.Connection('amqp://guest:guest#localhost:5672/%2f') as conn:
with conn.channel() as channel:
queue_read = rabbitpy.Queue(channel, QUEUE_NAME)
queue_write = rabbitpy.Queue(channel, QUEUE_NAME_RESPONSE)
# Exit on CTRL-C
try:
# Consume the message
for body in queue_read:
# do something on inboud...
Here is what it seems I can only set in creating and sending a message:
message = rabbitpy.Message(channel, result)
message.publish(EXCHANGE, RESPONSE_ROUTING_KEY, mandatory=False)
There is nothing related to a queue. How can I send a message to a specific queue with a routing key?
In Pika I would use the following:
channel = cnn.channel()
channel.exchange_declare(exchange=EXCHANGE, durable='True')
channel.exchange_declare(exchange=EXCHANGE, type='direct', durable='True')
channel.queue_declare(queue=QUEUE_NAME_RESPONSE, durable='True')
channel.queue_bind(exchange=EXCHANGE, queue=QUEUE_NAME_RESPONSE)
channel.basic_publish(exchange=EXCHANGE, routing_key=RESPONSE_ROUTING_KEY, body=return_JSON,properties=pika.BasicProperties(content_type='text/plain', delivery_mode=2))
channel.close()
How can I do this with RabbitPy?

I possibly have this by binding the routing key to the exchange and the queue. I have the following:
queue_write.bind(EXCHANGE, routing_key=RESPONSE_ROUTING_KEY, arguments=None)

using EXCHANGE, RESPONSE_ROUTING_KEY the queue is binded.
For example
We have 2 queue, Queue1 and Queue2. 1 exchange as Exchange and routingky as
routingkey1 and routingkey2.
Exchange + routingkey1 ==> Queue1
Exchange + routingkey1 ==> Queue2
Assume that w ehave the above binding.
When your publisher send a message using Exchange + routingkey1 it will be send to Queue 1. If the publisher use routingkey1 message will be sent to Queue2.
You can create the necessary binding using Rabbitmq Management UI or form your code. more details can be found here
In Rabbitpy you can do something like this,
amqp = rabbitpy.AMQP(channel)
amqp.queue_bind(queue='Queuq1', exchange='Exchange', routing_key='RoutingKey1')

Rabbitmq remote call with Pika

I am new to rabbitmq and trying to figure out how I can make a client request a server with information about memory and CPU utilization with this tutorial (https://www.rabbitmq.com/tutorials/tutorial-six-python.html).
So the client requests for CPU and memory ( I believe I will need two queues) and the server respond with the values.
Is there anyway to simple create a client.py and server.py with this case using the Pika library in Python.

I would recommend you to follow the first RabbitMQ tutorials if you haven't already. The RPC example builds on concepts covered on previous examples (direct queues, exclusive queues, acknowledgements, etc.).
The RPC solution proposed on the tutorial requires at least two queues, depending on how many clients you want to use:
One direct queue (rpc_queue), used to send requests from the client to the server.
One exclusive queue per client, used to receive responses.
The request/response cycle:
The client sends a message to the rpc_queue. Each message includes a reply_to property, with the name of the client exclusive queue the server should reply to, and a correlation_id property, which is just an unique id used to track the request.
The server waits for messages on the rpc_queue. When a message arrives, it prepares the response, adds the correlation_id to the new message, and sends it to the queue defined in the reply_to message property.
The client waits on its exclusive queue until it finds a message with the correlation_id that was originally generated.
Jumping straight to your problem, the first thing to do is to define the message format you'll want to use on your responses. You can use JSON, msgpack or any other serialization library. For example, if using JSON, one message could look something like this:
{
"cpu": 1.2,
"memory": 0.3
}
Then, on your server.py:
def on_request(channel, method, props, body):
response = {'cpu': current_cpu_usage(),
'memory': current_memory_usage()}
properties = pika.BasicProperties(correlation_id=props.correlation_id)
channel.basic_publish(exchange='',
routing_key=props.reply_to,
properties=properties,
body=json.dumps(response))
channel.basic_ack(delivery_tag=method.delivery_tag)
# ...
And on your client.py:
class ResponseTimeout(Exception): pass
class Client:
# similar constructor as `FibonacciRpcClient` from tutorial...
def on_response(self, channel, method, props, body):
if self.correlation_id == props.correlation_id:
self.response = json.loads(body.decode())
def call(self, timeout=2):
self.response = None
self.correlation_id = str(uuid.uuid4())
self.channel.basic_publish(exchange='',
routing_key='rpc_queue',
properties=pika.BasicProperties(
reply_to=self.callback_queue,
correlation_id=self.correlation_id),
body='')
start_time = time.time()
while self.response is None:
if (start_time + timeout) < time.time():
raise ResponseTimeout()
self.connection.process_data_events()
return self.response
As you see, the code is pretty much the same as the original FibonacciRpcClient. The main differences are:
We use JSON as data format for our messages.
Our client call() method doesn't require a body argument (there's nothing to send to the server)
We take care of response timeouts (if the server is down, or if it doesn't reply to our messages)
Still, there're a lot of things to improve here:
No error handling: For example, if the client "forgets" to send a reply_to queue, our server is gonna crash, and will crash again on restart (the broken message will be requeued infinitely as long as it isn't acknowledged by our server)
We don't handle broken connections (no reconnection mechanism)
...
You may also consider replacing the RPC approach with a publish/subscribe pattern; in this way, the server simply broadcasts its CPU/memory state every X time interval, and one or more clients receive the updates.

Django: Cleaning up redis connection after client disconnects from stream

I've implemented a Server Sent Event API in my Django app to stream realtime updates from my backend to the browser. The backend is a Redis pubsub. My Django view looks like this:
def event_stream(request):
"""
Stream worker events out to browser.
"""
listener = events.Listener(
settings.EVENTS_PUBSUB_URL,
channels=[settings.EVENTS_PUBSUB_CHANNEL],
buffer_key=settings.EVENTS_BUFFER_KEY,
last_event_id=request.META.get('HTTP_LAST_EVENT_ID')
)
return http.HttpResponse(listener, mimetype='text/event-stream')
And the events.Listener class that I'm returning as an iterator looks like this:
class Listener(object):
def __init__(self, rcon_or_url, channels, buffer_key=None,
last_event_id=None):
if isinstance(rcon_or_url, redis.StrictRedis):
self.rcon = rcon_or_url
elif isinstance(rcon_or_url, basestring):
self.rcon = redis.StrictRedis(**utils.parse_redis_url(rcon_or_url))
self.channels = channels
self.buffer_key = buffer_key
self.last_event_id = last_event_id
self.pubsub = self.rcon.pubsub()
self.pubsub.subscribe(channels)
def __iter__(self):
# If we've been initted with a buffer key, then get all the events off
# that and spew them out before blocking on the pubsub.
if self.buffer_key:
buffered_events = self.rcon.lrange(self.buffer_key, 0, -1)
# check whether msg with last_event_id is still in buffer. If so,
# trim buffered_events to have only newer messages.
if self.last_event_id:
# Note that we're looping through most recent messages first,
# here
counter = 0
for msg in buffered_events:
if (json.loads(msg)['id'] == self.last_event_id):
break
counter += 1
buffered_events = buffered_events[:counter]
for msg in reversed(list(buffered_events)):
# Stream out oldest messages first
yield to_sse({'data': msg})
try:
for msg in self.pubsub.listen():
if msg['type'] == 'message':
yield to_sse(msg)
finally:
logging.info('Closing pubsub')
self.pubsub.close()
self.rcon.connection_pool.disconnect()
I'm able to successfully stream events out to the browser with this setup. However, it seems that the disconnect calls in the listener's "finally" don't ever actually get called. I assume that they're still camped out waiting for messages to come from the pubsub. As clients disconnect and reconnect, I can see the number of connections to my Redis instance climbing and never going down. Once it gets to around 1000, Redis starts freaking out and consuming all the available CPU.
I would like to be able to detect when the client is no longer listening and close the Redis connection(s) at that time.
Things I've tried or thought about:
A connection pool. But as the redis-py README states, "It is not safe to pass PubSub or Pipeline objects between threads."
A middleware to handle the connections, or maybe just disconnections. This won't work because a middleware's process_response() method gets called too early (before http headers are even sent to the client). I need something called when the client disconnects while I'm in the middle of streaming content to them.
The request_finished and got_request_exception signals. The first, like process_response() in a middleware, seems to fire too soon. The second doesn't get called when a client disconnects mid-stream.
Final wrinkle: In production I'm using Gevent so I can get away with keeping a lot of connections open at once. However, this connection leak issue occurs whether I'm using plain old 'manage.py runserver', or Gevent monkeypatched runserver, or Gunicorn's gevent workers.

UPDATE: As of Django 1.5, you'll need to return a StreamingHttpResponse instance if you want to lazily stream things out as I'm doing in this question/answer.
ORIGINAL ANSWER BELOW
After a lot of banging on things and reading framework code, I've found what I think is the right answer to this question.
According to the WSGI PEP, if your application returns an iterator with a close() method, it should be called by the WSGI server once the response has finished. Django supports this too. That's a natural place to do the Redis connection cleanup that I need.
There's a bug in Python's wsgiref implementation, and by extension in Django's 'runserver', that causes close() to be skipped if the client disconnects from the server mid-stream. I've submitted a patch.
Even if the server honors close(), it won't be called until a write to the client actually fails. If your iterator is blocked waiting on the pubsub and not sending anything, close() won't be called. I've worked around this by sending a no-op message into the pubsub each time a client connects. That way when a browser does a normal reconnect, the now-defunct threads will try to write to their closed connections, throw an exception, then get cleaned up when the server calls close(). The SSE spec says that any line beginning with a colon is a comment that should be ignored, so I'm just sending ":\n" as my no-op message to flush out stale clients.
Here's the new code. First the Django view:
def event_stream(request):
"""
Stream worker events out to browser.
"""
return events.SSEResponse(
settings.EVENTS_PUBSUB_URL,
channels=[settings.EVENTS_PUBSUB_CHANNEL],
buffer_key=settings.EVENTS_BUFFER_KEY,
last_event_id=request.META.get('HTTP_LAST_EVENT_ID')
)
And the Listener class that does the work, along with a helper function to format the SSEs and an HTTPResponse subclass that lets the view be a little cleaner:
class Listener(object):
def __init__(self,
rcon_or_url=settings.EVENTS_PUBSUB_URL,
channels=None,
buffer_key=settings.EVENTS_BUFFER_KEY,
last_event_id=None):
if isinstance(rcon_or_url, redis.StrictRedis):
self.rcon = rcon_or_url
elif isinstance(rcon_or_url, basestring):
self.rcon = redis.StrictRedis(**utils.parse_redis_url(rcon_or_url))
if channels is None:
channels = [settings.EVENTS_PUBSUB_CHANNEL]
self.channels = channels
self.buffer_key = buffer_key
self.last_event_id = last_event_id
self.pubsub = self.rcon.pubsub()
self.pubsub.subscribe(channels)
# Send a superfluous message down the pubsub to flush out stale
# connections.
for channel in self.channels:
# Use buffer_key=None since these pings never need to be remembered
# and replayed.
sender = Sender(self.rcon, channel, None)
sender.publish('_flush', tags=['hidden'])
def __iter__(self):
# If we've been initted with a buffer key, then get all the events off
# that and spew them out before blocking on the pubsub.
if self.buffer_key:
buffered_events = self.rcon.lrange(self.buffer_key, 0, -1)
# check whether msg with last_event_id is still in buffer. If so,
# trim buffered_events to have only newer messages.
if self.last_event_id:
# Note that we're looping through most recent messages first,
# here
counter = 0
for msg in buffered_events:
if (json.loads(msg)['id'] == self.last_event_id):
break
counter += 1
buffered_events = buffered_events[:counter]
for msg in reversed(list(buffered_events)):
# Stream out oldest messages first
yield to_sse({'data': msg})
for msg in self.pubsub.listen():
if msg['type'] == 'message':
yield to_sse(msg)
def close(self):
self.pubsub.close()
self.rcon.connection_pool.disconnect()
class SSEResponse(HttpResponse):
def __init__(self, rcon_or_url, channels, buffer_key=None,
last_event_id=None, *args, **kwargs):
self.listener = Listener(rcon_or_url, channels, buffer_key,
last_event_id)
super(SSEResponse, self).__init__(self.listener,
mimetype='text/event-stream',
*args, **kwargs)
def close(self):
"""
This will be called by the WSGI server at the end of the request, even
if the client disconnects midstream. Unless you're using Django's
runserver, in which case you should expect to see Redis connections
build up until http://bugs.python.org/issue16220 is fixed.
"""
self.listener.close()
def to_sse(msg):
"""
Given a Redis pubsub message that was published by a Sender (ie, has a JSON
body with time, message, title, tags, and id), return a properly-formatted
SSE string.
"""
data = json.loads(msg['data'])
# According to the SSE spec, lines beginning with a colon should be
# ignored. We can use that as a way to force zombie listeners to try
# pushing something down the socket and clean up their redis connections
# when they get an error.
# See http://dev.w3.org/html5/eventsource/#event-stream-interpretation
if data['message'] == '_flush':
return ":\n" # Administering colonic!
if 'id' in data:
out = "id: " + data['id'] + '\n'
else:
out = ''
if 'name' in data:
out += 'name: ' + data['name'] + '\n'
payload = json.dumps({
'time': data['time'],
'message': data['message'],
'tags': data['tags'],
'title': data['title'],
})
out += 'data: ' + payload + '\n\n'
return out

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.