Usage of RabbitMQ queues with Django - python

I'm trying to add some real-time features to my Django applications, for that i'm using RabbitMQ and Celery on my django project, so what i would like to do is this: i have an external Python script which sends data to RabbitMQ > from RabbitMQ it should be retrieved from the Django app.
I'm sending some muppet data, like this:
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()
channel.queue_declare(queue='Test')
channel.basic_publish(exchange='',
routing_key='Test',
body='Hello world!')
print(" [x] Sent 'Hello World!'")
connection.close()
What i would like to do is: as soon as i send Hello World!, my Django app should receive the string, so that i can perform some operations with it, such as saving it on my database, passing it to an HTML template or simply printing it to my console.
My actual problem is that i still have no idea how to do this. I added Celery to my Django project but i don't know how to connect to RabbitMQ and receive the message. Would i have to do it with Django Channels? Is there some tutorial on this? I found various material about using RabbitMQ and Celery with Django but nothing on this particular matter.

This is not directly connected to Celery.
You could solve it like this:
create a manage command in Django that does whatever needs to be done with the incoming message/data (https://docs.djangoproject.com/en/3.0/howto/custom-management-commands/)
create a consumer which could also be a Django command that is then started in a separate process, it is not part of the regular Django process (though it can be part of your Django code). In this consumer, you listen to the queue and whenever data comes in, the manage command from (1.) is called.
Of course, 1. and 2. could also be done in one single command. I've separated it to illustrate better the different aspects. And you might have different tasks and reuse one consumer. Also, if you have already (1.) you can reuse it like this, and you can test it easily without the overhead of the consumer.
More on RabbitMQ python consumers: https://github.com/celery/celery/blob/master/celery/worker/consumer/consumer.py
Here is the Celery consumer:
https://github.com/celery/celery/blob/master/celery/worker/consumer/consumer.py
because Celery of course has it's own consumer. It looks rather generic though, a simple consumer should be less complex.
(I only ever have written NSQ python consumers as part of Django, so no direct experience with RabbitMQ consumers (only as backend for Celery).)
EDIT: What you should ask yourself is - do I want the realtime data saved and stored in my Django app, first of all?
If yes - then RabbitMQ+Consumer is a very valid approach.
If no, if this is just for the user - you could also think about directly exposing it via an API to your frontend (and there use ajax calls to fetch it).
If no but you want to buffer the data to avoid hitting the other app that generates the data - then a queue is a very nice tool. In this case though, you might change the consumer not to save the data but to expose it to your frontend. If you only have to support new browsers you could use websockets which are supported now with Django 3:
https://blog.heroku.com/in_deep_with_django_channels_the_future_of_real_time_apps_in_django

Related

Best practices for often repeating background tasks in django and pythonanywhere

So, I am currently working on a django project hosted at pythonanywhere, which includes a feature for notifications, while also receiving data externally from sensors through AWS. I have been thinking of the best practice in order to implement this.
I currently have a simple implementation which is a view that checks all notifications and does the actions as needed if required, with an always-on task (which simply means a script that is running independently) sending a REST request to the server every minute.
Server side:
views.py:
def checkNotifications(request):
notificationsObject = notifications.objects.order_by('thing').values_list('thing').distinct()
thingsList = list(notificationsObject)
for thing in thingsList:
valuesDic = returnAllField(thing)
thingNotifications = notifications.objects.filter(thing=thing)
#Do stuff for each notification
urls:
path('notifications/',views.checkNotifications,name="checkNotification")
and the client just sents a GET request to my URL/notifications/. which works.
Now, while researching I saw some other options such as the ones discussed here with django background tasks and/or celery:
How to initialize repeating tasks using Django Background Tasks?
Celery task best practices in Django/Python
as well as some other options.
My question is: Is there a benefit to moving from my first implementation to this one? The only benefit I can see directly is avoid abuse from another service trying to hit my URl to check notifications too often, but I can/have a required authentication to avoid that. And, is there a certain "best practice" with regards to this, considering that I am checking with this repeating task quite so often, it almost feels like there should be a more proper/cleaner solution. For one, I am not sure if running a repeating task is the best option with pythonanywhere.
(https://help.pythonanywhere.com/pages/AsyncInWebApps/ suggests using always-on tasks, but it also mentions django background tasks)
Thank you
To use Django background tasks on PythonAnywhere you need to run it using an always-on task, so it is not an alternative, but just the other use of always-on tasks.
You can also access your Django code in your always-on task directly with some kind of long-running management command, so you do not need to hit your web app with a special request.

Trouble with understanding how RabbitMQ can be used

i'm currently working on a Python web app that needs to implement RabbitMQ.
The app is structured like that :
The client connects to a HTTP server
His connexion is send to a message queue that is connected to the main service of my app
the main service receive the message and give the user his information
I understand how to make work RabbitMq using the documentation and tutorial on the website but I have trouble seeing how can it work with real tasks like displaying a web page or printing a file ? How does my service connected to the message queue will read the message received and say : "oh, i'm gonna display this webpage".
Sorry if this is confusing, if you need further explanations on what i'm trying to get, just tell me.
Thanks for reading!
RabbitMq can be good to send message to service which can execute long running process - ie download big file, generate complex animation. Web server can't (or shoudn't) execute long running process.
Web page sends message to RabbitMq (ie. with parameters for long running process) and get unique number. When service has free worker then it checks if there is new message in queue, get it (with unique number) and start worker. When worker finish job then service send result to RabbitMQ with the same uniqe number.
At the same time web page uses JavaScript to run loop which periodically check in RabbitMQ if there is result with this unique number. If there is no result then it may display progressbar, if there is result then it may display this result.
Example: Celery - Distributed Task Queue.
Celery can use RabbitMQ to communicate with Django or Flask.
(but it can use other modules ie. Redis)
Using Celery with Django.
Flask - Celery Background Tasks
From Celery repo
Celery is usually used with a message broker to send and receive messages.
The RabbitMQ, Redis transports are feature complete, but there's also
experimental support for a myriad of other solutions, including using
SQLite for local development.

Python Embed Web Server in Data Processing Node

I am working on a Python 2.7 project with a simple event loop that checks a variety of data sources (rabbitmq, mongodb, postgres, etc) for new data, processes the data and writes data to the next stage.
I would like to embed a web server in the application so it can receive simple REST commands, for shutting it down, diagnosis etc.
However, from reading the documentation on the available web servers it wasn’t clear if they will allow the event loop described above to function outside of the web server’s event loop. Ie. it looks like I would have to do something like launch the event loop using a REST call and have the loop live on an io thread, or similar.
Can someone explain which embedded server (cherrypy, bottle, flask, etc) / concurrency framework (tornado, gevent, twisted etc.) are best suited for this problem?
Thank you in advance!
I would recommend you use a separate process for your app that will receive REST commands (use Pyramid or Flask), and have it send messages over RabbitMQ to the real time part. I like Kombu myself for interfacing with RabbitMQ, and your message bus will nicely decouple your web/rest needs from your event driven needs. Your event driven part just gets messages off the bus, and doesn't need to know anything about REST.

Using RabbitMQ with Django to get information from internal servers

I've been trying to make a decision about my student project before going further. The main idea is get disk usage data, active linux user data, and so on from multiple internal server and publish them with Django.
Before I came to RabbitMQ I was thinking about developing a client application for each linux server and geting this data through a socket. But I want to make that student project simple. Also, I don't know how difficult it is to make a socket connection via Django.
So, I thought I could solve my problem with RabbitMQ without socket programming. Basically, I send a message to rabbit queue. Then get whatever I want from the consumer server.
On the Django side, the client will select one of the internal servers and click the "details" button. Then I want to show this information on web page.
I already read almost all documentation about rabbitmq, celery and pika. Sending messages to all internal servers(clients) and the calculation of information that I want to get is OKAY but I can't figure out how I can put this data on a webpage with Django?
How would you figure out that problem if you were me?
Thank you.
I solved my problem own my own. Solution is RabbitMQ RPC call. You can execute your python code on remote server and get result of process via RPC requests. Details can ben found here.
http://www.rabbitmq.com/tutorials/tutorial-six-python.html
Thank you guys.
Looks like you already done the hard work(celery, rabbit, etc) but missing Django basics. Go through the polls tutorial and getting started with django or the many other resources on the web, and It would be quite simple. Basically:
create the models (objects represented in db)
declare urls
setup views to pass the data from the model to the webpage template
create the templates (or do it with client side framework and create a JSON response)
EDIT: (after you clarified the question) Actually I just hit the same problem too. The answer is running another python process parallel to the Django process (in the same virtualenv) in this process you can set up a rabbit consumer (using pica, puka, kombu or whatever) and calling specific Django functions/methods to do something with the information from rabbitmq. you can also just call celery tasks from there to be executed in the Django app context.
a procfile for example (just illustrating, you can run both process in many other ways):
web: python manage.py runserver
worker: python listen_from_servers.py
Notice that you'll have to set the DJANGO_SETTIGNS_MODULE for the settings file enviroment variable for django imports to work.
You need the following two programs running at all times:
The producer, which will populate the queue. This is the program that will collect the various messages and then post them on the queue.
The consumer, which will process messages from the queue. This consumer's job is to read the message and do something with it; so that it is processed and removed from the queue. The function that this consumer does is entirely up to you, but what you want to do in this scenario is write information from the message to a database model; the same database that is part of your django app.
As the producer pushes messages and the consumer removes them from the queue, your database will get updated.
On the django side, the process is simply to filter this database and display records for a particular machine. As such, django does not need to be aware of how the records are being populated in the database - all django is doing is fetching, filtering, sending to the template and rendering the views.
The question comes how best (well actually, easily) populate the databases. You can do it the traditional way, by using Python's well documentation DB-API and write your own SQL statements; but since celery is so well integrated with django - you can use the django's ORM to do this work for you as well.
I hope this gets you going in the right direction.

Need help understanding Comet in Python (with Django)

After spending two entire days on this I'm still finding it impossible to understand all the choices and configurations for Comet in Python. I've read all the answers here as well as every blog post I could find. It feels like I'm about to hemorrhage at this point, so my utmost apologies for anything wrong with this question.
I'm entirely new to all of this, all I've done before were simple non-real-time sites with a PHP/Django backend on Apache.
My goal is to create a real-time chat application; hopefully tied to Django for users, auth, templates, etc.
Every time I read about a tool it says I need another tool on top of it, it feels like a never-ending chain.
First of all, can anybody categorize all the tools needed for this job?
I've read about different servers, networking libraries, engines, JavaScripts for the client side, and I don't know what else. I never imagined it would be this complex.
Twisted / Twisted Web seems to be popular, but I have no idea to to integrate it or what else I need (guessing I need client-side JS at least).
If I understand correctly, Orbited is built on Twisted, do I need anything else with it?
Are Gevent and Eventlet in the same category as Twisted? How much else do I need with them?
Where do things like Celery, RabbitMQ, or KV stores like Redis come into this? I don't really understand the concept of a message queue. Are they essential and what service do they provide?
Are there any complete chat app tutorials I should look at?
I'll be entirely indebted to anybody who helps me past this mental roadblock, and if I left anything out please don't hesitate to ask. I know it's a pretty loaded question.
You could use Socket.IO. There are gevent and tornado handlers for it. See my blog post on gevent-socketio with Django here: http://codysoyland.com/2011/feb/6/evented-django-part-one-socketio-and-gevent/
I feel your pain, having had to go through the same research over the past few months. I haven't had time to deal with proper documentation yet but I have a working example of using Django with socket.io and tornadio at http://bitbucket.org/virtualcommons/vcweb - I was hoping to set up direct communication from the Django server-side to the tornadio server process using queues (i.e., logic in a django view pushes a message onto a queue that then gets handled by tornadio which pushes a json encoded version of that message out to all interested subscribers) but haven't implemented that part fully yet. The way I've currently gotten it set up involves:
An external tornado (tornadio) server, running on another port, accepting socket.io requests and working with Django models. The only writes this server process makes to the database are the chat messages that need to be stored. It has full access to all Django models, etc., and all real-time interactions need to go directly through this server process.
Django template pages that require real-time access include the socket.io javascript and establish direct connections to the tornadio server
I looked into orbited, hookbox, and gevent but decided to go with socket.io + tornado as it seemed to allow me the cleanest javascript + python code. I could be wrong about that though, having just started to learn Python/Django over the past year.
Redis is relevant as a persistence layer that also supports native publish/subscribe. So instead of a situation where you are polling the db looking for new messages, you can subscribe to a channel, and have messages pushed out to you.
I found a working example of the type of system you describe. The magic happens in the socketio view:
def socketio(request):
"""The socket.io view."""
io = request.environ['socketio']
redis_sub = redis_client().pubsub()
user = username(request.user)
# Subscribe to incoming pubsub messages from redis.
def subscriber(io):
redis_sub.subscribe(room_channel())
redis_client().publish(room_channel(), user + ' connected.')
while io.connected():
for message in redis_sub.listen():
if message['type'] == 'message':
io.send(message['data'])
greenlet = Greenlet.spawn(subscriber, io)
# Listen to incoming messages from client.
while io.connected():
message = io.recv()
if message:
redis_client().publish(room_channel(), user + ': ' + message[0])
# Disconnected. Publish disconnect message and kill subscriber greenlet.
redis_client().publish(room_channel(), user + ' disconnected')
greenlet.throw(Greenlet.GreenletExit)
return HttpResponse()
Take the view step-by-step:
Set up socket.io, get a redis client and the current user
Use Gevent to register a "subscriber" - this takes incoming messages from Redis and forwards them on to the client browser.
Run a "publisher" which takes messages from socket.io (from the user's browser) and pushes them into Redis
Repeat until the socket disconnects
The Redis Cookbook gives a little more detail on the Redis side, as well as discussing how you can persist messages.
Regarding the rest of your question: Twisted is an event-based networking library, it could be considered an alternative to Gevent in this application. It's powerful and difficult to debug in my experience.
Celery is a "distributed task queue" - basically, it lets you spread units of work out across multiple machines. The "distributed" angle means some sort of transport is required between the machines. Celery supports several types of transport, including RabbitMQ (and Redis too).
In the context of your example, Celery would only be appropriate if you had to do some sort of costly processing on each message like scanning for profanity or something. Even still, something would have to initiate the Celery task, so there would need to be some code listening for the socket.io callback.
(Just in case you weren't totally confused, Celery itself can be made to use Gevent as its underlying concurrency library.)
Hope that helps!

Categories