Using RabbitMQ with Django to get information from internal servers - python

I've been trying to make a decision about my student project before going further. The main idea is get disk usage data, active linux user data, and so on from multiple internal server and publish them with Django.
Before I came to RabbitMQ I was thinking about developing a client application for each linux server and geting this data through a socket. But I want to make that student project simple. Also, I don't know how difficult it is to make a socket connection via Django.
So, I thought I could solve my problem with RabbitMQ without socket programming. Basically, I send a message to rabbit queue. Then get whatever I want from the consumer server.
On the Django side, the client will select one of the internal servers and click the "details" button. Then I want to show this information on web page.
I already read almost all documentation about rabbitmq, celery and pika. Sending messages to all internal servers(clients) and the calculation of information that I want to get is OKAY but I can't figure out how I can put this data on a webpage with Django?
How would you figure out that problem if you were me?
Thank you.

I solved my problem own my own. Solution is RabbitMQ RPC call. You can execute your python code on remote server and get result of process via RPC requests. Details can ben found here.
http://www.rabbitmq.com/tutorials/tutorial-six-python.html
Thank you guys.

Looks like you already done the hard work(celery, rabbit, etc) but missing Django basics. Go through the polls tutorial and getting started with django or the many other resources on the web, and It would be quite simple. Basically:
create the models (objects represented in db)
declare urls
setup views to pass the data from the model to the webpage template
create the templates (or do it with client side framework and create a JSON response)
EDIT: (after you clarified the question) Actually I just hit the same problem too. The answer is running another python process parallel to the Django process (in the same virtualenv) in this process you can set up a rabbit consumer (using pica, puka, kombu or whatever) and calling specific Django functions/methods to do something with the information from rabbitmq. you can also just call celery tasks from there to be executed in the Django app context.
a procfile for example (just illustrating, you can run both process in many other ways):
web: python manage.py runserver
worker: python listen_from_servers.py
Notice that you'll have to set the DJANGO_SETTIGNS_MODULE for the settings file enviroment variable for django imports to work.

You need the following two programs running at all times:
The producer, which will populate the queue. This is the program that will collect the various messages and then post them on the queue.
The consumer, which will process messages from the queue. This consumer's job is to read the message and do something with it; so that it is processed and removed from the queue. The function that this consumer does is entirely up to you, but what you want to do in this scenario is write information from the message to a database model; the same database that is part of your django app.
As the producer pushes messages and the consumer removes them from the queue, your database will get updated.
On the django side, the process is simply to filter this database and display records for a particular machine. As such, django does not need to be aware of how the records are being populated in the database - all django is doing is fetching, filtering, sending to the template and rendering the views.
The question comes how best (well actually, easily) populate the databases. You can do it the traditional way, by using Python's well documentation DB-API and write your own SQL statements; but since celery is so well integrated with django - you can use the django's ORM to do this work for you as well.
I hope this gets you going in the right direction.

Related

Best practices for often repeating background tasks in django and pythonanywhere

So, I am currently working on a django project hosted at pythonanywhere, which includes a feature for notifications, while also receiving data externally from sensors through AWS. I have been thinking of the best practice in order to implement this.
I currently have a simple implementation which is a view that checks all notifications and does the actions as needed if required, with an always-on task (which simply means a script that is running independently) sending a REST request to the server every minute.
Server side:
views.py:
def checkNotifications(request):
notificationsObject = notifications.objects.order_by('thing').values_list('thing').distinct()
thingsList = list(notificationsObject)
for thing in thingsList:
valuesDic = returnAllField(thing)
thingNotifications = notifications.objects.filter(thing=thing)
#Do stuff for each notification
urls:
path('notifications/',views.checkNotifications,name="checkNotification")
and the client just sents a GET request to my URL/notifications/. which works.
Now, while researching I saw some other options such as the ones discussed here with django background tasks and/or celery:
How to initialize repeating tasks using Django Background Tasks?
Celery task best practices in Django/Python
as well as some other options.
My question is: Is there a benefit to moving from my first implementation to this one? The only benefit I can see directly is avoid abuse from another service trying to hit my URl to check notifications too often, but I can/have a required authentication to avoid that. And, is there a certain "best practice" with regards to this, considering that I am checking with this repeating task quite so often, it almost feels like there should be a more proper/cleaner solution. For one, I am not sure if running a repeating task is the best option with pythonanywhere.
(https://help.pythonanywhere.com/pages/AsyncInWebApps/ suggests using always-on tasks, but it also mentions django background tasks)
Thank you
To use Django background tasks on PythonAnywhere you need to run it using an always-on task, so it is not an alternative, but just the other use of always-on tasks.
You can also access your Django code in your always-on task directly with some kind of long-running management command, so you do not need to hit your web app with a special request.

Usage of RabbitMQ queues with Django

I'm trying to add some real-time features to my Django applications, for that i'm using RabbitMQ and Celery on my django project, so what i would like to do is this: i have an external Python script which sends data to RabbitMQ > from RabbitMQ it should be retrieved from the Django app.
I'm sending some muppet data, like this:
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()
channel.queue_declare(queue='Test')
channel.basic_publish(exchange='',
routing_key='Test',
body='Hello world!')
print(" [x] Sent 'Hello World!'")
connection.close()
What i would like to do is: as soon as i send Hello World!, my Django app should receive the string, so that i can perform some operations with it, such as saving it on my database, passing it to an HTML template or simply printing it to my console.
My actual problem is that i still have no idea how to do this. I added Celery to my Django project but i don't know how to connect to RabbitMQ and receive the message. Would i have to do it with Django Channels? Is there some tutorial on this? I found various material about using RabbitMQ and Celery with Django but nothing on this particular matter.
This is not directly connected to Celery.
You could solve it like this:
create a manage command in Django that does whatever needs to be done with the incoming message/data (https://docs.djangoproject.com/en/3.0/howto/custom-management-commands/)
create a consumer which could also be a Django command that is then started in a separate process, it is not part of the regular Django process (though it can be part of your Django code). In this consumer, you listen to the queue and whenever data comes in, the manage command from (1.) is called.
Of course, 1. and 2. could also be done in one single command. I've separated it to illustrate better the different aspects. And you might have different tasks and reuse one consumer. Also, if you have already (1.) you can reuse it like this, and you can test it easily without the overhead of the consumer.
More on RabbitMQ python consumers: https://github.com/celery/celery/blob/master/celery/worker/consumer/consumer.py
Here is the Celery consumer:
https://github.com/celery/celery/blob/master/celery/worker/consumer/consumer.py
because Celery of course has it's own consumer. It looks rather generic though, a simple consumer should be less complex.
(I only ever have written NSQ python consumers as part of Django, so no direct experience with RabbitMQ consumers (only as backend for Celery).)
EDIT: What you should ask yourself is - do I want the realtime data saved and stored in my Django app, first of all?
If yes - then RabbitMQ+Consumer is a very valid approach.
If no, if this is just for the user - you could also think about directly exposing it via an API to your frontend (and there use ajax calls to fetch it).
If no but you want to buffer the data to avoid hitting the other app that generates the data - then a queue is a very nice tool. In this case though, you might change the consumer not to save the data but to expose it to your frontend. If you only have to support new browsers you could use websockets which are supported now with Django 3:
https://blog.heroku.com/in_deep_with_django_channels_the_future_of_real_time_apps_in_django

Django run thread/task with database access allways paralell to webserver

I want to build a Mqtt Client, which stores some data in my django database.
This client should always run, when the webserver is running.
What is the best way to run a thread with database access (django models) paralle to the webserver?
If read about the django background task model, but I am not sure if its a good way.
Celery is the most common solution for this. You can also create a custom admin command and execute it using cron or something similar.

What’s the correct way to run a long-running task in Django whilst returning a page to the user immediately?

I’m writing a tiny Django website that’s going to provide users with a way to delete all their contacts on Flickr.
It’s mainly an exercise to learn about Selenium, rather than something actually useful — because the Flickr API doesn’t provide a way to delete contacts, I’m using Selenium to make an actual web browser do the actual deleting of contacts.
Because this might take a while, I’d like to present the user with a message saying that the deleting is being done, and then notify them when it’s finished.
In Django, what’s the correct way to return a web page to the user immediately, whilst performing a task on the server that continues after the page is returned?
Would my Django view function use the Python threading module to make the deleting code run in another thread whilst it returns a page to the user?
Consider using some task queues - one of the most liked by Django community solution is to use Celery with RabbitMQ.
Once I needed this, I set up another Python process, that would communicate with Django via xmlrpc - this other process would take care of the long requests, and be able to answer the status of each. The Django views would call that other process (via xmlrpc) to queue jobs, and query job status. I made a couple proper json views in django to query the xmlrpc process - and would update the html page using javascript asynchronous calls to those views (aka Ajax)

Need help understanding Comet in Python (with Django)

After spending two entire days on this I'm still finding it impossible to understand all the choices and configurations for Comet in Python. I've read all the answers here as well as every blog post I could find. It feels like I'm about to hemorrhage at this point, so my utmost apologies for anything wrong with this question.
I'm entirely new to all of this, all I've done before were simple non-real-time sites with a PHP/Django backend on Apache.
My goal is to create a real-time chat application; hopefully tied to Django for users, auth, templates, etc.
Every time I read about a tool it says I need another tool on top of it, it feels like a never-ending chain.
First of all, can anybody categorize all the tools needed for this job?
I've read about different servers, networking libraries, engines, JavaScripts for the client side, and I don't know what else. I never imagined it would be this complex.
Twisted / Twisted Web seems to be popular, but I have no idea to to integrate it or what else I need (guessing I need client-side JS at least).
If I understand correctly, Orbited is built on Twisted, do I need anything else with it?
Are Gevent and Eventlet in the same category as Twisted? How much else do I need with them?
Where do things like Celery, RabbitMQ, or KV stores like Redis come into this? I don't really understand the concept of a message queue. Are they essential and what service do they provide?
Are there any complete chat app tutorials I should look at?
I'll be entirely indebted to anybody who helps me past this mental roadblock, and if I left anything out please don't hesitate to ask. I know it's a pretty loaded question.
You could use Socket.IO. There are gevent and tornado handlers for it. See my blog post on gevent-socketio with Django here: http://codysoyland.com/2011/feb/6/evented-django-part-one-socketio-and-gevent/
I feel your pain, having had to go through the same research over the past few months. I haven't had time to deal with proper documentation yet but I have a working example of using Django with socket.io and tornadio at http://bitbucket.org/virtualcommons/vcweb - I was hoping to set up direct communication from the Django server-side to the tornadio server process using queues (i.e., logic in a django view pushes a message onto a queue that then gets handled by tornadio which pushes a json encoded version of that message out to all interested subscribers) but haven't implemented that part fully yet. The way I've currently gotten it set up involves:
An external tornado (tornadio) server, running on another port, accepting socket.io requests and working with Django models. The only writes this server process makes to the database are the chat messages that need to be stored. It has full access to all Django models, etc., and all real-time interactions need to go directly through this server process.
Django template pages that require real-time access include the socket.io javascript and establish direct connections to the tornadio server
I looked into orbited, hookbox, and gevent but decided to go with socket.io + tornado as it seemed to allow me the cleanest javascript + python code. I could be wrong about that though, having just started to learn Python/Django over the past year.
Redis is relevant as a persistence layer that also supports native publish/subscribe. So instead of a situation where you are polling the db looking for new messages, you can subscribe to a channel, and have messages pushed out to you.
I found a working example of the type of system you describe. The magic happens in the socketio view:
def socketio(request):
"""The socket.io view."""
io = request.environ['socketio']
redis_sub = redis_client().pubsub()
user = username(request.user)
# Subscribe to incoming pubsub messages from redis.
def subscriber(io):
redis_sub.subscribe(room_channel())
redis_client().publish(room_channel(), user + ' connected.')
while io.connected():
for message in redis_sub.listen():
if message['type'] == 'message':
io.send(message['data'])
greenlet = Greenlet.spawn(subscriber, io)
# Listen to incoming messages from client.
while io.connected():
message = io.recv()
if message:
redis_client().publish(room_channel(), user + ': ' + message[0])
# Disconnected. Publish disconnect message and kill subscriber greenlet.
redis_client().publish(room_channel(), user + ' disconnected')
greenlet.throw(Greenlet.GreenletExit)
return HttpResponse()
Take the view step-by-step:
Set up socket.io, get a redis client and the current user
Use Gevent to register a "subscriber" - this takes incoming messages from Redis and forwards them on to the client browser.
Run a "publisher" which takes messages from socket.io (from the user's browser) and pushes them into Redis
Repeat until the socket disconnects
The Redis Cookbook gives a little more detail on the Redis side, as well as discussing how you can persist messages.
Regarding the rest of your question: Twisted is an event-based networking library, it could be considered an alternative to Gevent in this application. It's powerful and difficult to debug in my experience.
Celery is a "distributed task queue" - basically, it lets you spread units of work out across multiple machines. The "distributed" angle means some sort of transport is required between the machines. Celery supports several types of transport, including RabbitMQ (and Redis too).
In the context of your example, Celery would only be appropriate if you had to do some sort of costly processing on each message like scanning for profanity or something. Even still, something would have to initiate the Celery task, so there would need to be some code listening for the socket.io callback.
(Just in case you weren't totally confused, Celery itself can be made to use Gevent as its underlying concurrency library.)
Hope that helps!

Categories