Dramatiq doesn't add tasks to the queue - python

I'm trying to run some dramatiq actors from my Falcon API method, like this:
def on_post(self, req, resp):
begin_id = int(req.params["begin_id"])
count = int(req.params["count"])
for page_id in range(begin_id, begin_id + count):
process_vk_page.send(f"https://vk.com/id{page_id}")
resp.status = falcon.HTTP_200
My code gets to "send" method, goes through the loop without any problems. But where are no new tasks in the queue! Actor itself is not called, and "default" queue in my broker is empty. If I set custom queue, it is still empty. My actor looks like this:
#dramatiq.actor(broker=broker)
def process_vk_page(link: str):
pass
Where broker is
broker = RabbitmqBroker(url="amqp://guest:guest#rabbitmq:5672")
RabbitMQ logs tell that it is connecting fine
I've done some additional research in debugger. It gets the message (which is meant to be sent to broker) fine, and broker.enqueue in Actor.send_with_options() returns no exceptions, although I can't really get it's internal logic. I don't really know why it fails, but it is definitely RabbitmqBroker.enqueue() which is causing the problem.
Broker is RabbitMQ 3.8.2 on Erlang 22.2.1, running in Docker from rabbitmq Docker Hub image with default settings. Dramatiq version is 1.7.0.
In RabbitMQ logs there are only connections to broker when app starts and disconnections when I turn it off, like this:
2020-01-05 08:25:35.622 [info] <0.594.0> accepting AMQP connection <0.594.0> (172.20.0.1:51242 -> 172.20.0.3:5672)
2020-01-05 08:25:35.627 [info] <0.594.0> connection <0.594.0> (172.20.0.1:51242 -> 172.20.0.3:5672): user 'guest' authenticated and granted access to vhost '/'
2020-01-05 08:28:35.625 [error] <0.597.0> closing AMQP connection <0.597.0> (172.20.0.1:51246 -> 172.20.0.3:5672):
missed heartbeats from client, timeout: 60s
Broker is defined in __init__.py of main package and imported in subpackages. I'm not sure that specifying the same broker instance in decorators of all the functions is fine, but where are nothing in docs which bans it. I guess it doesn't matter, since if I create new broker for each Actor it still doesn't work.
I've tried to set Redis as broker, but I still get the same issue.
What might be the reason for this?

Most likely the issue is that you're not telling the workers which broker to use, since you're not declaring a default broker.
You haven't mentioned how your files are laid out in your application, but, assuming your broker is defined as broker inside tasks.py, then you would have to let your workers know about it like so:
dramatiq tasks:broker
See the examples at the end of dramatiq --help for more information and patterns.

Related

Running kafka consumer with Django

I've setup a kafka server on AWS and I already have a Django project acting as the producer, using kafka-python.
I've also setup a second Django project to act as the consumer (kafka-python), but I'm trying to figure out a way to run the consumer automatically after the server has started without having to trigger the consumer through an API call.
Everything I've tried so far either runs the consumer and blocks the server from starting or runs the server and blocks the consumer.
I did something like this on a Django project : I put the consumer launch into a daemon thread into a method and I call this method in the manage.py file.
I'm not really sure about impacts of modify manage.py file, but it's work fine.
def run_consumers():
thread = threading.Thread(name=my_consumer, target=main.lauch_consumer, args=()
thread.daemon = True
thread.start()
And in manage.py I added :
if os.environ.get('RUN_MAIN'):
# Run consumers once at server start
run_consumers()

What to do If the request is not completed in a certain time

I am trying to design an API on fastAPI, I have clients who limit the response to their request by time. For example, for some clients it may be tens of seconds, for others milliseconds.
It is assumed that the user sends a request (e.g. /v5/bla/info), the API checks who sends this request and determines the response time for it. If the request is executed during this time, then give an answer, if it is not executed, then at the end of the specified time, give some kind of request ID, so that the user then sends a request to another endpoint (e.g. /v5/check_request), which would give information on the execution (pending, done, error) of the request using the request ID.
The question is how to implement task execution and runtime checking while holding a session with the client
EDIT I was thinking that the API would send all requests coming to it to the database, and then some "executor" would take data from the database, execute requests and update the status. meanwhile, the api would check the recording status every n seconds and give the result.
How bad/good is this option. The load is approximately 30 million requests in 24 hours
As mentioned above I would recommend celery. With celery there is the option to schedule long running tasks on a task queue. When you place a task on the queue, a worker can pick up the task and process the computation in a separate process.
Celery can be run in different configurations. You will need to select a broker and a backend. E.g. as a broker RabbitMQ and as a backend Redis. More Information is here.
There is already a cookiecutter template for a bigger setup you could use. But you can also build a simpler setup with your own docker-compose file. Once you have configured celery. You can create tasks on the queue with:
from celery import Celery
celery = Celery(__name__)
celery.conf.broker_url = os.environ.get("CELERY_BROKER_URL", "redis://localhost:6379") # or any config you prefer
celery.conf.result_backend = os.environ.get("CELERY_RESULT_BACKEND", "redis://localhost:6379") # or any config you prefer
#celery.task(name="long_task")
def long_task():
# long running code
return True
And then add the task with something like this in the first endpoint:
task = long_task.delay()
# task.id will give you the id
And then second endpoint can call it by id and get the status of the task
from celery.result import AsyncResult
task_result = AsyncResult(task_id)

max_clients limit reached error on tornado-botocore server

I've developed a Tornado server using the tornado-botocore package for interacting with Amazon SQS service.
When I'm trying to load test the server i get the following log:
[simple_httpclient:137:fetch_impl] max_clients limit reached, request queued. 10 active, 89 queued requests.
I assume it's from the ASyncHTTPClient used by the botocore package.
I've tried to set the max_clients to an higher number but with no success:
def _connect(self, operation):
sqs_connection = Botocore(
service='sqs', operation=operation,
region_name=options.aws_sqs_region_name,
session=session)
sqs_connection.http_client.configure(None, defaults=dict(max_clients=5000))
what am i doing wrong?
Thanks.
configure is a class method that must be called before an AsyncHTTPClient is created: tornado.httpclient.AsyncHTTPClient.configure(None, max_clients=100).
The log message does not indicate an error (it's logged at debug level). It's up to you whether it's appropriate for this service to respond to load by using more connections or queuing things up. 5000 connections for a single application process seems like too much to me.

Using RabbitMQ with Plone - Celery or not?

I hope I am posting this in the right place.
I am researching RabbitMQ for potential use in our Plone sites. We currently us Async on a dedicated worker client in the Plone server, but we are thinking about building a dedicated RabbitMQ server that will handle all Plone messaging and other activity.
My specific question is, what are the advantages of using Celery to work with RabbitMQ in Plone versus just using RabbitMQ? I found this plone add-on for Celery integration, but not sure if that is best route to go. I noticed Celery has the Flower tool for monitoring the queues, which would be a huge plus.
As a side question, if you feel so inclined, does anyone have any tips or references for integration RabbitMQ with Plone to handle all of these requests? I have been doing research, and get the general gist of RabbitMQ, but I can't seem to make the connection with Plone activities, such as Content Rules and PloneFormGen submissions for example. So far I see this add-on that I am going to install and see if I can figure out, but I am just trying to get a little guidance if I can.
Thanks for your time!
At first, ask yourself, if you need the features of RabbitMQ or just want to do some asynchronous tasks in Python with Plone.
If you don't really need RabbitMQ, you could look into David Glick's gists for how to integrate Celery with Plone (and still use RabbitMQ with Celery):
https://gist.github.com/davisagli/5824662
https://gist.github.com/davisagli/5824709
You could also look into collective.taskqueue (simple queues without Celery nor RabbitMQ), but it does not provide any monitoring solution yet.
If you really need RabbitMQ, skip Celery, and try out collective.zamqp. Celery tries to be broker by itself and would prevent you from using most of AMQP's and RabbitMQ's built-in features.
RabbitMQ ships with great web admin plugin for monitoring and there are also plugins for 3rd party monitoring systems (like Zenoss).
I'm sorry that collective.zamqp is still missing narrative documentation, but you can look into collective.zamqpdemo for various examples of its configuration and usage.
In short, c.zamqp allows you to define configure broker usage in terms of producers and consumers:
from five import grok
from zope.interface import Interface
from collective.zamqp.producer import Producer
from collective.zamqp.consumer import Consumer
class CreateItemProducer(Producer):
"""Produces item creation requests"""
grok.name("amqpdemo.create") # is also used as default routing key
connection_id = "superuser"
serializer = "msgpack"
queue = "amqpdemo.create"
durable = False
class ICreateItemMessage(Interface):
"""Marker interface for item creation message"""
class CreateItemConsumer(Consumer):
"""Consumes item creation messages"""
grok.name("amqpdemo.create") # is also used as the queue name
connection_id = "superuser"
marker = ICreateItemMessage
durable = False
Publish messages through transaction bound producer (to publish messages only after a successful transaction):
import uuid
from zope.component import getUtility
from collective.zamqp.interfaces import IProducer
producer = getUtility(IProducer, name="amqpdemo.create")
producer._register() # register to bound to successful transaction
message = {"title": u"My title"}
producer.publish(message)
And consume the messages in a familiar content event handler environment:
from zope.component.hooks import getSite
from collective.zamqp.interfaces import IMessageArrivedEvent
from plone.dexterity.utils import createContentInContainer
#grok.subscribe(ICreateItemMessage, IMessageArrivedEvent)
def createItem(message, event):
"""Consume item creation message"""
portal = getSite()
obj = createContentInContainer(
portal, "Document", checkConstraints=True, **message.body)
message.ack()
Finally, it decouples broker connection configuration from code and the actual connection parameters can be defined in buildout.cfg (allowing required amount of consuming instances):
[instance]
recipe = plone.recipe.zope2instance
...
zope-conf-additional =
%import collective.zamqp
<amqp-broker-connection>
connection_id superuser
heartbeat 120
# These are defaults, but can be defined when required:
# hostname localhost
# virtual_host /
# username guest
# password guest
</amqp-broker-connection>
<amqp-consuming-server>
connection_id superuser
site_id Plone
user_id admin
vhm_method_prefix /VirtualHostBase/https/example.com:443/Plone/VirtualHostRoot
</amqp-consuming-server>
c.zamqp cannot be directly called from RestrictedPython, so integrating it to PloneFormGen would need either a custom action adapter or a custom External method to be called from PFG's Python script adapter.

Dynamic pages with Django & Celery

I have a Celery task registered in my tasks.py file. When someone POST to /run/pk I run the task with the given parameters. This task also executes other tasks (normal Python functions), and I'd like to update my page (the HttpResponse returned at /run/pk) whenever a subtask finishes its work.
Here is my task:
from celery.decorators import task
#task
def run(project, branch=None):
if branch is None:
branch = project.branch
print 'Creating the virtualenv'
create_virtualenv(project, branch)
print 'Virtualenv created' ##### Here I want to send a signal or something to update my page
runner = runner(project, branch)
print 'Using {0}'.format(runner)
try:
result, output = runner.run()
except Exception as e:
print 'Error: {0}'.format(e)
return False
print 'Finished'
run = Run(project=project, branch=branch,
output=output, **result._asdict())
run.save()
return True
Sending push notifications to the client's browser using Django isn't easy, unfortunately. The simplest implementation is to have the client continuously poll the server for updates, but that increases the amount of work your server has to do by a lot. Here's a better explanation of your different options:
Django Push HTTP Response to users
If you weren't using Django, you'd use websockets for these notifications. However Django isn't built for using websockets. Here is a good explanation of why this is, and some suggestions for how to go about using websockets:
Making moves w/ websockets and python / django ( / twisted? )
With many years past since this question was asked, Channels is a way you could now achieve this using Django.
Then Channels website describes itself as a "project to make Django able to handle more than just plain HTTP requests, including WebSockets and HTTP2, as well as the ability to run code after a response has been sent for things like thumbnailing or background calculation."
There is a service called Pusher that will take care of all the messy parts of Push Notifications in HTML5. They supply a client-side and server-library to handle all the messaging and notifications, while taking care of all the HTML5 Websocket nuances.

Categories