Architecture suggestions - How to implement autoscaled async tasks - python

We have a large application, which uses django as an ORM, and celery as a task running infrastructure.
We run complex pipelines triggered by events (user driven or automatic), which look something like this:
def pipeline_a:
# all lines are synchronous, so second line must happen after first is finished successfully
first_res = a1()
all_results = in_parallel.do(a2, a3, a4)
a5(first_res, all_results)
We wish to run a1, a2, ... on different machines (each task may need different resources), and the number of parallel running pipelines is always changing.
Today we use celery which is super convenient for implementing the above - but isn't suitable for auto-scaling (we hacked it to work with kubernetes, but it doesn't have native support with it).
Mainly the issues I want to solve are:
How to "run the next pipeline step" only after all previous ones are done (I may not know in advance which steps will be run - it depends on the results of previous steps, so the steps are dynamic in nature)
Today we try and use kubernetes (EKS) to autoscale some of the tasks (SQS queue size is the hpa metric). How to make kubernetes not try and terminate currently running tasks, but still "start pods" if a new task arrives at the queue (many tasks take ~half an hour to complete)
My experience so far says that to solve 1, celery is the most convenient way, but then it clashes with 2. So How would you solve 1 without celery, and then how could I harness kubernetes for long running tasks?

If I understand your question correctly,
you have async job which can run upto 30 mins.
Job are running on K8s.
Output of current job may decide the next Job.
You have ability to use SQS.
You can maintain queue for each of task. for each queue implement an consumer. Using Django first add the task to 'a1'. Update the job status in db.
When consumer of a1 finished execution it update the status in db and push to right queue. Let's say 'a3'.
Consumer of 'a3' will read the task. Update the db. Execute. Push the task in right queue. Update the db.
If you use SQS, then you store infinite task in queue. You will have to increase the number of consumer based on size of SQS queue. For this you can use https://github.com/Wattpad/kube-sqs-autoscaler

Related

Persisting all job results in a separate db in Celery

We are running an API server where users submit jobs for calculation, which take between 1 second and 1 hour. They then make requests to check the status and get their results, which could be (much) later, or even never.
Currently jobs are added to a pub/sub queue, and processed by various worker processes. These workers then send pub/sub messages back to a listener, which stores the status/results in a postgres database.
I am looking into using Celery to simplify things and allow for easier scaling.
Submitting jobs and getting results isn't a problem in Celery, using celery_app.send_task. However, I am not sure how to best ensure the results are stored when, particularly for long-running or possibly abandoned jobs.
Some solutions I considered include:
Give all workers access to the database and let them handle updates. The main limitation to this seems to be the db connection pool limit, as worker processes can scale to 50 replicas in some cases.
Listen to celery events in a separate pod, and write changes based on this to the jobs db. Only 1 connection needed, but as far as I understand, this would miss out on events while this pod is redeploying.
Only check job results when the user asks for them. It seems this could lead to lost results when the user takes too long, or slowly clog the results cache.
As in (3), but periodically check on all jobs not marked completed in the db. A tad complicated, but doable?
Is there a standard pattern for this, or am I trying to do something unusual with Celery? Any advice on how to tackle this is appreciated.
In the past I solved similar problem by modifying tasks to not only return result of the computation, but also store it into a cache server (Redis) right before it returns. I had a task that periodically (every 5min) collects these results and writes data (in bulk, so quite effective) to a relational database. It was quite effective until we started filling the cache with hundreds of thousands of results, so we implemented a tiny service that does this instead of task that runs periodically.

What's the problem with sharing job stores in APScheduler?

I did quite get the problem that arises by sharing a job store across multiple schedulers in APScheduler.
The official documentation mentions
Job stores must never be shared between schedulers
but doesn't discuss the problems related to that, Can someone please explain it?
and also if I deploy a Django application containing APScheduler in production, will multiple job stores be created for each worker process?
There are multiple reasons for this. In APScheduler 3.x, schedulers do not have any means to signal each other about changes happening in the job stores. When the scheduler starts, it queries the job store for jobs due for execution, processes them and then asks how long it should sleep until the next due job. If another scheduler adds a job that would be executed before that wake-up time, the other scheduler would happily sleep past that time because there is no mechanism with which it could receive a notification about the new (or updated) job.
Additionally, schedulers do not have the ability to enforce the maximum number of running instances of a job since they don't communicate with other schedulers. This can lead to conflicts when the same job is run on more than one scheduler process at the same time.
These shortcomings are addressed in the upcoming 4.x series and the ability to share job stores could be considered one of its most significant new features.

Best approach to tackle long polling in server side

I have a use case where I need to poll the API every 1 sec (basically infinite while loop). The polling will be initiated dynamically by user through an external system. This means there can be multiple polling running at the same time. The polling will be completed when the API returns 400. Anyways, my current implementation looks something like:
Flask APP deployed on heroku.
Flask APP has an endpoint which external system calls to start polling.
That flask endpoint will add the message to queue and as soon as worker gets it, it will start polling. I am using Heroku Redis to Go addons. Under the hood it uses python-rq and redis.
The problem is when some polling process goes on for a long time, the other process just sits on the queue. I want to be able to do all of the polling in a concurrent process.
What's the best approach to tackle this problem? Fire up multiple workers?
What if there could be potentially more than 100 concurrent processes.
You could implement a "weighted"/priority queue. There may be multiple ways of implementing this, but the simplest example that comes to my mind is using a min or max heap.
You shoud keep track of how many events are in the queue for each process, as the number of events for one process grows, the weight of the new inserted events should decrease. Everytime an event is processed, you start processing the following one with the greatest weight.
PS More workers will also speed up the the work.

Spread Celery Django tasks out over 24 hours

I have tasks that do a get request to an API.
I have around 70 000 requests that I need to do, and I want to spread them out in 24 hours. So not all 70k requests are run at for example 10AM.
How would I do that in celery django? I have been searching for hours but cant find a good simple solution.
The database has a list of games that needs to be refreshed. Currently I have a cron that creates tasks every hour. But is it better to create a task for every game and make it repeat every hour?
The typical approach is to send them whenever you need some work done, no matter how many there are (even hundreds of thousands). The execution however is controlled by how many workers (and worker processes) you have subscribed to a dedicated queue. The key here is the dedicated queue - that is a common way of not allowing all workers start executing the newly created tasks. This goes beyond the basic Celery usage. You need to use celery multi for this use-case, or create two or more separate Celery workers manually with different queues.
If you do not want to over-complicate things you can use your current setup, but make these tasks with lowest priority, so if any new, more important, task gets created, it will be executed first. Problem with this approach is that only Redis and RabbitMQ backends support priorities as far as I know.

How to ensure task execution order per user using Celery, RabbitMQ and Django?

I'm running Django, Celery and RabbitMQ. What I'm trying to achieve is to ensure, that tasks related to one user are executed in order (specifically, one at the time, I don't want task concurrency per user)
whenever new task is added for user, it should depend on the most recently added task. Additional functionality might include not adding task to queue, if task of this type is queued for this user and has not yet started.
I've done some research and:
I couldn't find a way to link newly created task with already queued one in Celery itself, chains seem to be only able to link new tasks.
I think that both functionalities are possible to implement with custom RabbitMQ message handler, though it might be hard to code after all.
I've also read about celery-tasktree and this might be an easiest way to ensure execution order, but how do I link new task with already "applied_async" task_tree or queue? Is there any way that I could implement that additional no-duplicate functionality using this package?
Edit: There is this also this "lock" example in celery cookbook and as the concept is fine, I can't see a possible way to make it work as intended in my case - simply if I can't acquire lock for user, task would have to be retried, but this means pushing it to the end of queue.
What would be the best course of action here?
If you configure the celery workers so that they can only execute one task at a time (see worker_concurrency setting), then you could enforce the concurrency that you need on a per user basis. Using a method like
NUMBER_OF_CELERY_WORKERS = 10
def get_task_queue_for_user(user):
return "user_queue_{}".format(user.id % NUMBER_OF_CELERY_WORKERS)
to get the task queue based on the user id, every task will be assigned to the same queue for each user. The workers would need to be configured to only consume tasks from a single task queue.
It would play out like this:
User 49 triggers a task
The task is sent to user_queue_9
When the one and only celery worker that is listening to user_queue_9 is ready to consume a new task, the task is executed
This is a hacky answer though, because
requiring just a single celery worker for each queue is a brittle system -- if the celery worker stops, the whole queue stops
the workers are running inefficiently

Categories