I am trying to build a Flask application on Windows where user uploads a big Excel file then it is processed in Python which takes 4-5 minutes. I need to process those tasks in background after user uploads the file.
I RQ, Celery, etc. but those are not working on Windows and I have never worked on Linux. I need some advice on how to achieve this.
celery and rq can work on windows but have some trouble
for rq use this
and for celery use this
I don't think it's accurate to say that you can't run RQ on Windows, it just has some limitations (as you can in the documentation).
As you can run Redis on Windows, you might want to give a try to other task queues based on Redis. One such example is huey. There are at least examples of people who were successful running it on Windows (e.g. look at this SO question).
I solved this by using WSL Linux Emulation on windows.. and running my RQ worker on WSL..
I am not sure though if I will face any issues in future but as of now its queuing and processing tasks as I desire..
info Might be useful for somebody with same problem
Related
I have a page where the user selects a Python script, and then this script executes.
My issue is that some scripts take a while to execute (up to 30m) so I'd like to run them in the background while the user can still navigate on the website.
I tried to use Celery but as I'm on Windows I couldn't do better than using --pool=solo which, while allowing the user to do something else, can only do so for one user at a time.
I also saw this thread while searching for a solution, but didn't manage to really understand how it worked nor how to implement it, as well as determine if it was really answering my problem...
So here is my question : how can I have multiple thread/multiple processes on Celery while on Windows ? Or if there's another way, how can I execute several tasks simultaneously in the background ?
Have you identified whether your slow scripts belong to CPU-bound tasks or I/O bound tasks?
if they're I/O bound, you can use eventlet and gevent based on Strategy 1 in the blog from distributedpython.com
but if they're CPU bound, you may have to think of using the ways like a dedicated Celery windows box (or windows Docker container) to workaround Celery billiard issue on Windows by setting the environment variable (FORKED_BY_MULTIPROCESSING=1) based on Strategy 2 in the blog from distributedpython.com
I have a Django web application hosted on IIS. I subprocess should be consistently running alongside the web application. When I run the application locally using
python manage.py runserver
the background task runs perfectly while the application is running. However, hosted on IIS the background task does not appear to run. How do I make the task run even when hosted on IIS?
In the manage.py file of Django I have the following code:
def run_background():
return subprocess.Popen(["python", "background.py"], creationflag=subprocess.CREATE_NEW_PROCESS_GROUP)
run_background()
execute_from_command_line(sys.argv)
What can be done to make the background task always run even on IIS?
Celery is a classic option for a background task manager.
https://pypi.org/project/celery/
Alternatively, I've used a library called schedule when I wanted something a little more lightweight. Note that schedule is still in its infancy. If you need something that is going to maintain support down the line, go with celery to be safe.
https://pypi.org/project/schedule/
Without knowing the context of your project, I can't say which I would choose, but they're both good options for task management.
On Windows, you can use Task Scheduler to automatically start your background process when Windows starts, using an arbitrary user account.
This was the "officially suggested" solution for Celery 3 on Windows until some years ago, and I believe it can be easily adapted to run any process.
You can find a detailed explanation here:
https://www.calazan.com/windows-tip-run-applications-in-the-background-using-task-scheduler/
I'm in need of a way to execute external long running processes from a web app written in Django and Python.
Right now I'm using Supervisord and the API. My problem with this solution is that it's very static. I need to build the commands from my app instead of having to pre configure Supervisord with all possible commands. The argument and the command is dynamic.
I need to execute the external process, save a pid/identifier and later be able to check if it's still alive and running and stop the process.
I've found https://github.com/mnaberez/supervisor_twiddler to add processes on the fly to supervisord. Maybe that's the best way to go?
Any other ideas how to best solve this problem?
I suggest you have a look at this post:
Processing long-running Django tasks using Celery + RabbitMQ + Supervisord + Monit
As the title says, there are a few additional components involved (mainly celery and rabbitMQ), but these are good and proven technologies for this kind of requirement.
I'd like to run periodic tasks on my django project, but I don't want all the complexity of celery/django-celery (with celerybeat) bundled in my project.
I'd like, also, to store the config with the times and which command to run within my SCM.
My production machine is running Ubuntu 10.04.
While I could learn and use cron, I feel like there should be a higher level (user friendly) way to do it. (Much like UFW is to iptables).
Is there such thing? Any tips/advice?
Thanks!
There are several Django-based scheduling apps, such as django-chronograph and django-chroniker and django-cron. I forked django-chronograph into django-chroniker to fix a few bugs and extend it for my own use case. I still use Celery in some projects, but like you point out, it's a bit overcomplicated and has a large stack.
In my personal opinion, i would learn how to use cron. This won't take more than 5 to 10 minutes, and it's an essential tool when working on a Linux server.
What you could do is set up a cronjob that requests one page of your django instance every minute, and have the django script figure out what time it is and what needs to be done, depending on the configuration stored in your database. This is the approach i've seen in other similar applications.
I want to give celery a try. I'm interested in a simple way to schedule crontab-like tasks, similar to Spring's quartz.
I see from celery's documentation that it requires running celeryd as a daemon process. Is there a way to refrain from running another external process and simply running this embedded in my django instance? Since I'm not interested in distributing the work at the moment, I'd rather keep it simple.
Add CELERY_ALWAYS_EAGER=True option in your django settings file and all your tasks will be executed locally. Seems like for the periodic tasks you have to execute celery beat as well.