The docs for the python dev server say this about running tasks:
When your app is running in the
development server, task queues are
not processed automatically. Instead,
task queues accrue tasks which you can
examine and execute from the developer
console...
But the release notes for version 1.3.4 of the python sdk (which I am using) say:
Auto task execution is now enabled in
the dev_appserver. To turn this off
use the flag --disable_task_running.
So maybe the docs are a little behind, right? Except when I go to "http://localhost:8080/_ah/admin/tasks?queue=default", I see this:
Tasks will not run automatically. Push the 'Run' button to execute each task.
Can tasks be run automatically or not? If so, what is the trick?
It seems the problem was that I was running the dev server with python 2.6 instead of 2.5. When using 2.5, everything worked.
Related
I have some python scripts that I look to run daily form a Windows PC.
My current workflow is:
The desktop PC stays all every day except for a weekly restart over the weekend
After the restart I open VS Code and run a little bash script ./start.sh that kicks off the tasks.
The above works reasonably fine, but it is also fairly painful. I need to re-run start.sh if I ever close VS Code (eg. for an update). Also the processes use some local python libraries so I need to stop them if I'm going to update them.
With regards to how to do this properly, 4 tools came to mind:
Windows Scheduler
Airflow
Prefect (https://www.prefect.io/)
Rocketry (https://rocketry.readthedocs.io/en/stable/)
However, I can't quite get my head around the fundamental issue that Prefect/Airflow/Rocketry run on my PC then there is nothing that will restart them after the PC reboots. I'm also not sure they will give me the isolation I'd prefer on these tools.
Docker came to mind, I could put each task into a docker image and run them via some form of docker swarm or something like that. But not sure if I'm re-inventing the wheel.
I'm 100% sure I'm not the first person in this situation. Could anyone point me to a guide on how this could be done well?
Note:
I am not considering running the python scripts in the cloud. They interact with local tools that are only licenced for my PC.
You can definitely use Prefect for that - it's very lightweight and seems to be matching what you're looking for. You install it with pip install prefect, start Orion API server: prefect orion start and once you create a Deployment, and start an agent prefect agent start -q default you can even configure schedule from the UI
For more information about Deployments, check our FAQ section.
It sounds Rocketry could also be suitable. Rocketry can shut down itself using a task. You could do a task that:
Runs on the main thread and process (blocking starting new tasks)
Waits or terminates all the currently running tasks (use the session)
Calls session.shut_down() which sets a flag to the scheduler.
There is also a app configuration shut_cond which is simply a condition. If this condition is True, the scheduler exits so alternatively you can use this.
Then after the line app.run() you simply have a line that runs shutdown -r (restart) command on shell using a subprocess library, for example. Then you need something that starts Rocketry again when the restart is completed. For this, perhaps this could be an answer: https://superuser.com/a/954957, or use Windows scheduler to have a simple startup task that starts Rocketry.
Especially if you had Linux machines (Raspberry Pis for example), you could integrate Rocketry with FastAPI and make a small cluster in which Rocketry apps communicate with each other, just put script with Rocketry as a startup service. One machine could be a backup that calls another machine's API which runs Linux restart command. Then the backup executes tasks until the primary machine answers to requests again (is up and running).
But as the author of the library, I'm possibly biased toward my own projects. But Rocketry very capable on complex scheduling problems, that's the purpose of the project.
You can use schtasks for windows to schedule the tasks like running bash script or python script and it's pretty reliable too.
I am running python 3.9, Windows 10, celery 4.3, redis as the backend, and aws sqs as the broker (I wasn't intending on using the backend, but it became more and more apparent to me that due to the library's restrictions on windows that'd I'd be better off using it if I could get it to work, otherwise I would've just used redis as the broker and backend).
To give you some context, I have a webpage that a user interacts with to allow them to do a resource intensive task. If the user has a task running and decides to resend the task, I need it to kill the task, and use the new information sent by the user to create the new task.
The problem for me arrives after this line of thinking:
Me: "Hmmm, the prefork pool is used for heavy cpu background tasks... I want to use that..."
Me: Goes and configures settings.py,
updates the celery library,
sets the environment variable to allow windows to run prefork pool -
os.environ.setdefault('FORKED_BY_MULTIPROCESSING', '1'),
sets a few other configuration settings, etc,
runs the worker and it works.
Me: "Hey, hey. It works... Oh, I still can't revoke a task DESPITE RUNNING THE PREFORK POOL!?!?!
Oh, that's okay... I can just set a session variable to let me know if the user already started a task,
and if they have, just have celery tell me if the task that they started is finished
before I allow the user to request to run a task again."
Me: Goes and configures django sessions,
configures redis,
updates the views to include the session variable, etc,
Me: "Great! Everything is working, so far..."
Me: Runs a test to see if the redis server returns the status...
Celery: "PENDING"
Me: "Yo! Is my task done, yet!?"
Celery: "No - PENDING"
Celery: "PENDING"
Celery: "PENDING"
Celery: "PENDING"
Celery: "PENDING"
Celery: "PENDING"
Me: Searches stackoverflow for why its only pending...
Me: Finds out that you must use --pool=solo for the worker...
Me: Dies on the inside.
Ideally - I'd like to be able to use the prefork pool to do intense processing and to kill the task if need be. The thing is that everything that I read tells me prefork is what I want, but solo is the only way I can think of to get it to work.
Questions:
How bad is it for me to compromise these desires and just go with solo, expecting that I will be using heavy cpu for the tasks and many users? Assume 100s if not 1000s at once submitting tasks.
What other solutions should I consider?
In my experience on windows I cannot use anything other than --pool=solo
What other solutions should I consider?
The way I do it is I use 1 pool for windows development and more on production (linux) at least in my case using solo pool for development is fine.
I've got a small application (https://github.com/tkoomzaaskz/cherry-api) and I would like to integrate it with travis. In fact, travis is probably not important here. My question is how can I configure a build/job to execute the following sequence:
start the server that serves the application
run tests
close the server (which means close the build)
The application is written in python/CherryPy (basic webapp framework). On my localhost I do it using two consoles. One runs the server and another one runs the tests - it's pretty easy and works fine. But when I want to execute all this in the CI environment, I fall in trouble - I'm unable to gain control after the server is started, because the server process waits for requests... and waits... and waits... and tests are never run (https://travis-ci.org/tkoomzaaskz/cherry-api/builds/10855029 - this build is infinite). Additionally, I don't know how to close the server. This is my .travis.yml:
before_script: python src/hello.py
script: nosetests
src/hello.py starts the built-in CherryPy server (listens on localhost:8080). I know I can move it to the background by adding the &: before_script: python src/hello.py & but then I shall find the process ID in the CI-environment and kill the process which seems very very dirty solution and I guess there's something better than that.
I'd appreciate any hints on how can I configure this.
edit: I've configured this dirty run in the background and then kill the process in this file. The build passes now. Still, I think it's ugly...
I need to debug Celery task from the Eclipse debugger.
I'm using Eclipse, PyDev and Django.
First, I open my project in Eclipse and put a breakpoint at the beginning of the task function.
Then, I'm starting the Celery workers from Eclipse by Right Clicking on manage.py from the PyDev Package Explorer and choosing "Debug As->Python Run" and specifying "celeryd -l info" as the argument. This starts MainThread, Mediator and three more threads visible from the Eclipse debugger.
After that I return back to the PyDev view and start the main application by Right Click on the project and choosing Run As/PyDev:Django
My issues is that once the task is submitted by the mytask.delay() it doesn't stop on the breakpoint. I put some traces withing the tasks code so I can see that it was executed in one of the worker threads.
So, how to make the Eclipse debugger to stop on the breakpoint placed withing the task when it executed in the Celery workers thread?
You should consider the option to run the celery task in the same thread as the main process (normally it runs on a separate process), this will make the debug much easier.
You can tell celery to run the task in sync by adding this setting to your settings.py module:
CELERY_TASK_ALWAYS_EAGER = True
# use this if you are on older versions of celery
# CELERY_ALWAYS_EAGER = True
Note: this is only meant to be in use for debugging or development stages!
You can do it using Celery's rdb:
from celery.contrib import rdb
rdb.set_trace()
Then, in a different terminal type telnet localhost 6900, and you will get the debug prompt.
CELERYD_POOL defaults to celery.concurrency.prefork:TaskPool which will spawn separate processes for each worker and PyDev can't see inside them. If you change it to one of the threaded options then you can use the debugger.
For example, for Celery 3.1 you can use this setting:
CELERYD_POOL = 'celery.concurrency.threads:TaskPool'
Note that this requires the threadpool module to be installed.
Also make sure to have CELERY_ALWAYS_EAGER = False, otherwise changing the pool class makes no sense.
I create a management command to test task.. find it easier than running it from shell..
If it runs only on a different thread, it should work on the latest PyDev versions (I think there was an issue before where a spawned thread would not be debugged, but this was fixed).
Now, if it's launching on a different process, you need to use the remote debugger (even if it's on the same machine). See: http://pydev.org/manual_adv_remote_debugger.html
i have a wsgi server which use paste,for some unkonw reason,it will often crash,so i want to has a application or just some package can help me to slove this,when it crashed automaticly kill the process and restart it.Any advice is welcome.
I'd use your operational system's service integration to do that. For example, on debian linux, there's start-stop-daemon. On windows, there's the service management.
It's the proven, well integrated way, provided by the operational system itself, to keep an application running.
Just make your installation program register your service with the native service management system.
You can use supervisord to run your service. It provides auto-restart option in a program configuration. You can reference to the autorestart section in this document.
To know how to use it with Python, you can reference to my answer on this topic.