A normal approach to cron jobs with a django site would be to use cron to run custom management commands periodically.
But I found this http://code.google.com/p/django-cron/
How does it work, without needing cron? What invokes it to poll?
If it just sets up an address for an http request to hit periodically, what if the job takes a long time, won't the server time out?
It continually fires off a Timer thread, whose whole purpose is to wait a defined amount of time (the polling frequency you set in settings.py) and then run the execute on the django-cron queue again.
It depends on Django being a long-lived process, which if configured correctly it is. It runs a thread to check every 5 minutes (by default) to see if there are any jobs that need to be run, and if so runs them.
Related
I am using Cherrypy to create an Application that takes user input, manipulates that data. Basically, executes a time taking script. And then when all that is done, it displays a new page. My problem is that by the time my script finishes execution, browser loses connection and displays
The page at myexample.com isn't workingor No data received. Although the whole script doesn't take more than a minute to execute. Any leads on how to go about would be appreciated.
Cherrypy is a multi-threaded python web server. Due to the python GIL you cannot run a time taking script when answering a request because it will cause Cherrypy to be unresponsive to any new user, meanwhile your script is running.
You need to run your time taking script in a separated python process. The best way to do this is using a queue manager like Celery or RQ.
Check this answer to have a detailed example on how to do this with Cherrypy.
In my django project, I need to collect data from about 50 remote servers into the local database minutely or every 30-seconds. Though it works with crontab in the remote servers, I want to do this in the project. Firstly, I consider the django-celery. However it does well in asynchronous processing and the collect-data task could not be delayed. Therefore i think, it may be not fit. How if i do this use the timer for python and what need i to pay more attention. Excuse for my ignorance of python and django. I'll appreciate other advice or ideas. Many thanks
Basically you can use Celery's preiodic tasks with expire option, which makes you sure that your tasks will not be executed twice.
Also you could run your own script with infinite loop like which will run calculation. If your calculation will run more than minute you can spawn your tasks using eventlet or gevent. Other option you could creare celery-tasks from this script and be sure that your tasks executes every N seconds, as you prefer.
I don't need the threads to be aware of each other. They just need to preform a task that shouldn't take more than two or three seconds tops. What can I do to guarantee that the tread will not be killed before the task is completed. Also, I need to use the occasionally timer thread. The timer is only for a minute but I'm nervous about that being too long for apache.
Why don't start these threads in the background? Why do they need to be part of the webserver? I would suggest that you write some scripts that either sit idle in the background all the time, or are called periodically by a cron job. The python scripts could lookup info in the database or even use a file to indicate what it needs to do, run, then exit.
I have found http://code.activestate.com/recipes/576451-how-to-create-a-windows-service-in-python/
But that service does nothing. How can I use that service for running specific Python file?
You can put your business code in SvcDoRun. The sample at your link just logs a message every three seconds. Just don't forget to check self.hWaitStop periodically.
Sometimes it is convenient to create a worker thread and do all work on that thread, or maybe start a child process. An additional complication in this case is that you have to think about synchronization.
Not talking about the delay method.
I want to be able to get a task, given it's task_id and change it's ETA on the fly, before it is executed.
For now I have to cancel it, and re-schedule one. Troublesome if the scheduled process involve a lot of stuff.
You should store some 'pause' value outside of celery/task queue. I do this with a mailer using celery. I can pause parts of the system by setting values in either memcache or mysql. The tasks then make sure to query the outside resource before executing the task. If it's meant to be paused it sets it does a task.retry() that causes it to go through the retry delay time and such.
Assuming you are using django-celery and PeriodicTask with DatabaseScheduler, you need to modify your PeriodicTask interval or crontab and save it. If your task is defined by an interval, modify the last_run_at property.
You run celerybeat with the database scheduler with:
python manage.py celerybeat -S djcelery.schedulers.DatabaseScheduler