Heroku scheduled task running every 10 minutes, scheduled once an hour - python

So I've just pushed my twitter bot to Heroku, and set to run every hour on the half hour with the Heroku scheduler addon. However, for whatever reason it's running every 10 minutes instead. Is this a bug with the scheduler? Here's an excerpt of my logs from when the scheduler ran it successfully and then it tried to run it again ten minutes later:
2013-01-30T19:30:20+00:00 heroku[scheduler.4875]: Starting process with command `python ff7ebooks.py`
2013-01-30T19:30:21+00:00 heroku[scheduler.4875]: State changed from starting to up
2013-01-30T19:30:24+00:00 heroku[scheduler.4875]: Process exited with status 0
2013-01-30T19:30:24+00:00 heroku[scheduler.4875]: State changed from up to complete
2013-01-30T19:34:34+00:00 heroku[web.1]: State changed from crashed to starting
2013-01-30T19:34:42+00:00 heroku[web.1]: Starting process with command `python ff7ebooks.py`
2013-01-30T19:34:44+00:00 heroku[web.1]: Process exited with status 0
2013-01-30T19:34:44+00:00 heroku[web.1]: State changed from starting to crashed
I can provide whatever info anyone needs to help me diagnose this issue.The [web.1] log messages repeat every couple of minutes. I don't want to spam my followers.

If anyone else has this issue, I figured it out. I enabled the scheduler and then allocated 0 dynos, that way it only allocates a Heroku dyno when it is scheduled to run. For some reason it was running my process continuously and (my assumption is that) Twitter only let it connect to a socket every few minutes which resulted in the sporadic tweeting.

I would share with you the solution of a guy that have helped me with a one-off running script (like a python script that starts and then ends, and not keeps running).
Any question let me know, and I will help you --> andreabalbo.com
Hi Andrea
I have also just created a random process-type in my Procfile:
tmp-process-type: command:test
I did not toggle on the process-type in the Heroku Dashboard. After
installing the Advanced Scheduler, I creating a trigger with command
"tmp-process-type" that runs every minute. Looking at my logs I can
see that every minute a process started with "command:test",
confirming that the process-type in the Procfile is working. I then
toggled on the process-type in the Heroku Dashboard. This showed up
immediately in my logs:
Scaled to tmp-process-type#1:Free web#0:Free by user ...
This is because after toggling, Heroku will spin up a normal dyno that
it will try to keep up. Since your script is a task that ends, the
dyno dies and Heroku will automatically restart it, causing your task
to be run multiple times.
In summary, the following steps should solve your problem:
1. Toggle your process-type off (but leave it in the Procfile)
2. Install advanced-scheduler
3. Create a trigger (recurring or one-off) with command "tmp-process-type"
4. Look at your logs to see if anything weird shows up
With kind regards, Oscar

Fixed this problem by only one action at the end:
I put the amount of workers to 0
then in the scheduler it is still put "python ELO_voetbal.py" and automatically starts a worker for that.
so I did not use either advanced scheduler or placed "tmp-process-type" somewhere.

Related

Python/Django Prevent a script from being run twice

I got a big script that retrieves a lot of data via an API (Magento Invoices). This script is launched by a cron every 20 minutes. But sometimes, we need to refresh manually for getting the last invoiced orders. I have a dedicated page for this.
I would like to prevent from manual launching of the script by testing if it's already running because both API and script take a lot of ressources and time.
I tried to add a "process" model with is_active = True/False that would be tested and avoid re-launching if script is already active. At the start of the script, I switch the process status to TRUE and set it to FALSE when the script has finished.
But it seems that the 2nd instance of the script waits for the first to be finished before starting. At the end, both scripts are run because process.is_active always = False
I also tried with request.session variable, but same issue.
Spent lotsa time on this but didn't find a way to achieve my goal.
Has anyone already faced such a case ?

Heroku how to see logs of clock process

I recently implemented a clock process in my heroku app (Python) to scrape data into a database for me every X hours. In the script, I have a line that is supposed to send me an email when the scraping begins and again when it ends. Today was the first day that it was supposed to run at 8AM UTC, and it seems to have ran perfectly fine as the data on my site has been updated.
However, I didn't receive any emails from the scraper, so I was trying to find the logs for that specific dyno to see if it hinted at why the email wasn't sent. However I am unable to see anything that even shows the process ran this morning.
With the below command all I see is that the dyno process is up as of my last Heroku deploy. But there is nothing that seems to suggest it ran successfully today... even though I know it did.
heroku logs --tail --dyno clock
yields the following output, which corresponds to the last time I deployed my app to heroku.
2021-04-10T19:25:54.411972+00:00 heroku[clock.1]: State changed from up to starting
2021-04-10T19:25:55.283661+00:00 heroku[clock.1]: Stopping all processes with SIGTERM
2021-04-10T19:25:55.402083+00:00 heroku[clock.1]: Process exited with status 143
2021-04-10T19:26:07.132470+00:00 heroku[clock.1]: Starting process with command `python clock.py --log-file -`
2021-04-10T19:26:07.859629+00:00 heroku[clock.1]: State changed from starting to up
My question is, is there any command or place to check on Heroku to see any output from my logs? For example any exceptions that were thrown? If I had any PRINT statements in my clock-process, where would those be printed to?
Thanks!
Although this is not the full answer, from the Ruby gem 'ruby-clock', we get an insight from the developer
Because STDOUT does not flush until a certain amount of data has gone
into it, you might not immediately see the ruby-clock startup message
or job output if viewing logs in a deployed environment such as Heroku
where the logs are redirected to another process or file. To change
this behavior and have logs flush immediately, add $stdout.sync = true
to the top of your Clockfile.
So I'm guessing that it has something to do with flushing STDOUT when logging although I am not sure how to do that in Python.
I did a quick search and found this stackoverflow post
Namely
In Python 3, print can take an optional flush argument:
print("Hello, World!", flush=True)
In Python 2, after calling print, do:
import sys
sys.stdout.flush()

TaskManager: Process is Terminated At Exactly 2 Hours

I'm running a python script remotely from a task machine and it creates a process that is supposed to be running for 3 hours. However, it seems to be terminating prematurely at exactly 2 hours. I don't believe it is a problem with the code because after the while loop ends, I am logging to a log file. The log file doesn't show that it exits out of that while loop successfully. Is there a specific setting on the machine that I need to look into that's interrupting my python process?
Is this perhaps a Scheduled Task? If so, have you checked the task's properties?
On my Windows 7 machine under the "Settings" tab is a checkbox for "Stop the task if it runs longer than:" with a box where you can specify the duration.
One of the suggested durations on my machine is "2 hours."

Celery worker stops after being idle for a few hours

I have a Flask app that uses WSGI. For a few tasks I'm planning to use Celery with RabbitMQ. But as the title says, I am facing an issue where the Celery tasks run for a few minutes and then after a long time of inactivity it just dies off.
Celery config:
CELERY_BROKER_URL='amqp://guest:guest#localhost:5672//'
BROKER_HEARTBEAT = 10
BROKER_HEARTBEAT_CHECKRATE = 2.0
BROKER_POOL_LIMIT = None
From this question, I added BROKER_HEARTBEAT and BROKER_HEARTBEAT_CHECKRATE.
I run the worker inside the venv with celery -A acmeapp.celery worker & to run it in the background. And while checking the status, for the first few minutes, it shows that one node is online and gives an OK response. But after a few hours of the app being idle, when I check the Celery status, it shows Error: No nodes replied within time constraint..
I am new to Celery and I don't know what to do now.
Your Celery worker might be trying to reconnect to the app until it reaches the retry limit. If that is the case, setting up this options in your config file will fix that problem.
BROKER_CONNECTION_RETRY = True
BROKER_CONNECTION_MAX_RETRIES = 0
The first line will make it retry whenever it fails, and the second one will disable the retry limit.
If that solution does not suit you enough, you can also try a high timeout (specified in seconds) for your app using this option:
BROKER_CONNECTION_TIMEOUT = 120
Hope it helps!

Heroku Scheduler doesnt change DB automatically - Script works if executed manually

I am using the free Heroku Scheduler add-on for two tasks.
(Small tasks which take less then 10 minutes should be easily handeled by Heroku Scheduler).
In the logs I see the Heroku Scheduler executing but it seems not to do anything.
Here is the task (my_task.py):
from alchemy import db_session
from models import User
all_users = User.query.all()
for user in all_users:
if user.remind_email_sent == False:
print "SEND REMIND EMAIL HERE", user.id
setattr(user, "remind_email_sent", True)
db_session.commit()
Here is the task in heroku scheduler:
The logs: heroku scheduler is executed (but I see no prints):
Aug 10 14:31:26 monteurzimmer-test app[api] notice Starting process with command `heroku run python my_task.py` by user scheduler#addons.heroku.com
Aug 10 14:31:32 monteurzimmer-test heroku[scheduler] notice Starting process with command `heroku run python my_task.py`
Aug 10 14:31:33 monteurzimmer-test heroku[scheduler] notice State changed from starting to up
Aug 10 14:31:34 monteurzimmer-test heroku[scheduler] notice State changed from up to complete
Aug 10 14:31:34 monteurzimmer-test heroku[scheduler] notice Process exited with status 127
EDIT:
Okay there is indeed an error, in the logs I was not displaying info. The error shows up here:
Aug 10 15:02:32 monteurzimmer-test app[scheduler] info bash: heroku: command not found
If I run the script manually (heroku run python my_task.py) through the heroku CLI it works fine, all items of currently 6 users are set to True
So why the scheduler does not work here? There is indeed an error, see the EDIT.
Right now it is a small test database, in future it is planed instead of the print to send an email to each user, there will be a few hundred users.
After I found the error, the solution is actually very easy. It tells me:
bash: heroku: command not found
Which means heroku is already executing bash, and therefore does not know the command heroku. You are already in your project folder, simply remove heroku run and thats it:
python my_task.py
As a note here: It took me very long to find the error. I assumed there is no error, because I never display the info stuff in the logs (I use LogDNA). So you should be looking on everything in the logs in such cases.

Categories