Heroku Scheduler doesnt change DB automatically - Script works if executed manually

Heroku Scheduler doesnt change DB automatically - Script works if executed manually - python

I am using the free Heroku Scheduler add-on for two tasks.
(Small tasks which take less then 10 minutes should be easily handeled by Heroku Scheduler).
In the logs I see the Heroku Scheduler executing but it seems not to do anything.
Here is the task (my_task.py):
from alchemy import db_session
from models import User
all_users = User.query.all()
for user in all_users:
if user.remind_email_sent == False:
print "SEND REMIND EMAIL HERE", user.id
setattr(user, "remind_email_sent", True)
db_session.commit()
Here is the task in heroku scheduler:
The logs: heroku scheduler is executed (but I see no prints):
Aug 10 14:31:26 monteurzimmer-test app[api] notice Starting process with command `heroku run python my_task.py` by user scheduler#addons.heroku.com
Aug 10 14:31:32 monteurzimmer-test heroku[scheduler] notice Starting process with command `heroku run python my_task.py`
Aug 10 14:31:33 monteurzimmer-test heroku[scheduler] notice State changed from starting to up
Aug 10 14:31:34 monteurzimmer-test heroku[scheduler] notice State changed from up to complete
Aug 10 14:31:34 monteurzimmer-test heroku[scheduler] notice Process exited with status 127
EDIT:
Okay there is indeed an error, in the logs I was not displaying info. The error shows up here:
Aug 10 15:02:32 monteurzimmer-test app[scheduler] info bash: heroku: command not found
If I run the script manually (heroku run python my_task.py) through the heroku CLI it works fine, all items of currently 6 users are set to True
So why the scheduler does not work here? There is indeed an error, see the EDIT.
Right now it is a small test database, in future it is planed instead of the print to send an email to each user, there will be a few hundred users.

After I found the error, the solution is actually very easy. It tells me:
bash: heroku: command not found
Which means heroku is already executing bash, and therefore does not know the command heroku. You are already in your project folder, simply remove heroku run and thats it:
python my_task.py
As a note here: It took me very long to find the error. I assumed there is no error, because I never display the info stuff in the logs (I use LogDNA). So you should be looking on everything in the logs in such cases.

Related

Heroku how to see logs of clock process

I recently implemented a clock process in my heroku app (Python) to scrape data into a database for me every X hours. In the script, I have a line that is supposed to send me an email when the scraping begins and again when it ends. Today was the first day that it was supposed to run at 8AM UTC, and it seems to have ran perfectly fine as the data on my site has been updated.
However, I didn't receive any emails from the scraper, so I was trying to find the logs for that specific dyno to see if it hinted at why the email wasn't sent. However I am unable to see anything that even shows the process ran this morning.
With the below command all I see is that the dyno process is up as of my last Heroku deploy. But there is nothing that seems to suggest it ran successfully today... even though I know it did.
heroku logs --tail --dyno clock
yields the following output, which corresponds to the last time I deployed my app to heroku.
2021-04-10T19:25:54.411972+00:00 heroku[clock.1]: State changed from up to starting
2021-04-10T19:25:55.283661+00:00 heroku[clock.1]: Stopping all processes with SIGTERM
2021-04-10T19:25:55.402083+00:00 heroku[clock.1]: Process exited with status 143
2021-04-10T19:26:07.132470+00:00 heroku[clock.1]: Starting process with command `python clock.py --log-file -`
2021-04-10T19:26:07.859629+00:00 heroku[clock.1]: State changed from starting to up
My question is, is there any command or place to check on Heroku to see any output from my logs? For example any exceptions that were thrown? If I had any PRINT statements in my clock-process, where would those be printed to?
Thanks!

Although this is not the full answer, from the Ruby gem 'ruby-clock', we get an insight from the developer
Because STDOUT does not flush until a certain amount of data has gone
into it, you might not immediately see the ruby-clock startup message
or job output if viewing logs in a deployed environment such as Heroku
where the logs are redirected to another process or file. To change
this behavior and have logs flush immediately, add $stdout.sync = true
to the top of your Clockfile.
So I'm guessing that it has something to do with flushing STDOUT when logging although I am not sure how to do that in Python.
I did a quick search and found this stackoverflow post
Namely
In Python 3, print can take an optional flush argument:
print("Hello, World!", flush=True)
In Python 2, after calling print, do:
import sys
sys.stdout.flush()

Live ECS logging into Cloudwatch

I am using an ECS task which runs a Docker container to execute some terraform commands.
I would like to logs the results of the terraform commands into Cloudwatch, if possible live. I am using the logging package of Python 3.
The function I use to output the result of the command is the following:
def execute_command(command):
"""
This method is used to execute the several commands
:param command: The command to be executed
:return decoded: The result of the command execution
"""
logging.info('Executing: {}'.format(command))
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
communicate = process.communicate()
decoded = (communicate[0].decode('utf-8'), communicate[1].decode('utf-8'))
for stdout in decoded[0].split('\n'):
if stdout != '':
logging.info(stdout)
for stderr in decoded[1].split('\n'):
if stderr != '':
logging.warning(stderr)
return decoded
Which is called the following way:
apply_command = 'terraform apply -input=false -auto-approve -no-color {}'.format(plan_path)
terraform_apply_output = utils.execute_command(apply_command)
if terraform_apply_output[1] is not '':
logging.info('Apply has failed. See above logs')
aws_utils.remove_message_from_queue(metadata['receipt_handle'])
utils.exit_pipeline(1)
When the terraform command succeed, I can see its output after the command has been executed (i.e: see the result of the apply command after the resources have been applied), which is expected by the code.
When the terraform command failed (let's say because some resources were already deployed and not saved in a .tfstate), then I cannot see the login and the ECS task quit without error message.
I can see 2 reasons for it:
The result of the failed terraform command returns a non-zero code, which means the ECS task exits before outputing the logs into stdout (and so, into Cloudwatch).
The result of the failed terraform command is sent to stderr, which is not correctly logged.
What is my error here, and how could I fix it? Any help greatly appreciated :)

This question sounds suspectly familiar to me. Anyway.
Adding a sleep(10) just before exiting the task will fix the issue.
From AWS support:
I’ve been investigating the issue further and I noticed an internal
ticket regarding CloudWatch logs sometimes being truncated for Fargate
tasks. The problem was reported as a known issue in the latest Fargate
platform version (1.3.0). [1] Looking at our internal tickets for the
same, as you mentioned in the case description, the current workaround
to avoid this situation is extending the lifetime of the existing
container by adding a delay (~>10 seconds) between the logging output
of the application and the exit of the process (exit of the
container). I can confirm that our service team are still working to
get a permanent resolution for this reported issue. Unfortunately,
there is no ETA shared for when the fix will be deployed. However,
I've taken this opportunity to add this case to the internal ticket to
inform the team of the similar and try to expedite the process. In
addition, I'd recommend keeping an eye on the ECS release notes for
updates to the Fargate platform version which address this behaviour:
-- https://aws.amazon.com/new/
-- https://docs.aws.amazon.com/AmazonECS/latest/developerguide/document_history.html
"

Web2Py - configure a scheduler

I have an application written in Web2Py that contains some modules. I need to call some functions out of a module on a periodic basis, say once daily. I have been trying to get a scheduler working for that purpose but am not sure how to get it working properly. I have referred to this and this to get started.
I have got a scheduler.py class in the models directory, which contains code like this:
from gluon.scheduler import Scheduler
from Module1 import Module1
def daily_task():
module1 = Module1()
module1.action1(arg1, arg2, arg3)
daily_task_scheduler = Scheduler(db, tasks=dict(my_daily_task=daily_task))
In default.py I have following code for the scheduler:
def daily_periodic_task():
daily_task_scheduler.queue_task('daily_running_task', repeats=0, period=60)
[for testing I am running it after 60 seconds, otherwise for daily I plan to use period=86400]
In my Module1.py class, I have this kind of code:
def action1(self, arg1, arg2, arg3):
for row in db().select(db.table1.ALL):
row.processed = 'processed'
row.update_record()
One of the issues I am facing is that I don't understand clearly how to make this scheduler work to automatically handle the execution of action1 on daily basis.
When I launch my application using syntax similar to: python web2py.py -K my_app it shows this in the console:
web2py Web Framework
Created by Massimo Di Pierro, Copyright 2007-2015
Version 2.11.2-stable+timestamp.2015.05.30.16.33.24
Database drivers available: sqlite3, imaplib, pyodbc, pymysql, pg8000
starting single-scheduler for "my_app"...
However, when I see the browser at:
http://127.0.0.1:8000/my_app/default/daily_periodic_task
I just see "None" as text displayed on the screen and I don't see any changes produced by the scheduled task in my database table.
While when I see the browser at:
http://127.0.0.1:8000/my_app/default/index
I get an error stating This web page is not available, basically indicating my application never got started.
When I start my application normally using python web2py.py my application loads fine but I don't see any changes produced by the scheduled task in my database table.
I am unable to figure out what I am doing wrong here and how to properly use the scheduler with Web2Py. Basically, I need to know how can I start my application normally alongwith the scheduled tasks properly running in background.
Any help in this regard would be highly appreciated.

Running python web2py.py starts the built-in web server, enabling web2py to respond to HTTP requests (i.e., serving web pages to a browser). This has nothing to do with the scheduler and will not result in any scheduled tasks being run.
To run scheduled tasks, you must start one or more background workers via:
python web2py.py -K myapp
The above does not start the built-in web server and therefore does not enable you to visit web pages. It simply starts a worker process that will be available to execute scheduled tasks.
Also, note that the above does not actually result in any tasks being scheduled. To schedule a task, you must insert a record in the db.scheduler_task table, which you can do via any of the usual methods of inserting records (including using appadmin) or programmatically via the scheduler.queue_task method (which is what you use in your daily_periodic_task action).
Note, you can simultaneously start the built-in web server and a scheduler worker process via:
python web2py.py -a yourpassword -K myapp -X
So, to schedule a daily task and have it actually executed, you need to (a) start a scheduler worker and (b) schedule the task. You can schedule the task by visiting your daily_periodic_task action, but note that you only need to visit that action once, as once the task has been scheduled, it remains in effect indefinitely (given that you have set repeats=0).
If the task does not appear to be working, it is possible there is something wrong with the task itself that is resulting in an error.

Heroku scheduled task running every 10 minutes, scheduled once an hour

So I've just pushed my twitter bot to Heroku, and set to run every hour on the half hour with the Heroku scheduler addon. However, for whatever reason it's running every 10 minutes instead. Is this a bug with the scheduler? Here's an excerpt of my logs from when the scheduler ran it successfully and then it tried to run it again ten minutes later:
2013-01-30T19:30:20+00:00 heroku[scheduler.4875]: Starting process with command `python ff7ebooks.py`
2013-01-30T19:30:21+00:00 heroku[scheduler.4875]: State changed from starting to up
2013-01-30T19:30:24+00:00 heroku[scheduler.4875]: Process exited with status 0
2013-01-30T19:30:24+00:00 heroku[scheduler.4875]: State changed from up to complete
2013-01-30T19:34:34+00:00 heroku[web.1]: State changed from crashed to starting
2013-01-30T19:34:42+00:00 heroku[web.1]: Starting process with command `python ff7ebooks.py`
2013-01-30T19:34:44+00:00 heroku[web.1]: Process exited with status 0
2013-01-30T19:34:44+00:00 heroku[web.1]: State changed from starting to crashed
I can provide whatever info anyone needs to help me diagnose this issue.The [web.1] log messages repeat every couple of minutes. I don't want to spam my followers.

If anyone else has this issue, I figured it out. I enabled the scheduler and then allocated 0 dynos, that way it only allocates a Heroku dyno when it is scheduled to run. For some reason it was running my process continuously and (my assumption is that) Twitter only let it connect to a socket every few minutes which resulted in the sporadic tweeting.

I would share with you the solution of a guy that have helped me with a one-off running script (like a python script that starts and then ends, and not keeps running).
Any question let me know, and I will help you --> andreabalbo.com
Hi Andrea
I have also just created a random process-type in my Procfile:
tmp-process-type: command:test
I did not toggle on the process-type in the Heroku Dashboard. After
installing the Advanced Scheduler, I creating a trigger with command
"tmp-process-type" that runs every minute. Looking at my logs I can
see that every minute a process started with "command:test",
confirming that the process-type in the Procfile is working. I then
toggled on the process-type in the Heroku Dashboard. This showed up
immediately in my logs:
Scaled to tmp-process-type#1:Free web#0:Free by user ...
This is because after toggling, Heroku will spin up a normal dyno that
it will try to keep up. Since your script is a task that ends, the
dyno dies and Heroku will automatically restart it, causing your task
to be run multiple times.
In summary, the following steps should solve your problem:
1. Toggle your process-type off (but leave it in the Procfile)
2. Install advanced-scheduler
3. Create a trigger (recurring or one-off) with command "tmp-process-type"
4. Look at your logs to see if anything weird shows up
With kind regards, Oscar

Fixed this problem by only one action at the end:
I put the amount of workers to 0
then in the scheduler it is still put "python ELO_voetbal.py" and automatically starts a worker for that.
so I did not use either advanced scheduler or placed "tmp-process-type" somewhere.

Gunicorn Internal Server Errors

I have a Gunicorn server running a Django application which has a tendency to crash quite frequently. Unfortunately when it crashes all the Gunicorn workers go down simultaneously and silently bypass Django's and django-sentry's logging. All the workers return "Internal Server Error" but the arbiter does not crash so supervisord does not register it as a crash and thus does not restart the process.
My question is, is there a way to hook onto a Gunicorn worker crash and possibly send an email or do a logging statement? Secondly is there a way to get supervisord to restart Gunicorn server that is returning nothing but 500's?
Thanks in advance.

I highly recommend using zc.buildout. Here is an example using plugin Superlance for supervisord with buildout:
[supervisor]
recipe = collective.recipe.supervisor
plugins =
superlance
...
programs =
10 zeo ${zeo:location}/bin/runzeo ${zeo:location}
20 instance1 ${instance1:location}/bin/runzope ${instance1:location} true
...
eventlisteners =
Memmon TICK_60 ${buildout:bin-directory}/memmon [-p instance1=200MB]
HttpOk TICK_60 ${buildout:bin-directory}/httpok [-p instance1 -t 20 http://localhost:8080/]
Which will do http request every 20 seconds and restart process if it fails.
http://pypi.python.org/pypi/collective.recipe.supervisor/0.16

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.