I have a question about manifest.yml files, and the command argument. I am trying to run multiple python scripts, and I was wondering if there was a better way that I can accomplish this?
command: python3 CD_Subject_Area.py python3 CD_SA_URLS.py
Please let me know how I could call more than one script at a time. Thanks!
To run a couple of short-term (ie. run and eventually exit) commands you would want to use Cloud Foundry tasks. The reason to use tasks over adding a custom command into manifest.yml or a Procfile is because the tasks will only run once.
If you add the commands above, as you have them, they may run many times. This is because an application on Cloud Foundry will run and should execute forever. If it exits, the platform considers it to have crashed and will restart it. Thus when your task ends, even if it is successful (i.e. exit 0), the platfrom still thinks it's a crash and will run it again, and again, and again. Until you stop your app.
With a task, you'd do the following instead:
cf push your application. This will start and stage the application. You can simply leave the command/-c argument as empty and do not include a Procfile[1][2]. The push will execute, the buildpack will run and stage your app, and then it will fail to start because there is no command. That is OK.
Run cf stop to put your app into the stopped state. This will tell the platform to stop trying to restart it.
Run cf run-task <app-name> <command>. For example, cf run-task my-cool-app "python3 CD_Subject_Area.py". This will execute the task in it's own container. The task will run to completion. Looking at cf tasks <app-name> will show you the result. Using cf logs <app-name> --recent will show you the output.
You can then repeat this to run any number of other tasks commands. You don't need to wait for the original one to run. They will all execute in separate containers so one task is totally separated from another task.
[1] - An alternative is to set the command to sleep 99999 or something like that, don't map a route, and set the health check type to process. The app should then start successfully. You'll still stop it, but this just avoids an unseemly error.
[2] - If you've already set a command and want to back that out, remove it from your manifest.yml and run cf push -c null, or simply cf delete your app and cf push it again. Otherwise, the command will be retained in Cloud Controller, which isn't what you'd want.
Related
I'm trying to create a task on Unix similar to a task of Task Scheduler in Windows which will run at specific times of the day and even gets triggered after server restarts. Aim of this job is to execute a python file.
My question is two parts:
1). How to write a job which I can schedule at multiple times of the day. I tried to write cron job using crontab command but it gives You <user> are not allowed to access to (crontab) because of pam configuration. I would like to know a way where I can schedule the triggering of python script without needing root/admin rights.
2). How can I schedule a job whose scheduling stays in effect even after the server is restarted. While going through various resources, I found systemd, using which we can use to start and stop the services. For example, https://linuxconfig.org/how-to-write-a-simple-systemd-service#:~:text=%20How%20To%20Write%20A%20Simple%20Systemd%20Service,section%20that%20you%20need%20to%20w...%20More%20
.But I'm unable to find how i can write a service script which will run my python script.
Can someone please guide on how can I run a job which executes my python script a some specific times of day and keeps working even after server bounce.
First PAM error say you do not have permissions so check /etc/security/access.conf and add line
+ : youruser : cron crond :0 tty1 tty2 tty3 tty4 tty5 tty6
For exec cron job on boot add in cron line like this:
#reboot /path/to/your_program
I have a python3.9 script I want to have running 24/7. In it, I use python-daemon to keep it running like so:
import daemon
with daemon.DaemonContext():
%%script%%
And it works fine but after a few hours or days, it just crashes randomly. I always start it with sudo but I can't seem to figure out where to find the log file of the daemon process for debugging. What can I do to ensure logging? How can I keep the script running or auto-restart it after crashing?
You can find the full code here.
If you really want to run a script 24/7 in background, the cleanest and easiest way to do it would surely be to create a systemd service.
There are already many descriptions of how to do that, for example here.
One of the advantages of systemd, in addition to being able to launch a service at startup, is to be able to restart it after failure.
Restart=on-failure
If all you want to do is automatically restart the program after a crash, the easiest method would probably be to use a bash script.
You can use the until loop, which is used to execute a given set of commands as long as the given condition evaluates to false.
#!/bin/bash
until python /path/to/script.py; do
echo "The program crashed at `date +%H:%M:%S`. Restarting the script..."
done
If the command returns a non zero exit-status, then the script is restarted.
I would start with familiarizing myself with those two questions:
How to make a Python script run like a service or daemon in Linux
Run a python script with supervisor
Looks like you need a supervisor that will make sure that your script/daemon is still running. You can take a look at supervisord.
I would like to run a python script on Heroku, but I would like to run it only once and stop at the end of the script
Right now my script is running endlessly, so at the end of it, it restarts from the beginning.
How can I stop it at the end of the script ?
right now my procfile look the following;
web: python ValueAppScript.py
worker: python ValueAppScript.py
thx you
First of all, you probably don't want to declare the same command as both a web and a worker. If your script listens for HTTP requests it should be a web process, otherwise a worker makes more sense:
worker: python ValueAppScript.py
Since you don't want your worker running all the time, scale it down to zero dynos:
heroku ps:scale worker=0
If you wish to run it once interactively, you can use heroku run:
heroku run python ValueAppScript.py
If you want it to run on a schedule, e.g. once per day, you can use the Heroku Scheduler. Since you have defined this as a worker process you should be able to just use worker as the command.
I have a python3 application that I want to run continually on an Ubuntu server. I'm managing it using pm2, but running into a very strange error.
I am starting the pm2 process using:
pm2 start --name python_app --watch --interpreter /usr/bin/python3.8 python_app.py
When I first run this, it doesn't start properly: pm2 continually stops and restarts the process multiple times per second, and will keep doing this unless stopped. The output of the pm2 error logs makes very little sense: it's a lot of very long tracebacks through python libraries (almost always Flask), but without any actual errors attached to them, other than KeyboardInterrupts, which I am not making.
After manually stopping and starting the app (using the commands below), everything runs as expected (and then continues to work fine for subsequent restarts).
pm2 stop python_app
pm2 start python_app
I have repeated this process (deleting and remaking the pm2 process to see the error, and then stopping and restarting to make it work) multiple times, with the same results every time. I wonder whether this is the result of something else that's not right? Or, whether there's an equivalent command in pm2 to 'setup' a new process without launching it, then starting it separately.
I tried increasing the startup memory that pm2 could use, but it just uses 100% of whatever I give it, and still experiences the same restarting issue (just much faster, haha).
ah, I think I know what the issue was: I noticed that watching appeared as 'disabled' when I listed processes, even though I had run with the --watch command.
When I removed --watch from the start command, it worked fine: perhaps it's not meant to be used with Python?
Would love to hear from anyone who knows more, but problem solved.
I was trying to run slurm jobs with srun on the background. Unfortunately, right now due to the fact I have to run things through docker its a bit annoying to use sbatch so I am trying to find out if I can avoid it all together.
From my observations, whenever I run srun, say:
srun docker image my_job_script.py
and close the window where I was running the command (to avoid receiving all the print statements) and open another terminal window to see if the command is still running, it seems that my running script is for some reason cancelled or something. Since it isn't through sbatch it doesn't send me a file with the error log (as far as I know) so I have no idea why it closed.
I also tried:
srun docker image my_job_script.py &
to give control back to me in the terminal. Unfortunately, if I do that it still keeps printing things to my terminal screen, which I am trying to avoid.
Essentially, I log into a remote computer through ssh and then do a srun command, but it seems that if I terminate the communication of my ssh connection, the srun command is automatically killed. Is there a way to stop this?
Ideally I would like to essentially send the script to run and not have it be cancelled for any reason unless I cancel it through scancel and it should not print to my screen. So my ideal solution is:
keep running srun script even if I log out of the ssh session
keep running my srun script even if close the window from where I sent the command
keep running my srun script and let me leave the srun session and not print to my scree (i.e. essentially run to the background)
this would be my idea solution.
For the curious crowd that want to know the issue with sbatch, I want to be able to do (which is the ideal solution):
sbatch docker image my_job_script.py
however, as people will know it does not work because sbatch receives the command docker which isn't a "batch" script. Essentially a simple solution (that doesn't really work for my case) would be to wrap the docker command in a batch script:
#!/usr/bin/sh
docker image my_job_script.py
unfortunately I am actually using my batch script to encode a lot of information (sort of like a config file) of the task I am running. So doing that might affect jobs I do because their underlying file is changing. That is avoided by sending the job directly to sbatch since it essentially creates a copy of the batch script (as noted in this question: Changing the bash script sent to sbatch in slurm during run a bad idea?). So the real solution to my problem would be to actually have my batch script contain all the information that my script requires and then somehow in python call docker and at the same time pass it all the information. Unfortunately, some of the information are function pointers and objects, so its not even clear to me how I would pass such a thing to a docker command ran in python.
or maybe being able to run docker directly to sbatch instead of using a batch script with also solve the problem.
The outputs can be redirected with the options -o stdout and -e for stderr.
So, the job can be launched in background and with the outputs redirected:
$ srun -o file.out -e file.errr docker image my_job_script.py &
Another approach is to use a terminal multiplexer like tmux or screen.
For example, create a new tmux window type tmux. In that window, use srun with your script. From there, you can then detach the tmux window, which returns you to your main shell so you can go about your other business, or you can logoff entirely. When you want to check in on your script, just reattach to the tmux window. See the documentation tmux -h for how to detach and reattach on your OS.
Any output redirects using the -o or -e will still work with this technique and you can run multiple srun commands concurrently in different tmux windows. I’ve found this approach useful to run concurrent pipelines (genomics in this case).
I was wondering this too because the differences between sbatch and srun are not very clearly explainer or motivated. I looked at the code and found:
sbatch
sbatch pretty much just sends a shell script to the controller, tells it to run it and then exits. It does not need to keep running while the job is happening. It does have a --wait option to stay running until the job is finished but all it does is poll the controller every 2 seconds to ask it.
sbatch can't run a job across multiple nodes - the code simply isn't in sbatch.c. sbatch is not implemented in terms of srun, it's a totally different thing.
Also its argument must be a shell script. Bit of a weird limitation but it does have a --wrap option so that it can automatically wrap a real program in a shell script for you. Good luck getting all the escaping right with that!
srun
srun is more like an MPI runner. It directly starts tasks on lots of nodes (one task per node by default though you can override that with --ntasks). It's intended for MPI so all of the jobs will run simultaneously. It won't start any until all the nodes have a slot free.
It must keep running while the job is in progress. You can send it to the background with & but this is still different to sbatch. If you need to start a million sruns you're going to have a problem. A million sbatchs should (in theory) work fine.
There is no way to have srun exit and leave the job still running like there is with sbatch. srun itself acts as a coordinator for all of the nodes in the job, and it updates the job status etc. so it needs to be running for the whole thing.