I have GCE instance setup and already being used. With some services setup and running. I need to be able to stop it and start it with bash or python scripts in a cron job as I won't it to be running only at specific times and days. Is this possible? Also would be nice if I could make a snapshot and restore from it.
You use command line (gcloud tool) or Google Compute API to start or stop the instances. You can implement any of the above method in your script.
Moreover, you can take a look at Preemptible instances which are recently announced. These instances runs on a periodic basis and are very suitable for jobs like batch processing.
Related
So far when dealing with web scraping projects, I've used GAppsScript, meaning that I can easily trigger the script to be run once a day.
Is there an equivalent service when dealing with python scripts? I have a RaspberryPi, so I guess I can keep it on 24/7 and use cronjobs to trigger the script daily. But that seems rather wasteful, since I'm talking about a few small scripts that take only a few seconds to run.
Is there any service that allows me to trigger a python script once a day? (without a need to keep a local machine on 24/7) The simpler the solution the better, wouldn't want to overengineer such a basic use case if a ready-made system already exists.
The only service I've found so far to do this with is WayScript and here's a python example running in the cloud. The free tier that should be enough for most simple/hobby-tier usecases.
I have a number of very simple Python scripts that are constantly hitting API endpoints 24/7. The amount of data being pulled is very minimal and they all query the APIs every few seconds. My question is, is it okay to run multiple simple scripts in AWS Lightsail using tmux on a single core Lightsail instance or is it better practice to create a new instance for each Python script?
I don't find any limits mentioned in LightSail for your use case. As long as the end-points are owned by you or you don't get blocked for hitting them continuously, all seems good.
https://aws.amazon.com/lightsail/faq/
You can also set some alarms on Lightsail instance usage to know if you've hit any limits.
I have a script that downloads larger amounts of data from an API. The script takes around two hours to run. I would like to run the script on GCP and schedule it to run once a week on Sundays, so that we have the newest data in our SQL database (also on GCP) by the next day.
I am aware of cronjobs, but would not like to run an entire server just for this single script. I have taken a look at cloud functions and cloud scheduler, but because the script takes so long to execute I cannot run it on cloud functions as the maximum execution time is 9 minutes (from here). Is there any other way how I could schedule the python script to run?
Thank you in advance!
For running a script more than 1h, you need to use a Compute Engine. (Cloud Run can live only 1h).
However, you can use Cloud Scheduler. Here how to do
Create a cloud scheduler with the frequency that you want
On this scheduler, use the Compute Engine Start API
In the advanced part, select a service account (create one or reuse one) which have the right to start a VM instance
Select OAuth token as authentication mode (not OIDC)
Create a compute engine (that you will start with the Cloud Scheduler)
Add a startup script that trigger your long job
At the end on the script, add a line to shutdown the VM (with Gcloud for example)
Note: the startup script is run as ROOT user. Take care of the default home directory and the permission of the created files.
I wrote a python script to send Data from a local DB via REST to Kafka.
My goal: I would like this script to run indefinitely, by either restarting in set intervals (i.e. every 5min) or whenever the DB gets new entries. I assume the set Intervals thing would be good enough, easier and safer.
Someone suggested to me to either run it via a cronjob and use a monitoring tool or do it using jenkins (which he considered better).
My Setting: I am not a DevOps engineer, and would like to know about the possibilities and risks setting this Script up. It would be no trouble to recreate the Script in Java if this improves the situation.
My Question: I did try to learn what jenkins is about and i think i understood the CI and CD part. But i don't see how this could help me with my goal. Can someone elaborate on this with some experience on this topic?
If you would suggest a cronjob, what are common methods or tools to monitor such a case? I think the main risks are, failing to send the data due to connection issues on the local machine to REST or the local DB or not beieng started properly at the specified time.
Jobs can be scheduled at regular intervals in Jenkins just like with cron, in fact it uses the same syntax. What's nice about scheduling the job via Jenkins, is that it's very easy to have it send an email if the job exits with a non-zero return code. I've moved all of my cron jobs into Jenkins and it's working well. So by running it via Jenkins you're covering the execution side and the monitoring side at the same time.
I have a Python script that can really eat some CPU and memory. Thus, I figure the most efficient way to run the script is to containerize it in a Docker Container. The script is not meant to run forever. Rather, it gets dependency information from environment variables, does it's behavior and then terminates. Once the script is over, by default Docker will remove the container from memory.
This is good. I am only paying for computing resource while the script is being run.
My problem is this: I have a number of different types of scripts I can run. What I want to do is create a manager that, given the name of a script type to run, gets the identified container to run in Google Container Engine in such as way that the invocation is configured to use a predefined CPU, disk and memory allocation envirnoment that is intended to run the script as fast as possible.
Then, once the script finishes, I want the container removed from the environment so that I am no longer paying for the resource. In other words I want to be able to do in an automated manner in Container Engine what I can do manually from my local machine at the command line.
I am trying to learn how to get Container Engine to support my need in an automated manner. It seems to me that using Kubernetes might be a bit of an overkill in that I do not really want to guarantee constant availability. Rather, I just want the container to run and die. If for some reason the script fails or terminated before success, the archtecture is designed to detect the unsuccesful attempt.
You could use a Kubernetes Controller to create a job object that 'runs to completion'.
A job object such as this can be used to run a single pod.
Once the job (in this case your script) has completed, the pod is terminated and will therefore no longer use any resources. The pod wouldn't be deleted (unless the job is deleted) but will remain in a terminated state. If required and configured correctly, no more pods will be created.
The job object can also be configured to start a new pod should the job fail for any reason should you require this functionality.
For more detailed information on this please see this page.
Also just to add, to keep your billing to a minimum, you could reduce the number of nodes in the cluster down to zero when you are not running the job, and then increase it to the required number when the jobs need to be executed. This could be done programmatically by making use of API calls if required. This should ensure your billing is kept as low as possible as you will only be billed for the nodes when they are running.