Dedicated Colab VM stops execution 1 hour after closing the tab

Dedicated Colab VM stops execution 1 hour after closing the tab - python

As the title says, I have spun a dedicated VM for Colab & connected it to the notebook, but after I close the tab it runs for only one more hour, even though it is stated in the documentation that "Connecting to a custom GCE VM puts you in control of your machine lifecycle. You will still experience disconnections from your VM from interruptions to your connection, but Colab will not alter the VM state: your work and progress will be saved and available when you reconnect."
Hovering over the RAM & Disk section does indeed show that I am connected to the VM.
Is there something I'm doing wrong / do I have to enable something else from the UI ?
Thanks a lot!

Related

How to script auto-shutdown of expensive linux dev server after X hours of interactive user idle time?

I and other team members develop, test, and debug our compute-intensive Python code on a cloud-based Linux server using large datasets and many CPUs/GPUs. During the day there can be one or more users with interactive sessions on this machine (e.g. SSH console or PyCharm over SSH) specifically so we can debug.
The cloud instance we run on costs $10-$20 per hour, which is fine as long as people are using it. We try to remember to shut it down manually when nobody is using it (which requires checking that others aren't logged in). Sometimes we forget, which can cost ~$300 overnight or $1,000 if left idle over a weekend. Note that user sessions can be set to timeout on the client side by configuring OpenSSH, but that leaves the server running.
How to set up scripts and configurations that either:
detect that all interactive users have been idle for X hours ("ideal" condition); or
detect that there have been no interactive sessions for X >= 0 hours ("good-enough" condition); and
sudo shutdown now when the condition is detected?
I'm aware that (for example) on AWS there are some hacky/complex/proprietary/unreliable ways to sort of do this by setting up external monitor services, and I assume there are similar kludges for GCP and Azure. We may want to do similar things on different cloud platforms (AWS, GCP, Azure), but on all of them we'd likely use Ubuntu 20.04+ as the common environment, so I'm looking for implementations that can be coded at the Ubuntu/Linux level.
I would prefer that solutions are based on bash or python. Assume all users are sudoers.
I've already tried proprietary services that are unreliable and not portable.

Jupyter notebook online and uninterrupted session

I'm trying to train a model in Jupyter notebook on my computer. Unfortunately, it will take about a few days (more than a week).
Is there a Jupyter notebook somewhere on the Internet that I could start, disconnect (turn off the laptop) and then come back after a few days and connect to the still ongoing session?

What are you looking for is a server. You need to host your Jupyter Notebook session on a remote host. The idea is that your Jupyter Notebook needs to run continuously, and as a result you need a machine that runs continuously. The fix for your issue will be a server, now the problem that you will have is the fact that if you need specific hardware requirements, like graphic cards, a cloud service provider that offers servers with your specific setup will be harder to find than choosing to train your model on the cpu of a server. The main idea would be to browse Amazon services and a free trial of a service in order to train your model.
Remote GPU Machine learning training service Amazon: https://aws.amazon.com/sagemaker/train/?sagemaker-data-wrangler-whats-new.sort-by=item.additionalFields.postDateTime&sagemaker-data-wrangler-whats-new.sort-order=desc
Jupyter Notebook on remote server config: https://www.digitalocean.com/community/tutorials/how-to-install-run-connect-to-jupyter-notebook-on-remote-server

How to run graphical automation in Azure VM (or similar)

I made a python script that runs a graphical automation using pyautogui (mouse movements) over a huge number of PDFs.
The automation appears to need an active display, for the mouse movements and the PDF to be opened.
If I connect to the Azure VM (with Windows OS) with SSH and start the python script, I get an error from pyautogui as below:
pyautogui.FailSafeException:
PyAutoGUI fail-safe triggered from mouse moving to a corner of the screen.
To disable this fail-safe, set pyautogui.FAILSAFE to False.
DISABLING FAIL-SAFE IS NOT RECOMMENDED.
I have tried with the failsafe disable and still it doesn't work.
As I have read, this happens because there is no active display opened.
If I connect to the VM using RDP, the automation starts and works as expect until I minimize or close the window. When I do that I get the same failsafe error from pyautogui.
But I cannot keep the window open, because I would need to start the same automation on 16 more VMs.
Is there a way to run such graphical automations in Azure VMs or any other similar solution? Docker maybe?
Is there any solution to run or host VM machines with permanent active display opened? Is something like this possible?

I ended up using Virtual Box.
I first created one VM with the needed software and files.
You can start the VM in headless mode. Then show the VM in order to log in to Windows and start the automation script. Then you can close the VM window and let it run in the background. The graphical automation will still run ok in that case. PyAutoGUI won't crash as if there were no active display.
I cloned the original VM 16 times and split the work evenly between the VMs.
Using ssh I am able to connect to each VM and monitor the status of the automations.
It would be great if there were a solution similar to kubernetes that could help with the deployment of such automations and with the monitoring of the status, but until then this is ok.

How to run a google Colab Notebook from terminal?

Suppose I have a Google Colab Notebook in an address like below:
https://colab.research.google.com/drive/XYZ
I want to keep it running for 12 hours, however, then again I want to turn my computer off. As a solution, I can connect to our Lab's server via ssh. The server is running all the time. I would like to know if it's possible that I load and run the notebook there?
I found a solution to connect to a Google Colab Session via ssh (colab_ssh package), but it again needs a running Colab Session.
I also tried to browse the link with lynx, but it needs login and this isn't supported by this browser.

Yes, it is possible. You would first need to download your colab notebook as an .ipynb file, then copy it to your server. Then, you can follow one of the guides on how to connect to a remotely running jupyter notebook session, like this one. All you need is the jupyter notebook software on your server, and an ssh client on your local computer.
Edit: I forgot to mention this: To keep your session alive even after closing the ssh connection, you can use tools like screen. The link provides more detailed explanation, but the general idea is that after connecting to your server, first you need to create a session like this:
screen -S <session_name>
which will create a new session and attach you to it (which is the term used when you are inside a session). Then, you can fire up your jupyter notebook here, and it will keep running even after closing the ssh connection. (You just have to make sure you don't kill the screen session using Ctrl+a followed by k)
Now, you have an indefinitely running jupyter notebook session on your server. You can connect to it via
ssh -N -f -L localhost:YYYY:localhost:XXXX remoteuser#remotehost
as mentioned in the first linked guide, use the browser to run a code cell on your jupyter notebook, and then turn off your laptop without worrying about interrupting your notebook session.

monitoring remote windows machines (CPU%, RAM, Process, restart) in python 2.7

I'm trying to build a monitoring program for our eight render machines (incl. refresh every second). Displaying the current status will be made with the use of Tkinter which is working great so far, but I can't figure out how to connect to the remote machines.
I've tried...
WMI (runs, but not as a thread so I can check every machine simultaneously)
psutil (only local stats)
Popen (can't remember why that didn't work - been fighting with this application for quite a while now, sorry)
glances (same here - and I think because it's more of a stand-alone program and needs to be installed together with a webserver on all the render machines? Really not sure on that one though!)
I'm starting to lose my mind here and maybe someone can give me a hint in the right direction?
Thanks!
Just as a additional info - what I'm trying to achieve:
Display icon if machine is running (this works - Yay!)
Display CPU% of process A
Display CPU% of process B
System CPU%
Free RAM
Restart process A
Restart process B
Restart machine
View last screenshot taken
Thanks again! Any help is appreciated. I quess this isn't the ideal project for a novice programer.

I would recommend using SaltStack. You can set up the salt minion on a windows machine, and then use it to get information about the minion. You can also run commands on the minions, to restart applications, etc.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.