How can I hide non-sensitive credentials on a Google Colab Notebook? - python

I have a Google Colab Notebook that is using psycopg2 to connect with a free Heroku PostgreSQL instance. I'd like to share the notebook with some colleagues for educational purposes to view and run the code.
There is nothing sensitive related to the account / database but would still like to hide the credentials used to make the initial connection without restricting their access.

My work around was creating a Python module that contained a function who performed the initial connection with credentials. I converted the module into a binary .pyc, uploaded it to Google Drive, downloaded the binary into the Notebook's contents via shell command then used it as an import.
It obviously isn't secure but provides the obfuscation layer I was looking for.

Related

Run selenium script that handles files remotely

I'm using selenium in order to extract some data (as a json file). This json is the final output of the script.
I've managed to do it locally so far in two different ways:
With a local webdriver (for Chrome).
With a Docker container.
However, I need it to be accessible from anywhere, in systems that don't have either webdrivers/Docker installed.
I have thought about deploying the script to Heroku and work around that idea, but I have no idea how to handle the data in this situation.
I think that cloud services are meant for these situations.
A storage account (S3 in Amazon or Blob Storage for Azure) allows you to acces the data from anywhere, and without almost any limitation of space, using its API or by using their SDK's.
Also you can specify access policies if your data should not be publicly accessible.
As you have already developed your script into a Docker conatiner, your are ready to run it in almost every cloud provider (for example in Amazon ECR).

Fetching Google Sheet data from locally mounted drive via gspread

I am working on a Google Colab notebook that requires the user to mount google drive using the colab.drive python library. They then input relative paths on the local directory tree (/content/drive/... by default on that mount) to files of interest for analysis. Now, I want to use a Google Sheet they can create as a configuration file. There is lots of info on how to authenticate gspread and fetch a sheet from its HTTPS url, but I can't find any info on how to access the .gsheet file using gspread that is already mounted on the local filesystem of the colab runtime.
There are many tutorials using this flow: https://colab.research.google.com/notebooks/io.ipynb#scrollTo=yjrZQUrt6kKj , but I don't want to make the user authenticate twice (having already done so for the initial mount), and i don't want to make them input some files as relative path, some as HTTPS URL.
I had thought this would be quite like using gspread to work with google sheets that I might have on my locally mounted drive as well. But, I haven't seen this workflow anywhere either. Any pointers in that direction might help me out as well.
Thank you!
Instead of adding .gsheet on colab's drive you can try storing it in the user's drive and later fetch from there when needed. So until that kernel is running you won't have to re-authenticate the user.
I'm also not finding anything to authenticate into colab from other device. So you would consider modifying your flow a bit.

How to allow users to run jupyter notebooks stored in my database on my server?

I have a problem and I need a hint how to approach the problem.
I have django application, in which I have sme jupyter notebooks stored in my database. At this point, users can download notebooks and run them on their compuers.
I would like to add functionality, where user could run notebook online. I was thinking of two solutions:
first one is to use some free to use online service, like google colab, but I haven't found any with api where I could send file from my database (maybe you know about some?),
second is to run jupyter hub on my server. I saw how to run jupyter hub remotely, but I don't know how to grant users the access, so they can run notebooks simultaneously, and they don't have access to server itself thorugh it, and do all of this in django.
Do you have any hints that could help me get this functionality?
JupyterHub is a good approach if you trust your users. However, if you want to run untrusted code (like Google Colab does), you need sandboxing. In that case, you can use a Docker image to run notebooks. For example, mikebirdgeneau/jupyterlab. And there is a docker-compose file example: https://github.com/mikebirdgeneau/jupyterlab-docker/blob/master/docker-compose.yml

Hosting a Pytorch Environment on Azure with Jupyter Notebooks usng Active Directory Authentication

I am looking for a way for non-technical employees within my company to access a Jupyter Notebook hosted on a python environment (with pytorch) hosted on Azure, with active directory authentication. I do not want to use the Azure Machine Learning tools. The goal is to setup an Azure VM with the Jupyter Notebook script along with the necessary dataset/files that the script needs to access existing in the VM's resources. A simple use case would be a user clicks a link directing them to Active Directory authentication, then once logged in, takes them directly to the Jupyter notebook containing a single script that the user can execute, but not edit. The VM also needs to have dependencies such as Pytorch and Scikit-learn. Does anybody know a possible solution to this?
I would use notebooks.azure.com, however there is no current functionality for privately sharing notebooks.

Running python script from Google Apps script

I have a python script (on my local machine) that queries Postgres database and updates a Google sheet via sheets API. I want the python script to run on opening the sheet. I am aware of Google Apps Script, but not quite sure how can I use it, to achieve what I want.
Thanks
Google Apps Script runs on the server side, so it can't be used to run a local script.
you will need several changes. first you need to move the script to the cloud (see google compute engine) and be able to access your databases from there.
then, from apps script look at the onOpen trigger. from there you can urlFetchApp to your python server to start the work.
you could also add a custom "refresh" menu to the sheet to call your server which is nicer than having to reload the sheet.
note that onOpen runs server side on google thus its impossible for it to access your local machine files.

Categories