Noob and beginner here. Just trying to learn the basics of GCP.
I have a series of Google Cloud Buckets that are text files. I also have a VM instance that I've set up via GCP.
Now, I'm trying to write some code to extract the data from Google buckets and run the script via GCP's command prompt.
How can I extract GCP buckets in Python
I think that you can use the Listing Objects and Downloading Objects GCS methods with Python; in this way, you will be able to get a list of the objects stored in your Cloud Storage buckets to then extract them into you VM instance. Additionally, keep in mind that it is important to verify that the service account that you implement to perform these tasks, has the required roles assigned in order to access to your GCS buckets, as well as provide the credentials to your application by using environment variables or explicitly pointing to your service account file in code.
Once you have your code ready, you can simply execute your Python program by using the
python command. You can take a look on this link to get the instructions to install Python in your new environment.
Related
I am new to Azure and Python and would like to ask some questions. Using Azure Python SDK:
Is there a direct way for me to get a list of snapshots of a managed OS disk order by created date? As of now, I can only get all snapshots within a resource group or under a subscription.
How do I write Python SDK code to create snapshots asynchronously? I have an idea of using multi-threading, and I also want to be notified when a snapshot is successfully created.
My questions might be confusing since I have little experience with Azure and Python. Any help will be appreciated.
I'm using selenium in order to extract some data (as a json file). This json is the final output of the script.
I've managed to do it locally so far in two different ways:
With a local webdriver (for Chrome).
With a Docker container.
However, I need it to be accessible from anywhere, in systems that don't have either webdrivers/Docker installed.
I have thought about deploying the script to Heroku and work around that idea, but I have no idea how to handle the data in this situation.
I think that cloud services are meant for these situations.
A storage account (S3 in Amazon or Blob Storage for Azure) allows you to acces the data from anywhere, and without almost any limitation of space, using its API or by using their SDK's.
Also you can specify access policies if your data should not be publicly accessible.
As you have already developed your script into a Docker conatiner, your are ready to run it in almost every cloud provider (for example in Amazon ECR).
I have a python script that uses a python api to fetch data from a data provider, then after manipulating the data, writes some of that data to Google Sheets (via the Google Sheets python api). So my workflow to update the data is to open the python file in VS Code, and run it from there, then switch to the spreadsheet to see the updated data.
Is there a way to call this python script from Google Sheets using Google Apps Script? If so, that would be more efficient; I could link the GAS script to a macro button on the spreadsheet.
Apps Script runs in the cloud at Google's servers rather than in your computer, and has no access to local resources such as Python scripts in your system.
To call a resource in the cloud, use the URL Fetch Service.
Currently, I am using following hack.
The appsscript behind gsheet command button writes parameters in a sheet called Parameters. And the python function on my local machine checks that Parameters sheet in that google workbook every 5 seconds. If there are no parameters, then it exists. And if there are parameters, then it executes the main code
When the code is deployed on service account, the portion which is polling remains inactive. And the appsscript directly makes a call directly to python code in service account.
There are many reasons why I need to call python function on LOCAL machine from gsheet. One reason is --- debugging is better in local machine and cumbersome on service account. Another reason is --- certain files can be put on local machines and we do not want to move these files to workspace. And gsheet needs data from these files.
This is a HACK I am using
.
I am looking for a better way that this "python code keeps polling" method.
I have a bunch of files in a Google Cloud Storage bucket, including some Python scripts and text files. I want to run the Python scripts on the text files. What would be the best way to go about doing this (App Engine, Compute Engine, Jupyter)? Thanks!
I recommend using Google Cloud Function, that can be triggered automatically each time you upload new file to the Cloud Storage to process it. You can see workflow for this in Cloud Function Storage Tutorial
You will need to at least download the python scripts onto an environment first (be it GCE or GAE). To access the GCS text files, you can use https://pypi.org/project/google-cloud-storage/ library. I don't think you can execute python scripts from the object bucket itself.
If it is troublesome to change the python codes for reading the text files from GCS, you will have to download everything into your environment (e.g. using gsutil)
I'm attempting to deploy an Azure Worker Role written in Python to our account. It's in Python to include a specific library (moviepy) needed to accomplish the task at hand. However, moviepy expects filenames as strings in the arguments for its objects. In C#, there is a method buried in the Azure libraries
LocalResource localResource = RoleEnvironment.GetLocalResource(workerRoleStorageName);
that returns an object with the necessary path in a property. However, the documentation for the Python Azure library fails to mention any such method, or indeed, any mention of anything similar. I've attempted to bumble my way through the library (via Intellisense) to find it, but have failed insofar.
We have already created the Local Storage in the Cloud Service Deployment project VS creates. Does anyone have any experience with accessing Azure Local Storage from a Python Worker Role, or even a link addressing how to do so?
According to the offical reference for WorkerRole Schema, as below.
%ROLEROOT% is an environment variable maintained by Azure and it represents the root folder location for your role. The \%ROLEROOT%\Approot folder represents the application folder for your role.
So you can try to get the storage root location of WorkerRole on Azure via the code below.
import os
roleroot = os.environ.get('ROLEROOT')