I have a bunch of files in a Google Cloud Storage bucket, including some Python scripts and text files. I want to run the Python scripts on the text files. What would be the best way to go about doing this (App Engine, Compute Engine, Jupyter)? Thanks!
I recommend using Google Cloud Function, that can be triggered automatically each time you upload new file to the Cloud Storage to process it. You can see workflow for this in Cloud Function Storage Tutorial
You will need to at least download the python scripts onto an environment first (be it GCE or GAE). To access the GCS text files, you can use https://pypi.org/project/google-cloud-storage/ library. I don't think you can execute python scripts from the object bucket itself.
If it is troublesome to change the python codes for reading the text files from GCS, you will have to download everything into your environment (e.g. using gsutil)
Related
I have a python program that ultimately writes a csv file using pandas:
df.to_csv(r'path\file.csv)
I was able to upload the files to the server via FileZilla and was also able to run the program on the EC2 server normally. However, I would now like to export a csv file to my local machine, but I don't know how to.
Do I have to write the csv file directly to a cloud drive (e.g. google drive via Pydrive)? What would be the easiest way?
You probably do not want to expose your computer to the dangers of the Internet. Therefore, it is better for your computer to 'pull' the data down, rather than allowing something on the Internet to 'push' something to it.
You could send the data to Amazon S3 or, if you are using a cloud-based storage service like Google Drive or Dropbox, use their SDK to upload the file to their storage.
Noob and beginner here. Just trying to learn the basics of GCP.
I have a series of Google Cloud Buckets that are text files. I also have a VM instance that I've set up via GCP.
Now, I'm trying to write some code to extract the data from Google buckets and run the script via GCP's command prompt.
How can I extract GCP buckets in Python
I think that you can use the Listing Objects and Downloading Objects GCS methods with Python; in this way, you will be able to get a list of the objects stored in your Cloud Storage buckets to then extract them into you VM instance. Additionally, keep in mind that it is important to verify that the service account that you implement to perform these tasks, has the required roles assigned in order to access to your GCS buckets, as well as provide the credentials to your application by using environment variables or explicitly pointing to your service account file in code.
Once you have your code ready, you can simply execute your Python program by using the
python command. You can take a look on this link to get the instructions to install Python in your new environment.
I am writing an python application which reads/parses a file of this this kind.
myalerts.ini,
value1=1
value2=3
value3=10
value4=15
Currently I store this file in local filesystem. If I need to change this file I need to have physical access to this computer.
I want to move this file to cloud so that I can change this file anywhere (another computer or from phone).
If this application is running on some machine I should be able to change this file on the cloud and the application which is running on another machine which I don't have physical access to will be able to read updated file.
Notes,
I am new to both python and aws.
I am currently running it on my local mac/linux and planning on deploying on aws.
There are many options!
Amazon S3: This is the simplest option. Each computer could download the file at regular intervals or just before they run a process. If the file is big, the app could instead check whether the file has changed before downloading.
Amazon Elastic File System (EFS): If your applications are running on multiple Amazon EC2 instances, EFS provides a shared file system that can be mounted on each instance.
Amazon DynamoDB: A NoSQL database instead of a file. Much faster than parsing a file, but less convenient for updating values — you'd need to write a program to update values, eg from the command-line.
AWS Systems Manager Parameter Store: A managed service for storing parameters. Applications (anywhere on the Internet) can request and update parameters. A great way to configure cloud-based application!
If you are looking for minimal change and you want it accessible from anywhere on the Internet, Amazon S3 is the easiest choice.
Whichever way you go, you'll use the boto3 AWS SDK for Python.
I currently have a Python program which reads a local file (containing a pickled database object) and saves to that file when it's done. I'd like to branch out and use this program on multiple computers accessing the same database, but I don't want to worry about synchronizing the local database files with each other, so I've been considering cloud storage options. Does anyone know how I might store a single data file in the cloud and interact with it using Python?
I've considered something like Google Cloud Platform and similar services, but those seem to be more server-oriented whereas I just need to access a single file on my own machines.
You could install gsutil and the boto library and use that.
Is it possible to create a new excel spreadsheet file and save it to an Amazon S3 bucket without first saving to a local filesystem?
For example, I have a Ruby on Rails web application which now generates Excel spreadsheets using the write_xlsx gem and saving it to the server's local file system. Internally, it looks like the gem is using Ruby's IO.copy_stream when it saves the spreadsheet. I'm not sure this will work if moving to Heroku and S3.
Has anyone done this before using Ruby or even Python?
I found this earlier question, Heroku + ephemeral filesystem + AWS S3. So, it would seem this is not possible using Heroku. Theoretically, it would be possible using a service which allows adding an Amazon EBS.
You have dedicated Ruby Gem to help you moving file to Amazon S3:
https://rubygems.org/gems/aws-s3
If you want more details about the implementation, here is the git repository. The documentation on the page is very complete, and explain how to move file to S3. Hope it helps.
Once your xls file is created, the library helps you create a S3Objects and store it into a Bucket (which you can also create with the library).
S3Object.store('keyOfYourData', open('nameOfExcelFile.xls'), 'bucketName')
If you want more choice, Amazon also delivered an official Gem for this purpose: https://rubygems.org/gems/aws-sdk