How to download an object from google cloud storage bucket?

How to download an object from google cloud storage bucket? - python

I uploaded a CSV file test.csv to Google Cloud Storage's bucket. The resulting public_url is like this
https://storage.googleapis.com/mybucket/test-2017-04-11-025727.csv
Originally `test.csv' has some row and column containing numbers like this
6,148,72,35,0,33.6,0.627,50,1
8,183,64,0,0,23.3,0.672,32,1
...
...
I uploaded the file by referring to bookshelf tutorial --> https://github.com/GoogleCloudPlatform/getting-started-python no 6-pubsub. The uploaded file will be saved with timestamp added to it.
Now I want to download the file that I uploaded to the bucket by using requests
Here is what I've been working on. The original sample is at 6-pubsub/bookshelf/crud.py. Below is script that I already edited based on the original sample.
from machinelearning import get_model, oauth2, storage, tasks
from flask import Blueprint, current_app, redirect, render_template, request, session, url_for
import requests
import os
...
...
crud = Blueprint('crud', __name__)
save_folder = 'temp/'
def upload_csv_file(file):
...
...
return public_url
...
...
#crud.route('/add', methods=['GET', 'POST'])
def add():
data = request.form.to_dict(flat=True)
# This will upload the file that I pushed from local directory to GCS
if request.method == 'POST':
csv_url = upload_csv_file(request.files.get('file'))
if csv_url:
data['csvUrl'] = csv_url
# I think this is not working. This should download back the file and save it to a temporary folder inside current working directory
response = requests.get(public_url)
if not os.path.exists(save_folder):
os.makedirs(save_folder)
with open(save_folder + 'testdata.csv', 'wb') as f:
f.write(response.content)
...
...
I opened folder temp and check the testdata.csv. It shows me an error like this inside the CSV file.
<?xml version='1.0' encoding='UTF-8'?><Error><Code>AccessDenied</Code><Message>Access denied.</Message><Details>Anonymous users does not have storage.objects.get access to object mybucket/test-2017-04-11-025727.csv.</Details></Error>
I was hoping testdata.csv will have same contents like test.csv but it did not.
I already recheck my OAuth client and secret, bucket id on config.py but the error still there.
How do I solve this kind of error?
Thank you in advanced.

I solved it. It is just like mr #Brandon Yarbrough said that the bucket's object is not publicly readable.
To make the bucket public, taking from this link --> https://github.com/GoogleCloudPlatform/gsutil/issues/419
gsutil defacl set public-read gs://<bucket_name>

Related

How to read file using sharepoint REST API in Python

I am an absolute beginner when it comes to working with REST APIs with python. We have received a share-point URL which has multiple folders and multiples files inside those folders in the 'document' section. I have been provided an 'app_id' and a 'secret_token'.
I am trying to access the .csv file and read them as a dataframe and perform operations.
The code for operation is ready after I downloaded the .csv and did it locally but I need help in terms of how to connect share-point using python so that I don't have to download such heavy files ever again.
I know there had been multiple queries already on this over stack-overflow but none helped to get to where I want.
I did the following and I am unsure of what to do next:
import json
from office365.runtime.auth.user_credential import UserCredential
from office365.sharepoint.client_context import ClientContext
from office365.runtime.http.request_options import RequestOptions
site_url = "https://<company-name>.sharepoint.com"
ctx = ClientContext(site_url).with_credentials(UserCredential("{app_id}", "{secret_token}"))
Above for site_url, should I use the whole URL or is it fine till ####.com?
This is what I have so far, next I want to read files from respective folders and convert them into a dataframe? The files will always be in .csv format
The example hierarchy of the folders are as follows:
Documents --> Folder A, Folder B
Folder A --> a1.csv, a2.csv
Folder B --> b1.csv, b2.csv
I should be able to move to whichever folder I want and read the files based on my requirement.
Thanks for the help.

This works for me, using a Sharepoint App Identity with an associated client Id and client Secret.
First, I demonstrate authenticating and reading a specific file, then getting a list of files from a folder and reading the first one.
import pandas as pd
import json
import io
from office365.sharepoint.client_context import ClientCredential
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File
#Authentication (shown for a 'modern teams site', but I think should work for a company.sharepoint.com site:
site="https://<myteams.companyname.com>/sites/<site name>/<sub-site name>"
#Read credentials from a json configuration file:
spo_conf = json.load(open(r"conf\spo.conf", "r"))
client_credentials = ClientCredential(spo_conf["RMAppID"]["clientId"],spo_conf["RMAppID"]["clientSecret"])
ctx = ClientContext(site).with_credentials(client_credentials)
#Read a specific CSV file into a dataframe:
folder_relative_url = "/sites/<site name>/<sub site>/<Library Name>/<Folder Name>"
filename = "MyFileName.csv"
response = File.open_binary(ctx, "/".join([folder_relative_url, filename]))
df = pd.read_csv(io.BytesIO(response.content))
#Get a list of file objects from a folder and read one into a DataFrame:
def getFolderContents(relativeUrl):
contents = []
library = ctx.web.get_list(relativeUrl)
all_items = library.items.filter("FSObjType eq 0").expand(["File"]).get().execute_query()
for item in all_items: # type: ListItem
cur_file = item.file
contents.append(cur_file)
return contents
fldrContents = getFolderContents('/sites/<site name>/<sub site>/<Library Name>')
response2 = File.open_binary(ctx, fldrContents[0].serverRelativeUrl)
df2 = pd.read_csv(io.BytesIO(response2.content))
Some References:
Related SO thread.
Office365 library github site.
Getting a list of contents in a doc library folder.
Additional notes following up on comments:
The site path doesn't not include the full url for the site home page (ending in .aspx) - it just ends with the name for the site (or sub-site, if relevant to your case).
You don't need to use a configuration file to store your authentication credentials for the Sharepoint application identity - you could just replace spo_conf["RMAppID"]["clientId"] with the value for the Sharepoint-generated client Id and do similarly for the client Secret. But this is a simple example of what the text of a JSON file could look like:
{
"MyAppName":{
"clientId": "my-client-id",
"clientSecret": "my-client-secret",
"title":"name_for_application"
}
}

Downloading multiple files in Flask

I am trying to provide the client side the option of downloading some files in Flask. There can be multiple files or a single file available for the user/client to download.
However I am not able to understand how to provide the user the option to download multiple files.
Here is what I have tried so far:
#app.route('/download_files')
def download():
count=0
download_list=[]
for path in pathlib.Path("dir1/dir2").iterdir():
if path.is_file():
for i in names:
if pathlib.PurePosixPath(path).stem == i:
count += 1
download_list.append(path)
return send_file(download_list, as_attachment=True, mimetype="text/plain", download_name="Downloaded Files", attachment_filename="Generated Files")
This does not work properly even with a single file. The file type I am trying to download is text file with the extension .sql .
Will I somehow have to zip multiple files and then provide the download option? Please guide with my available options.

In order to offer several files together as a download, you only have the option of compressing them in an archive.
In my example, all files that match the specified pattern are listed and compressed in a zip archive. This is written to the memory and sent by the server.
from flask import Flask
from flask import send_file
from glob import glob
from io import BytesIO
from zipfile import ZipFile
import os
app = Flask(__name__)
#app.route('/download')
def download():
target = 'dir1/dir2'
stream = BytesIO()
with ZipFile(stream, 'w') as zf:
for file in glob(os.path.join(target, '*.sql')):
zf.write(file, os.path.basename(file))
stream.seek(0)
return send_file(
stream,
as_attachment=True,
download_name='archive.zip'
)

You haven't provided a code sample where you actually getting these files or this file. Minimum working example would be like this:
from flask import Flask, request
app = Flask(__name__)
#app.route('/download_files', methods=['POST'])
def download():
file = request.files['file'] # for one file
files = request.files.getlist("file[]") # if there're multiple files provided
if __name__ == "__main__":
app.run()
After what your file variable will be an object of werkzeug.FileStorage and files variable will be a list of these objects.
And to download all these files you can check this question.

how to Upload jpg file and save it in restplas flask api?

I use restplus flask api . I want to upload jpg file then rename and save to files location. then save its url. I searched and found this code on https://flask-restplus.readthedocs.io/en/stable/parsing.html#file-upload, but I dont understand do_something_with_file statment in this code. could you help me?
from werkzeug.datastructures import FileStorage
upload_parser = api.parser()
upload_parser.add_argument('file', location='files',
type=FileStorage, required=True)
#api.route('/upload/')
#api.expect(upload_parser)
class Upload(Resource):
def post(self):
uploaded_file = args['file'] # This is FileStorage instance
url = do_something_with_file(uploaded_file)
return {'url': url}, 201

You can refer to flask original documentation for uploading files Uploading files
Basically, all you need is FileStorage.save() method to save uploaded file.

Download file from root directory using flask

I am generating a xlsx file which I would like to download once it is created. The file is created using a module called 'xlsxwriter'. It saves the file in my root directory, however I cant figure out how to access it via flask, so that it starts a download.
This is how I create the file:
workbook = xlsxwriter.Workbook('images.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write(..someData..)
It saves the file in my root directory.
Now I am trying to access it in order to download it via flask:
app = Flask(__name__, static_url_path='')
#app.route('/download')
def download():
# do some stuff
return Response(
app.send_static_file('images.xlsx'),
mimetype="xlsx",
headers={"Content-disposition":
"attachment; filename=images.xlsx"})
However, I get a 404 Error. Is using send_static_file the correct way to go here?

I found a solution using 'send_file' instead. Providing the path to my file like so:
from flask import send_file
return send_file(pathToMyFile, as_attachment=True)

excel download with Flask-RestPlus?

How to implement an API endpoint to download excel file using Flask-RestPlus?
Previously I had implemented similar function using Pyramid. However that method didn't work here.
Here is the old code snippet:
workBook = openpyxl.Workbook()
fileName = 'Report.xls'
response = Response(content_type='application/vnd.ms-excel',
content_disposition='attachment; filename=%s' % fileName)
workBook.save(response)
return response
Thanks for the help.

send_from_directory provides a secure way to quickly expose static files from an upload folder or something similar when using Flask-RestPlus
from flask import send_from_directory
import os
#api.route('/download')
class Download(Resource):
def get(self):
fileName = 'Report.xls'
return send_from_directory(os.getcwd(), fileName, as_attachment=True)
I have assumed file is in current working directory. The path to download file can be adjusted accordingly.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to download an object from google cloud storage bucket? - python

I solved it. It is just like mr #Brandon Yarbrough said that the bucket's object is not publicly readable. To make the bucket public, taking from this link --> https://github.com/GoogleCloudPlatform/gsutil/issues/419 gsutil defacl set public-read gs://<bucket_name>

Related

How to read file using sharepoint REST API in Python

Downloading multiple files in Flask

how to Upload jpg file and save it in restplas flask api?

Download file from root directory using flask

excel download with Flask-RestPlus?

Categories

Resources