How to overwrite a file uploading to SharePoint with Microsoft Graph - python

I have a Python script that will upload a file to Sharepoint using Microsoft Graph but it gives me a 500 status code error when I try to upload the same file twice.
Here is the code for the function that uploads the file:
def upload_file(session,filename,driveid,folder):
"""Upload a file to Sharepoint.
"""
fname_only = os.path.basename(filename)
# create the Graph endpoint to be used
endpoint = f'drives/{driveid}/root:/{folder}/{fname_only}:/createUploadSession'
start_response = session.put(api_endpoint(endpoint))
json_response = start_response.json()
upload_url = json_response["uploadUrl"]
# upload in chunks
filesize = os.path.getsize(filename)
with open(filename, 'rb') as fhandle:
start_byte = 0
while True:
file_content = fhandle.read(10*1024*1024)
data_length = len(file_content)
if data_length <= 0:
break
end_byte = start_byte + data_length - 1
crange = "bytes "+str(start_byte)+"-"+str(end_byte)+"/"+str(filesize)
print(crange)
chunk_response = session.put(upload_url,
headers={"Content-Length": str(data_length),"Content-Range": crange},
data=file_content)
if not chunk_response.ok:
print(f'<Response [{chunk_response.status_code}]>')
pprint.pprint(chunk_response.json()) # show error message
break
start_byte = end_byte + 1
return chunk_response
Here is the output for the first run:
bytes 0-10485759/102815295
bytes 10485760-20971519/102815295
bytes 20971520-31457279/102815295
bytes 31457280-41943039/102815295
bytes 41943040-52428799/102815295
bytes 52428800-62914559/102815295
bytes 62914560-73400319/102815295
bytes 73400320-83886079/102815295
bytes 83886080-94371839/102815295
bytes 94371840-102815294/102815295
Here is the output for the second run:
bytes 0-10485759/102815295
bytes 10485760-20971519/102815295
bytes 20971520-31457279/102815295
bytes 31457280-41943039/102815295
bytes 41943040-52428799/102815295
bytes 52428800-62914559/102815295
bytes 62914560-73400319/102815295
bytes 73400320-83886079/102815295
bytes 83886080-94371839/102815295
bytes 94371840-102815294/102815295
<Response [500]>
{'error': {'code': 'generalException',
'message': 'An unspecified error has occurred.'}}
I guess I could figure out how to delete the file before I overwrite it but it would be nice to preserve history since Sharepoint keeps versions.
Thanks for any help on this.
Bobby
p.s. I have been hacking the code in https://github.com/microsoftgraph/python-sample-console-app to get it to upload a file to SharePoint so some of the code in the function is from Microsoft's sample application.

For anyone ending up here whilst looking into file name conflict issues, according to the Microsoft article below, if there is a file name collision and you have not correctly specified that it should be replaced, the final byte range upload will fail in the way OP is describing. Hopefully this helps someone.
Handle upload errors
When the last byte range of a file is uploaded, it is possible for an error to occur. This can be due to a name conflict or quota limitation being exceeded. The upload session will be preserved until the expiration time, which allows your app to recover the upload by explicitly committing the upload session.
From: https://learn.microsoft.com/en-us/onedrive/developer/rest-api/api/driveitem_createuploadsession?view=odsp-graph-online#create-an-upload-session

Related

How to download large file from Cloud Run in Django

I have Django project on Cloud Run. When I download small file from page which has below code.
def download_view(request,pk):
file_path = f'media/{pk}.mp3'
name = f'{pk}.mp3'
with open(file_path, 'rb') as f:
response = HttpResponse(f.read(), content_type='audio/wav')
response['Content-Disposition'] = f'attachment; filename={name}'
return response
It's works fine. However, when I download a file (50MB). I got this picture's error.
Cloud run's log is like this. I couldn't find any log of traceback.
2021-05-06 12:00:35.668 JSTGET500606 B66 msChrome 72 https://***/download/mp3/2500762/
2021-05-06 11:49:49.037 JSTGET500606 B61 msChrome 72 https://***/download/mp3/2500645/
I'm not sure. Is this error related with download error.
2021-05-06 16:48:32.570 JSTResponse size was too large. Please consider reducing response size.
I think this is upload file size error. So this is not related with this subject of download error.
When I run Django at local, then download same 50MB file. I can download it. I think this download error related with Cloud run. It's stop after request/response. So I think this error coused by Cloud Run. Which was stoped, when I'm still downloading file.
I don't know how to solve this download error. If you have any solution, please help me!
The Cloud Run HTTP request/response size is limited to 32Mb. Use a multipart/form-data to send chunks of your big file and not the whole file directly.
Thank you #guillaume blaquiere! I solved download error. I post my code for othres.
def _file_iterator(file, chunk_size=512):
with open(file, 'rb') as f:
while True:
c = f.read(chunk_size)
if c:
yield c
else:
break
def download_view(request,pk):
file_path = f'media/{pk}.mp3'
file_name = f'{pk}.mp3'
response = StreamingHttpResponse(_file_iterator(file_path))
response['Content-Type'] = 'audio/mpeg'
response['Content-Disposition'] = f'attachment;filename="{file_name}"'
return response
I think StreamingHttpResponse is key point of this problem. It's return big file by chunks. It dose not over Cloud Run's limit.
When I used multipart/form-data for Content-Type, I could download file. But it's couldn't open on smart phone, because It couldn't select application. When I download on PC, it's can't show audio file icon. We should select exact content type.

save video in python from bytes

i have 2 microservices, A is written in java and sending a video in the form of bytes[ ] to B which is written in python.
B is doing some treatement over the video using openCV and this command in particular
stream = cv2.VideoCapture(video)
the command works fine when provided by a streaming or a ready local video, but when i give it my request.data which java is sending it says
TypeError: an integer is required (got type bytes)
so my question is :
is there any way to save a video to disk from that bytes i'm receiving from java or can i just give the bytes to cv2.capture ?
Thank you.
Just a slight improvement to your own solution: using the with context-manager closes the file for you even if something unexpected happens:
FILE_OUTPUT = 'output.avi'
# Checks and deletes the output file
# You cant have a existing file or it will through an error
if os.path.isfile(FILE_OUTPUT):
os.remove(FILE_OUTPUT)
# opens the file 'output.avi' which is accessable as 'out_file'
with open(FILE_OUTPUT, "wb") as out_file: # open for [w]riting as [b]inary
out_file.write(request.data)
i solved my problem like this :
FILE_OUTPUT = 'output.avi'
# Checks and deletes the output file
# You cant have a existing file or it will through an error
if os.path.isfile(FILE_OUTPUT):
os.remove(FILE_OUTPUT)
out_file = open(FILE_OUTPUT, "wb") # open for [w]riting as [b]inary
out_file.write(request.data)
out_file.close()

Python throwing error in reading JSON file

I am writing a function in a Python Script which will read the json file and print it.
The scripts reads as:
def main(conn):
global link, link_ID
with open('ad_link.json', 'r') as statusFile:
status = json.loads(statusFile.read())
statusFile.close()
print(status)
link_data = json.load[status]
link = link_data["link"]
link_ID = link_data["link_id"]
print(link)
print(link_ID)
I am getting error as:
link_data = json.load[status]
TypeError: 'function' object is not subscriptable
What is the issue?
The content of ad_link.json The file I am receiving is saved in this manner.
"{\"link\": \"https://res.cloudinary.com/dnq9phiao/video/upload/v1534157695/Adidas-Break-Free_nfadrz.mp4\", \"link_id\": \"ad_Bprise_ID_Adidas_0000\"}"
The function to receive and write JSON file
def on_message2(client, userdata, message):
print("New MQTT message received. File %s line %d" % (filename, cf.f_lineno))
print("message received?/'/'/' ", str(message.payload.decode("utf-8")), \
"topic", message.topic, "retained ", message.retain)
global links
links = str(message.payload.decode("utf-8")
logging.debug("Got new mqtt message as %s" % message.payload.decode("utf-8"))
status_data = str(message.payload.decode("utf-8"))
print(status_data)
print("in function on_message2")
with open("ad_link.json", "w") as outFile:
json.dump(status_data, outFile)
time.sleep(3)
The output of this function
New MQTT message received. File C:/Users/arunav.sahay/PycharmProjects/MediaPlayer/venv/Include/mediaplayer_db_mqtt.py line 358
message received?/'/'/' {"link": "https://res.cloudinary.com/dnq9phiao/video/upload/v1534157695/Adidas-Break-Free_nfadrz.mp4", "link_id": "ad_Bprise_ID_Adidas_0000"} topic ios_push retained 1
{"link": "https://res.cloudinary.com/dnq9phiao/video/upload/v1534157695/Adidas-Break-Free_nfadrz.mp4", "link_id": "ad_Bprise_ID_Adidas_0000"}
EDIT
I found out the error is in JSON format. I am receiving the JSON data in a wrong format. How will I correct that?
There are two major errors here:
You are trying to use the json.load function as a sequence or dictionary mapping. It's a function, you can only call it; you'd use json.load(file_object). Since status is actually a string, you'd have to use json.loads(status) to actually decode a JSON document stored in a string.
In on_message2, you encoded JSON data to JSON again. Now you have to decode it twice. That's an unfortunate waste of computer resources.
In the on_message2 function, the message.payload object is a bytes-value containing a UTF-8 encoded JSON document, if you want to write that to a file, don't decode to text, and don't encode the text to JSON again. Just write those bytes directly to a file:
def on_message2(client, userdata, message):
logging.debug("Got new mqtt message as %s" % message.payload.decode("utf-8"))
with open("ad_link.json", "wb") as out:
out.write(message.payload)
Note the 'wb' status; that opens a file in binary mode for writing, at which point you can write the bytes object to that file.
When you open a file without a b in the mode, you open a file in text mode, and when you write a text string to that file object, Python encodes that text to bytes for you. The default encoding depends on your OS settings, so without an explicit encoding argument to open() you can't even be certain that you end up with UTF-8 JSON bytes again! Since you already have a bytes value, there is no need to manually decode then have Python encode again, so use a binary file object and avoid that decode / encode dance here too.
You can now load the file contents with json.load() without having to decode again:
def main(conn):
with open('ad_link.json', 'rb') as status_file:
status = json.load(status_file)
link = status["link"]
link_id = status["link_id"]
Note that I opened the file as binary again. As of Python 3.6, the json.load() function can work both with binary files and text files, and for binary files it can auto-detect if the JSON data was encoded as UTF-8, UTF-16 or UTF-32.\
If you are using Python 3.5 or earlier, open the file as text, but do explicitly set the encoding to UTF-8:
def main(conn):
with open('ad_link.json', 'r', encoding='utf-8') as status_file:
status = json.load(status_file)
link = status["link"]
link_id = status["link_id"]
def main(conn):
global link, link_ID
with open('ad_link.json', 'r') as statusFile:
link_data = json.loads(statusFile.read())
link = link_data["link"]
link_ID = link_data["link_id"]
print(link)
print(link_ID)
replace loads with load when dealing with file object which supports read like operation
def main(conn):
global link, link_ID
with open('ad_link.json', 'r') as statusFile:
status = json.load(statusFile)
status=json.loads(status)
link = status["link"]
link_ID = status["link_id"]
print(link)
print(link_ID)

Exporting text/plain from Google Drive

I'm trying to export a Google Doc as text. I've tried two approaches, neither's working.
Exporting the contents, I get the contents as a byte object which I haven't been able to convert to a simple string:
req = service.files().export_media(fileId=file_id,mimeType='text/plain')
fh = io.BytesIO()
download = MediaIoBaseDownload(fh, req)
done = False
while done is False:
status, done = download.next_chunk()
return fh.getvalue()
I then get variants of codec errors
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufeff' in position 0: character maps to < undefined >
or these:
TypeError: write() argument must be str, not bytes
So that's a type conversion problem, I guess, and if I can solve that it'd work.
BUT, I'd much rather just use the exportLinks property to download the file as text/plain. Problem is, that's simply missing:
file = service.files().get(fileId=id).execute()
pprint.pprint(file)
{'id': 'xxxxxxxxxx',
'kind': 'drive#file',
'mimeType': 'application/vnd.google-apps.document',
'name': 'export'}
file['exportLinks'], unsurprisingly, gives a KeyError:
KeyError: 'exportLinks'
I've relaxed the scope so it's now 'https://www.googleapis.com/auth/drive', so that shouldn't be the problem.
What am I missing?
The Drive platform allows developers to open, import, and export native Google Docs types such as Google Spreadsheets, Presentations, Documents, and Drawings. For instance, if your application is configured to open PDF files, then because Google Documents are exportable to PDF, users will be able to use your application to open those documents.
The app can download the converted file content with the files.export method.
file_id = '1ZdR3L3qP4Bkq8noWLJHSr_iBau0DNT4Kli4SxNc2YEo'
request = drive_service.files().export_media(fileId=file_id,
mimeType='application/pdf')
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print "Download %d%%." % int(status.progress() * 100)

Cherrypy base64 image encoding not working as expected

The Problem:
I have been playing around with CherryPy for the past couple of days but I'm still having some trouble with getting images to work how I could expect them to. I can save an uploaded image as a jpg without issue but I can't convert it to a base64 image properly. Here's the simple server I wrote:
server.py
#server.py
import os
import cherrypy #Import framework
frameNumber = 1
lastFrame = ''
lastFrameBase64 = ''
class Root (object):
def upload(self, myFile, username, password):
global frameNumber
global lastFrameBase64
global lastFrame
size = 0
lastFrameBase64 = ''
lastFrame = ''
while True:
data = myFile.file.read(8192)
if not data:
break
size += len(data)
lastFrame += data
lastFrameBase64 += data.encode('base64').replace('\n','')
f = open('/Users/brian/Files/git-repos/learning-cherrypy/tmp_image/lastframe.jpg','w')
f.write(lastFrame)
f.close()
f = open('/Users/brian/Files/git-repos/learning-cherrypy/tmp_image/lastframe.txt','w')
f.write(lastFrameBase64)
f.close()
cherrypy.response.headers['Content-Type'] = 'application/json'
print "Image received!"
frameNumber = frameNumber + 1
out = "{\"status\":\"%s\"}"
return out % ( "ok" )
upload.exposed = True
cherrypy.config.update({'server.socket_host': '192.168.1.14',
'server.socket_port': 8080,
})
if __name__ == '__main__':
# CherryPy always starts with app.root when trying to map request URIs
# to objects, so we need to mount a request handler root. A request
# to '/' will be mapped to HelloWorld().index().
cherrypy.quickstart(Root())
When I view the lastframe.jpg file, the image renders perfectly. However, when I take the text string found in lastframe.txt and prepend the proper data-uri identifier data:image/jpeg;base64, to the base64 string, I get a broken image icon in the webpage I'm trying to show the image in.
<!DOCTYPE>
<html>
<head>
<title>Title</title>
</head>
<body>
<img src="data:image/jpeg;base64,/9....." >
</body>
</html>
I have tried using another script to convert my already-saved jpg image into a data-uri and it works. I'm not sure what I'm doing wrong in the server example b/c this code gives me a string that works as a data-uri:
Working Conversion
jpgtxt = open('tmp_image/lastframe.jpg','rb').read().encode('base64').replace('\n','')
f = open("jpg1_b64.txt", "w")
f.write(jpgtxt)
f.close()
So basically it comes down to how is the data variable taken from myFile.file.read(8192) is different from the data variable taken from open('tmp_image/lastframe.jpg','rb') I read that the rb mode in the open method tells python to read the file as a binary file rather than a string. Here's where I got that.
Summary
In summary, I don't know enough about python or the cherrypy framework to see how the actual data is stored when reading from the myFile variable and how the data is store when reading from the output of the open() method. Thanks for taking the time to look at this problem.
Base64 works by taking every 3 bytes of input and producing 4 characters. But what happens when the input isn't a multiple of 3 bytes? There's special processing for that, appending = signs to the end. But that's only supposed to happen at the end of the file, not in the middle. Since you're reading 8192 bytes at a time and encoding them, and 8192 is not a multiple of 3, you're generating corrupt output.
Try reading 8190 bytes instead, or read and encode the entire file at once.

Categories