Editing Google docs with drive API - python

(For clarity, this post relates to the difference between the Google Documents List API and Google Drive API on Google App Engine with Python)
With the [now deprecated] Documents list API I was able to edit Google Documents by exporting as HTML, modifying the HTML and then re-uploading, either as a new document or as a modification to the original. This was useful for things like generating PDF documents from a template. I have been trying to replicate this functionality with the new Drive API (V2), however seem unable to.
Have come up with this ...
http = # authenticated http client
drive_service = build('drive', 'v2', http=http)
# get the file ...
file = drive_service.files().get(fileId=FILE_ID).execute()
# download the html ...
url = file['exportLinks']['text/html']
response, content = http.request(url)
# edit html
html = content.replace('foo', 'bar')
# create new file with modified html ...
body = {'title':'Modified Doc',
'description':'Some description',
'mimeType':'text/html'}
media_body = MediaIoBaseUpload(StringIO.StringIO(html), 'text/html', resumable=False)
drive_service.files().insert(body=body, media_body=media_body)
The above code uploads an html file as a file into Google Drive, rather then rendering the HTML into a Google Document. Fair enough, this makes sense. But how to I get it render as a Google Doc, as I was able to do with the Documents List API?
One other thing - if I set resumable=True it throws the following error on App Engine - '_StreamSlice' has no len(). Couldn't figure out how to get resumable=True to work?
And one last thing - the sample code in the docs uses a MediaInMemoryUpload object, however if you look at the source it is now deprecated, in favour of MediaIoBaseUpload. Should the sample code be updated?!

i suspect the issue is that the default for conversion has changed from true to false. You must explicitly set convert=true on the upload. See https://developers.google.com/drive/v2/reference/files/insert

Related

I'm trying to get the direct download link for a file using the Google Drive API

I'm trying to get the direct download link for a file in Google Drive using the Google Drive API (v3), but I'm also trying to do this without making the file publicly shared.
Here is what I've tried:
https://www.googleapis.com/drive/v3/files/**FILE_ID**?alt=media&supportsAllDrives=True&includeItemsFromAllDrives=True&key=**API_KEY**
Now this works if the file is shared publicly. But when the file isn't shared publicly you get this message:
{'error': {'errors': [{'domain': 'global', 'reason': 'notFound', 'message': 'File not found: 10k0Qogwcz7k0u86m7W2HK-LO7bk8xAF8.', 'locationType': 'parameter', 'location': 'fileId'}], 'code': 404, 'message': 'File not found: 10kfdsfjDHJ38-UHJ34D82.'}}
After doing some googling I found a post on stack overflow saying that I need to add a request header with my access token, but this doesn't work and the application just hangs
Here is the full code:
### SETTING UP GOOGLE API
scopes = 'https://www.googleapis.com/auth/drive'
store = file.Storage('storage.json')
credentials = store.get()
if not credentials or credentials.invalid:
flow = client.flow_from_clientsecrets('client_secret.json', scopes)
credentials = tools.run_flow(flow, store)
accessToken = credentials.access_token
refreshToken = credentials.refresh_token
drive = build('drive', 'v3', credentials=credentials)
### SENDING REQUEST
req_url = "https://www.googleapis.com/drive/v3/files/"+file_id+"?alt=media&supportsAllDrives=True&includeItemsFromAllDrives=True&key="+GOOGLE_API_KEY
headers={'Authorization': 'Bearer %s' % accessToken}
request_content = json.loads((requests.get(req_url)).content)
print(request_content)
------------------ EDIT: ------------------
I've gotten really close to an answer, but I can't seem to figure out why this doesn't work.
So I've figured out previously that alt=media generates a download link for the file, but when the file is private this doesn't work.
I just discovered that you can add &access_token=.... to access private files, so I came up with this API call:
https://www.googleapis.com/drive/v3/files/**FILE_ID**?supportsAllDrives=true&alt=media&access_token=**ACCESS_TOKEN**&key=**API_KEY**
When I go to that url on my browser I get this message:
We're sorry...
... but your computer or network may be sending automated queries. To protect our users, we can't process your request right now.
I find this confusing because if I remove alt=media, I am able to call on that request and I get some metadata about the file.
I believe your goal as follows.
From I'm trying to get the direct download link for a file in Google Drive using the Google Drive API (v3),, I understand that you want to retrieve webContentLink.
The file that you want to retrieve the webContentLink is the files except for Google Docs files.
You have already been able to get the file metadata using Drive API. So your access token can be used for this.
Modification points:
When the file is not shared, the API key cannot be used. By this, https://www.googleapis.com/drive/v3/files/**FILE_ID**?alt=media&supportsAllDrives=True&includeItemsFromAllDrives=True&key=**API_KEY** returns File not found. I think that the reason of this issue is due to this.
When I saw your script in your question, it seems that you want to download the file content.
In your script, headers is not used. So in this case, the access token is not used.
In the method of "Files: get", there is no includeItemsFromAllDrives.
In your script, I think that an error occurs at credentials.access_token. How about this? If my understanding is correct, please try to modify to accessToken = credentials.token.
In Drive API v3, the default response values don't include webContentLink. So in this case, the field value is required to be set like fields=webContentLink.
When your script is modified, it becomes as follows.
Modified script:
file_id = '###' # Please set the file ID.
req_url = "https://www.googleapis.com/drive/v3/files/" + file_id + "?supportsAllDrives=true&fields=webContentLink"
headers = {'Authorization': 'Bearer %s' % accessToken}
res = requests.get(req_url, headers=headers)
obj = res.json()
print(obj.get('webContentLink'))
Or, you can use drive = build('drive', 'v3', credentials=credentials) in your script, you can also use the following script.
file_id = '###' # Please set the file ID.
drive = build('drive', 'v3', credentials=credentials)
request = drive.files().get(fileId=file_id, supportsAllDrives=True, fields='webContentLink').execute()
print(request.get('webContentLink'))
Note:
In this modified script,
When the file is in the shared Drive and you don't have the permissions for retrieving the file metadata, an error occurs.
When your access token cannot be used for retrieving the file metadata, an error occurs.
So please be careful above points.
When * is used for fields, all file metadata can be retrieved.
Reference:
Files: get
Added:
You want to download the binary data from the Google Drive by the URL.
The file size is large like "2-10 gigabytes".
In this case, unfortunately, webContentLink cannot be used. Because in the case of the such large file, webContentLink is redirected. So I think that the method that the file is publicly shared and use the API key is suitable for achieving your goal. But, you cannot publicly shared the file.
From this situation, as a workaround, I would like to propose to use this method. This method is "One Time Download for Google Drive". At Google Drive, when the publicly shared file is downloaded, even when the permission of file is deleted under the download, the download can be run. This method uses this.
Flow
In this sample script, the API key is used.
Request to Web Apps with the API key and the file ID you want to download.
At Web Apps, the following functions are run.
Permissions of file of the received file ID are changed. And the file is started to be publicly shared.
Install a time-driven trigger. In this case, the trigger is run after 1 minute.
When the function is run by the time-driven trigger, the permissions of file are changed. And sharing file is stopped. By this, the shared file of only one minute can be achieved.
Web Apps returns the endpoint for downloading the file of the file ID.
After you got the endpoint, please download the file using the endpoint in 1 minute. Because the file is shared for only one minute.
Usage:
1. Create a standalone script
In this workaround, Google Apps Script is used as the server side. Please create a standalone script.
If you want to directly create it, please access to https://script.new/. In this case, if you are not logged in Google, the log in screen is opened. So please log in to Google. By this, the script editor of Google Apps Script is opened.
2. Set sample script of Server side
Please copy and paste the following script to the script editor. At that time, please set your API key to the variable of key in the function doGet(e).
Here, please set your API key in the function of doGet(e). In this Web Apps, when the inputted API key is the same, the script is run.
function deletePermission() {
const forTrigger = "deletePermission";
const id = CacheService.getScriptCache().get("id");
const triggers = ScriptApp.getProjectTriggers();
triggers.forEach(function(e) {
if (e.getHandlerFunction() == forTrigger) ScriptApp.deleteTrigger(e);
});
const file = DriveApp.getFileById(id);
file.setSharing(DriveApp.Access.PRIVATE, DriveApp.Permission.NONE);
}
function checkTrigger(forTrigger) {
const triggers = ScriptApp.getProjectTriggers();
for (var i = 0; i < triggers.length; i++) {
if (triggers[i].getHandlerFunction() == forTrigger) {
return false;
}
}
return true;
}
function doGet(e) {
const key = "###"; // <--- API key. This is also used for checking the user.
const forTrigger = "deletePermission";
var res = "";
if (checkTrigger(forTrigger)) {
if ("id" in e.parameter && e.parameter.key == key) {
const id = e.parameter.id;
CacheService.getScriptCache().put("id", id, 180);
const file = DriveApp.getFileById(id);
file.setSharing(DriveApp.Access.ANYONE_WITH_LINK, DriveApp.Permission.VIEW);
var d = new Date();
d.setMinutes(d.getMinutes() + 1);
ScriptApp.newTrigger(forTrigger).timeBased().at(d).create();
res = "https://www.googleapis.com/drive/v3/files/" + id + "?alt=media&key=" + e.parameter.key;
} else {
res = "unavailable";
}
} else {
res = "unavailable";
}
return ContentService.createTextOutput(res);
}
3. Deploy Web Apps
On the script editor, Open a dialog box by "Publish" -> "Deploy as web app".
Select "Me" for "Execute the app as:".
Select "Anyone, even anonymous" for "Who has access to the app:". This is a test case.
If Only myself is used, only you can access to Web Apps. At that time, please use your access token.
Click "Deploy" button as new "Project version".
Automatically open a dialog box of "Authorization required".
Click "Review Permissions".
Select own account.
Click "Advanced" at "This app isn't verified".
Click "Go to ### project name ###(unsafe)"
Click "Allow" button.
Click "OK"
4. Test run: Client side
This is a sample script of python. Before you test this, please confirm the above script is deployed as Web Apps. And please set the URL of Web Apps, the file ID and your API key.
import requests
url1 = "https://script.google.com/macros/s/###/exec"
url1 += "?id=###fileId###&key=###your API key###"
res1 = requests.get(url1)
url2 = res1.text
res2 = requests.get(url2)
with open("###sampleFilename###", "wb") as f:
f.write(res2.content)
In this sample script, at first, it requests to the Web Apps using the file ID and API key, and the file is shared publicly in 1 minute. And then, the file can be downloaded. After 1 minute, the file is not publicly shared. But the download of the file can be kept.
Note:
When you modified the script of Web Apps, please redeploy the Web Apps as new version. By this, the latest script is reflected to the Web Apps. Please be careful this.
References:
One Time Download for Google Drive
Web Apps
Taking advantage of Web Apps with Google Apps Script

Python requests on Instagram media link returns Not Found error

I have this function in Python to get media id of the post from its URL provided:
def get_media_id(self):
req = requests.get('https://api.instagram.com/oembed/?url={}'.format(self.txtUrl.text()))
media_id = req.json()['media_id']
return media_id
When I open the result URL in the browser it returns data but in the code the result is "404 not found"
For example consider this link:
https://www.instagram.com/p/B05bitzp15CE8e3Idcl4DAb8fjsfxOsSUYvkDY0/
When I put it in the url the result is:
But when I run the same in this function it returns 404 error
I tried running your code and assuming that self.txtUrl.text() there is nothing wrong with your code. The problem is that you are trying to get access to a media id of a private account without the access token.
The reason that you are able to open that link in the browser is due to the fact that you are likely logged into that account or are following it. To use the method you have given, you would need a public instagram post, for example try setting txtUrl.text() = https://www.instagram.com/p/fA9uwTtkSN/. Your code should work just fine.
The problem is that your GET request doesn't have any authorisation token to gain access to the post. Other people have written answers to how to get the media_id if you have the access token here: Where do I find the Instagram media ID of a image (having access token and following the image owner)?

Using Slack RTM API with Django 2.0

I am a beginner to the Django framework and I am building a Django app that uses the Slack RTM API.
I have a coded a program in python that performs the OAuth authentication process like so :
def initialize():
url="https://slack.com/api/rtm.connect"
payload={"token":"xxx"}
r=requests.post(url,payload)
res=json.loads(r.text)
url1=res['url']
ws = create_connection(url1)
return ws
My Requirement:
The stream of events I receive (from my slack channel that my slack app is added to) is processed to filter out events of the type - message ,then match the message with a regex pattern and then store the matched string in a database.
As a stand alone python program I am receiving the stream of events from my channel.
My questions:
How do I successfully integrate this code to Django so that I can
fulfill my requirement?
Do I put the code in templates/views? What is the
recommended method to process this stream of data?
def initialize():
url = "https://slack.com/api/rtm.connect"
r = requests.get(url, params={'token': '<YOUR TOKEN>'})
res = r.json()
url1=res['url']
ws = create_connection(url1) #Note: There is no function called create_connnection() so it will raise an error
return ws
if you read the API web methods, you see :
Preferred HTTP method: GET
See here: Slack rtm.connect method
look at the comment, and thats the right code, see the differences between this code and yours.
basically to get JSON from a request don't use json.loads because this search your local computer not the request
use r.json() so it call the json you got from r.
Note that r.text will return raw text output so when you want to get url it will not be identified, with r.json you can call the object url as stated about
Hope this help.
and please could you tell us more what you wanna do with this in view ? because template is a directory which contains all the HTML files which you don't need to work with.
but why views.py ?

Google Drive API v3 updating a file on the go (using realtime data) in python

I am trying to update a file that I created with some new data. Essentially, i am trying to append contents to an already created file. I have tried different ways using different attributes but nothing seems to work so far. The migration from v2 to v3 seems to have made things harder to develop in python.
This is my code so far
def updateFile(fileID, contents):
credentials = get_credentials() #this function gets the credentials for oauth
http = credentials.authorize(httplib2.Http())
service = discovery.build('drive', 'v3', http=http)
# First retrieve the file from the API.
#fileCreated = service.files().get(fileId=fileID).execute()
#print(dir(fileCreated.update()))
# File's new content.
file_metadata = { 'name' : 'notes.txt', 'description' : 'this is a test' }
fh = BytesIO(contents.encode())
media = MediaIoBaseUpload(fh,
mimetype='text/plain')
# Send the request to the API.
#print(BytesIO(contents.encode()).read())
print(fileID)
updated_file = service.files().update(
body=file_metadata,
#uploadType = 'media',
fileId=fileID,
#fields = fileID,
media_body=media).execute()
I have tried using MediaFileUpload (it works only for uploading a file) but I am appending a 'string' that is real time generated using a voice interface and is not stored in any file. So I ended up using MediaIoBaseUpload.
The code runs without any error, but the file is not updated. Any help is appreciated.
okay, so i figured out what was wrong with my code. Apparently, nothing is wrong with this chunk of code that I posted. I realized that I have not converted the document I created to a "google document" hence the code didnt update the document. I changed the mime type of the document i create to be a google document, and now the code works fine :)
You can also use service.files().get_media(fileId=fileID).execute() to get the file content and append new content to it.

How to display images in html?

I have a working app using the imgur API with python.
from imgurpython import ImgurClient
client_id = '5b786edf274c63c'
client_secret = 'e44e529f76cb43769d7dc15eb47d45fc3836ff2f'
client = ImgurClient(client_id, client_secret)
items = client.subreddit_gallery('rarepuppers')
for item in items:
print(item.link)
It outputs a set of imgur links. I want to display all those pictures on a webpage.
Does anyone know how I can integrate python to do that in an HTML page?
OK, I haven't tested this, but it should work. The HTML is really basic (but you never mentioned anything about formatting) so give this a shot:
from imgurpython import ImgurClient
client_id = '5b786edf274c63c'
client_secret = 'e44e529f76cb43769d7dc15eb47d45fc3836ff2f'
client = ImgurClient(client_id, client_secret)
items = client.subreddit_gallery('rarepuppers')
htmloutput = open('somename.html', 'w')
htmloutput.write("[html headers here]")
htmloutput.write("<body>")
for item in items:
print(item.link)
htmloutput.write('SOME DESCRIPTION<br>')
htmloutput.write("</body</html>")
htmloutput.close
You can remove the print(item.link) statement in the code above - it's there to give you some comfort that stuff is happening.
I've made a few assumptions:
The item.link returns a string with only the unformatted link. If it's already in html format, then remove the HTML around item.link in the example above.
You know how to add an HTML header. Let me know if you do not.
This script will create the html file in the folder it's run from. This is probably not the greatest of ideas. So adjust the path of the created file appropriately.
Once again, though, if that is a real client secret, you probably want to change it ASAP.

Categories