First of all, I'm sorry if this is a too silly question... this is the first time I'm trying to use any of the technologies involved in this script (Python, the drive api, oauth 2.0, etc). I swear I've been searching and trying this for about a week before posting the question. hehehe
I'm trying to use the google-api-python-client to upload a big file (3.5GiB) that is on a terminal only Linux Debian. I've had some success uploading small files, but when I try to upload the big file, the upload stops about 1~2 hours after it started with HTTP 401 error (unauthorized). I've been looking on how to get a new access token but have had little success.
This is my (updated) code so far:
#!/usr/bin/python
import httplib2
import pprint
import time
from apiclient.discovery import build
from apiclient.http import MediaFileUpload
from apiclient import errors
from oauth2client.client import OAuth2WebServerFlow
# Copy your credentials from the APIs Console
CLIENT_ID = 'myclientid'
CLIENT_SECRET = 'myclientsecret'
# Check https://developers.google.com/drive/scopes for all available scopes
OAUTH_SCOPE = 'https://www.googleapis.com/auth/drive'
# Redirect URI for installed apps
REDIRECT_URI = 'urn:ietf:wg:oauth:2.0:oob'
# Run through the OAuth flow and retrieve credentials
flow = OAuth2WebServerFlow(CLIENT_ID, CLIENT_SECRET, OAUTH_SCOPE, REDIRECT_URI)
authorize_url = flow.step1_get_authorize_url()
print 'Go to the following link in your browser: ' + authorize_url
code = raw_input('Enter verification code: ').strip()
credentials = flow.step2_exchange(code)
# Create an httplib2.Http object and authorize it with our credentials
http = httplib2.Http()
http = credentials.authorize(http)
drive_service = build('drive', 'v2', http=http)
# Insert a file
media_body = MediaFileUpload('bigfile.zip', mimetype='application/octet-stream', chunksize=1024*256, resumable=True)
body = {
'title': 'bigfile.zip',
'description': 'Big File',
'mimeType': 'application/octet-stream'
}
retries = 0
request = drive_service.files().insert(body=body, media_body=media_body)
response = None
while response is None:
try:
print http.request.credentials.access_token
status, response = request.next_chunk()
if status:
print "Uploaded %.2f%%" % (status.progress() * 100)
retries = 0
except errors.HttpError, e:
if e.resp.status == 404:
print "Error 404! Aborting."
exit()
else:
if retries > 10:
print "Retries limit exceeded! Aborting."
exit()
else:
retries += 1
time.sleep(2**retries)
print "Error (%d)... retrying." % e.resp.status
continue
print "Upload Complete!"
After some digging, I found out that the authorized http object automatically refreshes the access token after receiving 401. Although it's really changing the access token, it's still not continuing the upload as expected... see the output below:
ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Uploaded 2.28%
ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Uploaded 2.29%
ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Uploaded 2.29%
ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Uploaded 2.30%
ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Error (401)... retrying.
ya29.AHES6ZQqp3_qbWsTk4yVDdHnlwc_7GvPZiFIReDnhIIiHao
Error (401)... retrying.
ya29.AHES6ZSqx90ZOUKqDEP4AAfWCVgXZYT2vJAiLwKDRu87JOs
Error (401)... retrying.
ya29.AHES6ZTp0RZ6U5K5UdDom0gq3XHnyVS-2sVU9hILOrG4o3Y
Error (401)... retrying.
ya29.AHES6ZSR-IOiwJ_p_Dm-OnCanVIVhCZLs7H_pYLMGIap8W0
Error (401)... retrying.
ya29.AHES6ZRnmM-YIZj4S8gvYBgC1M8oYy4Hv5VlcwRqgnZCOCE
Error (401)... retrying.
ya29.AHES6ZSF7Q7C3WQYuPAWrxvqbTRsipaVKhv_TfrD_gef1DE
Error (401)... retrying.
ya29.AHES6ZTsGzwIIprpPhCrqmoS3UkPsRzst5YHqL-zXJmz6Ak
Error (401)... retrying.
ya29.AHES6ZSS_1ZBiQJvZG_7t5uW3alsy1piGe4-u2YDnwycVrI
Error (401)... retrying.
ya29.AHES6ZTLFbBS8mSFWQ9zK8cgbX8RPeLghPxkfiKY54hBB-0
Error (401)... retrying.
ya29.AHES6ZQBeMWY50z6fWXvaCcd5_AJr_AYOuL2aiNKpK-mmyU
Error (401)... retrying.
ya29.AHES6ZTs2mYYSEyOqI_Ms4itKDx36t39Oc5RNZHkV4Dq49c
Retries limit exceeded! Aborting.
I'm using debian lenny with Python 2.5.2 installed, and installed the ssl and google-api-python-client through pip install about a week ago.
Thanks in advance for any help.
EDIT: Apparently, the problem isn't with the api. I tried the same code above, but with two small files, with 1h between them (system.sleep()). The output was:
ya29.AHES6ZRUssiLfuhqCP9Cu7C7LuhRV2rYzPldU27wiMJZWb8
Uploaded 66.89%
ya29.AHES6ZRUssiLfuhqCP9Cu7C7LuhRV2rYzPldU27wiMJZWb8
Upload 1 Complete!
ya29.AHES6ZRUssiLfuhqCP9Cu7C7LuhRV2rYzPldU27wiMJZWb8
Uploaded 57.62%
ya29.AHES6ZQd3o1ciwXpNFImH3CK0-dJAtQba_oeIO9DDbIq154
Upload 2 Complete!
For the second upload, a new access token was used successfully. So, perhaps the resumable session is expiring after some time or is only valid for that specific access token?
I filed an issue on the google-api-python-client project, and according to Joe Gregorio from google, the problem is in the backend:
"This is an issue with the backend and not with the API or with your code. As you deduced, if the upload goes too long the access_token expires and at that point the resumable upload can't be continued. There is work on progress to fix this issue right now, I will update this bug once the issue is fixed on the server side."
I assume the problem is that after the 1-2 hour limit your access token to your remote database expires; cutting off your connection with the remote server. I think what you could do is look at your hosts API manual... They should have something in there about 'refresh tokens'(They get you another Access Token, note some hosts only allow you to use one refresh token per session), if they are allowed an unlimited amount you can use a combination of a timer and AJAX to keep asking for more access tokens.
If not then you would have a make an AJAX request for another Authorization Token and exchange that for another Access token every hour. That sounds like a very rigorous process but I think that is the only way if your token keeps expiring.
Also just on another note have you tried other methods of uploading? If you said the above script ran for 1-2 hours and it only uploaded 1.44% of the file that could take 100+ hours to fully upload (Way too long for only 3 Gigs).
Related
Background
I have a Google Cloud Composer 1 environment running on GCP.
I have written a Google Cloud Function that, when run in the cloud, successfully triggers a DAG Run in my Composer environment. My code is based on and almost identical to the code in the Trigger DAGs with Cloud Functions guide from GCP documentation.
Here is the section of code most relevant to my question (source):
from google.auth.transport.requests import Request
from google.oauth2 import id_token
import requests
def make_iap_request(url, client_id, method='GET', **kwargs):
"""Makes a request to an application protected by Identity-Aware Proxy.
Args:
url: The Identity-Aware Proxy-protected URL to fetch.
client_id: The client ID used by Identity-Aware Proxy.
method: The request method to use
('GET', 'OPTIONS', 'HEAD', 'POST', 'PUT', 'PATCH', 'DELETE')
**kwargs: Any of the parameters defined for the request function:
https://github.com/requests/requests/blob/master/requests/api.py
If no timeout is provided, it is set to 90 by default.
Returns:
The page body, or raises an exception if the page couldn't be retrieved.
"""
# Set the default timeout, if missing
if 'timeout' not in kwargs:
kwargs['timeout'] = 90
# Obtain an OpenID Connect (OIDC) token from metadata server or using service
# account.
open_id_connect_token = id_token.fetch_id_token(Request(), client_id)
# Fetch the Identity-Aware Proxy-protected URL, including an
# Authorization header containing "Bearer " followed by a
# Google-issued OpenID Connect token for the service account.
resp = requests.request(
method, url,
headers={'Authorization': 'Bearer {}'.format(
open_id_connect_token)}, **kwargs)
if resp.status_code == 403:
raise Exception('Service account does not have permission to '
'access the IAP-protected application.')
elif resp.status_code != 200:
raise Exception(
'Bad response from application: {!r} / {!r} / {!r}'.format(
resp.status_code, resp.headers, resp.text))
else:
return resp.text
Challenge
I want to be able to run the same Cloud Function locally on my dev machine. When I try to do that, the function crashes with this error message:
google.auth.exceptions.DefaultCredentialsError: Neither metadata server or valid service account credentials are found.
This makes sense because the line that throws the error is:
google_open_id_connect_token = id_token.fetch_id_token(Request(), client_id)
Indeed when running locally the Metadata Server is not available and I don't know how to make valid service account credentials available to the call to fetch_id_token().
Question
My question is - What do I need to change in order to be able to securely obtain the OpenID Connect token when I run my function locally?
I've been able to run my code locally without changing it. Below are the details though I'm not sure this is the most secure option to get it done.
In the Google Cloud Console I browsed to the Service Accounts module.
I clicked on the "App Engine default service account" to see its details.
I switched to the "KEYS" tab.
I clicked on the "Add Key" button and generated a new JSON key.
I downloaded the JSON file and placed it outside of my source code folder.
Finally, on my dev machine*, I set the GOOGLE_APPLICATION_CREDENTIALS environment variable to be the path to where I placed the JSON file. More details here: https://cloud.google.com/docs/authentication/production
Once I did this, the call to id_token.fetch_id_token() picked up the service account details from the key file and returned the token successfully.
* - In my case I set the environment variable inside my PyCharm Debug Configuration.
I am trying to validate user purchase with the token received from the client (internal release).
For that I am using Python script with Google Python API Client (https://github.com/googleapis/google-api-python-client).
import httplib2
from oauth2client.service_account import ServiceAccountCredentials
token = "token received from the client"
http = httplib2.Http(timeout=self.http_timeout)
credentials = ServiceAccountCredentials.from_json_keyfile_name(
"./service_account.json", "https://www.googleapis.com/auth/androidpublisher"
)
http = credentials.authorize(http)
result = build("androidpublisher", "v3", http=http)\
.purchases()\
.products()\
.get(packageName="<package name>", productId="<subscription id>", token=token)\
.execute(http=http)
The response I am getting from this call is:
HttpError 401 when requesting https://www.googleapis.com/androidpublisher/v3/applications/<package name>/purchases/products/<subscription id>/tokens/<token>?alt=json returned "The current user has insufficient permissions to perform the requested operation."
The service user being used is give admin permissions to the account (for the sake of the test) in Google Play Console, and set to be "Project Owner" in "Google Cloud Console" console (for the sake of the test again).
What seems to be wrong here?
The 'Owner' permissions are sufficient to receipt validation [Source].
The error you're getting is most likely a propagation issue where it can actually take ~24hrs for the service credentials to go into effect throughout the system.
I am trying to access Google Drive using the Drive API Version 3 (Python). Listing the files seems to work fine. I get insufficient Permission error when I try to upload a file.
I changed My scope to give full permission to my script
SCOPES = 'https://www.googleapis.com/auth/drive'
Below is the block that I use to create the file
file_metadata = {
'name': 'Contents.pdf',
'mimeType': 'application/vnd.google-apps.file'
}
media = MediaFileUpload('Contents.pdf',
mimetype='application/vnd.google-apps.file',
resumable=True)
file = service.files().create(body=file_metadata,
media_body=media,
fields='id').execute()
print ('File ID: %s' % file.get('id'))
I get this error message:
ResumableUploadError: HttpError 403 "Insufficient Permission"
I am not sure what is wrong here.
I think that your script works fine. From the error you show, I thought the requirement of reauthorize of access token and refresh token. So please try a following flow.
When you authorize using client_secret.json, a credential JSON file is created. At the default Quickstart, it is created in .credentials of your home directory.
For your current situation, please delete your current the credential JSON file which is not client_secret.json, and reauthorize by launching your script. The default file name of Quickstart is drive-python-quickstart.json.
By this, scope of https://www.googleapis.com/auth/drive is reflected to access token and refresh token, and they are used for uploading process. When the error occurs even if this flow is done, please confirm whether Drive API is enabled at API console, again.
If this was not useful for you, I'm sorry.
Maybe you already have a file with the same name there?
I am trying to run watch() on my inbox and send it to a pub/sub.
However, I keep getting this error:
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://www.googleapis.com/gmail/v1/users/me.com/watch?alt=json returned "Invalid topicName does not match projects/western-oarlock/topics/*">
The code I am sending is:
request = {
'labelIds': ['INBOX'],
'topicName': 'projects/flask-app/topics/myTopic'
}
service.users().watch(userId='me', body=request).execute()
Why is it attempting to contact western-oarlock instead of flask-app?
Check if access token you are using is the right one.
Check if .p12 key you are using is from the same project, or try using a new key.
I had the same problem, in my case the cause was access token I used in Google Cloud API OAuth2 authentication which was generated using wrong service account. Hovewer I've read then somewehere on the Internet that wrong .p12 key also can cause this issue.
It ended up having to do with the JSON Secrets file. I was authenticating on the wrong project.
I have a basic web app that I uses Google OAUTH flow. The web app navigates to /oauth which triggers the oauth flow. the authorization server then redirects the client to /oauth2callback. I am trying it out locally and the first step works fine. The server is able to redirect the client to google's authorization server and receive the code that is meant to be exchanged for an access token. However, when I try to exchange the code using flow.step2_exchange the http requests hangs for a very long time. Usually the request times out. Occasionally I do get a response back with a valid access token, which leads me to believe that the logic is (generally speaking) sound.
Does anybody know what might be causing google to delay the response for 30 seconds or more? Could it be that google is throttling development requests? Has anyone encountered something like this before? Is there something I am doing wrong? I should note that I am currently using google own oauth2 library but I've tried constructing the http requests manually and that didn't help either. It still hangs on https://accounts.google.com/o/oauth2/token.
This is the flask code for the oauth flow:
from oauth2client.client import OAuth2WebServerFlow
#app.route('/oauth', methods=['GET'])
def oauth():
print "oauth called: "
flow = OAuth2WebServerFlow(
client_id=config.GOOGLE_OAUTH_CLIENT_ID,
client_secret=config.GOOGLE_OAUTH_CLIENT_SECRET,
scope='https://www.googleapis.com/auth/userinfo.email',
redirect_uri='http://localhost:6060/oauth2callback')
auth_uri = flow.step1_get_authorize_url()
return redirect(auth_uri)
#app.route('/oauth2callback', methods=["GET"])
def oauth_callback():
code = request.args['code']
flow = OAuth2WebServerFlow(
client_id=config.GOOGLE_OAUTH_CLIENT_ID,
client_secret=config.GOOGLE_OAUTH_CLIENT_SECRET,
scope='https://www.googleapis.com/auth/userinfo.email',
redirect_uri='http://localhost:6060/oauth2callback')
credentials = flow.step2_exchange(code)