Running a DataFlow streaming job using 2.11.0 release.
I get the following authentication error after few hours:
File "streaming_twitter.py", line 188, in <lambda>
File "streaming_twitter.py", line 102, in estimate
File "streaming_twitter.py", line 84, in estimate_aiplatform
File "streaming_twitter.py", line 42, in get_service
File "/usr/local/lib/python2.7/dist-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper return wrapped(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/googleapiclient/discovery.py", line 227, in build credentials=credentials)
File "/usr/local/lib/python2.7/dist-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper return wrapped(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/googleapiclient/discovery.py", line 363, in build_from_document credentials = _auth.default_credentials()
File "/usr/local/lib/python2.7/dist-packages/googleapiclient/_auth.py", line 42, in default_credentials credentials, _ = google.auth.default()
File "/usr/local/lib/python2.7/dist-packages/google/auth/_default.py", line 306, in default raise exceptions.DefaultCredentialsError(_HELP_MESSAGE) DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application.
This Dataflow job performs an API request to AI Platform prediction
and seems to be Authentication token is expiring.
Code snippet:
def get_service():
# If it hasn't been instantiated yet: do it now
return discovery.build('ml', 'v1',
discoveryServiceUrl=DISCOVERY_SERVICE,
cache_discovery=True)
I tried adding the following lines to the service function:
os.environ[
"GOOGLE_APPLICATION_CREDENTIALS"] = "/tmp/key.json"
But I get:
DefaultCredentialsError: File "/tmp/key.json" was not found. [while running 'generatedPtransform-930']
I assume because file is not in DataFlow machine.
Other option is to use developerKey param in build method, but doesnt seems supported by AI Platform prediction, I get error:
Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project."> [while running 'generatedPtransform-22624']
Looking to understand how to fix it and what is the best practice?
Any suggestions?
Complete logs here
Complete code here
Setting os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/tmp/key.json' only works locally with the DirectRunner. Once deploying to a distributed runner like Dataflow, each worker won't be able to find the local file /tmp/key.json.
If you want each worker to use a specific service account, you can tell Beam which service account to use to identify workers.
First, grant the roles/dataflow.worker role to the service account you want your workers to use. There is no need to download the service account key file :)
Then if you're letting PipelineOptions parse your command line arguments, you can simply use the service_account_email option, and specify it like --service_account_email your-email#your-project.iam.gserviceaccount.com when running your pipeline.
The service account pointed by your GOOGLE_APPLICATION_CREDENTIALS is simply used to start the job, but each worker uses the service account specified by the service_account_email. If a service_account_email is not passed, it defaults to the email from your GOOGLE_APPLICATION_CREDENTIALS file.
Related
after reading the offical google translate api document, it provide us with the following sample code:
from google.cloud import translate
def translate_text(text="Hello, world!", project_id="weighty-site-333613"):
client = translate.TranslationServiceClient().from_service_account_json('key.json')
location = "global"
parent = f"projects/{project_id}/locations/{location}"
response = client.translate_text(
request={
"parent": parent,
"contents": [text],
"mime_type": "text/plain",
"source_language_code": "en-US",
"target_language_code": "zh-CN",
}
)
for translation in response.translations:
print("Translated text: {}".format(translation.translated_text))
translate_text()
These code worked properly on google cloud terminal.
However, even if i put the "key.json" file in the same folder, an error like this is shown:
/usr/local/bin/python3.6 /Users/jiajunmao/Documents/GitHub/translator_of_excel/google_trans.py
Traceback (most recent call last):
File "/Users/jiajunmao/Documents/GitHub/translator_of_excel/google_trans.py", line 37, in <module>
translate_text()
File "/Users/jiajunmao/Documents/GitHub/translator_of_excel/google_trans.py", line 22, in translate_text
client = translate.TranslationServiceClient().from_service_account_json('key.json')
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/cloud/translate_v3/services/translation_service/client.py", line 354, in __init__
always_use_jwt_access=True,
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/cloud/translate_v3/services/translation_service/transports/grpc.py", line 158, in __init__
always_use_jwt_access=always_use_jwt_access,
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/cloud/translate_v3/services/translation_service/transports/base.py", line 110, in __init__
**scopes_kwargs, quota_project_id=quota_project_id
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/auth/_default.py", line 488, in default
raise exceptions.DefaultCredentialsError(_HELP_MESSAGE)
google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
Process finished with exit code 1
can someone tell me what should i do at this step? thank you so much
You need service account json file with correct permissions from GCP under IAM & Service Accounts.
Then you need to implement command,
GOOGLE_APPLICATION_CREDENTIALS = "/path/to/your/service_account.json"
My goal is to be able to run Python program using boto3 to access DynamoDB without any local configuration. I've been following this AWS document https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html) and it seems to be feasible using the 'IAM role' option https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#iam-role. This means I don't have anything configured locally.
However, as I attached a role with DynamoDB access permission to the EC2 instance the Python program is running and ran boto3.resources('dynamodb') I kept getting the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ubuntu/.local/lib/python3.6/site-packages/boto3/__init__.py", line 100, in resource
return _get_default_session().resource(*args, **kwargs)
File "/home/ubuntu/.local/lib/python3.6/site-packages/boto3/session.py", line 389, in resource
aws_session_token=aws_session_token, config=config)
File "/home/ubuntu/.local/lib/python3.6/site-packages/boto3/session.py", line 263, in client
aws_session_token=aws_session_token, config=config)
File "/home/ubuntu/.local/lib/python3.6/site-packages/botocore/session.py", line 839, in create_client
client_config=config, api_version=api_version)
File "/home/ubuntu/.local/lib/python3.6/site-packages/botocore/client.py", line 86, in create_client
verify, credentials, scoped_config, client_config, endpoint_bridge)
File "/home/ubuntu/.local/lib/python3.6/site-packages/botocore/client.py", line 328, in _get_client_args
verify, credentials, scoped_config, client_config, endpoint_bridge)
File "/home/ubuntu/.local/lib/python3.6/site-packages/botocore/args.py", line 47, in get_client_args
endpoint_url, is_secure, scoped_config)
File "/home/ubuntu/.local/lib/python3.6/site-packages/botocore/args.py", line 117, in compute_client_args
service_name, region_name, endpoint_url, is_secure)
File "/home/ubuntu/.local/lib/python3.6/site-packages/botocore/client.py", line 402, in resolve
service_name, region_name)
File "/home/ubuntu/.local/lib/python3.6/site-packages/botocore/regions.py", line 122, in construct_endpoint
partition, service_name, region_name)
File "/home/ubuntu/.local/lib/python3.6/site-packages/botocore/regions.py", line 135, in _endpoint_for_partition
raise NoRegionError()
botocore.exceptions.NoRegionError: You must specify a region.
I've searched the internet and it seems most of the solutions pointing to have local configuration (e.g. ~/.aws/config, boto3 config file, etc.).
Also, I have verified that from EC2 instance, I am able to get the region from instance metadata:
$ curl --silent http://169.254.169.254/latest/dynamic/instance-identity/document
{
...
"region" : "us-east-2",
...
}
My workaround right now is to provide an environment variable AWS_DEFAULT_REGION passing via Docker command line.
Here is the simple code I have to replicate the issue:
>>>import boto3
>>>dynamodb = boto3.resource('dynamodb')
I expect somehow boto3 is able to pick up the region that is already available in the EC2 instance.
There are two types of configuration data in boto3: credentials and non-credentials (including region). How boto3 reads them differs.
See:
Configuring Credentials
https://github.com/boto/boto3/issues/375
Specifically, boto3 retrieves credentials from the instance metadata service but not other configuration items (such as region).
So, you need to indicate which region you want. You can retrieve the current region from metadata and use it, if appropriate. Or use the environment variable AWS_DEFAULT_REGION.
You can pass region as a parameter to any boto3 resource.
dynamodb = boto3.resource('dynamodb', region_name='us-east-2')
I am trying to publish messages to Google PubSub in Python
Here is the code that I tried
from google.cloud import pubsub
ps = pubsub.Client()
topic = ps.topic("topic_name")
topic.publish("Message to topic")
I am getting the below error
File "/usr/local/lib/python2.7/dist-packages/google/cloud/iterator.py", line 218, in _items_iter
for page in self._page_iter(increment=False):
File "/usr/local/lib/python2.7/dist-packages/google/cloud/iterator.py", line 247, in _page_iter
page = self._next_page()
File "/usr/local/lib/python2.7/dist-packages/google/cloud/iterator.py", line 445, in _next_page
items = six.next(self._gax_page_iter)
File "/usr/local/lib/python2.7/dist-packages/google/gax/__init__.py", line 455, in next
return self.__next__()
File "/usr/local/lib/python2.7/dist-packages/google/gax/__init__.py", line 465, in __next__
response = self._func(self._request, **self._kwargs)
File "/usr/local/lib/python2.7/dist-packages/google/gax/api_callable.py", line 376, in inner
return a_func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/google/gax/retry.py", line 127, in inner
' classified as transient', exception)
google.gax.errors.RetryError: GaxError(Exception occurred in retry method that was not classified as transient, caused by <_Rendezvous of RPC that terminated with (StatusCode.PERMISSION_DENIED, User not authorized to perform this action.)>)
I've download service-account.json and the path of service-account.json is set to GOOGLE_APPLICATION_CREDENTIALS
I've also tried installing gcloud and executing gcloud auth application-default login
Please note that I am able to publish message using gcloud command and in java
$ gcloud beta pubsub topics publish sample "hello"
messageIds: '127284267552464'
Java code
TopicName topicName = TopicName.create(SRC_PROJECT, SRC_TOPIC);
Publisher publisher = Publisher.defaultBuilder(topicName).build();
ByteString data1 = ByteString.copyFromUtf8("hello");
PubsubMessage pubsubMessage1 = PubsubMessage.newBuilder().setData(data1).build();
publisher.publish(pubsubMessage1);
What is missing in python code?
I followed the steps described over here
This is an issue with my setup. My service-account.json does not have enough permission to publish messages to PubSub
I have a flask application where I can run a script (with the help of Flask-script) that makes use of google api discovery using the code below:
app_script.py
import argparse
import csv
import httplib2
from apiclient import discovery
from oauth2client import client
from oauth2client.file import Storage
from oauth2client import tools
def get_auth_credentials():
flow = client.flow_from_clientsecrets(
'/path/to/client_screts.json', # file downloaded from Google Developers Console
scope='https://www.googleapis.com/auth/webmasters.readonly',
redirect_uri='urn:ietf:wg:oauth:2.0:oob')
storage = Storage('/path/to/storage_file.dat')
credentials = storage.get()
if credentials is None or credentials.invalid:
parser = argparse.ArgumentParser(parents=[tools.argparser])
flags = parser.parse_args(['--noauth_local_webserver'])
credentials = tools.run_flow(flow=flow, storage=storage, flags=flags)
return credentials
def main():
credentials = get_auth_credentials()
http_auth = credentials.authorize(httplib2.Http())
# build the service object
service = discovery.build('webmasters', 'v3', http_auth)
Now the problem is every time I shutdown my computer upon booting and running the script again, I get the following error when trying to build the service object:
terminal:
$ python app.py runscript
No handlers could be found for logger "oauth2client.util"
Traceback (most recent call last):
File "app.py", line 5, in <module>
testapp.manager.run()
File "/home/user/.virtualenvs/testproject/local/lib/python2.7/site-packages/flask_script/__init__.py", line 412, in run
result = self.handle(sys.argv[0], sys.argv[1:])
File "/home/user/.virtualenvs/testproject/local/lib/python2.7/site-packages/flask_script/__init__.py", line 383, in handle
res = handle(*args, **config)
File "/home/user/.virtualenvs/testproject/local/lib/python2.7/site-packages/flask_script/commands.py", line 216, in __call__
return self.run(*args, **kwargs)
File "/home/user/development/testproject/testapp/__init__.py", line 16, in runscript
metrics_collector.main()
File "/home/user/development/testproject/testapp/metrics_collector.py", line 177, in main
service = discovery.build('webmasters', 'v3', http_auth)
File "/home/user/.virtualenvs/testproject/local/lib/python2.7/site-packages/oauth2client/util.py", line 140, in positional_wrapper
return wrapped(*args, **kwargs)
File "/home/user/.virtualenvs/testproject/local/lib/python2.7/site-packages/googleapiclient/discovery.py", line 206, in build
credentials=credentials)
File "/home/user/.virtualenvs/testproject/local/lib/python2.7/site-packages/oauth2client/util.py", line 140, in positional_wrapper
return wrapped(*args, **kwargs)
File "/home/user/.virtualenvs/testproject/local/lib/python2.7/site-packages/googleapiclient/discovery.py", line 306, in build_from_document
base = urljoin(service['rootUrl'], service['servicePath'])
KeyError: 'rootUrl'
intalled:
google-api-python-client==1.4.2
httplib2==0.9.2
Flask==0.10.1
Flask-Script==2.0.5
The script runs sometimes*, but thats the problem I don't know why it runs sometimes and others doesn't
*What I tried to make it work was to, delete all the cookies, download the client_secrets.json from the Google Developers Console again, remove the storage_file.dat, remove all .pyc files from the project
Can anyone help me see what's going on?
From a little bit of research here, it seems that the No handlers could be found for logger "oauth2client.util" error can actually be masking a different error. You need to use the logging module and configure your system to output.
Solution
Just add the following to configure logging:
import logging
logging.basicConfig()
Other helpful/related posts
Python - No handlers could be found for logger "OpenGL.error"
SOLVED: Error trying to access "google drive" with python (google quickstart.py source code)
Thank you so much for the tip Avantol13, you were right there was an error being masked.
The problem was that the following line:
service = discovery.build('webmasters', 'v3', http_auth)
should have actually be:
service = discovery.build('webmasters', 'v3', http=http_auth)
All working now. Thanks
I am following the example in https://developers.google.com/storage/docs/gspythonlibrary#credentials
I created client/secret pair by choosing in the dev. console "create new client id", "installed application", "other".
I have the following code in my python script:
import boto
from gcs_oauth2_boto_plugin.oauth2_helper import SetFallbackClientIdAndSecret
CLIENT_ID = 'my_client_id'
CLIENT_SECRET = 'xxxfoo'
SetFallbackClientIdAndSecret(CLIENT_ID, CLIENT_SECRET)
uri = boto.storage_uri('foobartest2014', 'gs')
header_values = {"x-goog-project-id": proj_id}
uri.create_bucket(headers=header_values)
and it fails with the following error:
File "/usr/local/lib/python2.7/dist-packages/boto/storage_uri.py", line 555, in create_bucket
conn = self.connect()
File "/usr/local/lib/python2.7/dist-packages/boto/storage_uri.py", line 140, in connect
**connection_args)
File "/usr/local/lib/python2.7/dist-packages/boto/gs/connection.py", line 47, in __init__
suppress_consec_slashes=suppress_consec_slashes)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/connection.py", line 190, in __init__
validate_certs=validate_certs, profile_name=profile_name)
File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 572, in __init__
host, config, self.provider, self._required_auth_capability())
File "/usr/local/lib/python2.7/dist-packages/boto/auth.py", line 883, in get_auth_handler
'Check your credentials' % (len(names), str(names)))
boto.exception.NoAuthHandlerFound: No handler was ready to authenticate. 3 handlers were checked. ['OAuth2Auth', 'OAuth2ServiceAccountAuth', 'HmacAuthV1Handler'] Check your credentials
I have been struggling with this for the last couple of days, turns out the boto stuff, and that gspythonlibrary are all totally obsolete.
The latest example code showing how to use/authenticate Google Cloud Storage is here:
https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/storage/api
You need to provide a client/secret pair in a .boto file, and then run gsutil config.
It will create a refresh token, and then should work!
For more info, see https://developers.google.com/storage/docs/gspythonlibrary#credentials
U can also make console application for gsutil commands authentication and gsutil cp, rm, gsutil config -a pass through console application to cloud SDK then execute