Polyglot Processor with Local Dataflow Server - python

I have been trying to work with polyglot and build a simple python processor. I followed the polyglot recipe and I could not get the stream to deploy. I originally deployed the same processor that is used in the example and got the following errors:
Unknown command line arg requested: spring.cloud.stream.bindings.input.destination
Unknown environment variable requested: SPRING_CLOUD_STREAM_KAFKA_BINDER_BROKERS
Traceback (most recent call last):
File "/processor/python_processor.py", line 10, in
consumer = KafkaConsumer(get_input_channel(), bootstrap_servers=[get_kafka_binder_brokers()])
File "/usr/local/lib/python2.7/dist-packages/kafka/consumer/group.py", line 353, in init
self._client = KafkaClient(metrics=self._metrics, **self.config)
File "/usr/local/lib/python2.7/dist-packages/kafka/client_async.py", line 203, in init
self.cluster = ClusterMetadata(**self.config)
File "/usr/local/lib/python2.7/dist-packages/kafka/cluster.py", line 67, in init
self._bootstrap_brokers = self._generate_bootstrap_brokers()
File "/usr/local/lib/python2.7/dist-packages/kafka/cluster.py", line 71, in _generate_bootstrap_brokers
bootstrap_hosts = collect_hosts(self.config['bootstrap_servers'])
File "/usr/local/lib/python2.7/dist-packages/kafka/conn.py", line 1336, in collect_hosts
host, port, afi = get_ip_port_afi(host_port)
File "/usr/local/lib/python2.7/dist-packages/kafka/conn.py", line 1289, in get_ip_port_afi
host_and_port_str = host_and_port_str.strip()
AttributeError: 'NoneType' object has no attribute 'strip'
Exception AttributeError: "'KafkaClient' object has no attribute '_closed'" in <bound method KafkaClient.del of <kafka.client_async.KafkaClient object at 0x7f8b7024cf10>> ignored
I then attempted to pass the environment and binding arguments through the deployment stream but that did not work. When I manually inserted the SPRING_CLOUD_STREAM_KAFKA_BINDER_BROKERS and spring.cloud.stream.bindings.input.destination parameter into Kafka's consumer I was able to deploy the stream as a workaround. I am not entirely sure what is causing the issue, would deploying this on Kubernetes be any different or is this an issue with Polyglot and Dataflow? Any help with this would be appreciated.
Steps to reproduce:
Attempt to deploy polyglot-processor stream from polyglot recipe on local dataflow server. I am also using the same stream definition as in the example: http --server.port=32123 | python-processor --reversestring=true | log.
Additional context:
I am attempting to deploy the stream on a local installation of SPDF and Kafka since I had some issues deploying custom python applications with Docker.

The recipe you have posted above expects the SPRING_CLOUD_STREAM_KAFKA_BINDER_BROKERS environment variable present as part of the server configuration (since the streams are managed via Skipper server, you would need to set this environment variable in your Skipper server configuration).
You can check this documentation on how you can set SPRING_CLOUD_STREAM_KAFKA_BINDER_BROKERS as environment property in Skipper server deployment.
You can also pass this property as a deployer property when deploying the python-processor stream app. You can refer this documentation on how you can pass deployment property to set the Spring Cloud Stream properties (here the binder configuration property) at the time of stream deployment.

Related

Logging Artifacts from MlFlow on GCS Bucket

I have a running MlFlow server on GCS VM instance. I have created a bucket to log the artifacts.
This is the command I'm running to start the server and for specifying bucket path-
mlflow server --default-artifact-root gs://gcs_bucket/artifacts --host x.x.x.x
But facing this error:
TypeError: stat: path should be string, bytes, os.PathLike or integer, not ElasticNet
Note- The mlflow server is running fine with the specified host alone. The problem is in the way when I'm specifying the storage bucket path.
I have given permission of storage api by using these commands:
gcloud auth application-default login
gcloud auth login
Also, on printing the artifact URI, this is what I'm getting:
mlflow.get_artifact_uri()
Output:
gs://gcs_bucket/artifacts/0/122481bf990xxxxxxxxxxxxxxxxxxxxx/artifacts
So in the above path from where this is coming 0/122481bf990xxxxxxxxxxxxxxxxxxxxx/artifacts and why it's not getting auto-created at gs://gcs_bucket/artifacts
After debugging more, why it's not able to get the local path from VM:
And this error I'm getting on VM:
ARNING:root:Malformed experiment 'mlruns'. Detailed error Yaml file './mlruns/mlruns/meta.yaml' does not exist.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/mlflow/store/tracking/file_store.py", line 197, in list_experiments
experiment = self._get_experiment(exp_id, view_type)
File "/usr/local/lib/python3.6/dist-packages/mlflow/store/tracking/file_store.py", line 256, in _get_experiment
meta = read_yaml(experiment_dir, FileStore.META_DATA_FILE_NAME)
File "/usr/local/lib/python3.6/dist-packages/mlflow/utils/file_utils.py", line 160, in read_yaml
raise MissingConfigException("Yaml file '%s' does not exist." % file_path)
mlflow.exceptions.MissingConfigException: Yaml file './mlruns/mlruns/meta.yaml' does not exist.
Can I get a solution to this and what I'm missing?
I think the main error is from the structure that you want to deploy. For your use case, the structure is suitable that in here. So you miss the URI path which used to store backend metadata. So please install DB SQL(PostgreSQL,...) first, then add the path to --backend-storage-uri.
In case you want to use MlFlow as a model registry and store images on gcs. You can use this structure in here with adding tag --artifacts-only --serve-artifacts
Hope this can help you.

AWS NoCredentials in training

I am attempting to run the example code for Amazon Sagemaker on a local GPU. I have copied the code from the Jupyter notebook to the following Python script:
import boto3
import subprocess
import sagemaker
from sagemaker.mxnet import MXNet
from mxnet import gluon
from sagemaker import get_execution_role
import os
sagemaker_session = sagemaker.Session()
instance_type = 'local'
if subprocess.call('nvidia-smi') == 0:
# Set type to GPU if one is present
instance_type = 'local_gpu'
# role = get_execution_role()
gluon.data.vision.MNIST('./data/train', train=True)
gluon.data.vision.MNIST('./data/test', train=False)
# successfully connects and uploads data
inputs = sagemaker_session.upload_data(path='data', key_prefix='data/mnist')
hyperparameters = {
'batch_size': 100,
'epochs': 20,
'learning_rate': 0.1,
'momentum': 0.9,
'log_interval': 100
}
m = MXNet("mnist.py",
role=role,
train_instance_count=1,
train_instance_type=instance_type,
framework_version="1.1.0",
hyperparameters=hyperparameters)
# fails in Docker container
m.fit(inputs)
predictor = m.deploy(initial_instance_count=1, instance_type=instance_type)
m.delete_endpoint()
where the referenced mnist.py file is exactly as specified on Github. The script fails on m.fit in Docker container with the following error:
algo-1-1DUU4_1 | Downloading s3://<S3-BUCKET>/sagemaker-mxnet-2018-10-07-00-47-10-435/source/sourcedir.tar.gz to /tmp/script.tar.gz
algo-1-1DUU4_1 | 2018-10-07 00:47:29,219 ERROR - container_support.training - uncaught exception during training: Unable to locate credentials
algo-1-1DUU4_1 | Traceback (most recent call last):
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/container_support/training.py", line 36, in start
algo-1-1DUU4_1 | fw.train()
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/mxnet_container/train.py", line 169, in train
algo-1-1DUU4_1 | mxnet_env.download_user_module()
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/container_support/environment.py", line 89, in download_user_module
algo-1-1DUU4_1 | cs.download_s3_resource(self.user_script_archive, tmp)
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/container_support/utils.py", line 37, in download_s3_resource
algo-1-1DUU4_1 | script_bucket.download_file(script_key_name, target)
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/boto3/s3/inject.py", line 246, in bucket_download_file
algo-1-1DUU4_1 | ExtraArgs=ExtraArgs, Callback=Callback, Config=Config)
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/boto3/s3/inject.py", line 172, in download_file
algo-1-1DUU4_1 | extra_args=ExtraArgs, callback=Callback)
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/boto3/s3/transfer.py", line 307, in download_file
algo-1-1DUU4_1 | future.result()
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/s3transfer/futures.py", line 73, in result
algo-1-1DUU4_1 | return self._coordinator.result()
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/s3transfer/futures.py", line 233, in result
algo-1-1DUU4_1 | raise self._exception
algo-1-1DUU4_1 | NoCredentialsError: Unable to locate credentials
I am confused that I can authenticate to S3 outside of the container (to pload the training/test data) but I cannot within the Docker container. So I am guessing the issues has to do with passing the AWS credentials to the Docker container. Here is the generated Docker-compose file:
networks:
sagemaker-local:
name: sagemaker-local
services:
algo-1-1DUU4:
command: train
environment:
- AWS_REGION=us-west-2
- TRAINING_JOB_NAME=sagemaker-mxnet-2018-10-07-00-47-10-435
image: 123456789012.dkr.ecr.us-west-2.amazonaws.com/sagemaker-mxnet:1.1.0-gpu-py2
networks:
sagemaker-local:
aliases:
- algo-1-1DUU4
stdin_open: true
tty: true
volumes:
- /tmp/tmpSkaR3x/algo-1-1DUU4/input:/opt/ml/input
- /tmp/tmpSkaR3x/algo-1-1DUU4/output:/opt/ml/output
- /tmp/tmpSkaR3x/algo-1-1DUU4/output/data:/opt/ml/output/data
- /tmp/tmpSkaR3x/model:/opt/ml/model
version: '2.1'
Should the AWS credentials be passed in as enviromental variables?
I upgraded my sagemaker install to after reading Using boto3 in install local mode?, but that had no effect. I checked the credentials that are being fetched in the Sagemaker session (outside the container) and they appear to be blank, even though I have an ~/.aws/config and ~/.aws/credentials file:
{'_token': None, '_time_fetcher': <function _local_now at 0x7f4dbbe75230>, '_access_key': None, '_frozen_credentials': None, '_refresh_using': <bound method AssumeRoleCredentialFetcher.fetch_credentials of <botocore.credentials.AssumeRoleCredentialFetcher object at 0x7f4d2de48bd0>>, '_secret_key': None, '_expiry_time': None, 'method': 'assume-role', '_refresh_lock': <thread.lock object at 0x7f4d9f2aafd0>}
I am new to AWS so I do not know how to diagnose the issue regarding AWS credentials. My .aws/config file has the following information (with placeholder values):
[default]
output = json
region = us-west-2
role_arn = arn:aws:iam::123456789012:role/SageMakers
source_profile = sagemaker-test
[profile sagemaker-test]
output = json
region = us-west-2
Where the sagemaker-test profile has AmazonSageMakerFullAccess in the IAM Management Console.
The .aws/credentials file has the following information (represented by placeholder values):
[default]
aws_access_key_id = 1234567890
aws_secret_access_key = zyxwvutsrqponmlkjihgfedcba
[sagemaker-test]
aws_access_key_id = 0987654321
aws_secret_access_key = abcdefghijklmopqrstuvwxyz
Lastly, these are versions of the applicable libraries from a pip freeze:
awscli==1.16.19
boto==2.48.0
boto3==1.9.18
botocore==1.12.18
docker==3.5.0
docker-compose==1.22.0
mxnet-cu91==1.1.0.post0
sagemaker==1.11.1
Please let me know if I left out any relevant information and thanks for any help/feedback that you can provide.
UPDATE: Thanks for your help, everyone! While attempting some of your suggested fixes, I noticed that boto3 was out of date, and update it (to boto3-1.9.26 and botocore-1.12.26) which appeared to resolve the issue. I was not able to find any documentation on that being an issue with boto3==1.9.18. If someone could help me understand what the issue was with boto3, I would happy to make mark their answer as correct.
SageMaker local mode is designed to pick up whatever credentials are available in your boto3 session, and pass them into the docker container as environment variables.
However, the version of the sagemaker sdk that you are using (1.11.1 and earlier) will ignore the credentials if they include a token, because that usually indicates short-lived credentials that won't remain valid long enough for a training job to complete or endpoint to be useful.
If you are using temporary credentials, try replacing them with permanent ones, or running from an ec2 instance (or SageMaker notebook!) that has an appropriate instance role assigned.
Also, the sagemaker sdk's handling of credentials changed in v1.11.2 and later -- temporary credentials will be passed to local mode containers, but with a warning message. So you could just upgrade to a newer version and try again (pip install -U sagemaker).
Also, try upgrading boto3 can change, so try using the latest version.
I just confirmed that his example works on my machine locally. Please make sure the role you are using has permission to use the buckets with name starts with sagemaker. Sagemaker by default creates buckets prefixed with sagemaker.
It looks like you have the credentials configured on your host at ~/.aws/credentials but are trying to access them on a docker container running on the host.
The simplest solution seems to be, mounting your aws credentials on the container at the expected location. You appear to be using the sagemaker-mxnet:1.1.0-gpu-py2 image, which appears to use the root user. Based on this, if you update the volumes in your docker-compose file for the algo-1-1DUU4 to include:
volumes:
...
~/.aws/:/root/.aws/
this will mount your credentials on to the root user in your container, so that your python script should be able to access them.
I'll assume that the library you're using has boto3 at its core. boto3 advises that there are several methods of authentication available to you.
Passing credentials as parameters in the boto.client() method
Passing credentials as parameters when creating a Session object
Environment variables
Shared credential file (~/.aws/credentials)
AWS config file (~/.aws/config)
Assume Role provider
Boto2 config file (/etc/boto.cfg and ~/.boto)
Instance metadata service on an Amazon EC2 instance that has an IAM role configured.
But it sounds like a docker sandbox does not have access to your ~/.aws/credentials.conf file, so I'd consider other options that may be available to you. As I'm unfamiliar with docker, I can't give you a guaranteed solution for your scenario.

Set IAM Policy works on local machine but not in GCE instance

The following lines from my Python app execute with no problems on my local machine.
import googleapiclient.discovery
project_id = 'some-project-id'
resource_manager = googleapiclient.discovery.build('cloudresourcemanager', 'v1')
iam_policy_request = resource_manager.projects().getIamPolicy(resource=project_id, body={})
iam_policy_response = iam_policy_request.execute(num_retries=3)
new_policy = dict()
new_policy['policy'] = iam_policy_response
del new_policy['policy']['version']
iam_policy_update_request = resourcemanager.projects().setIamPolicy(resource=project_id, body=new_policy)
update_result = iam_policy_update_request.execute(num_retries=3)
When I run the app in a GCE instance, and more precisely from within a Docker container inside the GCE instance, I get the exception:
URL being requested: POST https://cloudresourcemanager.googleapis.com/v1/projects/some-project-id:setIamPolicy?alt=json
Traceback (most recent call last):
File "/env/lib/python3.5/site-packages/google/api_core/grpc_helpers.py", line 54, in error_remapped_callable
return callable_(*args, **kwargs)
File "/env/lib/python3.5/site-packages/grpc/_channel.py", line 487, in __call__
return _end_unary_response_blocking(state, call, False, deadline)
File "/env/lib/python3.5/site-packages/grpc/_channel.py", line 437, in _end_unary_response_blocking
raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.PERMISSION_DENIED, User not authorized to perform this action.)>
i.e. an authorization error. Oddly, when I open a Python terminal session inside the GCE instance and run the Python code line by line, I do not get the exception. It only throws the exception when the code is running as part of the app.
I am using a service account inside of the GCE instance, as opposed to my regular account on my local machine. But I don't think that is the problem since I am able to run the lines of code one by one inside of the instance while still relying on the service account roles.
I would like to be able to run the app without the exception within the Docker container inside of GCE. I feel like I'm missing something but can't figure out what the missing piece is.
Looking to your issue it seems an authentication issue, because your application is not properly authenticated :
1- First run this command it will let your application temporarily use your own user credentials:
gcloud beta auth application-default login
the output should be like this:
Credentials saved to file: $SOME_PATH/application_default_credentials.json
2-Then you have set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path to the key file:
export GOOGLE_APPLICATION_CREDENTIALS=$SOME_PATH/application_default_credentials.json
Try to run you Application after that.

GCS with GKE, 403 Insufficient permission for writing into GCS bucket [duplicate]

This question already has an answer here:
Is it necessary to recreate a Google Container Engine cluster to modify API permissions?
(1 answer)
Closed 5 years ago.
Currently I'm trying to write files into Google Cloud Storage bucket. For this, I have used django-storages package.
I have deployed my code and I get into the running container through kubernetes kubectl utility to check the working of GCS bucket.
$ kubectl exec -it foo-pod -c foo-container --namespace=testing python manage.py shell
I can able to read the bucket but if I try to write into the bucket, it shows the below traceback.
>>> from django.core.files.storage import default_storage
>>> f = default_storage.open('storage_test', 'w')
>>> f.write('hi')
2
>>> f.close()
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/google/cloud/storage/blob.py", line 946, in upload_from_file
client, file_obj, content_type, size, num_retries)
File "/usr/local/lib/python3.6/site-packages/google/cloud/storage/blob.py", line 867, in _do_upload
client, stream, content_type, size, num_retries)
File "/usr/local/lib/python3.6/site-packages/google/cloud/storage/blob.py", line 700, in _do_multipart_upload
transport, data, object_metadata, content_type)
File "/usr/local/lib/python3.6/site-packages/google/resumable_media/requests/upload.py", line 98, in transmit
self._process_response(result)
File "/usr/local/lib/python3.6/site-packages/google/resumable_media/_upload.py", line 110, in _process_response
response, (http_client.OK,), self._get_status_code)
File "/usr/local/lib/python3.6/site-packages/google/resumable_media/_helpers.py", line 93, in require_status_code
status_code, u'Expected one of', *status_codes)
google.resumable_media.common.InvalidResponse: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/usr/local/lib/python3.6/site-packages/storages/backends/gcloud.py", line 75, in close
self.blob.upload_from_file(self.file, content_type=self.mime_type)
File "/usr/local/lib/python3.6/site-packages/google/cloud/storage/blob.py", line 949, in upload_from_file
_raise_from_invalid_response(exc)
File "/usr/local/lib/python3.6/site-packages/google/cloud/storage/blob.py", line 1735, in _raise_from_invalid_response
raise exceptions.from_http_response(error.response)
google.api_core.exceptions.Forbidden: 403 POST https://www.googleapis.com/upload/storage/v1/b/foo.com/o?uploadType=multipart: Insufficient Permission
>>> default_storage.url('new docker')
'https://storage.googleapis.com/foo.appspot.com/new%20docker'
>>>
Seems like it was completely related to the bucket permissions. So I have assigned Storage admin , Storage object creator roles to google cloud build service account (through bucket -> manage permissions) but still it shows the same error.
A possible explanation for this would be if you haven't assigned your cluster with the correct scope. If this is the case, the nodes in the cluster would not have the required authorisation/permission to write to Google Cloud Storage which could explain the 403 error you're seeing.
If no scope is set when the cluster is created, the default scope is assigned and this only provides read permission for Cloud Storage.
In order to check the clusters current scopes using Cloud SDK you could try running a 'describe' command from the Cloud Shell, for example:
gcloud container clusters describe CLUSTER-NAME --zone ZONE
The oauthScopes section of the output contains the current scopes assigned to the cluster/nodes.
The default read only Cloud Storage scope would display:
https://www.googleapis.com/auth/devstorage.read_only
If the Cloud Storage read/write scope is set the output will display:
https://www.googleapis.com/auth/devstorage.read_write
The scope can be set during cluster creation using the --scope switch followed by the desired scope identifier. In your case, this would be “storage-rw”. For example, you could run something like:
gcloud container clusters create CLUSTER-NAME --zone ZONE --scopes storage-rw
The storage-rw scope, combined with your service account should then allow the nodes in your cluster to write to Cloud Storage.
Alternatively you if you don't want to recreate the cluster you can create a new node pool with the new desired scopes, then delete your old node pool. See the accepted answer for Is it necessary to recreate a Google Container Engine cluster to modify API permissions? for information on how to achieve this.

authentification for Google Storage using Python

I want to build an app which has easy interactions with google storage, i.e., list files in bucket, download a file, and upload a file.
Following this tutorial, I decided to use a service account (not a user one) for authentification and followed the procedure. I created a public/private key on my console and download the key on my machine. Then I created the .boto file which points to this private key, and finally launched this program and it worked:
import boto
import gcs_oauth2_boto_plugin
uri = boto.storage_uri('txxxxxxxxxxxxxx9.appspot.com', 'gs')
for obj in uri.get_bucket():
print '%s://%s/%s' % (uri.scheme, uri.bucket_name, obj.name)
As you can see, the package gcs_oauth2_boto_plugin is not used in the code, so I decided to get rid of it. But magically, when I comment the import gcs_oauth2_boto_plugin line and run the program again, I get this error:
C:\Users\...\Anaconda3\envs\snakes\python.exe C:/Users/.../Dropbox/Prog/s3_manifest_builder/test.py
Traceback (most recent call last):
File "C:/Users/.../Dropbox/Prog/s3_manifest_builder/test.py", line 10, in <module>
for obj in uri.get_bucket():
File "C:\Users\...\Anaconda3\envs\snakes\lib\site-packages\boto\storage_uri.py", line 181, in get_bucket
conn = self.connect()
File "C:\Users\...\Anaconda3\envs\snakes\lib\site-packages\boto\storage_uri.py", line 140, in connect
**connection_args)
File "C:\Users\...\Anaconda3\envs\snakes\lib\site-packages\boto\gs\connection.py", line 47, in __init__
suppress_consec_slashes=suppress_consec_slashes)
File "C:\Users\...\Anaconda3\envs\snakes\lib\site-packages\boto\s3\connection.py", line 190, in __init__
validate_certs=validate_certs, profile_name=profile_name)
File "C:\Users\...\Anaconda3\envs\snakes\lib\site-packages\boto\connection.py", line 569, in __init__
host, config, self.provider, self._required_auth_capability())
File "C:\Users\...\Anaconda3\envs\snakes\lib\site-packages\boto\auth.py", line 987, in get_auth_handler
'Check your credentials' % (len(names), str(names)))
boto.exception.NoAuthHandlerFound: No handler was ready to authenticate. 1 handlers were checked. ['HmacAuthV1Handler'] Check your credentials
So my questions are:
1- how can you explain that deleting an import which IS NOT USED in the code makes it fail?
2- more generally, to be sure to understand the authentification process, if I want to run my app on a machine, I must be sure to have the .boto file (which points to my service account private key) generated previously? Or is there a cleaner/easier way to give access to my application to Google Storage for in/out interactions?
For instance, I only have to provide public and private key as strings to my program when I want to connect to a S3 bucket with boto. I don't needto generate a .boto file, importing packages etc..., which makes it so much easier to use, isn't it?
1- how can you explain that deleting an import which IS NOT USED in the code makes it fail?
The first hint is that the module is named a "plugin", although exactly how that's implemented isn't clear on the surface. It intuitively makes some sense that not importing a module would lead to an exception of this kind, though. Initially, I thought it was a bad practice of doing stateful activity on a global during the init of importing that module. In some ways, that is what it was, but only because class hierarchies are "state" in the meta-programmable python.
It turns out (as in many cases) that inspecting the location that stacktrace was thrown from (boto.auth.get_auth_handler()) provides the key to understanding the issue.
(see the linked source for commented version)
def get_auth_handler(host, config, provider, requested_capability=None):
ready_handlers = []
auth_handlers = boto.plugin.get_plugin(AuthHandler, requested_capability)
for handler in auth_handlers:
try:
ready_handlers.append(handler(host, config, provider))
except boto.auth_handler.NotReadyToAuthenticate:
pass
if not ready_handlers:
checked_handlers = auth_handlers
names = [handler.__name__ for handler in checked_handlers]
raise boto.exception.NoAuthHandlerFound(
'No handler was ready to authenticate. %d handlers were checked.'
' %s '
'Check your credentials' % (len(names), str(names)))
Note the reference to the class AuthHandler, which is defined in boto.auth_handler.
So, you can see that we need to look at the contents of boto.plugin.get_plugin(AuthHandler, requested_capability):
def get_plugin(cls, requested_capability=None):
if not requested_capability:
requested_capability = []
result = []
for handler in cls.__subclasses__():
if handler.is_capable(requested_capability):
result.append(handler)
return result
So, it becomes clear, at last finally when we see that the class definition of the class OAuth2Auth in gcs_oauth2_boto_plugin.oauth2_plugin, in which it is declared as a subclass of boto.auth_handler.AuthHandler, signaling its auth capabilities to the boto framework via the following member variable:
capability = ['google-oauth2', 's3']
2- more generally, to be sure to understand the authentification process, if I want to run my app on a machine, I must be sure to have the .boto file (which points to my service account private key) generated previously? Or is there a cleaner/easier way to give access to my application to Google Storage for in/out interactions?
This has a more generalized answer: You can use a .boto file, although you can also use service account credentials, and you could even use the REST API and go through an oauth2 flow to get the tokens needed to send in the Authorization header. The various methods of auth to cloud storage are in the documentation. The tutorial/doc you linked shows some methods, you've used .boto for another method. You can read about the Cloud Storage REST API (JSON) here and you can read about python oauth2 flows of various kinds here.

Categories