AWS Sagemaker InvokeEndpoint: operation: Endpoint of account not found

AWS Sagemaker InvokeEndpoint: operation: Endpoint of account not found - python

I've been following this guide here: https://aws.amazon.com/blogs/machine-learning/building-an-nlu-powered-search-application-with-amazon-sagemaker-and-the-amazon-es-knn-feature/
I have successfully deployed the model from my notebook instance. I am also able to generate predictions by calling predict() method from sagemaker.predictor.
This is how I created and deployed the model
class StringPredictor(Predictor):
def __init__(self, endpoint_name, sagemaker_session):
super(StringPredictor, self).__init__(endpoint_name, sagemaker_session, content_type='text/plain')
pytorch_model = PyTorchModel(model_data = inputs,
role=role,
entry_point ='inference.py',
source_dir = './code',
framework_version = '1.3.1',
py_version='py3',
predictor_cls=StringPredictor)
predictor = pytorch_model.deploy(instance_type='ml.m5.large', initial_instance_count=4)
From the SageMaker dashboard, I can even see that my endpoint and the status is "in-service"
If I run aws sagemaker list-endpoints I can see my desired endpoint showing up correctly as well.
My issue is when I run this code (outside of sagemaker), I'm getting an error:
import boto3
sm_runtime_client = boto3.client('sagemaker-runtime')
payload = "somestring that is used here"
response = sm_runtime_client.invoke_endpoint(EndpointName='pytorch-inference-xxxx',ContentType='text/plain',Body=payload)
This is the error thrown
botocore.errorfactory.ValidationError: An error occurred (ValidationError) when calling the InvokeEndpoint operation: Endpoint pytorch-inference-xxxx of account xxxxxx not found.
This is quite strange as I'm able to see and run the endpoint just fine from sagemaker notebook and I am able to run the predict() method too.
I have verified the region, endpoint name and the account number.

I was having the exact same error, I've just fixed mine by setting the correct region.
I have verified the region, endpoint name and the account number.
I know that you have indicated that you have verified the region, but in my case, the remote computer had another region configured. So I just ran the following command on my remote computer
aws configure
And once I set the key ID and secret key again, I set the correct region and the error was gone.

Related

Azure Function + Rider Credential issue resulting in "Message: AKV10032" error

I am trying to debug and work with an Azure function in Rider - this error only occurs when I run it locally, deploying the function to Azure works correctly.
When I run the this block of code
default_credentials = DefaultAzureCredential()
keyvault = SecretClient(
vault_url=azure_shared.key_vault,
credential=default_credentials
)
api_key = keyvault.get_secret("apikey").value
I get the following error:
ClientAuthenticationError: (Unauthorized) AKV10032: Invalid issuer. Expected one of https://sts.windows.net/xxxxxx-xxxx-xxxx-xxxx-4a5f0358090a/, https://sts.windows.net/xxxxxx-xxxx-xxxx-xxxx-5f571e91255a/, https://sts
.windows.net/xxxxxx-xxxx-xxxx-xxxx-dee5fc7331f4/, found https://sts.windows.net/xxxxxx-xxxx-xxxx-xxxx-579c58293b4b/.
I only have one subscription.
AZ ACCOUNT SHOW confirms the account I am logged in as is the one ending in 90a, so an expected account.
However, if I run AZ LOGIN and login with my work account, the tenantId is the b4b one.
Why the heck is Rider / Azure Functions using a different credential that I have provided? Is it stored somewhere locally?

Thank you JamesTran-MSFT | Microsoft Docs and User Madhanlal - Stack Overflow Stack Overflow. Posting your suggestions as answer to help other community members.
You can try below way to resolve AKV10032: Invalid issuer. Expected one of https://sts.windows.net/... error:
This error could be cross-tenant issue;
If you set the sub as default just before, it should work:
az account set --subscription {SubID}
az keyvault secret list --vault-name myVault
Re-execuate the code:
default_credentials = DefaultAzureCredential()
keyvault = SecretClient(
vault_url=azure_shared.key_vault,
credential=default_credentials
)
api_key = keyvault.get_secret("apikey").value
References: Unable to retrieve password from keyvault - ERROR: AKV10032: Invalid issuer - Microsoft Q&A and How to solve azure keyvault secrets (Unauthorized) AKV10032: Invalid issuer. error in Python - Stack Overflow

Uploading file with python returns Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>

blob.upload_from_filename(source) gives the error
raise exceptions.from_http_status(response.status_code, message, >response=response)
google.api_core.exceptions.Forbidden: 403 POST >https://www.googleapis.com/upload/storage/v1/b/bucket1-newsdata->bluetechsoft/o?uploadType=multipart: ('Request failed with status >code', 403, 'Expected one of', )
I am following the example of google cloud written in python here!
from google.cloud import storage
def upload_blob(bucket, source, des):
client = storage.Client.from_service_account_json('/path')
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket)
blob = bucket.blob(des)
blob.upload_from_filename(source)
I used gsutil to upload files, which is working fine.
Tried to list the bucket names using the python script which is also working fine.
I have necessary permissions and GOOGLE_APPLICATION_CREDENTIALS set.

This whole things wasn't working because I didn't have permission storage admin in the service account that I am using in GCP.
Allowing storage admin to my service account solved my problem.

As other answers have indicated that this is related to the issue of permission, I have found one following command as useful way to create default application credential for currently logged in user.
Assuming, you got this error, while running this code in some machine. Just following steps would be sufficient:
SSH to vm where code is running or will be running. Make sure you are user, who has permission to upload things in google storage.
Run following command:
gcloud auth application-default login
This above command will ask to create token by clicking on url. Generate token and paste in ssh console.
That's it. All your python application started as that user, will use this as default credential for storage buckets interaction.
Happy GCP'ing :)

This question is more appropriate for a support case.
As you are getting a 403, most likely you are missing a permission on IAM, the Google Cloud Platform support team will be able to inspect your resources and configurations.

This is what worked for me when the google documentation didn't work. I was getting the same error with the appropriate permissions.
import pathlib
import google.cloud.storage as gcs
client = gcs.Client()
#set target file to write to
target = pathlib.Path("local_file.txt")
#set file to download
FULL_FILE_PATH = "gs://bucket_name/folder_name/file_name.txt"
#open filestream with write permissions
with target.open(mode="wb") as downloaded_file:
#download and write file locally
client.download_blob_to_file(FULL_FILE_PATH, downloaded_file)

Azure container instances deployment failed

I am deploying a machine learning image to Azure Container Instances from Azure Machine Learning services according to this article, but am always stuck with the error message:
Aci Deployment failed with exception: Your container application crashed. This may be caused by errors in your scoring file's init() function.
Please check the logs for your container instance xxxxxxx'.
I tried:
increasing memory_gb=4 in aci_config.
I did
troubleshooting locally, but I could not have found any.
Below is my score.py
def init():
global model
model_path = Model.get_model_path('pofc_fc_model')
model = joblib.load(model_path)
def run(raw_data):
data = np.array(json.loads(raw_data)['data'])
y_hat = model.predict(data)
return y_hat.tolist()

Have you registered the model 'pofc_fc_model' in your workspace using the register() function on the model object? If not, there will be no model path and that can cause failure.
See this section on model registration: https://learn.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where#registermodel

Trying to connect to Google cloud storage (GCS) using python

I've build the following script:
import boto
import sys
import gcs_oauth2_boto_plugin
def check_size_lzo(ds):
# URI scheme for Cloud Storage.
CLIENT_ID = 'myclientid'
CLIENT_SECRET = 'mysecret'
GOOGLE_STORAGE = 'gs'
dir_file= 'date_id={ds}/apollo_export_{ds}.lzo'.format(ds=ds)
gcs_oauth2_boto_plugin.SetFallbackClientIdAndSecret(CLIENT_ID, CLIENT_SECRET)
uri = boto.storage_uri('my_bucket/data/apollo/prod/'+ dir_file, GOOGLE_STORAGE)
key = uri.get_key()
if key.size < 45379959:
raise ValueError('umg lzo file is too small, investigate')
else:
print('umg lzo file is %sMB' % round((key.size/1e6),2))
if __name__ == "__main__":
check_size_lzo(sys.argv[1])
It works fine locally but when I try and run on kubernetes cluster I get the following error:
boto.exception.GSResponseError: GSResponseError: 403 Access denied to 'gs://my_bucket/data/apollo/prod/date_id=20180628/apollo_export_20180628.lzo'
I have updated the .boto file on my cluster and added my oauth client id and secret but still having the same issue.
Would really appreciate help resolving this issue.
Many thanks!

If it works in one environment and fails in another, I assume that you're getting your auth from a .boto file (or possibly from the OAUTH2_CLIENT_ID environment variable), but your kubernetes instance is lacking such a file. That you got a 403 instead of a 401 says that your remote server is correctly authenticating as somebody, but that somebody is not authorized to access the object, so presumably you're making the call as a different user.
Unless you've changed something, I'm guessing that you're getting the default Kubernetes Engine auth, with means a service account associated with your project. That service account probably hasn't been granted read permission for your object, which is why you're getting a 403. Grant it read/write permission for your GCS resources, and that should solve the problem.
Also note that by default the default credentials aren't scoped to include GCS, so you'll need to add that as well and then restart the instance.

Not authorized for images: boto3 error

I am trying to run this script:
from __future__ import print_function
import paramiko
import boto3
#print('Loading function')
paramiko.util.log_to_file("/tmp/Dawny.log")
# List of EC2 variables
region = 'us-east-1'
image = 'ami-<>'
keyname = '<>.pem'
ec2 = boto3.resource('ec2')
instances = ec2.create_instances(
ImageId=image, MinCount=1, MaxCount=1,
InstanceType = 't2.micro', KeyName=keyname)
instance = instances[0]
instance.wait_until_running()
instance.load()
print(instance.public_dns_name)
I am running this script on a server which has all the aws configurations done (in aws configure)
And, when I run it, I get this error:
botocore.exceptions.ClientError: An error occurred (AuthFailure) when calling the RunInstances operation: Not authorized for images:
[ami-<>]
Any reason why? And, how do I solve it?
[The image is private. But, as I have configured boto on the server, technically, it shouldn't be a problem, right?]

There is few answer for this error
Insufficient parameter, but create_instance give your other error. e.g. VPC-id, subnet-ID, Security group are missing.
Your API Access key in credential doesn't have any right to initate run-instance. Please go to IAM and check whether your user are given adequate roles to perform the task.

You might run into this error if you try to use the Key Pair file name instead of the actual name in AWS Console > EC2 > Key Pairs
aws ec2 run-instances --image-id ami-123457916 --instance-type t3.nano
--key-name **my_ec2_keypair.pem**
Should be name of the KeyPair, not the filename of the KeyPair.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

AWS Sagemaker InvokeEndpoint: operation: Endpoint of account not found - python

Related

Azure Function + Rider Credential issue resulting in "Message: AKV10032" error

Uploading file with python returns Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>

Azure container instances deployment failed

Trying to connect to Google cloud storage (GCS) using python

Not authorized for images: boto3 error

Categories

Resources