Hosting a word2vec model with gensim on AWS lambda
using python 2.7
boto==2.48.0
gensim==3.4.0
and I have a few lines in my function.py file where I load the model directly from s3
print('################### connecting to s3...')
s3_conn = boto.s3.connect_to_region(
region,
aws_access_key_id = Aws_access_key_id,
aws_secret_access_key = Aws_secret_access_key,
is_secure = True,
calling_format = OrdinaryCallingFormat()
)
print('################### connected to s3...')
bucket = s3_conn.get_bucket(S3_BUCKET)
print('################### got bucket...')
key = bucket.get_key(S3_KEY)
print('################### got key...')
model = KeyedVectors.load_word2vec_format(key, binary=True)
print('################### loaded model...')
on the model loading line
model = KeyedVectors.load_word2vec_format(key, binary=True)
getting a mysterious error without much details:
on the cloud watch can see all of my print messages til '################### got key...' inclusive,
then I get:
START RequestId: {req_id} Version: $LATEST
then right after it [no time delays between these two messages]
module initialization error: __exit__
please, is there a way to get a detailed error or more info?
More background details :
I was able to download the model from s3 to /tmp/ and it did authorize and retrieve the model file, but it went out of space [file is ~2GB, /tmp/ is 512MB]
so, switched to directly loading the model by gensim as above and now getting that mysterious error.
running the function with python-lambda-local works without issues
so, this probably narrows it down to an issue with gensim's smart open or aws lambda, would appreciate any hints, thanks!
instead of connecting using boto,
simply:
model = KeyedVectors.load_word2vec_format('s3://{}:{}#{}/{}'.format(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, S3_BUCKET, S3_KEY), binary=True)
worked!
but of course, unfortunately, it doesn't answer the question on why the mysterious exit error came up and how to get more info :/
Related
I am a beginner in using Watson Visual Recognition and am trying to create a custom classifier to classify dog images. However, when trying to create my classifier as shown in the code snippet below, I get an error.
with open ('beagle.zip','rb') as beagles,open ('golden-retriever.zip','rb') as golden_retrievers,open `('husky.zip','rb') as huskies:`
classifier = visrec.create_classifier(name = 'dog_classifier',positive_examples = `{'beagles':beagles,'golden_retrievers':golden_retrievers,'huskies': 'huskies'})`
Here is the error:
classifier = visrec.create_classifier(name = 'dog_classifier',positive_examples = {'beagles':beagles,'golden_retrievers':golden_retrievers,'huskies': 'huskies'})
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/ibm_watson/visual_recognition_v3.py", line 282, in create_classifier
response = self.send(request)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/ibm_cloud_sdk_core/base_service.py", line 302, in send
raise ApiException(response.status_code, http_response=response)
ibm_cloud_sdk_core.api_exception.ApiException: Error: <HTML><HEAD>
<TITLE>Internal Server Error</TITLE>
</HEAD><BODY>
<H1>Internal Server Error - Write</H1>
The server encountered an internal error or misconfiguration and was unable to
complete your request.<P>
Reference #4.9436d517.1617113574.3744472
</BODY></HTML>
, Code: 503
How can I fix this?
503 indicates that the service is unavailable. If could have been down when you were trying to hit it.
Current status shows it is up - https://cloud.ibm.com/status
Are you still getting the same error?
Sample code from the API documentation (https://cloud.ibm.com/apidocs/visual-recognition/visual-recognition-v3?code=python#createclassifier) is
import json
from ibm_watson import VisualRecognitionV3
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
authenticator = IAMAuthenticator('{apikey}')
visual_recognition = VisualRecognitionV3(
version='2018-03-19',
authenticator=authenticator
)
visual_recognition.set_service_url('{url}')
with open('./beagle.zip', 'rb') as beagle, open(
'./golden-retriever.zip', 'rb') as goldenretriever, open(
'./husky.zip', 'rb') as husky, open(
'./cats.zip', 'rb') as cats:
model = visual_recognition.create_classifier(
'dogs',
positive_examples={'beagle': beagle, 'goldenretriever': goldenretriever, 'husky': husky},
negative_examples=cats).get_result()
print(json.dumps(model, indent=2))
Make sure that you have the set the service url correctly. The endpoint should match the region in which you have created your visual recognition service. You can verify what the url should be, as it will be available in the same place you get your API key from.
Thinking about this as I wrote the previous paragraph, I realise that as you are new to Watson Visual Recognition, you may not have created an instance of the service or generated an API key. As the service has been deprecated, you may not be able to do so. If this is the case, then I am afraid you won't be able to make use of the service.
I have a running MlFlow server on GCS VM instance. I have created a bucket to log the artifacts.
This is the command I'm running to start the server and for specifying bucket path-
mlflow server --default-artifact-root gs://gcs_bucket/artifacts --host x.x.x.x
But facing this error:
TypeError: stat: path should be string, bytes, os.PathLike or integer, not ElasticNet
Note- The mlflow server is running fine with the specified host alone. The problem is in the way when I'm specifying the storage bucket path.
I have given permission of storage api by using these commands:
gcloud auth application-default login
gcloud auth login
Also, on printing the artifact URI, this is what I'm getting:
mlflow.get_artifact_uri()
Output:
gs://gcs_bucket/artifacts/0/122481bf990xxxxxxxxxxxxxxxxxxxxx/artifacts
So in the above path from where this is coming 0/122481bf990xxxxxxxxxxxxxxxxxxxxx/artifacts and why it's not getting auto-created at gs://gcs_bucket/artifacts
After debugging more, why it's not able to get the local path from VM:
And this error I'm getting on VM:
ARNING:root:Malformed experiment 'mlruns'. Detailed error Yaml file './mlruns/mlruns/meta.yaml' does not exist.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/mlflow/store/tracking/file_store.py", line 197, in list_experiments
experiment = self._get_experiment(exp_id, view_type)
File "/usr/local/lib/python3.6/dist-packages/mlflow/store/tracking/file_store.py", line 256, in _get_experiment
meta = read_yaml(experiment_dir, FileStore.META_DATA_FILE_NAME)
File "/usr/local/lib/python3.6/dist-packages/mlflow/utils/file_utils.py", line 160, in read_yaml
raise MissingConfigException("Yaml file '%s' does not exist." % file_path)
mlflow.exceptions.MissingConfigException: Yaml file './mlruns/mlruns/meta.yaml' does not exist.
Can I get a solution to this and what I'm missing?
I think the main error is from the structure that you want to deploy. For your use case, the structure is suitable that in here. So you miss the URI path which used to store backend metadata. So please install DB SQL(PostgreSQL,...) first, then add the path to --backend-storage-uri.
In case you want to use MlFlow as a model registry and store images on gcs. You can use this structure in here with adding tag --artifacts-only --serve-artifacts
Hope this can help you.
blob.upload_from_filename(source) gives the error
raise exceptions.from_http_status(response.status_code, message, >response=response)
google.api_core.exceptions.Forbidden: 403 POST >https://www.googleapis.com/upload/storage/v1/b/bucket1-newsdata->bluetechsoft/o?uploadType=multipart: ('Request failed with status >code', 403, 'Expected one of', )
I am following the example of google cloud written in python here!
from google.cloud import storage
def upload_blob(bucket, source, des):
client = storage.Client.from_service_account_json('/path')
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket)
blob = bucket.blob(des)
blob.upload_from_filename(source)
I used gsutil to upload files, which is working fine.
Tried to list the bucket names using the python script which is also working fine.
I have necessary permissions and GOOGLE_APPLICATION_CREDENTIALS set.
This whole things wasn't working because I didn't have permission storage admin in the service account that I am using in GCP.
Allowing storage admin to my service account solved my problem.
As other answers have indicated that this is related to the issue of permission, I have found one following command as useful way to create default application credential for currently logged in user.
Assuming, you got this error, while running this code in some machine. Just following steps would be sufficient:
SSH to vm where code is running or will be running. Make sure you are user, who has permission to upload things in google storage.
Run following command:
gcloud auth application-default login
This above command will ask to create token by clicking on url. Generate token and paste in ssh console.
That's it. All your python application started as that user, will use this as default credential for storage buckets interaction.
Happy GCP'ing :)
This question is more appropriate for a support case.
As you are getting a 403, most likely you are missing a permission on IAM, the Google Cloud Platform support team will be able to inspect your resources and configurations.
This is what worked for me when the google documentation didn't work. I was getting the same error with the appropriate permissions.
import pathlib
import google.cloud.storage as gcs
client = gcs.Client()
#set target file to write to
target = pathlib.Path("local_file.txt")
#set file to download
FULL_FILE_PATH = "gs://bucket_name/folder_name/file_name.txt"
#open filestream with write permissions
with target.open(mode="wb") as downloaded_file:
#download and write file locally
client.download_blob_to_file(FULL_FILE_PATH, downloaded_file)
I am deploying a machine learning image to Azure Container Instances from Azure Machine Learning services according to this article, but am always stuck with the error message:
Aci Deployment failed with exception: Your container application crashed. This may be caused by errors in your scoring file's init() function.
Please check the logs for your container instance xxxxxxx'.
I tried:
increasing memory_gb=4 in aci_config.
I did
troubleshooting locally, but I could not have found any.
Below is my score.py
def init():
global model
model_path = Model.get_model_path('pofc_fc_model')
model = joblib.load(model_path)
def run(raw_data):
data = np.array(json.loads(raw_data)['data'])
y_hat = model.predict(data)
return y_hat.tolist()
Have you registered the model 'pofc_fc_model' in your workspace using the register() function on the model object? If not, there will be no model path and that can cause failure.
See this section on model registration: https://learn.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where#registermodel
I cannot give too many details due to confidentiality, but I will try to specify as best as I can.
I have an AWS role that is going to be used to call an API and has the correct permissions.
I am using Boto3 to attempt to assume the role.
In my python code I have
sts_client = boto3.client('sts')
response = sts_client.assume_role(
RoleArn="arn:aws:iam::ACCNAME:role/ROLENAME",
RoleSessionName="filler",
)
With this code, I get this error:
"An error occurred (InvalidClientTokenId) when calling the AssumeRole operation: The security token included in the request is invalid."
Any help would be appreciated. Thanks
When you construct the client in this way, e.g. sts_client = boto3.client('sts'), it uses the boto3 DEFAULT_SESSION, which pulls from your ~/.aws/credentials file (possibly among other locations; I did not investigate further).
When I ran into this, the values for aws_access_key_id, aws_secret_access_key, and aws_session_token were stale. Updating them in the default configuration file (or simply overriding them directly in the client call) resolved this issue:
sts_client = boto3.client('sts',
aws_access_key_id='aws_access_key_id',
aws_secret_access_key='aws_secret_access_key',
aws_session_token='aws_session_token')
As an aside, I found that enabling stream logging was helpful and used the output to dive into the boto3 source code and find the issue: boto3.set_stream_logger('').