I am facing an issue with serving tensorflow models on AWS SageMaker. I have trained the model outside of the SageMaker environment, now I have a savedmodel.pb file and I need to deploy it on a SageMaker endpoint. So I simply zipped the model file and uploaded it to an S3 bucket.
Now, when trying to create an endpoint, I get the following error in my Cloudwatch log:
tensorflow_serving/sources/storage_path/file_system_storage_path_source.cc:369]
FileSystemStoragePathSource encountered a file-system access error:
Could not find base path /opt/ml/model/export/Servo for servable
generic_model
I believe SageMaker is looking for the tar.gz to follow a particular directory structure. However, all I have is a .pb file.
TensorFlow Serving expects the following folder structure:
export/Servo/{version_number} where the version number is any valid positive number.
SageMaker expects the same directory format of TFS, there is a GH issue about this https://github.com/aws/sagemaker-python-sdk/issues/599
Related
I want to download all model pickle files which are registered in azureml workspace to a local folder. is this possible? using python only don't want to manually download each pickle file using ui
once you retrieve your model from your workspace you can do the following
model.download(exist_ok=True)
This is shown in the docs- https://learn.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py#azureml-core-model-model-download
This answer form #Ninja_coder for this question might be helpful- Model.get_model_path(model_name="model") throws an error: Model not found in cache or in root at
I using kedro in conjunction with databricks-connect to run my machine learning model. I am trained and tested the model using databricks notebook and saved the pickle file of the model to azure blob storage. To test the pipeline locally I downloaded the pickle file and stored it locally. kedro reads in the file fine when stored locally, however the problem is when I try to read in the file into kedro directly from azure; I get the following error : "Invalid Load Key '?'
Unfortunately I cannot post any screenshots since I am doing this for my job. My thinking is that the pickle file is stored differently while on azure and when downloaded locally it is stored correctly.
For a bit more clarity I am attempting to store a logistic Regression model trained used statsmodels (sm.logit)
Pickles unfortunately aren't guaranteed to work across environments, at the very least the python version (and possibly OS) need to be identical. Is there a mismatch here?
I'm trying to deploy a .tflite to Firebase ML so that I can distribute it from there.
I used transfer learning on this TF Hub model. I then followed this tutorial to convert the model .tflite format.
This model gives good results in the python TFLite interpreter and can be used on Android if I package with the app.
However I want to serve the model via Firebase, so I use this tutorial to deploy the .tflite file to Firebase. Using this tutorial, I get an error firebase_admin.exceptions.FailedPreconditionError: Cannot publish a model that is not verified..
I can't find any information about this error anywhere, and given the model works on both Android and Python, I'm at a loss as to what could be causing this
Did you solve this issue? I had the same one, and it turned out the model size should be < 40mb. That caused the error, and the detailed error is only reported when uploading a model manually through web Firebase dashboard
I see that the trained models are served from the file systems while using tensorflow serving, is it ok to rely on the file system alone.
Is there a mechanism to restore these models in case the we lose all these files?
You can create a custom serving image that has your model built into the container that you can deploy using docker and will load your model for serving on startup. Ref: TensorFlow Serving with Docker
You can also keep your model in Google cloud platform and use the GS path directly to serve models using below command.
tensorflow_model_server --port=8080 --model_base_path=gs://bucket_name/folder_models/
I have trained a model using sagemaker SDK using script mode. When I am deploying it I am getting this error.
Also, I have tried with the transformer, getting the same error.
It's working fine when I am training using the container.
Need help on this...