I am trying to deploy my object detection model that was trained using tensorflow to sagemaker. I was able to deploy it without specifying any entry points during model creation but it turns out doing that will only work for small sizes images (Sagemaker has limit of 5MB). The code I used for this is as:
from sagemaker.tensorflow.serving import Model
# Initialize model ...
model = Model(
model_data= s3_path_for_model,
role=sagemaker_role,
framework_version="1.14",
env=env)
# Deploy model ...
predictor = model.deploy(initial_instance_count=1,
instance_type='ml.t2.medium')
# Test using an image ...
import cv2
import numpy as np
image_content = cv2.imread("PATH_TO_IMAGE",
1).astype('uint8').tolist()
body = {"instances": [{"inputs": image_content}]}
# Works fine for small images ...
# I get predictions perfectly with this ...
results = predictor.predict(body)
So, I googled around and found that I need to pass an entry_point for Model() in order to predict for larger images. Something like:
model = Model(
entry_point="inference.py",
dependencies=["requirements.txt"],
model_data= s3_path_for_model,
role=sagemaker_role,
framework_version="1.14",
env=env
)
But doing this gives FileNotFoundError: [Errno 2] No such file or directory: 'inference.py'. A little help here please. I am using sagemaker-python-sdk.
My folder structure is as:
model
|__ 001
|__saved_model.pb
|__variables
|__<contents here>
|__ code
|__inference.py
|__requirements.txt
Note: I have also tried, ./code/inference.py and /code/inference.py.
5MB is a hard limit for real-time endpoints.
Are you sure you need to pass such large images for prediction? Most use cases work fine with smaller, lower resolution images.
If you need real-time prediction, one workaround would be to pass the image S3 URI in the prediction request (instead of the image itself), and load the image from S3.
If you don't need real-time prediction, you should look at batch transform, which doesn't enforce that size restriction: https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html
Related
I want to cache my model results in order to make predictions without redoing the clustering.
I read that I can do that with memory parameter in HDBSCAN.
I did that instead because I wanted to save the file in the same directory as my script instead of '/tmp/joblib' that's here ((HDBSCAN cluster caching and persistance)) :
clusterer = hdbscan.HDBSCAN(min_cluster_size=30, prediction_data=True).fit(data)
# save the model to disk
filename = 'finalized_model.joblib'
joblib.dump(clusterer, filename)
I then tried to load the model in a different file:
from joblib import load
# load the model
model = load('finalized_model.joblib')
# make predictions
test_labels, strengths = model.approximate_predict(model, test_points)
But I got this error: AttributeError: 'HDBSCAN' object has no attribute 'approximate_predict'
Last time I got this error, it was because prediction_data was not set to True, but what's the problem now?
approximate_predict() is under hdbscan package itself, instead of a HDBSCAN object.
Here's what you need to do:
from joblib import load
import hdbscan
# load the model
model = load('finalized_model.joblib')
# make predictions
test_labels, strengths = hdbscan.approximate_predict(model, test_points)
API Reference:
https://hdbscan.readthedocs.io/en/latest/api.html#hdbscan.prediction.approximate_predict
System information
Google Colab
When I run the example provided by official tensorflow basic text classification, everything runs fine until the model save, but when I load the model it gives me this error.
RuntimeError: Unable to restore a layer of class TextVectorization. Layers of class TextVectorization require that the class be provided to the model loading code, either by registering the class using #keras.utils.register_keras_serializable on the class def and including that file in your program, or by passing the class in a keras.utils.CustomObjectScope that wraps this load call.
Expected Behavior: Model should be loaded successfully and process the raw input
https://colab.research.google.com/gist/amahendrakar/8b65a688dc87ce9ca07ffb0ce50b84c7/44199.ipynb#scrollTo=fEjmSrKIqiiM
Example Link: https://tensorflow.google.cn/tutorials/keras/text_classification
I also ran into this error message (RuntimeError: Unable to restore a layer of class TextVectorization. [...]) when I implemented (and customized) the code from the "Basic Text Classification" tutorial.
Instead of running the code in a notebook, I have two scripts, one for building, training and saving the model and the other one for loading it and making predictions. (Thus, the error does not seem to be limited to Google Colab).
This is what I had to do (see https://github.com/tensorflow/tensorflow/issues/45231):
First, I added this line in the first script before the function definition and built, trained and saved the model again:
#tf.keras.utils.register_keras_serializable()
def custom_standardization(input_data):
[...]
# Save model as SavedModel
export_model.save(model_path, save_format='tf')
Secondly, I also had to add the same line and the whole function definition in the second script to make sure that it works if I restart(!) ipython (where I currently run the scripts) and only run the second script:
#tf.keras.utils.register_keras_serializable()
def custom_standardization(input_data):
lowercase = tf.strings.lower(input_data)
stripped_html = tf.strings.regex_replace(lowercase, '<br />', ' ')
return tf.strings.regex_replace(stripped_html,
'[%s]' % re.escape(string.punctuation),
'')
[...]
# Load model
reloaded_model = tf.keras.models.load_model(model_path)
# Make predictions
predictions = reloaded_model.predict(examples)
Note: If I run the second script without restarting ipython after running the first script, I get this error:
ValueError: Custom>custom_standardization has already been registered [...]
Alternatively, you can just use the default standardization method in the vectorizer layer when building the model:
vectorize_layer = TextVectorization(
standardize="lower_and_strip_punctuation",
max_tokens=max_features,
output_mode='int',
output_sequence_length=sequence_length)
I got something working as Hassan describes it, I think. Not sure it's the right way, but it seems to work for me...
I define, train, and archive the model in one notebook
I un-archive it, load it, and use it for predictions from another notebook.
See here: https://github.com/OlivierLD/oliv-ai/tree/master/JupyterNotebooks/tf-tutorials/sentiment-analysis
I trained my model in Amazon-SageMaker and downloaded it to my local computer. Unfortunately, I don't have any idea how to run the model locally.
The Model is in a directory with files like:
image-classification-0001.params
image-classification-0002.params
image-classification-0003.params
image-classification-0004.params
image-classification-0005.params
image-classification-symbol.json
model-shapes.json
Would anyone know how to run this locally with Python, or be able to point me to a resource that could help? I am trying to avoid calling the model using the Amazon API.
Edit: The model I used was created with code very similar to this example.
Any help is appreciated, I will award the bounty to whoever is most helpful, even if they don't completely solve the question.
This is not a complete answer as I do not have SageMaker setup (And I do not know MXNet) and so I can not practically test this approach (yes, as already mentioned, I do not want to call this a complete answer rather a probable pointer/approach to solve this issue).
The Assumption -
You mentioned a that your model is very similar to the notebook link you provided. If you read the text in the notebook carefully, you will see at some point there is something like this -
"In this demo, we are using Caltech-256 dataset, which contains 30608 images of 256 objects. For the training and validation data, we follow the splitting scheme in this MXNet example."
See the mention of MXNet there? Let us assume that you did not change a lot and hence your model is built using MXNet as well.
The Approach -
Assuming what I just mentioned, if you go and search in the documentation of AWS SageMaker Python SDK you will see a section about serialization of the modules. Which again, by itself, starts with another assumption -
"If you train function returns a Module object, it will be serialized by the default Module serialization system, unless you've specified a custom save function."
Assuming that this is True for your case, further reading in the same document tells us that "model-shapes.json" is a JSON serialised representation of your models, "model-symbol.json" is the serialization of the module symbols created by calling the 'save' function on the 'symbol' property of module, and finally "module.params" is the serialized (I am not sure if it is text or binary format) form of the module parameters.
Equipped with this knowledge we go and look into the documentation of MXNet. And Voila! We see here how we can save and load models with MXNet. So as you already have those saved files, you just need to load them in a local installation of MXNet and then run them to predict the unknown.
I hope this will help you to find a direction to solve your problem.
Bonus -
I am not sure if this also can do the same job, (it is also mentioned by #Seth Rothschild in the comments) but it should, you can see that AWS SageMaker Python SDK has a way to load models from saved ones as well.
Following SRC's advice, I was able to get it to work by following the instructions in this question and this doc which describe how to load a MXnet model.
I loaded the model like so:
lenet_model = mx.mod.Module.load('model_directory/image-classification',5)
image_l = 64
image_w = 64
lenet_model.bind(for_training=False, data_shapes=[('data',(1,3,image_l,image_w))],label_shapes=lenet_model._label_shapes)
Then predicted using the slightly modified helper functions in the previously linked documentation:
import mxnet as mx
import matplotlib.pyplot as plot
import cv2
import numpy as np
from mxnet.io import DataBatch
def get_image(url, show=False):
# download and show the image
fname = mx.test_utils.download(url)
img = cv2.cvtColor(cv2.imread(fname), cv2.COLOR_BGR2RGB)
if img is None:
return None
if show:
plt.imshow(img)
plt.axis('off')
# convert into format (batch, RGB, width, height)
img = cv2.resize(img, (64, 64))
img = np.swapaxes(img, 0, 2)
img = np.swapaxes(img, 1, 2)
img = img[np.newaxis, :]
return img
def predict(url, labels):
img = get_image(url, show=True)
# compute the predict probabilities
lenet_model.forward(DataBatch([mx.nd.array(img)]))
prob = lenet_model.get_outputs()[0].asnumpy()
# print the top-5
prob = np.squeeze(prob)
a = np.argsort(prob)[::-1]
for i in a[0:5]:
print('probability=%f, class=%s' %(prob[i], labels[i]))
Finally I called the prediction with this code:
labels = ['a','b','c', 'd','e', 'f']
predict('https://eximagesite/img_tst_a.jpg', labels )
If you want to host your trained model locally, and you are using Apache MXNet as your model framework (as you have in the above example), the simplest way is to use MXNet Model Server: https://github.com/awslabs/mxnet-model-server
Once you installed it locally, you can start serving using:
mxnet-model-server \
--models squeezenet=https://s3.amazonaws.com/model-server/models/squeezenet_v1.1/squeezenet_v1.1.model
and then call the local endpoint with the image
curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg
curl -X POST http://127.0.0.1:8080/squeezenet/predict -F "data=#kitten.jpg"
I'm trying to use google cloud platform to deploy a model to support prediction.
I train the model (locally) with the following instruction
~/$ gcloud ml-engine local train --module-name trainer.task --package-path trainer
and everything works fine (...):
INFO:tensorflow:Restoring parameters from gs://my-bucket1/test2/model.ckpt-45000
INFO:tensorflow:Saving checkpoints for 45001 into gs://my-bucket1/test2/model.ckpt.
INFO:tensorflow:loss = 17471.6, step = 45001
[...]
Loss: 144278.046875
average_loss: 1453.68
global_step: 50000
loss: 144278.0
INFO:tensorflow:Restoring parameters from gs://my-bucket1/test2/model.ckpt-50000
Mean Square Error of Test Set = 593.1018482
But, when I run the following command to create a version,
gcloud ml-engine versions create Mo1 --model mod1 --origin gs://my-bucket1/test2/ --runtime-version 1.3
Then I get the following error.
ERROR: (gcloud.ml-engine.versions.create) FAILED_PRECONDITION: Field: version.deployment_uri
Error: SavedModel directory gs://my-bucket1/test2/ is expected to contain exactly one
of: [saved_model.pb, saved_model.pbtxt].- '#type': type.googleapis.com/google.rpc.BadRequest
fieldViolations:- description: 'SavedModel directory gs://my-bucket1/test2/ is expected
to contain exactly one of: [saved_model.pb, saved_model.pbtxt].'
field: version.deployment_uri
Here is a screenshot of my bucket. I have a saved model with 'pbtxt' format
my-bucket-image
Finally, I add the piece of code where I save the model in the bucket.
regressor = tf.estimator.DNNRegressor(feature_columns=feature_columns,
hidden_units=[40, 30, 20],
model_dir="gs://my-bucket1/test2",
optimizer='RMSProp'
)
You'll notice that the file in your screenshot is graph.pbtxt whereas saved_model.pb{txt} is needed.
Note that just renaming the file generally will not be sufficient. The training process outputs checkpoints periodically in case restarts happen and recovery is needed. However, those checkpoints (and corresponding graphs) are the training graph. Training graphs tend to have things like file readers, input queues, dropout layers, etc. which are not appropriate for serving.
Instead, TensorFlow requires you to explicitly export a separate graph for serving. You can do this in one of two ways:
During training (typically, after training is complete)
As a separate process after training.
During/After Training
For this, I'll refer you to the Census sample.
First, You'll need a "Serving Input Function", such as
def serving_input_fn():
"""Build the serving inputs."""
inputs = {}
for feat in INPUT_COLUMNS:
inputs[feat.name] = tf.placeholder(shape=[None], dtype=feat.dtype)
features = {
key: tf.expand_dims(tensor, -1)
for key, tensor in inputs.iteritems()
}
return tf.contrib.learn.InputFnOps(features, None, inputs)
The you can simply call:
regressor.export_savedmodel("path/to/model", serving_input_fn)
Or, if you're using learn_runner/Experiment, you'll need to pass an ExportStrategy like the following to the constructor of Experiment:
export_strategies=[saved_model_export_utils.make_export_strategy(
serving_input_fn,
exports_to_keep=1,
default_output_alternative_key=None,
)]
After Training
Almost exactly the same steps as above, but just in a separate Python script you can run after training is over (in your case, this is beneficial because you won't have to retrain). The basic idea is to construct the Estimator with the same model_dir used in training, then to call export as above, something like:
def serving_input_fn():
"""Build the serving inputs."""
inputs = {}
for feat in INPUT_COLUMNS:
inputs[feat.name] = tf.placeholder(shape=[None], dtype=feat.dtype)
features = {
key: tf.expand_dims(tensor, -1)
for key, tensor in inputs.iteritems()
}
return tf.contrib.learn.InputFnOps(features, None, inputs)
regressor = tf.contrib.learn.DNNRegressor(
feature_columns=feature_columns,
hidden_units=[40, 30, 20],
model_dir="gs://my-bucket1/test2",
optimizer='RMSProp'
)
regressor.export_savedmodel("my_model", serving_input_fn)
EDIT 09/12/2017
One slight change is needed to your training code. You are using tf.estimator.DNNRegressor, but that was introduced in TensorFlow 1.3; CloudML Engine only officially supports TensorFlow 1.2, so you'll need to use tf.contrib.learn.DNNRegressor instead. They are very similar, but one notable difference is that you'll need to use the fit method instead of train.
I had the same error message here, in my case there was two problems:
The path to bucket with misspelling
Wrong saved_file.pbtxt (with the first error message I put another renamed .pbtxt file in the same bucket with my model classes and this make the problem persist after the path corrected)
The command worked after delete the wrong file and correct the path.
I hope this helps too.
I have created a script to evaluate a TensorFlow convolutional neural network. It loads some images and does some simple preprocessing:
def main(argv):
classifier = import_model()
for path in argv[1:]:
image_reversed = imread(path).astype(np.float32)
image_unlayered = np.transpose(image_reversed, (1, 0, 2))
image = np.reshape(image_unlayered, [1, -1, 480, 3])
angle = infer_steering_angle(classifier, image)
print("Steering angle %f for image %s." % (angle, path))
It imports the model using a network structure function in another file that has been verified to at least mostly work and is used to train a network:
def import_model():
# Load estimator
classifier = learn.Estimator(
model_fn=cnn_model_fn,
model_dir="/tmp/network2"
)
return classifier
and finally, it uses the Estimator.predict function to pass the single image to the network, overriding the default batch_size of 10 and setting it to 1. It returns a tensor with a single element, which should correspond to the steering angle (this is an end-to-end autonomous driving regression problem).
def infer_steering_angle(classifier, image):
output = classifier.predict(
x=image,
batch_size=1
)
for angle in output:
return angle
The problem is, it always outputs 0.0 for the steering angle. I've looked over all of this several times, and the only thing I can think of is that I'm misunderstanding the Estimator.predict function. It's rather poorly documented, in that it lacks concrete examples of how it should be used. Does anybody notice anything wrong with how I'm formatting the input or parsing the output?
UPDATE:
I tried putting this code right in the training file, so the importing can't be the problem. I'm starting to become suspicious it's a problem with the model itself. The code is at https://hastebin.com/rakulonebu.py.