How to Deploy Amazon-SageMaker Locally in Python

How to Deploy Amazon-SageMaker Locally in Python - python

I trained my model in Amazon-SageMaker and downloaded it to my local computer. Unfortunately, I don't have any idea how to run the model locally.
The Model is in a directory with files like:
image-classification-0001.params
image-classification-0002.params
image-classification-0003.params
image-classification-0004.params
image-classification-0005.params
image-classification-symbol.json
model-shapes.json
Would anyone know how to run this locally with Python, or be able to point me to a resource that could help? I am trying to avoid calling the model using the Amazon API.
Edit: The model I used was created with code very similar to this example.
Any help is appreciated, I will award the bounty to whoever is most helpful, even if they don't completely solve the question.

This is not a complete answer as I do not have SageMaker setup (And I do not know MXNet) and so I can not practically test this approach (yes, as already mentioned, I do not want to call this a complete answer rather a probable pointer/approach to solve this issue).
The Assumption -
You mentioned a that your model is very similar to the notebook link you provided. If you read the text in the notebook carefully, you will see at some point there is something like this -
"In this demo, we are using Caltech-256 dataset, which contains 30608 images of 256 objects. For the training and validation data, we follow the splitting scheme in this MXNet example."
See the mention of MXNet there? Let us assume that you did not change a lot and hence your model is built using MXNet as well.
The Approach -
Assuming what I just mentioned, if you go and search in the documentation of AWS SageMaker Python SDK you will see a section about serialization of the modules. Which again, by itself, starts with another assumption -
"If you train function returns a Module object, it will be serialized by the default Module serialization system, unless you've specified a custom save function."
Assuming that this is True for your case, further reading in the same document tells us that "model-shapes.json" is a JSON serialised representation of your models, "model-symbol.json" is the serialization of the module symbols created by calling the 'save' function on the 'symbol' property of module, and finally "module.params" is the serialized (I am not sure if it is text or binary format) form of the module parameters.
Equipped with this knowledge we go and look into the documentation of MXNet. And Voila! We see here how we can save and load models with MXNet. So as you already have those saved files, you just need to load them in a local installation of MXNet and then run them to predict the unknown.
I hope this will help you to find a direction to solve your problem.
Bonus -
I am not sure if this also can do the same job, (it is also mentioned by #Seth Rothschild in the comments) but it should, you can see that AWS SageMaker Python SDK has a way to load models from saved ones as well.

Following SRC's advice, I was able to get it to work by following the instructions in this question and this doc which describe how to load a MXnet model.
I loaded the model like so:
lenet_model = mx.mod.Module.load('model_directory/image-classification',5)
image_l = 64
image_w = 64
lenet_model.bind(for_training=False, data_shapes=[('data',(1,3,image_l,image_w))],label_shapes=lenet_model._label_shapes)
Then predicted using the slightly modified helper functions in the previously linked documentation:
import mxnet as mx
import matplotlib.pyplot as plot
import cv2
import numpy as np
from mxnet.io import DataBatch
def get_image(url, show=False):
# download and show the image
fname = mx.test_utils.download(url)
img = cv2.cvtColor(cv2.imread(fname), cv2.COLOR_BGR2RGB)
if img is None:
return None
if show:
plt.imshow(img)
plt.axis('off')
# convert into format (batch, RGB, width, height)
img = cv2.resize(img, (64, 64))
img = np.swapaxes(img, 0, 2)
img = np.swapaxes(img, 1, 2)
img = img[np.newaxis, :]
return img
def predict(url, labels):
img = get_image(url, show=True)
# compute the predict probabilities
lenet_model.forward(DataBatch([mx.nd.array(img)]))
prob = lenet_model.get_outputs()[0].asnumpy()
# print the top-5
prob = np.squeeze(prob)
a = np.argsort(prob)[::-1]
for i in a[0:5]:
print('probability=%f, class=%s' %(prob[i], labels[i]))
Finally I called the prediction with this code:
labels = ['a','b','c', 'd','e', 'f']
predict('https://eximagesite/img_tst_a.jpg', labels )

If you want to host your trained model locally, and you are using Apache MXNet as your model framework (as you have in the above example), the simplest way is to use MXNet Model Server: https://github.com/awslabs/mxnet-model-server
Once you installed it locally, you can start serving using:
mxnet-model-server \
--models squeezenet=https://s3.amazonaws.com/model-server/models/squeezenet_v1.1/squeezenet_v1.1.model
and then call the local endpoint with the image
curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg
curl -X POST http://127.0.0.1:8080/squeezenet/predict -F "data=#kitten.jpg"

Related

pytorch dataset python script does not run in colab

I've decided to make reusable scripts for frequently used classes. So, i made one for my image dataset and imported it in collab. I can create the dataset object successfully but can't get data from it.
here is my dataset code:
https://pastebin.com/XfAiTe3A
Here is how i use the script in google colab:
from imageDataset import customDataset
dataset = customDataset(train_data)\
dataset[0]
Here is the error:
16 target = self.targets[index]
---> 17 image = io.imread(self.image_paths[index])
18
19 if self.augmentations is not None:
SystemError: <built-in function imread> returned NULL without setting an error
But if i copy paste the code in a jupyter cell , i can use the class like i normally do. What I'm i doing wrong?
any help is appreciated, thanks

I just had to rerun the notebook and the problem fixed itself. Coding truly amazes me

Unable to restore a layer of class TextVectorization - Text Classification

System information
Google Colab
When I run the example provided by official tensorflow basic text classification, everything runs fine until the model save, but when I load the model it gives me this error.
RuntimeError: Unable to restore a layer of class TextVectorization. Layers of class TextVectorization require that the class be provided to the model loading code, either by registering the class using #keras.utils.register_keras_serializable on the class def and including that file in your program, or by passing the class in a keras.utils.CustomObjectScope that wraps this load call.
Expected Behavior: Model should be loaded successfully and process the raw input
https://colab.research.google.com/gist/amahendrakar/8b65a688dc87ce9ca07ffb0ce50b84c7/44199.ipynb#scrollTo=fEjmSrKIqiiM
Example Link: https://tensorflow.google.cn/tutorials/keras/text_classification

I also ran into this error message (RuntimeError: Unable to restore a layer of class TextVectorization. [...]) when I implemented (and customized) the code from the "Basic Text Classification" tutorial.
Instead of running the code in a notebook, I have two scripts, one for building, training and saving the model and the other one for loading it and making predictions. (Thus, the error does not seem to be limited to Google Colab).
This is what I had to do (see https://github.com/tensorflow/tensorflow/issues/45231):
First, I added this line in the first script before the function definition and built, trained and saved the model again:
#tf.keras.utils.register_keras_serializable()
def custom_standardization(input_data):
[...]
# Save model as SavedModel
export_model.save(model_path, save_format='tf')
Secondly, I also had to add the same line and the whole function definition in the second script to make sure that it works if I restart(!) ipython (where I currently run the scripts) and only run the second script:
#tf.keras.utils.register_keras_serializable()
def custom_standardization(input_data):
lowercase = tf.strings.lower(input_data)
stripped_html = tf.strings.regex_replace(lowercase, '<br />', ' ')
return tf.strings.regex_replace(stripped_html,
'[%s]' % re.escape(string.punctuation),
'')
[...]
# Load model
reloaded_model = tf.keras.models.load_model(model_path)
# Make predictions
predictions = reloaded_model.predict(examples)
Note: If I run the second script without restarting ipython after running the first script, I get this error:
ValueError: Custom>custom_standardization has already been registered [...]
Alternatively, you can just use the default standardization method in the vectorizer layer when building the model:
vectorize_layer = TextVectorization(
standardize="lower_and_strip_punctuation",
max_tokens=max_features,
output_mode='int',
output_sequence_length=sequence_length)

I got something working as Hassan describes it, I think. Not sure it's the right way, but it seems to work for me...
I define, train, and archive the model in one notebook
I un-archive it, load it, and use it for predictions from another notebook.
See here: https://github.com/OlivierLD/oliv-ai/tree/master/JupyterNotebooks/tf-tutorials/sentiment-analysis

Sagemaker: Specifying custom entry point gives not found error

I am trying to deploy my object detection model that was trained using tensorflow to sagemaker. I was able to deploy it without specifying any entry points during model creation but it turns out doing that will only work for small sizes images (Sagemaker has limit of 5MB). The code I used for this is as:
from sagemaker.tensorflow.serving import Model
# Initialize model ...
model = Model(
model_data= s3_path_for_model,
role=sagemaker_role,
framework_version="1.14",
env=env)
# Deploy model ...
predictor = model.deploy(initial_instance_count=1,
instance_type='ml.t2.medium')
# Test using an image ...
import cv2
import numpy as np
image_content = cv2.imread("PATH_TO_IMAGE",
1).astype('uint8').tolist()
body = {"instances": [{"inputs": image_content}]}
# Works fine for small images ...
# I get predictions perfectly with this ...
results = predictor.predict(body)
So, I googled around and found that I need to pass an entry_point for Model() in order to predict for larger images. Something like:
model = Model(
entry_point="inference.py",
dependencies=["requirements.txt"],
model_data= s3_path_for_model,
role=sagemaker_role,
framework_version="1.14",
env=env
)
But doing this gives FileNotFoundError: [Errno 2] No such file or directory: 'inference.py'. A little help here please. I am using sagemaker-python-sdk.
My folder structure is as:
model
|__ 001
|__saved_model.pb
|__variables
|__<contents here>
|__ code
|__inference.py
|__requirements.txt
Note: I have also tried, ./code/inference.py and /code/inference.py.

5MB is a hard limit for real-time endpoints.
Are you sure you need to pass such large images for prediction? Most use cases work fine with smaller, lower resolution images.
If you need real-time prediction, one workaround would be to pass the image S3 URI in the prediction request (instead of the image itself), and load the image from S3.
If you don't need real-time prediction, you should look at batch transform, which doesn't enforce that size restriction: https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html

Is there any way I can download the pre-trained models available in PyTorch to a specific path?

I am referring to the models that can be found here: https://pytorch.org/docs/stable/torchvision/models.html#torchvision-models

As, #dennlinger mentioned in his answer : torch.utils.model_zoo, is being internally called when you load a pre-trained model.
More specifically, the method: torch.utils.model_zoo.load_url() is being called every time a pre-trained model is loaded. The documentation for the same, mentions:
The default value of model_dir is $TORCH_HOME/models where
$TORCH_HOME defaults to ~/.torch.
The default directory can be overridden with the $TORCH_HOME
environment variable.
This can be done as follows:
import torch
import torchvision
import os
# Suppose you are trying to load pre-trained resnet model in directory- models\resnet
os.environ['TORCH_HOME'] = 'models\\resnet' #setting the environment variable
resnet = torchvision.models.resnet18(pretrained=True)
I came across the above solution by raising an issue in the PyTorch's GitHub repository:
https://github.com/pytorch/vision/issues/616
This led to an improvement in the documentation i.e. the solution mentioned above.

Yes, you can simply copy the urls and use wget to download it to the desired path. Here's an illustration:
For AlexNet:
$ wget -c https://download.pytorch.org/models/alexnet-owt-4df8aa71.pth
For Google Inception (v3):
$ wget -c https://download.pytorch.org/models/inception_v3_google-1a9a5a14.pth
For SqueezeNet:
$ wget -c https://download.pytorch.org/models/squeezenet1_1-f364aa15.pth
For MobileNetV2:
$ wget -c https://download.pytorch.org/models/mobilenet_v2-b0353104.pth
For DenseNet201:
$ wget -c https://download.pytorch.org/models/densenet201-c1103571.pth
For MNASNet1_0:
$ wget -c https://download.pytorch.org/models/mnasnet1.0_top1_73.512-f206786ef8.pth
For ShuffleNetv2_x1.0:
$ wget -c https://download.pytorch.org/models/shufflenetv2_x1-5666bf0f80.pth
If you want to do it in Python, then use something like:
In [11]: from six.moves import urllib
# resnet 101 host url
In [12]: url = "https://download.pytorch.org/models/resnet101-5d3b4d8f.pth"
# download and rename the file to `resnet_101.pth`
In [13]: urllib.request.urlretrieve(url, "resnet_101.pth")
Out[13]: ('resnet_101.pth', <http.client.HTTPMessage at 0x7f7fd7f53438>)
P.S: You can find the download URLs in the respective python modules of torchvision.models

There is a script available that will output a list of URLs across the entire package.
From within the pytorch/vision package execute the following:
python scripts/collect_model_urls.py .
# ...
# https://download.pytorch.org/models/swin_v2_b-781e5279.pth
# https://download.pytorch.org/models/swin_v2_s-637d8ceb.pth
# https://download.pytorch.org/models/swin_v2_t-b137f0e2.pth
# https://download.pytorch.org/models/vgg11-8a719046.pth
# https://download.pytorch.org/models/vgg11_bn-6002323d.pth
# ...

TL;DR: No, it is not possible directly, but you can easily adapt it.
I think what you want to do is to look at torch.utils.model_zoo, which is internally called when you load a pre-trained model:
If we look at the code for the pre-trained models, for example AlexNet here, we can see that it simply calls the previously mentioned model_zoo function, but without the saved location. You can either modify the PyTorch source to specify this (that would actually be a great addition IMO, so maybe open a pull request for that), or else simply adopt the code in the second link to your own liking (and save it to a custom location under a different name), and then manually insert the relevant location there.
If you want to regularly update PyTorch, I would heavily recommend the second method, since it doesn't involve directly altering PyTorch's code base, and potentially throw errors during updates.

input_alternative error on export_savedmodel in Tensorflow

I have a simple LinearModel with two sparse and two real-valued features. I trained it and now I want to export it with the export_savedmodel. Referencing few sources I came up with something along the lines of:
feature_spec = create_feature_spec_for_parsing(
[
real_valued_column_1, real_valued_column_2,
sparse_column_1, sparce_column_2
]
)
input_receiver_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)
my_estimator.export_savedmodel('my_model/', serving_input_fn=input_receiver_fn)
where:
real_valued_column_1 = tf.contrib.layers.real_valued_column(
'avg_consumption_h')
sparse_column_1 = tf.contrib.layers.sparse_column_with_integerized_feature("sparse_1", bucket_size=24)
Unfortunately I get ValueError: A default input_alternative must be provided. on export_savedmodel. I digged in a little into the codebase of tensorflow and it seems that build_parsing_serving_input_receiver_fn always returns ServingInputReceiver but the method that extracts input_alternatives always creates them empty if serving_input_fn passed to export_savedmodel is not of the type InputFnOps.
Is build_parsing_serving_input_receiver_fn somehow deprecated, something is wrong in the process of extraction of input_alternative, or maybe I'm misunderstanding process completely and doing something wrong?
I'm using python 3.6 with tensorflow 1.2, my model is a simple tf.contrib.learn.LinearRegressor.

You can try the following
from tensorflow.contrib.learn.python.learn.utils.input_fn_utils import build_parsing_serving_input_fn
input_receiver_fn = build_parsing_serving_input_fn(feature_spec)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.