Microsoft recommender model serialization - python

I'm working with Microsoft Recommender models and I also need to serialize model object after training. How can I do this then set model in Redis?
When I called pickle.dumps(model) this exception "{TypeError}can't pickle _thread.RLock objects"
occurred
import pickle
pickle.dumps(model)

Related

Deploying a new model to a sagemaker endpoint without updating the config?

I want to deploy a new model to an existing AWS SageMaker endpoint. The model is trained by a different pipeline and stored as a mode.tar.gz in S3. The sagemaker endpoint config is pointing to this as the model data URL. Sagemaker however doesn't reload the model and I don't know how to convince it to do so.
I want to deploy a new model to an AWS SageMaker endpoint. The model is trained by a different pipeline and stored as a mode.tar.gz in S3. I provisioned the Sagemaker Endpoint using AWS CDK. Now, within the training pipeline, I want to allow the data scientists to optionally upload their newly trained model to the endpoint for testing. I dont want to create a new model or an endpoint config. Also, I dont want to change the infrastructure (AWS CDK) code.
The model is uploaded to the S3 location that the sagemaker endpoint config is using as the
model_data_url. Hence it should use the new model. But it doesn't load it. I know that Sagemaker caches models inside the container, but idk how to force a new load.
This documentation suggests to store the model tarball with another name in the same S3 folder, and alter the code to invoke the model. This is not possible for my application. And I dont want Sagemaker to default to an old model, once the TargetModel parameter is not present.
Here is what I am currently doing after uploading the model to S3. Even though the endpoint transitions into Updating state, it does not force a model reload:
def update_sm_endpoint(endpoint_name: str) -> Dict[str, Any]:
"""Forces the sagemaker endpoint to reload model from s3"""
sm = boto3.client("sagemaker")
return sm.update_endpoint_weights_and_capacities(
EndpointName=endpoint_name,
DesiredWeightsAndCapacities=[
{"VariantName": "main", "DesiredWeight": 1},
],
)
Any ideas?
If you want to modify the model called in a SageMaker endpoint, you have to create a new model object and and new endpoint configuration. Then call update_endpoint This will not change the name of the endpoint.
comments on your question and SageMaker doc:
the documentation you mention ("This documentation suggests to store the model tarball with another name in the same S3 folder, and alter the code to invoke the model") is for SageMaker Multi-Model Endpoint, a service to store multiple models in the same endpoint in parallel. This is not what you need. You need a single-model SageMaker endpoint, and that you update with a
also, the API you mention sm.update_endpoint_weights_and_capacities is not needed for what you want (unless you want a progressive rollout from the traffic from model 1 to model 2).

How to fix AttributeError: 'module' object has no attribute 'config' in pytest module

Problem:- I am unable to load the pytest.config object in my test cases. It is showing that config is not an object in pytest module (At compile time itself).
AttributeError: 'module' object has no attribute 'config'
There are certain configuration parameters available in pytest.config object (Which are written in Utility scripts) which needs to be consumed by different test cases.
Background:- We are basically migrating old test framework to the new test framework. In this migration project, we have to consume plugin objects set to the pytest.config object (I am new to pytest framework).
search = pytest.config.deployment_object
I am unable to get the deployment_object from pytest.config object

How to deploy multiple TensorFlow models using AWS?

I've trained 10 different TensorFlow models for style transfer, basically, each model is responsible to apply filters to a image based on a style image. So, every model is functioning independently and I want to integrate this into an application. Is there any way to deploy these models using AWS?
I've tried deploying these models using AWS SageMaker and then using the endpoint with AWS Lambda and then finally creating an API using API Gateway. But the catch here is that we can only deploy a single model on SageMaker, but in my case I want to deploy 10 different models.
I expect to provide a link to each model in my application, so the selected filter will trigger the model on AWS and will apply the filter.
What I did for something similar is that I created my own docker container with an api code capable of loading and predicting with multiple models. The api, when it starts it copies a model.tar.gz from an S3 bucket, and inside that tar.gz are the weights for all my models, my code then scans the content and loads all the models. If your models are too big (RAM consumption) you might need to handle this differently, as it's said here, that it loads the model only when you call predict. I load all the models at the beginning to have faster predicts. That is not actually a big change in code.
Another approach that I'm trying right now is to have the API Gateway call multiple Sagemaker endpoints, although I did not find good documentation for that.
There are couple options, and the final choice depends on your priorities in terms of cost, latency, reliability, simplicity.
Different SageMaker endpoints per model - one benefit of that is that it leads to better robustness, because models are isolated from one another. If one model gets called a lot, it won't put the whole fleet down. They each live their own life, and can also be hosted on separate type of machines, to achieve better economics. Note that to achieve high-availability it is even recommended to double hardware backend (2+ servers per SageMaker endpoint), so that endpoints are multi-zone, as SageMaker does its best to host endpoint backend on different availability zones if an endpoint has two or more instances.
One SageMaker TFServing multi-model endpoint - If all your models are TensorFlow models and if their artifacts are compatible with TFServing, you may be able to host all of them in a single SageMaker TFServing endpoint. See this section of the docs: Deploying more than one model to your endpoint
One SageMaker Multi-Model Endpoint, a feature that was released end of 2019 and that enables hosting of multiple models in the same container.
Serverless deployment in AWS Lambda - this can be cost-effective: models generate charges only when called. This is limited to pairs of {DL model ; DL framework} that fit within Lambda memory and storage limits and that do not require GPU. It's been documented couple times in the past, notably with Tensorflow and MXNet

How to keep a Keras model loaded into memory and use it when needed?

I was reading a Keras blog teaching how to create a simple image classifier restful API with Flask. I was wondering how to achieve the same approach of loading model in other web frameworks that do not use python.
In the code below the model is loaded into memory just before the server starts and it runs until the server is alive:
# if this is the main thread of execution first load the model and
# then start the server
if __name__ == "__main__":
print(("* Loading Keras model and Flask starting server..."
"please wait until server has fully started"))
load_model()
app.run()
I'm familiar with Pickle and I know how to run python code in other web frameworks (such as python-shell of Node.js). pickled models are built once and can be loaded each time they're needed. but I'm looking to achieve the same thing as the tutorial suggests which is loading it only once, and using it multiple times. Is creating a separate python server app that serves the loaded model to Node.js requests a good idea?
You can load a model in Keras using load_model and pass in a path:
from keras.models import load_model
model = load_model('model.hd5')
I have created a Flask API that loads a Keras model, you can take a look here if it helps:
https://github.com/Ares513/DetectingTrollsApi/blob/master/api.py
I managed to develop a python "model" server in which the ML model is loaded into memory and is shared through sockets. the consumer app, on the other hand, is a simple Node.js web app that forwards the requests to the python server and retrieves the reply.
you can find the code sample here: Keras deep api
this is an image classifier app that uses ResNet50 to classify images. images can be uploaded through the Node.js app and are then passed to the python server for classification, the result is then sent back to the Node.js app.

Tensorflow Serving: When to use it rather than simple inference inside Flask service?

I am serving a model trained using object detection API. Here is how I did it:
Create a Tensorflow service on port 9000 as described in the basic tutorial
Create a python code calling this service using predict_pb2 from tensorflow_serving.apis similar to this
Call this code inside a Flask server to make the service available with HTTP
Still, I could have done things much easier the following way :
Create a python code for inference like in the example in object detection repo
Call this code inside a Flask server to make the service available with HTTP
As you can see, I could have skipped the use of Tensorflow serving.
So, is there any good reason to use Tensorflow serving in my case ? If not, what are the cases where I should use it ?
I believe most of the reasons why you would prefer Tensorflow Serving over Flask are related to performance:
Tensorflow Serving makes use of gRPC and Protobuf while a regular
Flask web service uses REST and JSON. JSON relies on HTTP 1.1 while
gRPC uses HTTP/2 (there are important differences). In addition,
Protobuf is a binary format used to serialize data and it is more
efficient than JSON.
TensorFlow Serving can batch requests to the same model, which uses hardware (e.g. GPUs) more appropriate.
TensorFlow Serving can manage model versioning
As almost everything, it depends a lot on the use case you have and your scenario, so it's important to think about pros and cons and your requirements. TensorFlow Serving has great features, but these features could be also implemented to work with Flask with some effort (for instance, you could create your batch mechanism).
Flask is used to handle request/response whereas Tensorflow serving is particularly built for serving flexible ML models in production.
Let's take some scenarios where you want to:
Serve multiple models to multiple products (Many to Many relations) at
the same time.
Look which model is making an impact on your product (A/B Testing).
Update model weights in production, which is as easy as saving a new
model to a folder.
Have a performance equal to code written in C/C++.
And you can always use all those advantages for FREE by sending requests to TF Serving using Flask.

Categories