Deploying a new model to a sagemaker endpoint without updating the config?

Deploying a new model to a sagemaker endpoint without updating the config? - python

I want to deploy a new model to an existing AWS SageMaker endpoint. The model is trained by a different pipeline and stored as a mode.tar.gz in S3. The sagemaker endpoint config is pointing to this as the model data URL. Sagemaker however doesn't reload the model and I don't know how to convince it to do so.
I want to deploy a new model to an AWS SageMaker endpoint. The model is trained by a different pipeline and stored as a mode.tar.gz in S3. I provisioned the Sagemaker Endpoint using AWS CDK. Now, within the training pipeline, I want to allow the data scientists to optionally upload their newly trained model to the endpoint for testing. I dont want to create a new model or an endpoint config. Also, I dont want to change the infrastructure (AWS CDK) code.
The model is uploaded to the S3 location that the sagemaker endpoint config is using as the
model_data_url. Hence it should use the new model. But it doesn't load it. I know that Sagemaker caches models inside the container, but idk how to force a new load.
This documentation suggests to store the model tarball with another name in the same S3 folder, and alter the code to invoke the model. This is not possible for my application. And I dont want Sagemaker to default to an old model, once the TargetModel parameter is not present.
Here is what I am currently doing after uploading the model to S3. Even though the endpoint transitions into Updating state, it does not force a model reload:
def update_sm_endpoint(endpoint_name: str) -> Dict[str, Any]:
"""Forces the sagemaker endpoint to reload model from s3"""
sm = boto3.client("sagemaker")
return sm.update_endpoint_weights_and_capacities(
EndpointName=endpoint_name,
DesiredWeightsAndCapacities=[
{"VariantName": "main", "DesiredWeight": 1},
],
)
Any ideas?

If you want to modify the model called in a SageMaker endpoint, you have to create a new model object and and new endpoint configuration. Then call update_endpoint This will not change the name of the endpoint.
comments on your question and SageMaker doc:
the documentation you mention ("This documentation suggests to store the model tarball with another name in the same S3 folder, and alter the code to invoke the model") is for SageMaker Multi-Model Endpoint, a service to store multiple models in the same endpoint in parallel. This is not what you need. You need a single-model SageMaker endpoint, and that you update with a
also, the API you mention sm.update_endpoint_weights_and_capacities is not needed for what you want (unless you want a progressive rollout from the traffic from model 1 to model 2).

Related

Is it possible set up an endpoint for a model I created in AWS SageMaker without using the SageMaker SDK

I've created my own model on a AWS SageMaker instance, with my own training and inference loops. I want to deploy it so that I can call the model for inference from AWS Lambda.
I didn't use the SageMaker package to develop at all, but every tutorial (here is one) I've looked at does so.
How do I create an endpoint without using the SageMaker package.

You can use the boto3 library to do this.
Here is an example of pseudo code for this -
import boto3
sm_client = boto3.client('sagemaker')
create_model_respose = sm_client.create_model(ModelName=model_name, ExecutionRoleArn=role, Containers=[container] )
create_endpoint_config_response = sm_client.create_endpoint_config(EndpointConfigName=endpoint_config_name)
create_endpoint_response = sm_client.create_endpoint(EndpointName=endpoint_name, EndpointConfigName=endpoint_config_name)

Does MLflow allow to log artifacts from remote locations like S3?

My setting
I have developed an environment for ML experiments that looks like the following: training happens in the AWS cloud with SageMaker Training Jobs. The trained model is stored in the /opt/ml/model directory, which is reserved by SageMaker to pack models as a .tar.gz in SageMaker's own S3 bucket. Several evaluation metrics are computed during training and testing, and recorded to an MLflow infrastructure consisting of an S3-based artifact store (see Scenario 4). Note that this is a different S3 bucket than SageMaker's.
A very useful feature from MLflow is that any model artifacts can be logged to a training run, so data scientists have access to both metrics and more complex outputs through the UI. These outputs include (but are not limited to) the trained model itself.
A limitation is that, as I understand it, the MLflow API for logging artifacts only accepts as input a local path to the artifact itself, and will always upload it to its artifact store. This is suboptimal when the artifacts are stored somewhere outside MLflow, as you have to store them twice. A transformer model may weigh more than 1GB.
My questions
Is there a way to pass an S3 path to MLflow and make it count as an artifact, without having to download it locally first?
Is there a way to avoid pushing a copy of an artifact to the artifact store? If my artifacts already reside in another remote location, it would be ideal to just have a link to such location in MLflow and not a copy in MLflow storage.

You can use a Tracking Server with S3 as a backend

Schedule the deployment of a sagemaker model

I'm trying out SageMaker and I've created a model using autopilot. The point is that SageMaker only allows you to deploy directly to an endpoint. But since I'll only be using the model a couple of times a day, what is the most direct way to schedule deployments by events (for example when loading new csv's into an s3 directory or when I see a queue in sqs) or at least periodically?

The answer above is incorrect. Boto3 is part of the Lambda Python environment, so all you need to do is create a SageMaker client and invoke the appropriate API.
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html

You can use a trigger (e.g. Cloudwatch Events/EventBridge, S3 event, etc.) to run a Lambda function that deploys your SageMaker model. The Lambda function, however, requires a runtime that can call SageMaker APIs. You will have to create a custom runtime (via Layers) for that. If you're using Python, use this as reference: https://dev.to/vealkind/getting-started-with-aws-lambda-layers-4ipk.

How to deploy multiple TensorFlow models using AWS?

I've trained 10 different TensorFlow models for style transfer, basically, each model is responsible to apply filters to a image based on a style image. So, every model is functioning independently and I want to integrate this into an application. Is there any way to deploy these models using AWS?
I've tried deploying these models using AWS SageMaker and then using the endpoint with AWS Lambda and then finally creating an API using API Gateway. But the catch here is that we can only deploy a single model on SageMaker, but in my case I want to deploy 10 different models.
I expect to provide a link to each model in my application, so the selected filter will trigger the model on AWS and will apply the filter.

What I did for something similar is that I created my own docker container with an api code capable of loading and predicting with multiple models. The api, when it starts it copies a model.tar.gz from an S3 bucket, and inside that tar.gz are the weights for all my models, my code then scans the content and loads all the models. If your models are too big (RAM consumption) you might need to handle this differently, as it's said here, that it loads the model only when you call predict. I load all the models at the beginning to have faster predicts. That is not actually a big change in code.
Another approach that I'm trying right now is to have the API Gateway call multiple Sagemaker endpoints, although I did not find good documentation for that.

There are couple options, and the final choice depends on your priorities in terms of cost, latency, reliability, simplicity.
Different SageMaker endpoints per model - one benefit of that is that it leads to better robustness, because models are isolated from one another. If one model gets called a lot, it won't put the whole fleet down. They each live their own life, and can also be hosted on separate type of machines, to achieve better economics. Note that to achieve high-availability it is even recommended to double hardware backend (2+ servers per SageMaker endpoint), so that endpoints are multi-zone, as SageMaker does its best to host endpoint backend on different availability zones if an endpoint has two or more instances.
One SageMaker TFServing multi-model endpoint - If all your models are TensorFlow models and if their artifacts are compatible with TFServing, you may be able to host all of them in a single SageMaker TFServing endpoint. See this section of the docs: Deploying more than one model to your endpoint
One SageMaker Multi-Model Endpoint, a feature that was released end of 2019 and that enables hosting of multiple models in the same container.
Serverless deployment in AWS Lambda - this can be cost-effective: models generate charges only when called. This is limited to pairs of {DL model ; DL framework} that fit within Lambda memory and storage limits and that do not require GPU. It's been documented couple times in the past, notably with Tensorflow and MXNet

SageMaker delete Models and Endpoint configurations with python API

I've tried deleting/recreating endpoints with the same name, and wasted a lot of time before I realized that changes do not get applied unless you also delete the corresponding Model and Endpoint configuration so that new ones can be created with that name.
Is there a way with the sagemaker python api to delete all three instead of just the endpoint?

I believe you are looking for something like this? :
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.delete_endpoint_config
Examples:
import boto3
deployment_name = 'my_deployment_name'
client = boto3.client('sagemaker')
response = client.describe_endpoint_config(EndpointConfigName=deployment_name)
model_name = response['ProductionVariants'][0]['ModelName']
client.delete_model(ModelName=model_name)
client.delete_endpoint(EndpointName=deployment_name)
client.delete_endpoint_config(EndpointConfigName=deployment_name)

It looks like AWS is currently in the process of supporting model deletion via API with this pull request.
For the time being Amazon's only recommendation is to delete everything via the console.
If this is critical to your system you can probably manage everything via Cloud Formation and create/delete services containing your Sagemaker models and endpoints.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.