Schedule the deployment of a sagemaker model

Schedule the deployment of a sagemaker model - python

I'm trying out SageMaker and I've created a model using autopilot. The point is that SageMaker only allows you to deploy directly to an endpoint. But since I'll only be using the model a couple of times a day, what is the most direct way to schedule deployments by events (for example when loading new csv's into an s3 directory or when I see a queue in sqs) or at least periodically?

The answer above is incorrect. Boto3 is part of the Lambda Python environment, so all you need to do is create a SageMaker client and invoke the appropriate API.
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html

You can use a trigger (e.g. Cloudwatch Events/EventBridge, S3 event, etc.) to run a Lambda function that deploys your SageMaker model. The Lambda function, however, requires a runtime that can call SageMaker APIs. You will have to create a custom runtime (via Layers) for that. If you're using Python, use this as reference: https://dev.to/vealkind/getting-started-with-aws-lambda-layers-4ipk.

Related

Strategies to run user-submitted code on AWS

I'm building an application that can run user-submitted python code. I'm considering the following approaches:
Spinning up a new AWS lambda function for each user's request to run the submitted code in it. Delete the lambda function afterwards. I'm aware of AWS lambda's time limit - so this would be used to run only small functions.
Spinning up a new EC2 machine to run a user's code. One instance per user. Keep the instance running while the user is still interacting with my application. Kill the instance after the user is done.
Same as the 2nd approach but also spin up a docker container inside the EC2 instance to add an additional layer of isolation (is this necessary?)
Are there any security vulnerabilities I need to be aware of? Will the user be able to do anything if they gain access to environment variables in their own lambda function/ec2 machine? Are there any better solutions?

Any code which you run on AWS Lambda will have the capabilities of the associated function. Be very careful what you supply.
Even logging and metrics access can be manipulated to incur additional costs.

Is it possible set up an endpoint for a model I created in AWS SageMaker without using the SageMaker SDK

I've created my own model on a AWS SageMaker instance, with my own training and inference loops. I want to deploy it so that I can call the model for inference from AWS Lambda.
I didn't use the SageMaker package to develop at all, but every tutorial (here is one) I've looked at does so.
How do I create an endpoint without using the SageMaker package.

You can use the boto3 library to do this.
Here is an example of pseudo code for this -
import boto3
sm_client = boto3.client('sagemaker')
create_model_respose = sm_client.create_model(ModelName=model_name, ExecutionRoleArn=role, Containers=[container] )
create_endpoint_config_response = sm_client.create_endpoint_config(EndpointConfigName=endpoint_config_name)
create_endpoint_response = sm_client.create_endpoint(EndpointName=endpoint_name, EndpointConfigName=endpoint_config_name)

How to deploy multiple TensorFlow models using AWS?

I've trained 10 different TensorFlow models for style transfer, basically, each model is responsible to apply filters to a image based on a style image. So, every model is functioning independently and I want to integrate this into an application. Is there any way to deploy these models using AWS?
I've tried deploying these models using AWS SageMaker and then using the endpoint with AWS Lambda and then finally creating an API using API Gateway. But the catch here is that we can only deploy a single model on SageMaker, but in my case I want to deploy 10 different models.
I expect to provide a link to each model in my application, so the selected filter will trigger the model on AWS and will apply the filter.

What I did for something similar is that I created my own docker container with an api code capable of loading and predicting with multiple models. The api, when it starts it copies a model.tar.gz from an S3 bucket, and inside that tar.gz are the weights for all my models, my code then scans the content and loads all the models. If your models are too big (RAM consumption) you might need to handle this differently, as it's said here, that it loads the model only when you call predict. I load all the models at the beginning to have faster predicts. That is not actually a big change in code.
Another approach that I'm trying right now is to have the API Gateway call multiple Sagemaker endpoints, although I did not find good documentation for that.

There are couple options, and the final choice depends on your priorities in terms of cost, latency, reliability, simplicity.
Different SageMaker endpoints per model - one benefit of that is that it leads to better robustness, because models are isolated from one another. If one model gets called a lot, it won't put the whole fleet down. They each live their own life, and can also be hosted on separate type of machines, to achieve better economics. Note that to achieve high-availability it is even recommended to double hardware backend (2+ servers per SageMaker endpoint), so that endpoints are multi-zone, as SageMaker does its best to host endpoint backend on different availability zones if an endpoint has two or more instances.
One SageMaker TFServing multi-model endpoint - If all your models are TensorFlow models and if their artifacts are compatible with TFServing, you may be able to host all of them in a single SageMaker TFServing endpoint. See this section of the docs: Deploying more than one model to your endpoint
One SageMaker Multi-Model Endpoint, a feature that was released end of 2019 and that enables hosting of multiple models in the same container.
Serverless deployment in AWS Lambda - this can be cost-effective: models generate charges only when called. This is limited to pairs of {DL model ; DL framework} that fit within Lambda memory and storage limits and that do not require GPU. It's been documented couple times in the past, notably with Tensorflow and MXNet

Communicating with azure container using server-less function

I have created a python serverless function in azure that gets executed when a new file is uploaded to azure blob (BlobTrigger). The function extracts certain properties of the file and saves it in the DB. As the next step, I want this function copy and process the same file inside a container instance running in ACS. The result of processing should be returned back to the same azure function.
This is a hypothetical architecture that I am currently brainstorming on. I wanted to know if this is feasible. Can you provide me some pointers on how I can achieve this.
I dont see any ContainerTrigger kind of functionality that can allow me to trigger the container and process my next steps.
I have tried utilizing the code examples mentioned here but they have are not really performing the tasks that I need: https://github.com/Azure-Samples/aci-docs-sample-python/blob/master/src/aci_docs_sample.py

Based on the comments above you can consider.
Azure Container Instance
Deploy your container in ACI (Azure Container Instance) and expose HTTP end point from container , just like any web url. Trigger Azure Function using blob storage trigger and then pass your blob file URL to the exposed http end point to your container. Process the file there and return the response back to azure function just like normal http request/response.
You can completely bypass azure function and can trigger your ACI (container instance) using logic apps , process the file and directly save in database.
When you are using Azure function make sure this is short lived process since Azure function will exit after certain time (default 5 mins). For long processing you may have to consider azure durable functions.
Following url can help you understand better.
https://github.com/Azure-Samples/aci-event-driven-worker-queue

How to show Spark application's percentage of completion on AWS EMR (and Boto3)?

I am running a Spark step on AWS EMR, this step is added to EMR through Boto3, I will like to return to the user a percentage of completion of the task, is there anyway to do this?
I was thinking to calculate this percentage with the number of completed stages of Spark, I know this won't be too precise, as the stage 4 may take double time than stage 5 but I am fine with that.
Is it possible to access this information with boto3?
I checked the method list_steps (here are the docs) but in the response I am getting only if its running without other information.

DISCLAIMER: I know nothing about AWS EMR and Boto3
I will like to return to the user a percentage of completion of the task, is there anyway to do this?
Any way? Perhaps. Just register a SparkListener and intercept events as they come. That's how web UI works under the covers (which is the definitive source of truth for Spark applications).
Use spark.extraListeners property to register a SparkListener and do whatever you want with the events.
Quoting the official documentation's Application Properties:
spark.extraListeners A comma-separated list of classes that implement SparkListener; when initializing SparkContext, instances of these classes will be created and registered with Spark's listener bus. If a class has a single-argument constructor that accepts a SparkConf, that constructor will be called; otherwise, a zero-argument constructor will be called.
You could also consider REST API interface:
In addition to viewing the metrics in the UI, they are also available as JSON. This gives developers an easy way to create new visualizations and monitoring tools for Spark. The JSON is available for both running applications, and in the history server. The endpoints are mounted at /api/v1. Eg., for the history server, they would typically be accessible at http://:18080/api/v1, and for a running application, at http://localhost:4040/api/v1.

This is not supported at the moment and I don't think it will be anytime soon.
You'll just have to follow application logs the old fashioned way. So maybe consider formatting your logs in a way you know what has actually finished.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Schedule the deployment of a sagemaker model - python

The answer above is incorrect. Boto3 is part of the Lambda Python environment, so all you need to do is create a SageMaker client and invoke the appropriate API. https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html

Related

Strategies to run user-submitted code on AWS

Is it possible set up an endpoint for a model I created in AWS SageMaker without using the SageMaker SDK

How to deploy multiple TensorFlow models using AWS?

Communicating with azure container using server-less function

How to show Spark application's percentage of completion on AWS EMR (and Boto3)?

Categories

Resources