How to deploy our ML trained model? - python

I am new to machine learning. I'm done with k-means clustering and the ml model is trained. My question is how to pass input for my trained model?
Example:
Consider a google image processing ML model. For that we pass an image that gives the proper output like emotion from that picture.
Now my doubt is how to do like that I'm done the k-means to predict mall_customer who spending more money to buy a product for this I want to call or pass the input to the my trained model.
I am using python and sci-kit learn.

What you want here is an API where you can send request/input and get response/predictions.
You can create a Flask server, save your trained model as a pickle file and load it when making predictions. This might be some work to do.
Please refer these :
https://towardsdatascience.com/deploying-a-machine-learning-model-as-a-rest-api-4a03b865c166
https://hackernoon.com/deploy-a-machine-learning-model-using-flask-da580f84e60c
Note: The Flask inbuilt server is not production ready. You might want to refer uwsgi + ngnix
In case you are using docker : https://hub.docker.com/r/tiangolo/uwsgi-nginx-flask/ this will be a great help.

Since the question was asked in 2019, many Python libraries exist that allow users to quickly deploy machine learning models without having to learn Flask, containerization, and getting a web hosting solution. The best solution depends on factors like how long you need to deploy the model for, and whether it needs to be able handle heavy traffic.
For the use case that the user described, it sounds like the gradio library could be helpful (http://www.gradio.app/), which allows users to soft-deploy models with public links and user interfaces with a few lines of Python code, like below:

Let's say all you know is how to train and save a model and want some way of using it in a real app or some way of presenting it to the world.
Here's what you'll need to do:
Create an API (eg: using Flask, FastAPI, Starlette etc) which will serve your model, ie, it will receive inputs, run your model on them, and send back outputs.
Need to setup a webserver (eg: uvicorn), that will host your Flask App and serve as a bridge between host machine and your Flask App.
Deploy the whole thing behind a cloud provider like (netlify, GCP, AWS etc). This will give you a url that can be used to call your API.
Then there are other optional things like:
Docker, which let you package your model, it's dependencies and your Flask App together inside a docker image, which can be easily deployed on different platforms due to consistency. Your app will then run as a docker container. This solves the environment consistency problems.
Kubernetes, which lets you make sure your Flask App always stays available by spinning up new docker container every time something goes wrong in the one that's up. This solves availability and scalability problem.
There are multiple tools that ease or automate different parts of this process. You can also check out mia which lets you do all the above and also give a nice frontend UI to your model web app. Its a no-code, low-code tool so you can go from a saved model to a deployed Web App and an API endpoint within minutes.
(Edit - Disclaimer: I'm part of the team responsible for building mia)

Related

How to deal with time consuming python functions in Heroku?

I have successfully deployed a Django app to Heroku using Postgres. The only problem is that some of the python functions I have written can take up to several minutes to run (data scraping many pages with selenium and generating 50 different deep learning models with keras). If it takes longer than 30 seconds the app crashes. I ultimately plan on using this Heroku app as an API that I will connect to a frontend using React on netlify. Is there a way to automatically run these functions behind the scenes somehow? If not, how can I deploy a website that runs time consuming python functions in the backend and uses React for the frontend?
Okay, I think we can divide the problems in TWO parts:\
1- Heroku free Tier (assuming it is) "kills" the server after 30min of absence (source), so basically its very difficult to host a backend in heroku. And besides that, since you're training A LOT of deep learning models you could go out of memory and things like that.
2- You might want to redesign your architecture. What about creating a server that once in a while train this machine learning models and the other one just consume and make inferences on those models? You could also separate the scrapping part from the actual server, and just pull data from db.
Since you didn't added constraints to your problem, I see it that way.

How to run a python script on Azure for CSV file analysis

I have a python script on my local machine that reads a CSV file and outputs some metrics. The end goal is to create a web interface where the user uploads the CSV file and the metrics are displayed, while all being hosted on Azure.
I want to use a VM on Azure to run this python script.
The script takes the CSV file and outputs metrics which are stored in CosmosDB.
A web interface reads from this DB and displays graphs from the data generated by the script.
Can someone elaborate on the steps I need to follow to achieve this? Detailed steps are not essentially required, but a brief overview with links to relevant learning sources would be helpful.
There's an article that lists the primary options for hosting sites in Azure: https://learn.microsoft.com/en-us/azure/developer/python/quickstarts-app-hosting
As Sadiq mentioned, Functions is probably your best choice as it will probably be less expensive, less maintenance, and can handle both the script and the web interface. Here is a python tutorial for that method: https://learn.microsoft.com/en-us/azure/developer/python/tutorial-vs-code-serverless-python-01
Option 2 would be to run a traditional website on an App Service plan, with background tasks handled either by Functions or a Webjob- they both use the webjobs SDK, so the code is very similar: https://learn.microsoft.com/en-us/learn/paths/deploy-a-website-with-azure-app-service/
VMs are an option if either of those two don't work, but it comes with significantly more administration. This learning path has info on how to do this. The website is built on the MEAN stack, but is applicable to Python as well: https://learn.microsoft.com/en-us/learn/paths/deploy-a-website-with-azure-virtual-machines/

How to deploy multiple TensorFlow models using AWS?

I've trained 10 different TensorFlow models for style transfer, basically, each model is responsible to apply filters to a image based on a style image. So, every model is functioning independently and I want to integrate this into an application. Is there any way to deploy these models using AWS?
I've tried deploying these models using AWS SageMaker and then using the endpoint with AWS Lambda and then finally creating an API using API Gateway. But the catch here is that we can only deploy a single model on SageMaker, but in my case I want to deploy 10 different models.
I expect to provide a link to each model in my application, so the selected filter will trigger the model on AWS and will apply the filter.
What I did for something similar is that I created my own docker container with an api code capable of loading and predicting with multiple models. The api, when it starts it copies a model.tar.gz from an S3 bucket, and inside that tar.gz are the weights for all my models, my code then scans the content and loads all the models. If your models are too big (RAM consumption) you might need to handle this differently, as it's said here, that it loads the model only when you call predict. I load all the models at the beginning to have faster predicts. That is not actually a big change in code.
Another approach that I'm trying right now is to have the API Gateway call multiple Sagemaker endpoints, although I did not find good documentation for that.
There are couple options, and the final choice depends on your priorities in terms of cost, latency, reliability, simplicity.
Different SageMaker endpoints per model - one benefit of that is that it leads to better robustness, because models are isolated from one another. If one model gets called a lot, it won't put the whole fleet down. They each live their own life, and can also be hosted on separate type of machines, to achieve better economics. Note that to achieve high-availability it is even recommended to double hardware backend (2+ servers per SageMaker endpoint), so that endpoints are multi-zone, as SageMaker does its best to host endpoint backend on different availability zones if an endpoint has two or more instances.
One SageMaker TFServing multi-model endpoint - If all your models are TensorFlow models and if their artifacts are compatible with TFServing, you may be able to host all of them in a single SageMaker TFServing endpoint. See this section of the docs: Deploying more than one model to your endpoint
One SageMaker Multi-Model Endpoint, a feature that was released end of 2019 and that enables hosting of multiple models in the same container.
Serverless deployment in AWS Lambda - this can be cost-effective: models generate charges only when called. This is limited to pairs of {DL model ; DL framework} that fit within Lambda memory and storage limits and that do not require GPU. It's been documented couple times in the past, notably with Tensorflow and MXNet

Tensorflow Serving: When to use it rather than simple inference inside Flask service?

I am serving a model trained using object detection API. Here is how I did it:
Create a Tensorflow service on port 9000 as described in the basic tutorial
Create a python code calling this service using predict_pb2 from tensorflow_serving.apis similar to this
Call this code inside a Flask server to make the service available with HTTP
Still, I could have done things much easier the following way :
Create a python code for inference like in the example in object detection repo
Call this code inside a Flask server to make the service available with HTTP
As you can see, I could have skipped the use of Tensorflow serving.
So, is there any good reason to use Tensorflow serving in my case ? If not, what are the cases where I should use it ?
I believe most of the reasons why you would prefer Tensorflow Serving over Flask are related to performance:
Tensorflow Serving makes use of gRPC and Protobuf while a regular
Flask web service uses REST and JSON. JSON relies on HTTP 1.1 while
gRPC uses HTTP/2 (there are important differences). In addition,
Protobuf is a binary format used to serialize data and it is more
efficient than JSON.
TensorFlow Serving can batch requests to the same model, which uses hardware (e.g. GPUs) more appropriate.
TensorFlow Serving can manage model versioning
As almost everything, it depends a lot on the use case you have and your scenario, so it's important to think about pros and cons and your requirements. TensorFlow Serving has great features, but these features could be also implemented to work with Flask with some effort (for instance, you could create your batch mechanism).
Flask is used to handle request/response whereas Tensorflow serving is particularly built for serving flexible ML models in production.
Let's take some scenarios where you want to:
Serve multiple models to multiple products (Many to Many relations) at
the same time.
Look which model is making an impact on your product (A/B Testing).
Update model weights in production, which is as easy as saving a new
model to a folder.
Have a performance equal to code written in C/C++.
And you can always use all those advantages for FREE by sending requests to TF Serving using Flask.

Sklearn Model (Python) with NodeJS (Express): how to connect both?

I have a web server using NodeJS - Express and I have a Scikit-Learn (machine learning) model pickled (dumped) in the same machine.
What I need is to demonstrate the model by sending/receiving data from it to the server. I want to load the model on startup of the web server and keep "listening" for data inputs. When receive data, executes a prediction and send it back.
I am relatively new to Python. From what I've seen I could use a "Child Process" to execute that. I also saw some modules that run Python script from Node.
The problem is I want to load the model once and let it be for as long as the server is on. I don't want to keep loading the model every time due to it's size. How is the best way to perform that?
The idea is running everything in a AWS machine.
Thank you in advance.
My recommendation: write a simple python web service (personally recommend flask) and deploy your ML model. Then you can easily send requests to your python web service from your node back-end. You wouldn't have a problem with the initial model loading. it is done once in the app startup, and then you're good to go
DO NOT GO FOR SCRIPT EXECUTIONS AND CHILD PROCESSES!!! I just wrote it in bold-italic all caps so to be sure you wouldn't do that. Believe me... it potentially go very very south, with all that zombie processes upon job termination and other stuff. let's just simply say it's not the standard way to do that.
You need to think about multi-request handling. I think flask now has it by default
I am just giving you general hints because your problem has been generally introduced.

Categories