Performance penalty for Tensorflow Serving in docker hosted app - python

I am experiencing a large performance penalty for calls to Tensorflow Serving, when the calling app is hosted in a docker container. I was hoping someone would have some suggestions. Details below on the setup and what I have tried.
Scenario 1:
Docker (version 18.09.0, build 4d60db4) hosted Tensorflow model, following the instructions here.
Flask app running on the host machine (not in container).
Using gRPC for sending the request to the model.
Performance: 0.0061 seconds / per prediction
Scenario 2:
Same docker container hosted Tensorflow model.
Container hosted Flask app running on the host machine (inside same container as the model).
Using gRPC for sending the request to the model.
Performance: 0.0107 / per prediction
In other words, when the app is hosted in the same container as the model, performance is ~40% lower.
I have logged timing on nearly every step in the app and have tracked the difference down to this line:
result = self.stub.Predict(self.request, 60.0)
In the container hosted app, the average round-trip for this is 0.006 seconds. For the same app hosted outside the container, the round-trip for this line is 0.002 seconds.
This is the function I am using to establish the connection to the model.
def TFServerConnection():
channel = implementations.insecure_channel('127.0.0.1', 8500)
stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
request = predict_pb2.PredictRequest()
return (channel, stub, request)
I have tried hosting the app and model in separate containers, building a custom Tensorflow Serving containers (optimized for my VM), and using the REST api (which decreased performance slightly for both scenarios).
Edit 1
To add a bit more information, I am running the docker container with the following command:
docker run \
--detach \
--publish 8000:8000 \
--publish 8500:8500 \
--publish 8501:8501 \
--name tfserver \
--mount type=bind,source=/home/jason/models,target=/models \
--mount type=bind,source=/home/jason/myapp/serve/tfserve.conf,target=/config/tfserve.conf \
--network host \
jason/myapp:latest
Edit 2
I have now tracked this down to being an issue with stub.Predict(request, 60.0) in Flask apps only. It seems Docker is not the issue. Here are the versions of Flask and Tensorflow I am currently running.
$ sudo pip3 freeze | grep Flask
Flask==1.0.2
$ sudo pip3 freeze | grep tensor
tensorboard==1.12.0
tensorflow==1.12.0
tensorflow-serving-api==1.12.0
I am using gunicorn as my WSGI server:
gunicorn --preload --config config/gunicorn.conf app:app
And the contents of config/gunicorn.conf:
bind = "0.0.0.0:8000"
workers = 3
timeout = 60
worker_class = 'gevent'
worker_connections = 1000
Edit 3
I have now narrowed the issue down to Flask. I ran the Flask app directly with app.run() and got the same performance as when using gunicorn. What could Flask be doing that would slow the call to Tensorflow?

Related

App engine 404 Error for readiness check even with increased app_start_timeout_sec

First off, I would like to state that I have scoured SO for a solution, yet nothing worked for me...
I am trying to deploy a flask server on App engine, yet I always get a 404 Error with /readiness_check failReason:"null"
This is my app.yaml (yes, I did increase the app_start_timeout_sec)
# yaml config for custom environment that uses docker
runtime: custom
env: flex
service: test-appengine
# change readiness check ;
# rediness failure leads to 502 Error
readiness_check:
path: "/readiness_check"
check_interval_sec: 5
timeout_sec: 4
failure_threshold: 2
success_threshold: 2
app_start_timeout_sec: 1800
And this is my Dockerfile:
# Use the official Python image.
# https://hub.docker.com/_/python
FROM python:3.8-buster
# Install Python dependencies.
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
# Copy local code to the container image.
ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . .
# expose port 8080 for app engine
EXPOSE 8080
# Run the web service on container startup. Here we use the gunicorn
# webserver, with one worker process and 8 threads.
# For environments with multiple CPU cores, increase the number of workers
# to be equal to the cores available.
# CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 main:app
CMD ["gunicorn", "main:app", "-b", ":8080", "--timeout", "300"]
Finally, my main.py contains a very basic route, for the sake of the argument :
from flask import Flask
app = Flask(__name__)
#app.route("/")
def return_hello():
return "Hello!"
Could you please let me know what I'm doing wrong? Have been battling this issue for days now ... Thank you !
I believe you still need to define the handler for your readiness_check (you're getting 404 which means route not found).
See this article for an example

cant connect make request from fastapi container to kafka container

I have two containers one is kafka container open port 9092:9092 and another is Fastapi container. If I don't dockerize Fastapi, I can make rest api request to fastapi to kafka. It sends message to kafka via fastapi. But when I dockerize fastapi can't connect fastapi container to kafka container.
I cant run fastapi docker file with -p 8000:8000 -p 9092:9092 it says 9092 is already used.
How can I make request to fastapi container then fastapi connects to kafka container.
fastapi dockerfile
FROM python:3.8.10
ADD . .
COPY requirements.txt .
RUN pip3 install -r requirements.txt
CMD ["python3", "main.py"]
My error is
kafka.errors.NoBrokersAvailable: NoBrokersAvailable.
I get kafka container IP address and I am making to kafka container IP address example
producer = KafkaProducer(bootstrap_servers=containerip, value_serializer=lambda x: json.dumps(x).encode('utf-8'),api_version=(2)), lines=True, orient='records')
cant run fastapi docker file with -p 8000:8000 -p 9092:9092 it says 9092 is already used.
Remove it then. Unclear why you need port 9092 on your API, anyway ; it's not the Kafka service.
Without seeing your complete Kafka client code, it's hard to say what your other problems are, so please consult Connect to Kafka running in Docker
For example, what is containerip? This should be kafka:9092 (if you follow the instructions in the linked post)
Run docker network create
Make sure you use docker run with --network on both containers
Ensure KAFKA_ADVERTISTED_LISTENERS variable contains at least INTERNAL://kafka:9092
Remove the -p flags for the Kafka container since you are only interacting with Kafka from another container.
Connect Python to kafka:9092
If you set network mode of your container to host, it's done.
run your fast-api (pay attention to network switch):
docker run --network host -p 8000:8000 fast-api
run kafka:
docker run -p 9092:9092 kafka
run postgres:
docker run -p 5432:5432 postgres
but it's better to use bridge-networks.

How to deploy a scalable API using fastapi?

I have a complex API which takes around 7GB memory when I deploy it using Uvicorn.
I want to understand how I can deploy it, such a way that from my end I want to be able to make parallel requests. The deployed API should be capable of processing two or three requests at same time.
I am using FastAPI with uvicorn and nginx for deployment. Here is my deployed command.
uvicorn --host 0.0.0.0 --port 8888
Can someone provide some clarity on how people achieve that?
You can use gunicorn instead of uvicorn to handle your backend. Gunicorn offers multiple workers to effectively make load balancing of the arriving requests. This means that you will have as many gunicorn running process as you specify to receive and process requests. From the doc, gunicorn should only need 4-12 worker processes to handle hundreds or thousands of requests per second. However, the number of workers should be no more than (2 x number_of_cpu_cores) + 1 to avoid running out of memory errors. You can check this out in the doc.
For example, if you want to use 4 workers for your fastapi-based backend, you can specify it with the flag w:
gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker -b "0.0.0.0:8888"
In this case, the script where I have my backend functionalities is called main and fastapi is instantiated as app.
I'm working on something like this using Docker and NGINX.
There's a Docker official image created by the guy who developed FastAPI that deploys uvicorn/gunicorn for you that can be configured to your needs:
It took some time to get the hang of Docker but I'm really liking it now. You can build an nginx image using the below configuration and then build x amount of your app inside of separate containers for however many you need to serve as hosts.
The below example is running a weighted load balancer for two of my app services with a backup third if those two should fail.
https://hub.docker.com/r/tiangolo/uvicorn-gunicorn-fastapi
nginx Dockerfile:
FROM nginx
# Remove the default nginx.conf
RUN rm /etc/nginx/conf.d/default.conf
# Replace with our own nginx.conf
COPY nginx.conf /etc/nginx/conf.d/
nginx.conf:
upstream loadbalancer {
server 192.168.115.5:8080 weight=5;
server 192.168.115.5:8081;
server 192.168.115.5:8082 backup;
}
server {
listen 80;
location / {
proxy_pass http://loadbalancer;
}
}
app Dockerfile:
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.7
RUN pip install --upgrade pip
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . /app

DockerFile for a Python script with Firefox-based Selenium web driver, Flask & BS4 dependencies

Super new to python, and never used docker before. I want to host my python script on Google Cloud Run but need to package into a Docker container to submit to google.
What exactly needs to go in this DockerFile to upload to google?
Current info:
Python: v3.9.1
Flask: v1.1.2
Selenium Web Driver: v3.141.0
Firefox Geckodriver: v0.28.0
Beautifulsoup4: v4.9.3
Pandas: v1.2.0
Let me know if further information about the script is required.
I have found the following snippets of code to use as a starting point from here. I just don't know how to adjust to fit my specifications, nor do I know what 'gunicorn' is used for.
# Use the official Python image.
# https://hub.docker.com/_/python
FROM python:3.7
# Install manually all the missing libraries
RUN apt-get update
RUN apt-get install -y gconf-service libasound2 libatk1.0-0 libcairo2 libcups2 libfontconfig1 libgdk-pixbuf2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libxss1 fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils
# Install Chrome
RUN wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
RUN dpkg -i google-chrome-stable_current_amd64.deb; apt-get -fy install
# Install Python dependencies.
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
# Copy local code to the container image.
ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . .
# Run the web service on container startup. Here we use the gunicorn
# webserver, with one worker process and 8 threads.
# For environments with multiple CPU cores, increase the number of workers
# to be equal to the cores available.
CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 main:app
# requirements.txt
Flask==1.0.2
gunicorn==19.9.0
selenium==3.141.0
chromedriver-binary==77.0.3865.40.0
Gunicorn is an application server for running your python application instance, it is a pure-Python HTTP server for WSGI applications. It allows you to run any Python application concurrently by running multiple Python processes within a single dyno.
Please have a look into the following Tutorial which explains in detail regarding gunicorn.
Regarding Cloud Run, to deploy to Cloud Run, please follow next steps or the Cloud Run Official Documentation:
1) Create a folder
2) In that folder, create a file named main.py and write your Flask code
Example of simple Flask code
import os
from flask import Flask
app = Flask(__name__)
#app.route("/")
def hello_world():
name = os.environ.get("NAME", "World")
return "Hello {}!".format(name)
if __name__ == "__main__":
app.run(debug=True, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))
3) Now your app is finished and ready to be containerized and uploaded to Container Registry
3.1) So to containerize your app, you need a Dockerfile in the same directory as the source files (main.py)
3.2) Now build your container image using Cloud Build, run the following command from the directory containing the Dockerfile:
gcloud builds submit --tag gcr.io/PROJECT-ID/FOLDER_NAME
where PROJECT-ID is your GCP project ID. You can get it by running gcloud config get-value project
4) Finally you can deploy to Cloud Run by executing the following command:
gcloud run deploy --image gcr.io/PROJECT-ID/FOLDER_NAME --platform managed
You can also have a look into the Google Cloud Run Official GitHub Repository for a Cloud Run Hello World Sample.

Can a Docker container (and its Python script) be a RESTful API?

I have a python script that will take a single argument. The script makes calls to 3rd party APIs, which obviously need to be server-side because of the API keys.
I want to make a website that makes an ajax call (or similar) to run the python script and give it some browser-side data.
I thought this would be a good project to use Docker for, but I can't find anything to indicate that this will work.
Is python not the right tool? Is Docker not the right tool?
Any help is greatly appreciated!
Edit (also in comments)
I've deployed a test image/service/cluster successfully on DigitalOcean, but I don't see how I can configure the endpoint to be RESTful
Edit (for source code)
To test the setup, I've been using a Docker-provided tutorial.
Dockerfile
#use an official Python runtime as a parent image
FROM python:3.6.2-slim
WORKDIR /app
ADD . /app
RUN pip install -r requirements.txt
EXPOSE 80
ENV NAME World
CMD ["python", "app.py"]
docker-compose.yml
version: "3"
services:
web:
# replace username/repo:tag with your name and image details
image: username/repo:latest
deploy:
replicas: 5
resources:
limits:
cpus: "0.1"
memory: 50M
restart_policy:
condition: on-failure
ports:
- "80:80"
networks:
- webnet
networks:
webnet:
app.py
from flask import Flask
from redis import Redis, RedisError
import os
import socket
# Connect to Redis
redis = Redis(host="redis", db=0, socket_connect_timeout=2,
socket_timeout=2)
app = Flask(__name__)
#app.route("/")
def hello():
try:
visits = redis.incr("counter")
except RedisError:
visits = "<i>cannot connect to Redis, counter disabled</i>"
html = "<h3>Hello {name}!</h3>" \
"<b>Hostname:</b> {hostname}<br/>" \
"<b>Visits:</b> {visits}"
return html.format(name=os.getenv("NAME", "world"), hostname=socket.gethostname(), visits=visits)
if __name__ == "__main__":
app.run(host='0.0.0.0', port=80)
It suddenly seems like I probably just need to change #app.route("/")...
Short answer, yes, its possible.
I recommend you first run your Python code outside of a container and try to connect to the droplet before putting in Docker, just to make sure it works.
Once you have the basic connection working, you can move into a container, and build out the REST API. (or GraphQL, if you're into that)
Containers aren't RESTful, the app they host can be. You don't have any REST endpoints yet, looks like, though.
Anyways, you've mapped the port
ports:
- "80:80"
And the app in running there
if __name__ == "__main__":
app.run(host='0.0.0.0', port=80)
Assuming you have opened your firewall to allow port 80 on DigitalOcean, that's all you need
Just fire up your http://digital-ocean.address
Is python not the right tool?
Use whatever you're comfortable with. For example, if you just want make HTTP requests, straight shell commands to do cURL commands and echo HTML responses would work.
Is Docker not the right tool?
It's not required. Even if you have keys, placing them within a container just adds a layer of redirection, not obfuscation

Categories