I am trying to run an machine learning inference server on docker container with AWS sagemaker , Flask, Nginx and Gunicorn. I have tried running with a c5.xlarge instance and c5.4xlarge instance on AWS sagemaker and it always breaks when run on a c5.xlarge instance.
When the request comes to check health of the application by loading the ML model which is around 300 mb. When inference endpoint is called, it checks if the model is up and running in the worker and if not get the ML model up and then run the prediction with data. I usually call the model with <=5MB data .
Nginx Config:
worker_processes auto;
daemon off; # Prevent forking
pid /tmp/nginx.pid;
error_log /var/log/nginx/error.log;
events {
# defaults
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
access_log /var/log/nginx/access.log combined;
upstream gunicorn {
server unix:/tmp/gunicorn.sock;
}
server {
listen 8080 deferred;
client_max_body_size 5m;
keepalive_timeout 10000;
location ~ ^/(ping|invocations) {
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_pass http://gunicorn;
}
location / {
return 404 "{}";
}
}
}
gunicorn :
subprocess.Popen(['gunicorn',
'--timeout', str(model_server_timeout),
'-k', 'gevent',
'-b', 'unix:/tmp/gunicorn.sock',
'-w', str(model_server_workers),
'--error-logfile', '-',
'--access-logfile', '-',
'--preload',
'wsgi:app'])
I have looked at the timeout (it is already set at 60 secs for gunicorn), tried preloading the app, and the logs thrown to stdout have only upstream prematurely closed connection while reading response in the error.
How long does your container usually respond to requests? If you use the container in a hosted Endpoint, the container has to respond to requests within 60 seconds. It might be helpful to set the gunicorn timeout to be a little lower than 60 seconds.
https://docs.aws.amazon.com/sagemaker/latest/dg/API_runtime_InvokeEndpoint.html
It looks like the response time depends on the instance type. If this is the case, and you do want to use c5.xlarge instance type for example, you can try to create a batch transform job instead of using a real time inference endpoint. Batch transform job does allow >60 seconds response time for each request.
https://docs.aws.amazon.com/sagemaker/latest/dg/API_CreateTransformJob.html
Hope this helps!
-Han
Related
I have the python flask application which i am using inside the NGINX and i have dockerised this image. By default flask application runs in 5000 port and NGINX in 80 port. If I run image in the container all the services are working fine. I am able to access the services from NGINX port 80 which internally mapped to flask 5000 port.
Now I want to add the health check for this image. So i am using the py-healthcheck module in the flask application like this.
health = HealthCheck()
def redis_available():
return True, "UP"
health.add_check(redis_available)
app.add_url_rule("/health", "healthcheck", view_func=lambda: health.run())
Now If i run only the flask application(without NGINX in my local system) using the URL
http://localhost:5000/health
I am getting the proper response saying the applicarion is up.
In order to add the healthcheck for image i have adde this command in Dockerfile
HEALTHCHECK --interval=30s --timeout=120s --retries=3 CMD wget --no-check-certificate --quiet --tries=1 --spider https://localhost:80/health || exit 1
Here i am assuming that i am trying to access the healthcheck endpoint from NGINX thats why i am using localhost:80. But if i run the conainer the container is always unhealthy but all the end points are working fine. Whether i have to do some configuration in NGINX conf file in order access the healthcheck endpoint of flask from NGINX?
Here is the nginx config:
# based on default config of nginx 1.12.1
# Define the user that will own and run the Nginx server
user nginx;
# Define the number of worker processes; recommended value is the number of
# cores that are being used by your server
# auto will default to number of vcpus/cores
worker_processes auto;
# altering default pid file location
pid /tmp/nginx.pid;
# turn off daemon mode to be watched by supervisord
daemon off;
# Enables the use of JIT for regular expressions to speed-up their processing.
pcre_jit on;
# events block defines the parameters that affect connection processing.
events {
# Define the maximum number of simultaneous connections that can be opened by a worker process
worker_connections 1024;
}
# http block defines the parameters for how NGINX should handle HTTP web traffic
http {
# Include the file defining the list of file types that are supported by NGINX
include /opt/conda/envs/analytics_service/etc/nginx/mime.types;
# Define the default file type that is returned to the user
default_type text/html;
# Don't tell nginx version to clients.
server_tokens off;
# Specifies the maximum accepted body size of a client request, as
# indicated by the request header Content-Length. If the stated content
# length is greater than this size, then the client receives the HTTP
# error code 413. Set to 0 to disable.
client_max_body_size 0;
# Define the format of log messages.
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
# Define the location of the log of access attempts to NGINX
access_log /opt/conda/envs/analytics_service/etc/nginx/access.log main;
# Define the location on the file system of the error log, plus the minimum
# severity to log messages for
error_log /opt/conda/envs/analytics_service/etc/nginx/error.log warn;
# Define the parameters to optimize the delivery of static content
sendfile on;
tcp_nopush on;
tcp_nodelay on;
# Define the timeout value for keep-alive connections with the client
keepalive_timeout 65;
# Define the usage of the gzip compression algorithm to reduce the amount of data to transmit
#gzip on;
# Include additional parameters for virtual host(s)/server(s)
include /opt/conda/envs/analytics_service/etc/nginx/conf.d/*.conf;
}
I have a Flask (with Flask-restplus) app running locally on port 5000. When I launch the app locally and go below URL I can see the Swagger UI.
http://localhost:5000/api/#
But when I run it behind NGINX, gunicorn and go to
http://localhost:81/api/#
I get below error
Can't read from server. It may not have the appropriate access-control-origin settings.
When I look at the chrome error I see a request being made to http://localhost/api/swagger.json Colud this be the problem as the NGINX container is running on port 81?
$(function () {
window.swaggerUi = new SwaggerUi({
url: "http://localhost/api/swagger.json",
validatorUrl: "" || null,
dom_id: "swagger-ui-container",
supportedSubmitMethods: ['get', 'post', 'put', 'delete', 'patch'],
onComplete: function(swaggerApi, swaggerUi){
},
onFailure: function(data) {
log("Unable to Load SwaggerUI"); // <<<-- This is where it breaks
},
});
window.swaggerUi.load();
However I am able to make a postman request to my through http://localhost:81/api/centres/1 and I get expected data
After googling since last three days the options are:
To send CORS headers on response. I dont prefer this as it is a security risk.
To configure NGINX to redirect requests to correct url (http://flask.pocoo.org/snippets/35/)
This is what my server config looks like
server {
listen 81;
charset utf-8;
# Configure NGINX to reverse proxy HTTP requests to the upstream server
(Gunicorn (WSGI server))
location / {
# Define the location of the proxy server to send the request to
proxy_pass http://web:8000;
proxy_redirect http://localhost/api http://localhost:81/api;
# Redefine the header fields that NGINX sends to the upstream server
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Scheme $scheme;
}
}
It still doesnt show me the swagger UI. I'm a newbie to this world of Docker, Nginx and gunicorn. How can fix this issue?
I am trying to run 2 django project in one single VPS server in 2 different domain. I am using gunicorn as my project server.
I have created 2 virtual env, 2 supervisor and 2 separate file in sites-available and enable-folder.
My project is running well but the problem is in one time only one project is running in both the domain. Though in my nginx sites-avaible files are given different domain as server_name still one django project is running both of the domain
Can any oone help.
/etc/nginx/sites-avaiable/VaidText
upstream sample_project_server {
# fail_timeout=0 means we always retry an upstream even if it failed
# to return a good HTTP response (in case the Unicorn master nukes a
# single worker for timing out).
server unix:/home/example/test.example.com/TestEnv/run/gunicorn.sock fail_timeout=0;
}
server {
listen 80;
server_name test.example.com;
client_max_body_size 4G;
access_log /home/example/test.example.com/logs/nginx-access.log;
error_log /home/example/test.example.com/logs/nginx-error.log;
location /static/ {
alias /home/ubuntu/static/;
}
location /media/ {
alias /home/ubuntu/media/;
}
location / {
# an HTTP header important enough to have its own Wikipedia entry:
# http://en.wikipedia.org/wiki/X-Forwarded-For
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# enable this if and only if you use HTTPS, this helps Rack
# set the proper protocol for doing redirects:
# proxy_set_header X-Forwarded-Proto https;
# pass the Host: header from the client right along so redirects
# can be set properly within the Rack application
proxy_set_header Host $http_host;
# we don't want nginx trying to do something clever with
# redirects, we set the Host: header above already.
proxy_redirect off;
# set "proxy_buffering off" *only* for Rainbows! when doing
# Comet/long-poll stuff. It's also safe to set if you're
# using only serving fast clients with Unicorn + nginx.
# Otherwise you _want_ nginx to buffer responses to slow
# clients, really.
# proxy_buffering off;
# Try to serve static files from nginx, no point in making an
# *application* server like Unicorn/Rainbows! serve static files.
if (!-f $request_filename) {
proxy_pass http://sample_project_server;
break;
}
}
# Error pages
error_page 500 502 503 504 /500.html;
location = /500.html {
root /home/ubuntu/static/;
}
}
/etc/nginx/sites-avaiable/SheikhText
upstream sample_project_server {
# fail_timeout=0 means we always retry an upstream even if it failed
# to return a good HTTP response (in case the Unicorn master nukes a
# single worker for timing out).
server
unix:/home/example/sheikhnoman.example.com/SheikhEnv/run/gunicorn.sock fail_timeout=0;
}
server {
listen 80;
server_name sheikhnoman.example.com;
client_max_body_size 4G;
access_log /home/example/sheikhnoman.example.com/logs/nginx-access.log;
error_log /home/example/sheikhnoman.example.com/logs/nginx-error.log;
location /static/ {
alias /home/ubuntu/sheikhnoman/static/;
}
location /media/ {
alias /home/ubuntu/sheikhnoman/media/;
}
location / {
# an HTTP header important enough to have its own Wikipedia entry:
# http://en.wikipedia.org/wiki/X-Forwarded-For
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# enable this if and only if you use HTTPS, this helps Rack
# set the proper protocol for doing redirects:
# proxy_set_header X-Forwarded-Proto https;
# pass the Host: header from the client right along so redirects
# can be set properly within the Rack application
proxy_set_header Host $http_host;
# we don't want nginx trying to do something clever with
# redirects, we set the Host: header above already.
proxy_redirect off;
# set "proxy_buffering off" *only* for Rainbows! when doing
# Comet/long-poll stuff. It's also safe to set if you're
# using only serving fast clients with Unicorn + nginx.
# Otherwise you _want_ nginx to buffer responses to slow
# clients, really.
# proxy_buffering off;
# Try to serve static files from nginx, no point in making an
# *application* server like Unicorn/Rainbows! serve static files.
if (!-f $request_filename) {
proxy_pass http://sample_project_server;
break;
}
}
# Error pages
error_page 500 502 503 504 /500.html;
location = /500.html {
root /home/ubuntu/sheikhnoman/static/;
}
}
Your two ngnix config files contains the same "sample_project_server" upstream name. Try to set a different upsteam name for each of your files.
I have my django app setup on a gunicorn server that is being proxied by nginx(also used for static files), and nginx is "intercepting" the GET request with the credentials code from Google! Why is nginx stealing the request instead of passing it to gunicorn to be processed?
here is my api info for my web application:
Client ID:
67490467925-v76j4e7bcdrps3ve37q41bnrtjm3jclj.apps.googleusercontent.com
Email address:
67490467925-v76j4e7bcdrps3ve37q41bnrtjm3jclj#developer.gserviceaccount.com
Client secret:
XquTw495rlwsHOodhWk
Redirect URIs: http://www.quickerhub.com
JavaScript origins: https://www.quickerhub.com
and here is the perfect GET request being stolen by nginx:
http://www.quickerhub.com/?code=4/bzqKIpj3UA3bBiyJfQzi3svzPBLZ.QoB_rXWZ6hUbmmS0T3UFEsPMOFF4fwI
and of course sweet nginx is giving me the "Welcome to nginx!" page...
Is there a way to tell nginx to pass these requests on to gunicorn? Or am I doing something incorrectly?
Thanks!
NGINX vhost config:
upstream interest_app_server {
# fail_timeout=0 means we always retry an upstream even if it failed
# to return a good HTTP response (in case the Unicorn master nukes a
# single worker for timing out).
server unix:/webapps/hello_django/run/gunicorn.sock fail_timeout=0;
}
server {
listen 80;
server_name quickerhub.com;
client_max_body_size 4G;
access_log /webapps/hello_django/logs/nginx-access.log;
error_log /webapps/hello_django/logs/nginx-error.log;
location /static {
root /webapps/hello_django/interest/;
}
location /media {
root /webapps/hello_django/interest/;
}
location /static/admin {
root /webapps/hello_django/lib/python2.7/site-packages/django/contrib/admin/;
}
location / {
# an HTTP header important enough to have its own Wikipedia entry:
# http://en.wikipedia.org/wiki/X-Forwarded-For
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# enable this if and only if you use HTTPS, this helps Rack
# set the proper protocol for doing redirects:
# proxy_set_header X-Forwarded-Proto https;
# pass the Host: header from the client right along so redirects
# can be set properly within the Rack application
proxy_set_header Host $http_host;
# we don't want nginx trying to do something clever with
# redirects, we set the Host: header above already.
proxy_redirect off;
# set "proxy_buffering off" *only* for Rainbows! when doing
# Comet/long-poll stuff. It's also safe to set if you're
# using only serving fast clients with Unicorn + nginx.
# Otherwise you _want_ nginx to buffer responses to slow
# clients, really.
# proxy_buffering off;
# Try to serve static files from nginx, no point in making an
# *application* server like Unicorn/Rainbows! serve static files.
if (!-f $request_filename) {
proxy_pass http://interest_app_server;
break;
}
}
# Error pages
error_page 500 502 503 504 /500.html;
location = /500.html {
root /webapps/hello_django/interest/templates/;
}
}
you have
server_name quickerhub.com;
the get request is coming back to
http://www.quickerhub.com/?code=4/bzqKIpj3UA3bBiyJfQzi3svzPBLZ.QoB_rXWZ6hUbmmS0T3UFEsPMOFF4fwI
quickerhub.com != www.quickerhub.com and so nginx is falling through to serving the default page (for when it can't find a vhost).
all you need to do is use
server_name www.quickerhub.com quickerhub.com;
or even better, add this to "canonicalise" all your urls to the version without www.
server {
server_name www.quickerhub.com;
expires epoch;
add_header Cache-Control "no-cache, public, must-revalidate, proxy-revalidate";
rewrite ^ http://quickerhub.com$request_uri permanent;
}
I've got this error in nginx version 1.0.0
nginx: [emerg] unknown directive "user" in /etc/nginx/sites-enabled/
tornado:1
if I remove user www-data the worker processes got error
nginx: [emerg] unknown directive "worker_processes" in /etc/nginx/
sites-enabled/tornado:1
I've search on google but still got nothing
please help
this is my tornado in site-available
user www-data www-data;
worker_processes 1;
error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
use epoll;
}
http {
# Enumerate all the Tornado servers here
upstream frontends {
server 127.0.0.1:8081;
server 127.0.0.1:8082;
server 127.0.0.1:8083;
server 127.0.0.1:8084;
}
include /etc/nginx/mime.types;
default_type application/octet-stream;
access_log /var/log/nginx/access.log;
keepalive_timeout 65;
proxy_read_timeout 200;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
gzip on;
gzip_min_length 1000;
gzip_proxied any;
gzip_types text/plain text/html text/css text/xml
application/x-javascript application/xml
application/atom+xml text/javascript;
# Only retry if there was a communication error, not a timeout
# on the Tornado server (to avoid propagating "queries of death"
# to all frontends)
proxy_next_upstream error;
server {
listen 8080;
# Allow file uploads
client_max_body_size 50M;
location ^~ /static/ {
root /var/www;
if ($query_string) {
expires max;
}
}
location = /favicon.ico {
rewrite (.*) /static/favicon.ico;
}
location = /robots.txt {
rewrite (.*) /static/robots.txt;
}
location / {
proxy_pass_header Server;
proxy_set_header Host $http_host;
proxy_redirect false;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_pass http://frontends;
}
}
}
Probably a bit overdue, but if anyone stumbles on this here's a hint:
Probably config collision, check in /etc/nginx for a .conf file with same directive.
Also worth checking, is whether the nginx.conf has an "include" line. It's very common and is a source of collisions.
For example.
evan#host:~/$ cat /etc/nginx/nginx.conf | grep include
include /etc/nginx/mime.types;
include /etc/nginx/conf.d/.conf;
include /etc/nginx/sites-enabled/;
In this case, a directive in /etc/nginx/sites-enabled/ will clash with the the contents of nginx.conf. Make sure you don't double up on anything between the included files.
Just want to elaborate on Kjetil M.'s answer, as that worked for me but I did not get what he means immediately. I wasn't until after a lot of attempts did I fix the problem and had a "oh that's what he meant" momment.
If your /etc/nginx/nginx.conf file and one of other config files /etc/nginx/sites-enabled/ use the same directive such as "user", you will run into this error. Just make sure only 1 version is active and comment out the other ones.
worker_* directives must be on the top of configuration, that means must be in /etc/nginx/nginx.conf
Example:
My firsts lines are:
user www-data;
worker_processes 4;
worker_connections 1024;
if you want to know how many workers is the best for your server you can run this command:
grep processor /proc/cpuinfo | wc -l
this tell you how many cores do you have, it doesn't make sense to have more workers than cores for websites.
if you want to know how many connections your workers can handle you can use this:
ulimit -n
Hope it helps.
I was getting the same error, but when I started nginx with -c options as
nginx -c conf.d/myapp.conf
it worked fine
Another thing, if you've created the config file on Windows and are using in on Linux, make sure the line endings are correct ("\r\n" vs. "\r") and that the file is not stored as unicode.
In my case, the error message appeared to show a space before user, even though there was no space there:
nginx: [emerg] unknown directive " user" in /etc/nginx/nginx.conf:1
Turns out that two of my .conf files had a BOM at the beginning of the file. Removing the BOM fixed the issue.