mod-wsgi hangs during posting with https and large data packages

mod-wsgi hangs during posting with https and large data packages - python

I created a https environment in win10 with apache24 + openssl(1.1.1) + mod-wsgi(4.5.24+ap24vc14).
It works well for http posting (no matter how big of the posting data package) but I met a problem for https posting.
For https posting:
when the client and the server are the same local machine, also works well no matter how big of the posting data package.
when the client is a different machine in the same domain, it also works well for small or medium posting data packages, maybe less than 3M, no precise number.
when the client is a different machine in the same domain and posting relatively big data packages, about 5M or 6M, after initial several successful posting, program hangs at server body=environ['wsgi.input'].read(length), no response and no error (seldomly it will pass successfully after a long time, but mostly it willl hang until the connection time out).
when debugging the client and the server, the runtime values of length are both correct and the same.
it seems body=environ['wsgi.input'].read(length) comes from sys.stdin.buffer.read(length), but I still can't find the root reason and a solution.
Client code:
import json
import requests
import base64
import requests.packages.urllib3.util.ssl_
requests.packages.urllib3.util.ssl_.DEFAULT_CIPHERS = 'ALL'
url="https://192.168.0.86"
# url="http://192.168.0.86"
f_img=open("./PICs/20191024142412.jpg",'rb')
# f_img=open("./PICs/20191023092645.jpg",'rb')
json_data={'type':'idpic','image':str(base64.b64encode(f_img.read()),'utf-8')}
result = requests.post(url,json=json_data,verify=False)
result_data=json.loads(result.content)
print(result_data)
Part of server codes:
class WSGICopyBody(object):
def __init__(self, application):
self.application = application
def __call__(self, environ, start_response):
from io import StringIO, BytesIO
length = environ.get('CONTENT_LENGTH', '0')
length = 0 if length == '' else int(length)
body = environ['wsgi.input'].read(length)
environ['body_copy'] = body
environ['wsgi.input'] = BytesIO(body)
app_iter = self.application(environ,self._sr_callback(start_response))
return app_iter
def _sr_callback(self, start_response):
def callback(status, headers, exc_info=None):
start_response(status, headers, exc_info)
return callback
app = Flask(__name__)
app.wsgi_app = WSGICopyBody(app.wsgi_app)
#app.route('/',methods=['POST'])
#app.route('/picserver',methods=['POST'])
def picserver():
print("before request.get_data")
request_json_data = request.environ['body_copy']

Related

How can I fix ConnectTimeout exceptions from within FastAPI

I am looking to create a server that takes in a request, does some processing, and forwards the request to another endpoint. I seem to be running into an issue at higher concurrency where my client.post is causing a httpx.ConnectTimeout exception.
I haven't completely ruled out the possibility of an issue with the endpoint(I am currently working with them to debug anything that might be on their end), but I'm trying to figure out if there is something wrong on my end or if there are any glaring inefficiencies I can improve upon.
I am running this in ECS, currently on a cluster where tasks have 4 vCPUs. I am using the docker image uvicorn-gunicorn-fastapi(https://github.com/tiangolo/uvicorn-gunicorn-fastapi-docker). Currently all default settings minus the bind/port/logging. Here is a minimal code example:
import httpx
from fastapi import FastAPI, Request, Response
app = FastAPI()
def process_request(path, request):
#Process Request Here
def create_headers(path):
#Create headers here
#app.get('/')
async def root(path: str, request: Request):
endpoint = 'https://endpoint.com/'
querystring = 'path=' + path
data = process_request(request, path, request)
headers = create_headers(request)
async with httpx.AsyncClient() as client:
await client.post(endpoint + "?" + querystring, data=data, headers=headers)
return Response(status_code=200)

Could be that the server on the other side is taking too much and the connection simply times out because httpx doesn't give enough time to the other endpoint to complete the request?
If yes, you could try disabling timeout or increase the limit (which I suggest over disabling).
See https://www.python-httpx.org/quickstart/#timeouts

python flask+gevent doesn't release memory. limit.conf

#packages.
greenlet==0.4.11
Flask==0.11.1
#centos, /etc/security/limit.conf
* soft nofile 65535
* hard nofile 65535
This is my test codes (python 3.5) I ran this and watched memory usage.
At First, It started with 30MB memory with 3 threads.
But After sending bulk "/do" request on this server,
memory increase to 60MB with 12 threads. Although sending and every request is done. this memory usage is not changed.
from gevent import monkey;monkey.patch_all(thread=False)
import gevent
from flask import Flask, request
from gevent.pywsgi import WSGIServer
import requests
app = Flask(__name__)
#app.route("/do", methods=['GET', 'POST'])
def ping():
data = request.get_json()
gevent.spawn(send_request, data)
return 'pong'
def send_request(data):
resp = requests.get("http://127.0.0.1:25000/ping", data=data)
if resp.text != 'pong':
app.logger.error(resp.text)
if __name__ == "__main__":
http = WSGIServer(("0.0.0.0", 9999), app)
http.serve_forever()
end_server = True
app.logger.info("Server will be closed")
I think this python uses all available 65535 file count.
How can I limit python to use less file count than I configured in limit.conf file?

python seems not reuse socket when it busy, so It makes socket over an over again until limit.conf nofile limit when sending request in spawn.
So, I just gave a limit for this python process.
import resource
resource.setrlimit(resource.RLIMIT_NOFILE, (1024, 1024))
== updated ==
But requests library still consumes a lot of memory..
I just decided to use tornado http server and AsyncHttpClient with this options below,
AsyncHTTPClient.configure("tornado.simple_httpclient.SimpleAsyncHTTPClient", max_clients=1000)
tornado.netutil.Resolver.configure("tornado.netutil.ThreadedResolver")
you need to write this code on global area below "import" stuffs.
and used gen.moment after finishing request to send it immediately.
#gen.coroutine
def get(self):
self.write("pong")
self.finish()
yield gen.moment
resp = yield self.application.http_client.fetch("...url...", method='POST', headers={"Content-Type": "application/json"},
body=json.dumps({..data..}))

Run function on Flask server every x seconds to update Redis cache without clients making separate calls

I currently have a flask app that makes a call to S3 as well as an external API with the following structure before rendering the data in javascript:
from flask import Flask, render_template,make_response
from flask import request
import requests
import requests_cache
import redis
from boto3.session import Session
import json
app = Flask(__name__)
#app.route('/test')
def test1():
bucket_root = 'testbucket'
session = Session(
aws_access_key_id='s3_key',
aws_secret_access_key='s3_secret_key')
s3 = session.resource('s3')
bucket = s3.Bucket(bucket_root)
testvalues = json.dumps(s3.Object(bucket_root,'all1.json').get()['Body'].read())
r = requests.get(api_link)
return render_template('test_html.html',json_s3_test_response=r.content,
limit=limit, testvalues=testvalues)
#app.route('/test2')
def test2():
bucket_root = 'testbucket'
session = Session(
aws_access_key_id='s3_key',
aws_secret_access_key='s3_secret_key')
s3 = session.resource('s3')
bucket = s3.Bucket(bucket_root)
testvalues = json.dumps(s3.Object(bucket_root,'all2.json').get()['Body'].read())
r = requests.get(api_link)
return render_template('test_html.html',json_s3_test_response=r.content,
limit=limit, testvalues=testvalues)
#app.errorhandler(500)
def internal_error(error):
return "500 error"
#app.errorhandler(404)
def not_found(error):
return "404 error",404
#app.errorhandler(400)
def custom400(error):
return "400 error",400
//catch all?
#app.errorhandler(Exception)
def all_exception_handler(error):
return 'error', 500
Obviously I have a lot of inefficiencies here, but my main question is:
To me it seems like I'm calling S3 and the external API for each client, every time they refresh the page. This increases the chance for the app to crash due to timeouts (and my poor error handling) and diminishes performance. I would like to resolve this by periodically caching the S3 results (say every 10 mins) into a local redis server (already set up and running) as well as just pinging the external API just once from the server every few seconds before passing it onto ALL clients.
I have code that can store the data into redis every 10 mins in a regular python script, however, I'm not sure where to place this within the flask server? Do I put it as it's own function or keep the call to redis in the #app.route()?
Thank you everyone for your time and effort. Any help would be appreciated! I'm new to flask so some of this has been confusing.

Client certificates and mutual authentication in Python [duplicate]

I want to make a little update script for a software that runs on a Raspberry Pi and works like a local server. That should connect to a master server in the web to get software updates and also to verify the license of the software.
For that I set up two python scripts. I want these to connect via a TLS socket. Then the client checks the server certificate and the server checks if it's one of the authorized clients. I found a solution for this using twisted on this page.
Now there is a problem left. I want to know which client (depending on the certificate) is establishing the connection. Is there a way to do this in Python 3 with twisted?
I'm happy with every answer.

In a word: yes, this is quite possible, and all the necessary stuff is
ported to python 3 - I tested all the following under Python 3.4 on my Mac and it seems to
work fine.
The short answer is
"use twisted.internet.ssl.Certificate.peerFromTransport"
but given that a lot of set-up is required to get to the point where that is
possible, I've constructed a fully working example that you should be able to
try out and build upon.
For posterity, you'll first need to generate a few client certificates all
signed by the same CA. You've probably already done this, but so others can
understand the answer and try it out on their own (and so I could test my
answer myself ;-)), they'll need some code like this:
# newcert.py
from twisted.python.filepath import FilePath
from twisted.internet.ssl import PrivateCertificate, KeyPair, DN
def getCAPrivateCert():
privatePath = FilePath(b"ca-private-cert.pem")
if privatePath.exists():
return PrivateCertificate.loadPEM(privatePath.getContent())
else:
caKey = KeyPair.generate(size=4096)
caCert = caKey.selfSignedCert(1, CN="the-authority")
privatePath.setContent(caCert.dumpPEM())
return caCert
def clientCertFor(name):
signingCert = getCAPrivateCert()
clientKey = KeyPair.generate(size=4096)
csr = clientKey.requestObject(DN(CN=name), "sha1")
clientCert = signingCert.signRequestObject(
csr, serialNumber=1, digestAlgorithm="sha1")
return PrivateCertificate.fromCertificateAndKeyPair(clientCert, clientKey)
if __name__ == '__main__':
import sys
name = sys.argv[1]
pem = clientCertFor(name.encode("utf-8")).dumpPEM()
FilePath(name.encode("utf-8") + b".client.private.pem").setContent(pem)
With this program, you can create a few certificates like so:
$ python newcert.py a
$ python newcert.py b
Now you should have a few files you can use:
$ ls -1 *.pem
a.client.private.pem
b.client.private.pem
ca-private-cert.pem
Then you'll want a client which uses one of these certificates, and sends some
data:
# tlsclient.py
from twisted.python.filepath import FilePath
from twisted.internet.endpoints import SSL4ClientEndpoint
from twisted.internet.ssl import (
PrivateCertificate, Certificate, optionsForClientTLS)
from twisted.internet.defer import Deferred, inlineCallbacks
from twisted.internet.task import react
from twisted.internet.protocol import Protocol, Factory
class SendAnyData(Protocol):
def connectionMade(self):
self.deferred = Deferred()
self.transport.write(b"HELLO\r\n")
def connectionLost(self, reason):
self.deferred.callback(None)
#inlineCallbacks
def main(reactor, name):
pem = FilePath(name.encode("utf-8") + b".client.private.pem").getContent()
caPem = FilePath(b"ca-private-cert.pem").getContent()
clientEndpoint = SSL4ClientEndpoint(
reactor, u"localhost", 4321,
optionsForClientTLS(u"the-authority", Certificate.loadPEM(caPem),
PrivateCertificate.loadPEM(pem)),
)
proto = yield clientEndpoint.connect(Factory.forProtocol(SendAnyData))
yield proto.deferred
import sys
react(main, sys.argv[1:])
And finally, a server which can distinguish between them:
# whichclient.py
from twisted.python.filepath import FilePath
from twisted.internet.endpoints import SSL4ServerEndpoint
from twisted.internet.ssl import PrivateCertificate, Certificate
from twisted.internet.defer import Deferred
from twisted.internet.task import react
from twisted.internet.protocol import Protocol, Factory
class ReportWhichClient(Protocol):
def dataReceived(self, data):
peerCertificate = Certificate.peerFromTransport(self.transport)
print(peerCertificate.getSubject().commonName.decode('utf-8'))
self.transport.loseConnection()
def main(reactor):
pemBytes = FilePath(b"ca-private-cert.pem").getContent()
certificateAuthority = Certificate.loadPEM(pemBytes)
myCertificate = PrivateCertificate.loadPEM(pemBytes)
serverEndpoint = SSL4ServerEndpoint(
reactor, 4321, myCertificate.options(certificateAuthority)
)
serverEndpoint.listen(Factory.forProtocol(ReportWhichClient))
return Deferred()
react(main, [])
For simplicity's sake we'll just re-use the CA's own certificate for the
server, but in a more realistic scenario you'd obviously want a more
appropriate certificate.
You can now run whichclient.py in one window, then python tlsclient.py a;
python tlsclient.py b in another window, and see whichclient.py print out
a and then b respectively, identifying the clients by the commonName
field in their certificate's subject.
The one caveat here is that you might initially want to put that call to
Certificate.peerFromTransport into a connectionMade method; that won't
work.
Twisted does not presently have a callback for "TLS handshake complete";
hopefully it will eventually, but until it does, you have to wait until you've
received some authenticated data from the peer to be sure the handshake has
completed. For almost all applications, this is fine, since by the time you
have received instructions to do anything (download updates, in your case) the
peer must already have sent the certificate.

Restricting POST requests to a maximum size on Pyramid

I am writing a web application with Pyramid and would like to restrict the maximum length for POST requests, so that people can't post huge amount of data and exhaust all the memory on the server. However I looked pretty much everywhere I could think of (Pyramid, WebOb, Paster) and couldn't find any option to accomplish this. I've seen that Paster has limits for the number of HTTP headers, length each header, etc., but I didn't see anything for the size of the request body.
The server will be accepting POST requests only for JSON-RPC, so I don't need to allow huge request body sizes. Is there a way in the Pyramid stack of accomplishing this?
Just in case this is not obvious from the rest, a solution which has to accept and load the whole request body into memory before checking the length and returning a 4xx error code defeats the purpose of what's I'm trying to do, and is not what I'm looking for.

Not really a direct answer to your question. As far as I know, you can create a wsgi app that will load the request if the body is below the configuration setting you can pass it to the next WSGI layer. If it goes above you can stop reading and return an error directly.
But to be honest, I really don't see the point to do it in pyramid. For example, if you run pyramid behind a reverse proxy with nginx or apache or something else.. you can always limit the size of the request with the frontend server.
unless you want to run pyramid with Waitress or Paster directly without any proxy, you should handle body size in the front end server that should be more efficient than python.
Edit
I did some research, it isn't a complete answer but here is something that can be used I guess. You have to read environ['wsgi_input'] as far as I can tell. This is a file like object that receives chunk of data from nginx or apache for example.
What you really have to do is read that file until max lenght is reached. If it is reached raise an Error if it isn't continue the request.
You might want to have a look at this answer

You can do it in a variety of ways here's a couple of examples. one using wsgi middleware based on webob(installed when you install pyramid among other things). and one that uses pyramids event mechanism
"""
restricting execution based on request body size
"""
from pyramid.config import Configurator
from pyramid.view import view_config
from pyramid.events import NewRequest, subscriber
from webob import Response, Request
from webob.exc import HTTPBadRequest
import unittest
def restrict_body_middleware(app, max_size=0):
"""
this is straight wsgi middleware and in this case only depends on
webob. this can be used with any wsgi compliant web
framework(which is pretty much all of them)
"""
def m(environ, start_response):
r = Request(environ)
if r.content_length <= max_size:
return r.get_response(app)(environ, start_response)
else:
err_body = """
request content_length(%s) exceeds
the configured maximum content_length allowed(%s)
""" % (r.content_length, max_size)
res = HTTPBadRequest(err_body)
return res(environ, start_response)
return m
def new_request_restrict(event):
"""
pyramid event handler called whenever there is a new request
recieved
http://docs.pylonsproject.org/projects/pyramid/en/1.2-branch/narr/events.html
"""
request = event.request
if request.content_length >= 0:
raise HTTPBadRequest("too big")
#view_config()
def index(request):
return Response("HI THERE")
def make_application():
"""
make appplication with one view
"""
config = Configurator()
config.scan()
return config.make_wsgi_app()
def make_application_with_event():
"""
make application with one view and one event subsriber subscribed
to NewRequest
"""
config = Configurator()
config.add_subscriber(new_request_restrict, NewRequest)
return config.make_wsgi_app()
def make_application_with_middleware():
"""
make application with one view wrapped in wsgi middleware
"""
return restrict_body_middleware(make_application())
class TestWSGIApplication(unittest.TestCase):
def testNoRestriction(self):
app = make_application()
request = Request.blank("/", body="i am a request with a body")
self.assert_(request.content_length > 0, "content_length should be > 0")
response = request.get_response(app)
self.assert_(response.status_int == 200, "expected status code 200 got %s" % response.status_int)
def testRestrictedByMiddleware(self):
app = make_application_with_middleware()
request = Request.blank("/", body="i am a request with a body")
self.assert_(request.content_length > 0, "content_length should be > 0")
response = request.get_response(app)
self.assert_(response.status_int == 400, "expected status code 400 got %s" % response.status_int)
def testRestrictedByEvent(self):
app = make_application_with_event()
request = Request.blank("/", body="i am a request with a body")
self.assert_(request.content_length > 0, "content_length should be > 0")
response = request.get_response(app)
self.assert_(response.status_int == 400, "expected status code 400 got %s" % response.status_int)
if __name__ == "__main__":
unittest.main()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

mod-wsgi hangs during posting with https and large data packages - python

Related

How can I fix ConnectTimeout exceptions from within FastAPI

python flask+gevent doesn't release memory. limit.conf

Run function on Flask server every x seconds to update Redis cache without clients making separate calls

Client certificates and mutual authentication in Python [duplicate]

Restricting POST requests to a maximum size on Pyramid

Categories

Resources