Python AIOHTTP.web server multiprocessing load-balancer? - python

I am currently developing a web app using the aiohttp module. I'm using:
aiohttp.web, asyncio, uvloop, aiohttp_session, aiohttp_security, aiomysql, and aioredis
I have run some benchmarks against it and while they're pretty good, I can't help but want for more. I know that Python is, by nature, single-threaded. AIOHTTP is using async as to be non-blocking but am I correct in assuming that it is not utilizing all CPU cores?
My idea: Run multiple instances of my aiohttp.web code via concurrent.futures in multiprocessing mode. Each process would serve the site on a different port. I would then put a load balancer in front of them. MySQL and Redis can be used to share state where necessary such as for sessions.
Question: Given a server with several CPU cores, will this result in the desired performance increase? If so, is there any specific pattern to pursue in order to avert problems? I can't think of anything that these aio modules are doing that would require that there only be a single thread though I could be wrong.
Note: This is not a subjective question as I've posed it. Either the module is currently bound to one thread/process or it isn't - can benefit from a multiprocessing module + load balancer or can't.

You're right asyncio uses one CPU only. (one event loop uses one thread only and thus one CPU only)
Whether your whole project is network or CPU bound is something I can't say.
You have to try.
You could use nginx or haproxy as load balancer.
You might even try to use no load balancer at all. I never tried this feature for load balancing, just as proof of concept for a fail-over system.
With new kernels multiple processes can listen to the same port (when using the SO_REUSEPORT option) and I guess it's the kernel who would be doing a round robin.
Here a small link to an article comparing performance of a typical nginx configuration vs an nginx setup with the SO_REUSEPORT feature:
https://blog.cloudflare.com/the-sad-state-of-linux-socket-balancing/
It seems SO_REUSEPORT might distributes the CPU charge rather evenly, but might increase the variation of response times. Not sure this is relevant in your setup, but thought I let you know.
Added 2020-02-04:
My solution added 2019-12-09 works, but triggers a deprecation warning.
When having more time and time for testing it myself I will post the improved solution here. For the time being you can find it at AIOHTTP - Application.make_handler(...) is deprecated - Adding Multiprocessing
Added 2019-12-09:
Here a small example of an HTTP server, that can be started multiple times listening on the same socket.
The kernel would distribute the tasks. I never checked whether this is efficient or not though.
reuseport.py:
import asyncio
import os
import socket
import time
from aiohttp import web
def mk_socket(host="127.0.0.1", port=8000, reuseport=False):
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
if reuseport:
SO_REUSEPORT = 15
sock.setsockopt(socket.SOL_SOCKET, SO_REUSEPORT, 1)
sock.bind((host, port))
return sock
async def handle(request):
name = request.match_info.get('name', "Anonymous")
pid = os.getpid()
text = "{:.2f}: Hello {}! Process {} is treating you\n".format(
time.time(), name, pid)
time.sleep(0.5) # intentionally blocking sleep to simulate CPU load
return web.Response(text=text)
if __name__ == '__main__':
host = "127.0.0.1"
port=8000
reuseport = True
app = web.Application()
sock = mk_socket(host, port, reuseport=reuseport)
app.add_routes([web.get('/', handle),
web.get('/{name}', handle)])
loop = asyncio.get_event_loop()
coro = loop.create_server(
protocol_factory=app.make_handler(),
sock=sock,
)
srv = loop.run_until_complete(coro)
loop.run_forever()
And one way to test it:
./reuseport.py & ./reuseport.py &
sleep 2 # sleep a little so servers are up
for n in 1 2 3 4 5 6 7 8 ; do wget -q http://localhost:8000/$n -O - & done
The output might look like:
1575887410.91: Hello 1! Process 12635 is treating you
1575887410.91: Hello 2! Process 12633 is treating you
1575887411.42: Hello 5! Process 12633 is treating you
1575887410.92: Hello 7! Process 12634 is treating you
1575887411.42: Hello 6! Process 12634 is treating you
1575887411.92: Hello 4! Process 12634 is treating you
1575887412.42: Hello 3! Process 12634 is treating you
1575887412.92: Hello 8! Process 12634 is treating you

I think is better to not reinvent the wheel and use one of the proposed solutions at the documentation:
https://docs.aiohttp.org/en/stable/deployment.html#nginx-supervisord

Related

Communication between Python Scripts

I have 2 python scripts. 1st is Flask server and 2nd Is NRF24L01 receiver/transmitter(On Raspberry Pi3) script. Both scripts are running at the same time. I want to pass variables (variables are not constant) between these 2 scripts. How I can do that in a simplest way?
How about a python RPC setup? I.e. Run a server on each script, and each script can also be a client to invoke Remote Procedure Calls on each other.
https://docs.python.org/2/library/simplexmlrpcserver.html#simplexmlrpcserver-example
I'd like to propose a complete solution basing on Sush's proposition. For last few days I've been struggling with the problem of communicating between two processes run separately (in my case - on the same machine). There are lots of solutions (Sockets, RPC, simple RPC or other servers) but all of them had some limitations. What worked for me was a SimpleXMLRPCServer module. Fast, reliable and better than direct socket operations in every aspect. Fully functioning server which can be cleanly closed from client is just as short:
from SimpleXMLRPCServer import SimpleXMLRPCServer
quit_please = 0
s = SimpleXMLRPCServer(("localhost", 8000), allow_none=True) #allow_none enables use of methods without return
s.register_introspection_functions() #enables use of s.system.listMethods()
s.register_function(pow) #example of function natively supported by Python, forwarded as server method
# Register a function under a different name
def example_method(x):
#whatever needs to be done goes here
return 'Enterd value is ', x
s.register_function(example_method,'example')
def kill():
global quit_please
quit_please = 1
#return True
s.register_function(kill)
while not quit_please:
s.handle_request()
My main help was 15 years old article found here.
Also, a lot of tutorials use s.server_forever() which is a real pain to be cleanly stopped without multithreading.
To communicate with the server all you need to do is basically 2 lines:
import xmlrpclib
serv = xmlrpclib.ServerProxy('http://localhost:8000')
Example:
>>> import xmlrpclib
>>> serv = xmlrpclib.ServerProxy('http://localhost:8000')
>>> serv.example('Hello world')
'Enterd value is Hello world'
And that's it! Fully functional, fast and reliable communication. I am aware that there are always some improvements to be done but for most cases this approach will work flawlessly.

Python SSH Server( twisted.conch) takes up high cpu usage when a large number of echo

I write a SSH server with Twisted Conch. But encountered a difficult problem. Assume that user A and user B log in to the twisted ssh server through ssh command. Then user A tail or cat a large file (greater than 100M) on the server, which will cause a lot of echoing through the twisted ssh server, making python ssh process (twisted.conch ) cpu usage is very high (greater than 95%, or even 100%), then the user B will be blocked, a long time no response. Is there any way to sleep userA's session(0.5 seconds) when find the user A has a large number of echo through the twisted ssh server and not blocking the other connected users.
import sys
import checkers
from twisted.python import components, log, logfile
from twisted.cred import portal
from twisted.internet import reactor
from twisted.conch.ssh import factory, keys, session, filetransfer
from twisted.conch.unix import UnixSSHRealm, SSHSessionForUnixConchUser, UnixConchUser
import keyvalue
if __name__ == "__main__":
sshFactory = factory.SSHFactory()
sshFactory.portal = portal.Portal(UnixSSHRealm())
sshFactory.portal.registerChecker(checkers.UsernamePasswordChecker())
sshFactory.publicKeys = {
'ssh-rsa': keys.Key.fromString(keyvalue.publicKey)}
sshFactory.privateKeys = {
'ssh-rsa': keys.Key.fromString(keyvalue.privateKey)}
components.registerAdapter(
SSHSessionForUnixConchUser, UnixConchUser, session.ISession)
log.startLogging(sys.stdout)
reactor.listenTCP(2222, sshFactory)
reactor.run()
This is effectively a bug in Twisted. One user using the server should not generate so much load that it's unresponsive to everyone else.
However, it's not an easy one to fix. There are a couple of solutions.
First, before you do anything else, you should ensure your code is using PyPy, which may give you all the additional performance you need to support more users. Even if it isn't sufficient, it should be helpful in combination with these other solutions.
One is that you can run this code in multiple processes, using a strategy like this, which will allow you to preemptively run the process on multiple cores. Of course, that doesn't let you do much concurrently inside one process.
Another option is that you could use twisted.protocols.htb, which you could use on sshFactory, to rate-limit incoming traffic and ensure it is processed fairly between competing connections.
Please share any progress that you make on this, as I'm sure it would be interesting to other Twisted users!

Python parallel processing from client-server application

I have a web application (django) that stores in mysql database PID numbers of processes from remote Linux machine. I designed a simple server-client application that talking to remote server and getting me some data about given PID number (cpu%, mem%) ... this data is from 5s interval.
But there is a performance problem .... I have 200 pids to check and every of them takes ~5 sec and they are processing in the for loop. So I have situation where I`m waiting 200*5 sec minimum
Can somebody advise me how to make it parallel processing? So my application will be able to fetch for example 50 pids at one time ... I believe python client - server library can handle multiple requests coming to the server.
I want to archive something like:
for pid in my_200_pid_list:
// Some parallel magic to not wait and pass another 49...
result[pid] = askforprocess(pid)
My client code:
def askforprocess(processpid):
#Create TCP/IP socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Connect on host and port provided in command line arguments
server_address = ('172.16.1.105', int('5055'))
sock.connect(server_address)
# Send the data
try:
message = processpid
sock.sendall(message)
data = sock.recv(2048)
finally:
sock.close()
return data
In general, it's best to do stuff like this using a single thread when possible. You just have to make sure your functions don't block other functions. The builtin lib that comes to mind is select. Unfortunately, it's a bit difficult to explain and I haven't used it in quite some time. Hopefully this link will help you understand it http://pymotw.com/2/select/.
You can also use the multiprocessing lib and poll each pid in a separate thread. This can be very difficult to manage if you plan to scale out further! Use threads only as a last resort (this is my usual rule of thumb when it comes to threads). https://docs.python.org/2/library/multiprocessing.html#module-multiprocessing
from multiprocessing import Process
def askforprocess(processpid):
#Create TCP/IP socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Connect on host and port provided in command line arguments
server_address = ('172.16.1.105', int('5055'))
sock.connect(server_address)
# Send the data
try:
message = processpid
sock.sendall(message)
data = sock.recv(2048)
finally:
sock.close()
return data
if __name__ == '__main__':
info('main line')
p = Process(target=askforprocess, args=(processpid,))
p.start()
Lastly, there's Twisted library which is probably the most difficult to understand, but defiantly makes concurrent (not necessarily parallel) functions easy to write. Only bad thing is you'd probably have to rewrite your entire app in order to use Twisted. Don't be put off by this fact, try to use it if you can.
Hope that helps.
Use threads to process your requests in parallel: https://docs.python.org/2/library/threading.html

Python and ZMQ: REP socket not receiving all requests

I have a REP socket that's connected to many REQ sockets, each running on a separate Google Compute Engine instance. I'm trying to accomplish the synchronization detailed in the ZMQ Guide's syncpub/syncsub example, and my code looks pretty similar to that example:
context = zmq.Context()
sync_reply = context.socket(zmq.REP)
sync_reply.bind('tcp://*:5555')
# start a bunch of other sockets ...
ready = 0
while ready < len(all_instances):
sync_reply.recv()
sync.reply.send(b'')
ready += 1
And each instance is running the following code:
context = zmq.Context()
sync_request = context.socket(zmq.REQ)
sync_request.connect('tcp://IP_ADDRESS:5555')
sync_request.send(b'')
sync_request.recv()
# start other sockets and do other work ...
This system works fine up until a certain number of instances (around 140). Any more, though, and the REP socket will not receive all of the requests. It also seems like the requests it drops are from different instances each time, which leads me to believe that all the requests are indeed being sent, but the socket is just not receiving any more than (about) 140 of them.
I've tried setting the high water mark for the sockets, spacing out the requests over the span of a few seconds, switching to ROUTER/DEALER sockets - all with no improvement. The part that confuses me the most is that the syncsub/syncpub example code (linked above) works fine for me with up to 200 Google Compute Engine instances, which is as many as I can start. I'm not sure what about my code specifically is causing this problem - any help or tips would be appreciated.
Answering my own question - it seems like it was an issue with the large number of sockets I was using, and also possibly the memory limitations of the GCE instances used. See comment thread above for more details.

What is the proper way to handle periodic housekeeping tasks in Python/Pyramid/CherryPy?

I have a python web-app that uses Pyramid/CherryPy for the webserver.
It has a few periodic housekeeping tasks that need to be run - Clearing out stale sessions, freeing their resources, etc...
What is the proper way to manage this? I can fairly easily just run a additional "housekeeping" thread (and use a separate scheduler, like APscheduler), but having a separate thread reach into the running server thread(s) just seems like a really clumsy solution. CherryPy is already running the server in a (multi-threaded) eventloop, it seems like it should be possible to somehow schedule periodic events through that.
I was lead to this answer by #fumanchu's answer, but I wound up using an instance of the cherrypy.process.plugins.BackgroundTask plugin:
def doHousekeeping():
print("Housekeeper!")
-
def runServer():
cherrypy.tree.graft(wsgi_server.app, "/")
# Unsubscribe the default server
cherrypy.server.unsubscribe()
# Instantiate a new server object
server = cherrypy._cpserver.Server()
# Configure the server object
server.socket_host = "0.0.0.0"
server.socket_port = 8080
server.thread_pool = 30
# Subscribe this server
server.subscribe()
cherrypy.engine.housekeeper = cherrypy.process.plugins.BackgroundTask(2, doHousekeeping)
cherrypy.engine.housekeeper.start()
# Start the server engine (Option 1 *and* 2)
cherrypy.engine.start()
cherrypy.engine.block()
Results in doHousekeeping() being called at 2 second intervals within the CherryPy event loop.
It also doesn't involve doing something as silly as dragging in the entire OS just to call a task periodically.
Have a look at the "main" channel at http://cherrypy.readthedocs.org/en/latest/progguide/extending/customplugins.html
Do yourself a favour and just use cron. No need to roll your own scheduling software.

Categories