I am creating a simple TCP server-client script in Python. The server is threaded and forks a new worker/thread for every client connection. So far I have pretty much coded the entire server module. But my function called the handle_clients() which is forked for every incoming client connection is getting very long. In order to improve the readability of the code I want to split my handle_clients() into multiple small functions. I do understand that when I split handle_client() into smaller functions, the split functions should be wrapped around mutex locks to synchronize shared usage between multiple handle_clients() threads. Doing this will actually reduce the efficiency of the program because handle_clients() will have to wait for other threads to unlock the shared functions before actually using it. My other thought was to create these smaller functions as threads within the handle_clients() thread. And wait for these threads to finish using Thread.join() before continuing. Is there a better way to do this?
My code:
#!/usr/bin/python
import socket
import threading
import pandas as pd
class TCPServer(object):
NUMBER_OF_THREADS = 0
BUFFER = 4096
threads_list = []
def __init__(self, port, hostname):
self.socket = socket.socket(
family=socket.AF_INET, type=socket.SOCK_STREAM)
self.socket.bind((hostname, port))
def listen_for_clients(self):
self.socket.listen(5)
while True:
client, address = self.socket.accept()
client_ID = client.recv(TCPServer.BUFFER)
print(f'Connected to client: {client_ID}')
if client_ID:
TCPServer.NUMBER_OF_THREADS = TCPServer.NUMBER_OF_THREADS + 1
thread = threading.Thread(
target=TCPServer.create_worker, args=(self, client, address, client_ID))
TCPServer.threads_list.append(thread)
thread.start()
if TCPServer.NUMBER_OF_THREADS > 2:
break
TCPServer.wait_for_workers()
def wait_for_workers():
for thread in TCPServer.threads_list:
thread.join()
def create_worker(self, client, address, client_ID):
print(f'Spawned a new worker for {client_ID}. Worker #: {TCPServer.NUMBER_OF_THREADS}')
data_list = []
data_frame = pd.DataFrame()
client.send("SEND_REQUEST_TYPE".encode())
request_type = client.recv(TCPServer.BUFFER).decode('utf-8')
if request_type == 'KMEANS':
print(f'Client: REQUEST_TYPE {request_type}')
client.send("SEND_DATA".encode())
while True:
data = client.recv(TCPServer.BUFFER).decode('utf-8')
if data == 'ROW':
client.send("OK".encode())
while True:
data = client.recv(TCPServer.BUFFER).decode('utf-8')
print(f'Client: {data}')
if data == 'ROW_END':
print('Data received: ', data_list)
series = pd.Series(data_list)
data_frame.append(series, ignore_index=True)
data_list = []
client.send("OK".encode())
break
else:
data_list.append(int(data))
client.send("OK".encode())
elif data == 'DATA_END':
client.send("WAIT".encode())
# (Vino) pass data to algorithm
print('Data received from client {client_ID}: ', data_frame)
elif request_type == 'NEURALNET':
pass
elif request_type == 'LINRIGRESSION':
pass
elif request_type == 'LOGRIGRESSION':
pass
def main():
port = input("Port: ")
server = TCPServer(port=int(port), hostname='localhost')
server.listen_for_clients()
if __name__ == '__main__':
main()
Note: This following block of code is repetative and will e used multiple times within the handle_client() function.
while True:
data = client.recv(TCPServer.BUFFER).decode('utf-8')
if data == 'ROW':
client.send("OK".encode())
while True:
data = client.recv(TCPServer.BUFFER).decode('utf-8')
print(f'Client: {data}')
if data == 'ROW_END':
print('Data received: ', data_list)
series = pd.Series(data_list)
data_frame.append(series, ignore_index=True)
data_list = []
client.send("OK".encode())
break
else:
data_list.append(int(data))
client.send("OK".encode())
elif data == 'DATA_END':
client.send("WAIT".encode())
# (Vino) pass data to algorithm
print('Data received from client {client_ID}: ', data_frame)
This is the block I want a place in a separate function and calls it within the handle_client() thread.
Your code is already long, I'll not dive into it but try to keep things general.
I do understand that when I split handle_client() into smaller functions, the split functions should be wrapped around mutex locks.
That's not directly true, between threads you already have to use locks to guard against memory overwriting, regarless your function calls.
The server is threaded
Looks like you're doing CPU-intensive work (I see LINALG, NEURALNET, ...), it is not logical to use threads, in Python, to dispatch CPU-intensive loads as the GIL will linearize CPU usage between your threads.
The way to parallelize CPU intensive work in Python is to use processes.
Processes do not share memory so you'll be able to manipulate variables freely without mutexes, but they won't be shared at all, I hope your jobs are independent, as they can't share any state.
If you need to share state, avoid locks, it's complicated to handle, it's the way to dead locks, and it's not readable, try to implement your "state sharing" with queues, as a pipeline of jobs, each worker pulling from a queue, doing work, and pushing to another queue, this way keep things clear and easy to understand. Plus there's implementation of queues for threads and processes so you'll be able to switch from both almost seamlessly.
if TCPServer.NUMBER_OF_THREADS > 2:
break
Hey, you're breaking out of your main loop when you have more than two threads, existing your main process, killing your server, I bet that now what you want. Oh and if you use processes instead of threads, you should prefork a pool of them, as their creation costs more than a thread. And reuse them, a process can do a job after finishing one, it does not have to die (typically use queues to send job to your processes).
Side note: I'd implement this using HTTP instead of raw TCP to benefit from the notions of request, response, error reporting, existing frameworks, and the ability to use existing clients (curl/wget in command line, your browser, requests in Python). I'd implement this fully asynchronously (no blocking HTTP request), like one request to create a job, and following requests to get the status and the result, like:
$ curl -X POST http://localhost/linalg/jobs/ -d '{your data}'
201 Created
Location: http://localhost/linalg/jobs/1
$ curl -XGET http://localhost/linalg/jobs/1
200 OK
{"status": "queued"}
Some time later…
$ curl -XGET http://localhost/linalg/jobs/1
200 OK
{"status": "in progress"}
Some time later…
$ curl -XGET http://localhost/linalg/jobs/1
200 OK
{"status": "done", "result": "..."}
To implement this there's a lot of nice work already done, typically aiohttp, apistar, and so on.
Related
I
setting up a Websocket that receives market data from 33 pairs, process the data and insert it into a local mysql database.
what I've tried so far :
Setting up the websocket works fine, then process the data on each new message function and insert it directly into the database
--> problem was that with 33 pairs the websocket was stacking up the buffer with market data, and after a few minutes I would get a delay in the database of at least 10 seconds
Then I tried processing the data through a thread : the on_message function would execute a thread that is simply putting the market data into an array, like below
datas=[]
def add_queue(symbol,t,a,b,r_n):
global datas
datas.append([symbol,t,a,b,r_n])
if json_msg['ev']=="C":
symbol=json_msg['p'].replace("/","-")
round_number=pairs_dict_new[symbol]
t = Thread(target=add_queue, args=(symbol,json_msg['t'],json_msg['a'],json_msg['b'],round_number,))
t.start()
and then another function, with a loop thread would pick it up to insert it into the database
def add_db():
global datas
try:
# db = mysql.connector.connect(
# host="104.168.157.164",
# user="bvnwurux_noe_dev",
# password="Tickprofile333",
# database="bvnwurux_tick_values"
# )
while True:
for x in datas:
database.add_db(x[0],x[1],x[2],x[3],x[4])
if x in datas:
datas.remove(x)
except KeyboardInterrupt:
print("program ending..")
t2 = Thread(target=add_db)
t2.start()
still giving a delay, and the threaded process wasn't actually using a lot of CPU but more of RAM and it just was even worse.
instead of using a websocket with a thread, I tried simple webrequests to the API call, so with 1 thread per symbol, it would loop through a webrequest and in everythread send it to the database. my issues here were that mysql connections don't like threads (sometimes they would make a request with the same connection at the same time and crash) or it would still be delayed by the time to process the code, even without buffer. the code was taking too long to process the answered request that it couldnt keep it under 10s of delay.
Here is a little example of the basic code I used to get the data.
pairs={'AUDCAD':5,'AUDCHF':5,'AUDJPY':3,'AUDNZD':5,'AUDSGD':2,'AUDUSD':5,'CADCHF':5,'CADJPY':3,'CHFJPY':3,'EURAUD':5,'EURCAD':5,'EURCHF':5,'EURGBP':5,'EURJPY':3,'EURNZD':5,'EURSGD':5,'EURUSD':5,'GBPAUD':5,'GBPCAD':5,'GBPCHF':5,'GBPJPY':3,'GBPNZD':5,'GBPSGD':5,'GBPUSD':5,'NZDCAD':5,'NZDCHF':5,'NZDJPY':3,'NZDUSD':5,'USDCAD':5,'USDCHF':5,'USDJPY':3,'USDSGD':5,'SGDJPY':3}
def on_open(ws):
print("Opened connection")
ws.send('{"action":"auth","params":"<API KEY>"}') #connecting with secret api key
def on_message(ws, message):
print("msg",message)
json_msg = json.loads(message)[0]
if json_msg['status'] == "auth_success": # successfully authenticated
r = ws.send('{"action":"subscribe","params":"C.*"}') # subscribing to currencies
print("should subscribe to " + pairs)
#once the websocket is connected to all the pairs, process the data
--> process json_msg
if __name__ == "__main__":
# websocket.enableTrace(True) # just to show all the requests made (debug mode)
ws = websocket.WebSocketApp("wss://socket.polygon.io/forex",
on_open=on_open,
on_message=on_message)
ws.run_forever(dispatcher=rel) # Set dispatcher to automatic reconnection
rel.signal(2, rel.abort) # Keyboard Interrupt
rel.dispatch()
method I tried multiprocess, but this was on the other crashing my server because it would use 100% CPU, and then the requests made on the apache server would not reach or take a long time loading. Its really a balance problem
I'm using an ubuntu server with 32CPUS, based in london and the API polygon is based in NYC.
I also tried with 4 CPUS in seattle to NYC, but still no luck.
Even with 4 pairs and 32CPUS , it would eventually reach 10s delay. I think this is more of a code structure problem.
Lets say I have a Python gRPC server and a corresponding client.
According to this Question the same gRPC channel can be utilized to be passed to client stubs, each running in different threads.
Lets say RPC function foo() is called from thread T1 and the response takes about one second. Can I call foo() from thread T2 in the meantime, while T1 is still waiting for the response in thread or is the channel somehow locked until the first call returns? In other words: Is performance gained by using more threads, because the corresponding server is based on a thread pool and able to handle more requests "in parallel" and, if yes, should I use the same channel or different channels per thread?
EDIT: According to a quick test it seems that parallel requests using the same channel from different threads are possible and it gives sense, to do it in such way. However, because before closing the question, I would like to see a confirmation from an expert whether this code is correct:
import time
import threading
import grpc
import helloworld_pb2
import helloworld_pb2_grpc
def run_in_thread1(channel):
n = 10000
for i in range(n):
stub = helloworld_pb2_grpc.GreeterStub(channel)
response = stub.SayHello(helloworld_pb2.HelloRequest(name='1'))
print("client 1: " + response.message)
def run_in_thread2(channel):
n = 10000
for i in range(n):
stub = helloworld_pb2_grpc.GreeterStub(channel)
response = stub.SayHello2(helloworld_pb2.HelloRequest(name='2'))
print("client 2: " + response.message)
if __name__ == '__main__':
print("I'm client")
channel = grpc.insecure_channel('localhost:50051')
x1 = threading.Thread(target=run_in_thread1, args=(channel,))
x1.start()
x2 = threading.Thread(target=run_in_thread2, args=(channel,))
x2.start()
x1.join()
x2.join()
gRPC uses HTTP/2 and can multiplex many requests on one connection and gRPC client connections should be re-used for the lifetime of the client app.
If you are inspired by what is done when working with databases, I would say you don't need to worry about it as the opening connection overhead doesn't exist when working with gRPC.
Problem Outline
I have a python flask server where one of the endpoints has a moderate amount of work to do (the real code reads, resizes and returns an image). I want to optimise the endpoint so that it can be called multiple times in parallel.
The code I currently have (shown below) does not work because it relies on passing a multiprocessing.Event object through a multiprocessing.JoinableQueue which is not allowed and results in the following error:
RuntimeError: Condition objects should only be shared between processes
through inheritance
How can I use a separate process to compute some jobs and notify the main thread when a specific job is complete?
Proof of Concept
Flask can be multithreaded so if one request is waiting on a result other threads can continue to process other requests. I have a basic proof of concept here that shows that parallel requests can be optimised using multiprocessing: https://github.com/alanbacon/flask_multiprocessing
The example code on github spawns a new process for every request which I understand has considerable overheads, plus I've noticed that my proof-of-concept server crashes if there are more than 10 or 20 concurrent requests, I suspect this is because there are too many processes being spawned.
Current Attempt
I have tried to create a set of workers that pick jobs off a queue. When a job is complete the result is written to a shared memory area. Each job contains the work to be done and an Event object that can be set when the job is complete to signal the main thread.
Each request thread passes in a job with a newly created Event object, it then immediately waits on that event before returning the result. While one server request thread is waiting the server is able to use other threads to continue to serve other requests.
The problem as mentioned above is that Event objects can not be passed around in this way.
What approach should I take to circumvent this problem?
from flask import Flask, request, Response,
import multiprocessing
import uuid
app = Flask(__name__)
# flask config
app.config['PROPAGATE_EXCEPTIONS'] = True
app.config['DEBUG'] = False
def simpleWorker(complexity):
temp = 0
for i in range(0, complexity):
temp += 1
mgr = multiprocessing.Manager()
results = mgr.dict()
joinableQueue = multiprocessing.JoinableQueue()
lock = multiprocessing.Lock()
def mpWorker(joinableQueue, lock, results):
while True:
next_task = joinableQueue.get() # blocking call
if next_task is None: # poison pill to kill worker
break
simpleWorker(next_task['complexity']) # pretend to do heavy work
result = next_task['val'] * 2 # compute result
ID = next_task['ID']
with lock:
results[ID] = result # output result to shared memory
next_task['event'].set() # tell main process result is calculated
joinableQueue.task_done() # remove task from queue
#app.route("/work/<ID>", methods=['GET'])
def work(ID=None):
if request.method == 'GET':
# send a task to the consumer and wait for it to finish
uid = str(uuid.uuid4())
event = multiprocessing.Event()
# pass event to job so that job can tell this thread when processing is
# complete
joinableQueue.put({
'val': ID,
'ID': uid,
'event': event,
'complexity': 100000000
})
event.wait() # wait for result to be calculated
# get result from shared memory area, and clean up
with lock:
result = results[ID]
del results[ID]
return Response(str(result), 200)
if __name__ == "__main__":
num_consumers = multiprocessing.cpu_count() * 2
consumers = [
multiprocessing.Process(
target=mpWorker,
args=(joinableQueue, lock, results))
for i in range(num_consumers)
]
for c in consumers:
c.start()
host = '127.0.0.1'
port = 8080
app.run(host=host, port=port, threaded=True)
I have a Tornado web application, this app can receive GET and POST request from the client.
The POSTs request put an information received in a Tornado Queue, then I pop this information from the queue and with it I do an operation on the database, this operation can be very slow, it can take several seconds to complete!
In the meantime that this database operation goes on I want to be able to receive other POSTs (that put other information in the queue) and GET. The GET are instead very fast and must return to the client their result immediatly.
The problem is that when I pop from the queue and the slow operation begin the server doesn't accept other requests from the client. How can I resolve this?
This is the semplified code I have written so far (import are omitted for avoid wall of text):
# URLs are defined in a config file
application = tornado.web.Application([
(BASE_URL, Variazioni),
(ARTICLE_URL, Variazioni),
(PROMO_URL, Variazioni),
(GET_FEEDBACK_URL, Feedback)
])
class Server:
def __init__(self):
http_server = tornado.httpserver.HTTPServer(application, decompress_request=True)
http_server.bind(8889)
http_server.start(0)
transactions = TransactionsQueue() #contains the queue and the function with interact with it
IOLoop.instance().add_callback(transactions.process)
def start(self):
try:
IOLoop.instance().start()
except KeyboardInterrupt:
IOLoop.instance().stop()
if __name__ == "__main__":
server = Server()
server.start()
class Variazioni(tornado.web.RequestHandler):
''' Handle the POST request. Put an the data received in the queue '''
#gen.coroutine
def post(self):
TransactionsQueue.put(self.request.body)
self.set_header("Location", FEEDBACK_URL)
class TransactionsQueue:
''' Handle the queue that contains the data
When a new request arrive, the generated uuid is putted in the queue
When the data is popped out, it begin the operation on the database
'''
queue = Queue(maxsize=3)
#staticmethod
def put(request_uuid):
''' Insert in the queue the uuid in postgres format '''
TransactionsQueue.queue.put(request_uuid)
#gen.coroutine
def process(self):
''' Loop over the queue and load the data in the database '''
while True:
# request_uuid is in postgres format
transaction = yield TransactionsQueue.queue.get()
try:
# this is the slow operation on the database
yield self._load_json_in_db(transaction )
finally:
TransactionsQueue.queue.task_done()
Moreover I don't understand why if I do 5 POST in a row, it put all five data in the queue though the maximun size is 3.
I'm going to guess that you use a synchronous database driver, so _load_json_in_db, although it is a coroutine, is not actually async. Therefore it blocks the entire event loop until the long operation completes. That's why the server doesn't accept more requests until the operation is finished.
Since _load_json_in_db blocks the event loop, Tornado can't accept more requests while it's running, so your queue never grows to its max size.
You need two fixes.
First, use an async database driver written specifically for Tornado, or run database operations on threads using Tornado's ThreadPoolExecutor.
Once that's done your application will be able to fill the queue, so second, TransactionsQueue.put must do:
TransactionsQueue.queue.put_nowait(request_uuid)
This throws an exception if there are already 3 items in the queue, which I think is what you intend.
I need a webserver which routes the incoming requests to back-end workers by batching them every 0.5 second or when it has 50 http requests whichever happens earlier. What will be a good way to implement it in python/tornado or any other language?
What I am thinking is to publish the incoming requests to a rabbitMQ queue and then somehow batch them together before sending to the back-end servers. What I can't figure out is how to pick multiple requests from the rabbitMq queue. Could someone point me to right direction or suggest some alternate apporach?
I would suggest using a simple python micro web framework such as bottle. Then you would send the requests to a background process via a queue (thus allowing the connection to end).
The background process would then have a continuous loop that would check your conditions (time and number), and do the job once the condition is met.
Edit:
Here is an example webserver that batches the items before sending them to any queuing system you want to use (RabbitMQ always seemed overcomplicated to me with Python. I have used Celery and other simpler queuing systems before). That way the backend simply grabs a single 'item' from the queue, that will contain all required 50 requests.
import bottle
import threading
import Queue
app = bottle.Bottle()
app.queue = Queue.Queue()
def send_to_rabbitMQ(items):
"""Custom code to send to rabbitMQ system"""
print("50 items gathered, sending to rabbitMQ")
def batcher(queue):
"""Background thread that gathers incoming requests"""
while True:
batcher_loop(queue)
def batcher_loop(queue):
"""Loop that will run until it gathers 50 items,
then will call then function 'send_to_rabbitMQ'"""
count = 0
items = []
while count < 50:
try:
next_item = queue.get(timeout=.5)
except Queue.Empty:
pass
else:
items.append(next_item)
count += 1
send_to_rabbitMQ(items)
#app.route("/add_request", method=["PUT", "POST"])
def add_request():
"""Simple bottle request that grabs JSON and puts it in the queue"""
request = bottle.request.json['request']
app.queue.put(request)
if __name__ == '__main__':
t = threading.Thread(target=batcher, args=(app.queue, ))
t.daemon = True # Make sure the background thread quits when the program ends
t.start()
bottle.run(app)
Code used to test it:
import requests
import json
for i in range(101):
req = requests.post("http://localhost:8080/add_request",
data=json.dumps({"request": 1}),
headers={"Content-type": "application/json"})