Grpc python client server streaming not working as expected

Grpc python client server streaming not working as expected - python

a simple grpc server client, client send a int and server streams int's back.
client is reading the messages one by one but server is running the generator function immediately for all responses.
server code:
import test_pb2_grpc as pb_grpc
import test_pb2 as pb2
import time
import grpc
from concurrent import futures
class test_servcie(pb_grpc.TestServicer):
def Produce(self, request, context):
for i in range(request.val):
print("request came")
rs = pb2.Rs()
rs.st = i + 1
yield rs
def serve():
server =
grpc.server(futures.ThreadPoolExecutor(max_workers=10))
pb_grpc.add_TestServicer_to_server(test_servcie(), server)
server.add_insecure_port('[::]:50051')
print("service started")
server.start()
try:
while True:
time.sleep(3600)
except KeyboardInterrupt:
server.stop(0)
if __name__ == '__main__':
serve()
client code:
import grpc
import test_pb2_grpc as pb_grpc
import test_pb2 as pb
def test():
channel = grpc.insecure_channel(
'{host}:{port}'.format(host="localhost", port=50051))
stub = pb_grpc.TestStub(channel=channel)
req = pb.Rq()
req.val = 20
for s in stub.Produce(req):
print(s.st)
import time
time.sleep(10)
test()
proto file:
syntax = "proto3";
service Test {
rpc Produce (Rq) returns (stream Rs);
}
message Rq{
int32 val = 1;
}
message Rs{
int32 st = 1;
}
after starting the server
when i run the client, server side generator started running and completed immediately it looped for the range.
what i expected is it will one by one as client calls but that is not the case.
is this an expected behaviour. my client is still printing the values but the sever is already completed the function.

Yes, this behavior is expected. gRPC features flow control between the two sides of an RPC (so that generating messages too fast on one side won't exhaust memory on the other side) but there's also an allowance for a small amount of buffering (so that a reasonably small amount of data may be sent by one side before the other side explicitly asks for it). In your case the twenty messages sent from server to client all fit within this small allowance. The service-side gRPC Python runtime is calling your service-side Produce method, consuming its entire output of twenty messages, and sending all those messages across the network to your client, where they are locally held by the invocation-side gRPC Python runtime until your invocation-side test function asks for them.
If you want to see the effects of flow control in action, try using huge messages (one megabyte in size or so) or altering the size of the allowance (I think this is done with a channel argument but those are an advanced and relatively-unsupported feature so this is left as an exercise).

Related

Python: run http server in parallel with a script

I'm running a python transformation pipeline by pulling messages from GCP pub/sub, transforming them and publishing them back to pub/sub.
I need to setup an http health check. A healthy server suppose to return code 200. If no results produced for 30 seconds, then server suppose t return code 500.
How do I run an http server in parallel with my pipeline?
Since python functions are executed synchronously, my code doesn't behave the way I want.
from http.server import HTTPServer, BaseHTTPRequestHandler
import time;
lastItemFinishedAt = time.time()
def start():
while True:
# pull message from pub/sub
# if got a message, then transform it and do this:
lastItemFinishedAt = time.time()
# if no messages found, then break the loop
print('🏁 No more messages in the queue')
start()
class Serv(BaseHTTPRequestHandler):
def do_GET(self):
if time.time() - lastItemFinishedAt > 30:
self.send_response(500)
else:
self.send_response(200)
httpd = HTTPServer(('localhost',8080),Serv)
httpd.serve_forever()

Dask: can't submit tasks to global client from separate multiprocessing.Process

I have 2 processes: the first one on which I create the global distributed client; the second process is a web scraper, that should get the global client and submit tasks to it and when everything is done, it sends a message to another process to tell it that he can proceed.
from dask.distributed import Client, as_completed
from multiprocessing import Process
from time import sleep
import zmq
def get(url) -> dict:
# downloads data from url
time.sleep(3)
return data
def save(data) -> None:
# saves data locally
time.sleep(3)
return None
def scraper(urls):
# global client
client = get_client()
# zeromq socket
context = zmq.Context()
socket = context.socket(zmq.PUB)
socket.bind('tcp://*:port')
while True:
for future, result in as_completed([client.submit(get, url=url) for url in urls], with_results=True):
save(data=result)
socket.send_string('All job is done for this minute, proceed.')
sleep(60)
if __name__ == '__main__':
client = Client()
s = Process(target=scraper, *args, **kwargs)
s.start()
The problem is that from the scraper function I can get the global client (I see it correctly if I print it), but I can't submit to it any kind of task. The console doesn't print any error, it's just stuck without doing nothing. I think that the cause is that the scraper function is running on a saparate multiprocessing.Process.
Any solution or workaround? Thank you.

The dask client holds open connections to the scheduler. Depending on how your systems creates new processes, you may get copies of the connections which point to nothing useful in the new process, or fail to transfer the client completely (it is not pickleable).
Instead, you should send the connection information to the child process
addr = c.scheduler_info()['address']
and in the target function do
client = Client(addr)

Twisted: Using connectProtocol to connect endpoint cause memory leak?

I was trying to build a server. Beside accept connection from clients as normal servers do, my server will connect other server as a client either.
I've set the protocol and endpoint like below:
p = FooProtocol()
client = TCP4ClientEndpoint(reactor, '127.0.0.1' , 8080) # without ClientFactory
Then, after call reactor.run(), the server will listen/accept new socket connections. when new socket connections are made(in connectionMade), the server will call connectProtocol(client, p), which acts like the pseudocode below:
while server accept new socket:
connectProtocol(client, p)
# client.client.connect(foo_client_factory) --> connecting in this way won't
# cause memory leak
As the connections to the client are made, the memory is gradually consumed(explicitly calling gc doesn't work).
Do I use the Twisted in a wrong way?
-----UPDATE-----
My test programe: Server waits clients to connect. When connection from client is made, server will create 50 connections to other server
Here is the code:
#! /usr/bin/env python
import sys
import gc
from twisted.internet import protocol, reactor, defer, endpoints
from twisted.internet.endpoints import TCP4ClientEndpoint, connectProtocol
class MyClientProtocol(protocol.Protocol):
def connectionMade(self):
self.transport.loseConnection()
class MyClientFactory(protocol.ClientFactory):
def buildProtocol(self, addr):
p = MyClientProtocol()
return p
class ServerFactory(protocol.Factory):
def buildProtocol(self, addr):
p = ServerProtocol()
return p
client_factory = MyClientFactory() # global
client_endpoint = TCP4ClientEndpoint(reactor, '127.0.0.1' , 8080) # global
times = 0
class ServerProtocol(protocol.Protocol):
def connectionMade(self):
global client_factory
global client_endpoint
global times
for i in range(50):
# 1)
p = MyClientProtocol()
connectProtocol(client_endpoint, p) # cause memleak
# 2)
#client_endpoint.connect(client_factory) # no memleak
times += 1
if times % 10 == 9:
print 'gc'
gc.collect() # doesn't work
self.transport.loseConnection()
if __name__ == '__main__':
server_factory = ServerFactory()
serverEndpoint = endpoints.serverFromString(reactor, "tcp:8888")
serverEndpoint.listen(server_factory)
reactor.run()

This program doesn't do any Twisted log initialization. This means it runs with the "log beginner" for its entire run. The log beginner records all log events it observes in a LimitedHistoryLogObserver (up to a configurable maximum).
The log beginner keeps 2 ** 16 (_DEFAULT_BUFFER_MAXIMUM) events and then begins throwing out old ones, presumably to avoid consuming all available memory if a program never configures another observer.
If you hack the Twisted source to set _DEFAULT_BUFFER_MAXIMUM to a smaller value - eg, 10 - then the program no longer "leaks". Of course, it's really just an object leak and not a memory leak and it's bounded by the 2 ** 16 limit Twisted imposes.
However, connectProtocol creates a new factory each time it is called. When each new factory is created, it logs a message. And the application code generates a new Logger for each log message. And the logging code puts the new Logger into the log message. This means the memory cost of keeping those log messages around is quite noticable (compared to just leaking a short blob of text or even a dict with a few simple objects in it).
I'd say the code in Twisted is behaving just as intended ... but perhaps someone didn't think through the consequences of that behavior complete.
And, of course, if you configure your own log observer then the "log beginner" is taken out of the picture and there is no problem. It does seem reasonable to expect that all serious programs will enable logging rather quickly and avoid this issue. However, lots of short throw-away or example programs often don't ever initialize logging and rely on print instead, making them subject to this behavior.
Note This problem was reported in #8164 and fixed in 4acde626 so Twisted 17 will not have this behavior.

How to avoid high cpu usage?

I created a zmq_forwarder.py that's run separately and it passes messages from the app to a sockJS connection, and i'm currently working on right now on how a flask app could receive a message from sockJS via zmq. i'm pasting the contents of my zmq_forwarder.py. im new to ZMQ and i dont know why everytime i run it, it uses 100% CPU load.
import zmq
# Prepare our context and sockets
context = zmq.Context()
receiver_from_server = context.socket(zmq.PULL)
receiver_from_server.bind("tcp://*:5561")
forwarder_to_server = context.socket(zmq.PUSH)
forwarder_to_server.bind("tcp://*:5562")
receiver_from_websocket = context.socket(zmq.PULL)
receiver_from_websocket.bind("tcp://*:5563")
forwarder_to_websocket = context.socket(zmq.PUSH)
forwarder_to_websocket.bind("tcp://*:5564")
# Process messages from both sockets
# We prioritize traffic from the server
while True:
# forward messages from the server
while True:
try:
message = receiver_from_server.recv(zmq.DONTWAIT)
except zmq.Again:
break
print "Received from server: ", message
forwarder_to_websocket.send_string(message)
# forward messages from the websocket
while True:
try:
message = receiver_from_websocket.recv(zmq.DONTWAIT)
except zmq.Again:
break
print "Received from websocket: ", message
forwarder_to_server.send_string(message)
as you can see, i've setup 4 sockets. the app connects to port 5561 to push data to zmq, and port 5562 to receive from zmq (although im still figuring out how to actually set it up to listen for messages sent by zmq). on the other hand, sockjs receives data from zmq on port 5564 and sends data to it on port 5563.
i've read the zmq.DONTWAIT makes receiving of message asynchronous and non-blocking so i added it.
is there a way to improve the code so that i dont overload the CPU? the goal is to be able to pass messages between the flask app and the websocket using zmq.

You are polling your two receiver sockets in a tight loop, without any blocking (zmq.DONTWAIT), which will inevitably max out the CPU.
Note that there is some support in ZMQ for polling multiple sockets in a single thread - see this answer. I think you can adjust the timeout in poller.poll(millis) so that your code only uses lots of CPU if there are lots of incoming messages, and idles otherwise.
Your other option is to use the ZMQ event loop to respond to incoming messages asynchronously, using callbacks. See the PyZMQ documentation on this topic, from which the following "echo" example is adapted:
# set up the socket, and a stream wrapped around the socket
s = ctx.socket(zmq.REP)
s.bind('tcp://localhost:12345')
stream = ZMQStream(s)
# Define a callback to handle incoming messages
def echo(msg):
# in this case, just echo the message back again
stream.send_multipart(msg)
# register the callback
stream.on_recv(echo)
# start the ioloop to start waiting for messages
ioloop.IOLoop.instance().start()

ZeroMQ: HWM on PUSH does not work

I am trying to write a server/client script with a server that vents the tasks, and multiple workers that execute it.
The problem is that my ventilator has so many tasks that it would fill up the memory in a heartbeat.
I tried to set the HWM before it binds, but with no success. It just keeps on sending messages as soon as a worker connects, completely disregarding the HWM that was set. I also have a sink that keeps record of the tasks that were done.
server.py
import zmq
def ventilate():
context = zmq.Context()
# Socket to send messages on
sender = context.socket(zmq.PUSH)
sender.setsockopt(zmq.SNDHWM, 30) #Big messages, so I don't want to keep too many in queue
sender.bind("tcp://*:5557")
# Socket with direct access to the sink: used to syncronize start of batch
sink = context.socket(zmq.PUSH)
sink.connect("tcp://localhost:5558")
print "Sending tasks to workers…"
# The first message is "0" and signals start of batch
sink.send('0')
print "Sent starting signal"
while True:
sender.send("Message")
if __name__=="__main__":
ventilate()
worker.py
import zmq
from multiprocessing import Process
def work():
context = zmq.Context()
# Socket to receive messages on
receiver = context.socket(zmq.PULL)
receiver.connect("tcp://localhost:5557")
# Socket to send messages to
sender = context.socket(zmq.PUSH)
sender.connect("tcp://localhost:5558")
# Process t asks forever
while True:
msg = receiver.recv_msg()
print "Doing sth with msg %s"%(msg)
sender.send("Message %s done"%(msg))
if __name__ == "__main__":
for worker in range(10):
Process(target=work).start()
sink.py
import zmq
def sink():
context = zmq.Context()
# Socket to receive messages on
receiver = context.socket(zmq.PULL)
receiver.bind("tcp://*:5558")
# Wait for start of batch
s = receiver.recv()
print "Received start signal"
while True:
msg = receiver.recv_msg()
print msg
if __name__=="__main__":
sink()

Ok, I had a play around, I don't think the issue is with the PUSH HWM, but rather that you can't set a HWM for PULL. If you look at this documentation, you can see there it says N/A for action on HWM.
The PULL sockets seem to be taking hundreds of messages each (and I did try setting a HWM just in case it did anything on the PULL socket. It didn't.). I evidenced this by changing the ventilator to send messages with an incrementing integer, and changing each worker in the pool to wait 2 seconds between calls to recv(). The workers print out that they are processing messages with vastly different integers. For instance, one worker will be working on message 10, while the next is working on message 400. As time goes on, you see the worker who was processing message 10, is now processing message 11, 12, 13, etc. while the other is processing 401, 402, etc.
This indicates to me that the ZMQ_PULL socket is buffering the messages somewhere. So while the ZMQ_PUSH socket does have a HWM, the PULL socket is requesting messages quickly, despite them not actually being accessed by a call to recv(). So that results in the PUSH HWM effectively being ignored if a PULL socket is connected. As far as I can see, you can't control the length of the buffer of the PULL socket (I would expect the RCVHWM socket option to control this but it doesn't appear to).
This behaviour of course begs the question what is the point of the ZMQ_PULL HWM option, which only makes sense to have if you can also control the receiving sockets HWM.
At this point, I'd start asking the 0MQ people whether you are missing something obvious, or if this is considered a bug.
Sorry I couldn't be more help!

ZeroMQ has buffers on both sending and receiving ends of a socket, hence you need to set high water marks on both the PUSH and the PULL socket in your code (and indeed before a bind() or connect()).
In the Python bindings this is now conveniently done via socket.hwm = 1 which will set both ZMQ_SNDHWM and ZMQ_RCVHWM in one go.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.