I'm running a python transformation pipeline by pulling messages from GCP pub/sub, transforming them and publishing them back to pub/sub.
I need to setup an http health check. A healthy server suppose to return code 200. If no results produced for 30 seconds, then server suppose t return code 500.
How do I run an http server in parallel with my pipeline?
Since python functions are executed synchronously, my code doesn't behave the way I want.
from http.server import HTTPServer, BaseHTTPRequestHandler
import time;
lastItemFinishedAt = time.time()
def start():
while True:
# pull message from pub/sub
# if got a message, then transform it and do this:
lastItemFinishedAt = time.time()
# if no messages found, then break the loop
print('🏁 No more messages in the queue')
start()
class Serv(BaseHTTPRequestHandler):
def do_GET(self):
if time.time() - lastItemFinishedAt > 30:
self.send_response(500)
else:
self.send_response(200)
httpd = HTTPServer(('localhost',8080),Serv)
httpd.serve_forever()
Related
I have 2 processes: the first one on which I create the global distributed client; the second process is a web scraper, that should get the global client and submit tasks to it and when everything is done, it sends a message to another process to tell it that he can proceed.
from dask.distributed import Client, as_completed
from multiprocessing import Process
from time import sleep
import zmq
def get(url) -> dict:
# downloads data from url
time.sleep(3)
return data
def save(data) -> None:
# saves data locally
time.sleep(3)
return None
def scraper(urls):
# global client
client = get_client()
# zeromq socket
context = zmq.Context()
socket = context.socket(zmq.PUB)
socket.bind('tcp://*:port')
while True:
for future, result in as_completed([client.submit(get, url=url) for url in urls], with_results=True):
save(data=result)
socket.send_string('All job is done for this minute, proceed.')
sleep(60)
if __name__ == '__main__':
client = Client()
s = Process(target=scraper, *args, **kwargs)
s.start()
The problem is that from the scraper function I can get the global client (I see it correctly if I print it), but I can't submit to it any kind of task. The console doesn't print any error, it's just stuck without doing nothing. I think that the cause is that the scraper function is running on a saparate multiprocessing.Process.
Any solution or workaround? Thank you.
The dask client holds open connections to the scheduler. Depending on how your systems creates new processes, you may get copies of the connections which point to nothing useful in the new process, or fail to transfer the client completely (it is not pickleable).
Instead, you should send the connection information to the child process
addr = c.scheduler_info()['address']
and in the target function do
client = Client(addr)
I'm implementing a bi-directional ping-pong demo app between an electron app and a python backend.
This is the code for the python part which causes the problems:
import sys
import zerorpc
import time
from multiprocessing import Process
def ping_response():
print("Sleeping")
time.sleep(5)
c = zerorpc.Client()
c.connect("tcp://127.0.0.1:4243")
print("sending pong")
c.pong()
class Api(object):
def echo(self, text):
"""echo any text"""
return text
def ping(self):
p = Process(target=ping_response, args=())
p.start()
print("got ping")
return
def parse_port():
port = 4242
try:
port = int(sys.argv[1])
except Exception as e:
pass
return '{}'.format(port)
def main():
addr = 'tcp://127.0.0.1:' + parse_port()
s = zerorpc.Server(Api())
s.bind(addr)
print('start running on {}'.format(addr))
s.run()
if __name__ == '__main__':
main()
Each time ping() is called from javascript side it will start a new process that simulates some work (sleeping for 5 seconds) and replies by calling pong on nodejs server to indicate work is done.
The issue is that the pong() request never gets to javascript side. If instead of spawning a new process I create a new thread using _thread and execute the same code in ping_response(), the pong request arrives in the javascript side. Also if I manually run the bash command zerorpc tcp://localhost:4243 pong I can see that the pong request is received by the nodejs script so the server on the javascript side works ok.
What happens with zerorpc client when I create a new process and it doesn't manage to send the request ?
Thank you.
EDIT
It seems it gets stuck in c.pong()
Try using gipc.start_process() from the gipc module (via pip) instead of multiprocessing.Process(). It creates a new gevent context which otherwise multiprocessing will accidentally inherit.
a simple grpc server client, client send a int and server streams int's back.
client is reading the messages one by one but server is running the generator function immediately for all responses.
server code:
import test_pb2_grpc as pb_grpc
import test_pb2 as pb2
import time
import grpc
from concurrent import futures
class test_servcie(pb_grpc.TestServicer):
def Produce(self, request, context):
for i in range(request.val):
print("request came")
rs = pb2.Rs()
rs.st = i + 1
yield rs
def serve():
server =
grpc.server(futures.ThreadPoolExecutor(max_workers=10))
pb_grpc.add_TestServicer_to_server(test_servcie(), server)
server.add_insecure_port('[::]:50051')
print("service started")
server.start()
try:
while True:
time.sleep(3600)
except KeyboardInterrupt:
server.stop(0)
if __name__ == '__main__':
serve()
client code:
import grpc
import test_pb2_grpc as pb_grpc
import test_pb2 as pb
def test():
channel = grpc.insecure_channel(
'{host}:{port}'.format(host="localhost", port=50051))
stub = pb_grpc.TestStub(channel=channel)
req = pb.Rq()
req.val = 20
for s in stub.Produce(req):
print(s.st)
import time
time.sleep(10)
test()
proto file:
syntax = "proto3";
service Test {
rpc Produce (Rq) returns (stream Rs);
}
message Rq{
int32 val = 1;
}
message Rs{
int32 st = 1;
}
after starting the server
when i run the client, server side generator started running and completed immediately it looped for the range.
what i expected is it will one by one as client calls but that is not the case.
is this an expected behaviour. my client is still printing the values but the sever is already completed the function.
Yes, this behavior is expected. gRPC features flow control between the two sides of an RPC (so that generating messages too fast on one side won't exhaust memory on the other side) but there's also an allowance for a small amount of buffering (so that a reasonably small amount of data may be sent by one side before the other side explicitly asks for it). In your case the twenty messages sent from server to client all fit within this small allowance. The service-side gRPC Python runtime is calling your service-side Produce method, consuming its entire output of twenty messages, and sending all those messages across the network to your client, where they are locally held by the invocation-side gRPC Python runtime until your invocation-side test function asks for them.
If you want to see the effects of flow control in action, try using huge messages (one megabyte in size or so) or altering the size of the allowance (I think this is done with a channel argument but those are an advanced and relatively-unsupported feature so this is left as an exercise).
I use http webserver python script:
class PiFaceWebHandler(http.server.BaseHTTPRequestHandler):
def do_GET(self):
[....]
if __name__ == "__main__":
# get the port
if len(sys.argv) > 1:
port = int(sys.argv[1])
else:
port = DEFAULT_PORT
# set up PiFace Digital
PiFaceWebHandler.pifacedigital = pifacedigitalio.PiFaceDigital()
print("Starting simple PiFace web control at:\n\n"
"\thttp://{addr}:{port}\n\n"
"Change the output_port with:\n\n"
"\thttp://{addr}:{port}?output_port=0xAA\n"
.format(addr=get_my_ip(), port=port))
# run the server
server_address = ('', port)
try:
httpd = http.server.HTTPServer(server_address, PiFaceWebHandler)
httpd.serve_forever()
except KeyboardInterrupt:
print('^C received, shutting down server')
httpd.socket.close()
It's working fine, but i want the script (or another one, ) check some I/O continuously, in a while loop, ie.
And sometimes this I/O could change state with http request.
Currently, I/O changes state on http request, but i don't find the tips to change them on external trigger (another input ie).
How can i do? Where can i code the loop test?
do I make myself clear?
Thanks,
Is there a way to trigger the send() websocket command based on a external event? I am trying to push to the client every time a database is updated. I've tried using an sql notify, a uwsgi file monitor decorator etc. Basic code is
from flask.ext.uwsgi_websocket import GeventWebSocket
from uwsgidecorators import *
ws = GeventWebSocket(app)
#ws.route('/feed')
def socket(ws):
ws_readystate = ws.receive()
if ws_readystate == '1':
ws.send(json.dumps('this message is received just fine'))
# client is ready, start something to trigger sending a message here
#filemon("../mydb.sqlite")
def db_changed(x):
print 'DB changed'
ws.send(json.dumps('db changed'))
this will print "DB changed" in output, but client won't recieve the 'db changed' message. I'm running the app as
uwsgi --master --http :5000 --http-websockets --gevent 2 --wsgi my_app_name:app
gevent queues are a great way to manage such patterns
This is an example you can adapt to your situation
from uwsgidecorators import *
from gevent.queue import Queue
channels = []
#filemon('/tmp',target='workers')
def trigger_event(signum):
for channel in channels:
try:
channel.put_nowait(True)
except:
pass
def application(e, sr):
sr('200 OK', [('Content-Type','text/html')])
yield "Hello and wait..."
q = Queue()
channels.append(q)
q.get()
yield "event received, goodbye"
channels.remove(q)
if you dot not plan to use multiple processes feel free to remove target='workers' from the filemon decorator (this special target raise the uwsgi signal to all of the workers instead of the first avilable one)