Distributed lock manager for Python

Distributed lock manager for Python - python

I have a bunch of servers with multiple instances accessing a resource that has a hard limit on requests per second.
I need a mechanism to lock the access on this resource for all servers and instances that are running.
There is a restful distributed lock manager I found on github: https://github.com/thefab/restful-distributed-lock-manager
Unfortunately there seems to be a min. lock time of 1 second and it's relatively unreliable. In several tests it took between 1 and 3 seconds to unlock a 1 second lock.
Is there something well tested with a python interface I can use for this purpose?
Edit: I need something that auto unlocks in under 1 second. The lock will never be released in my code.

My first idea was using Redis. But there are more great tools and some are even lighter, so my solution builds on zmq. For this reason you do not have to run Redis, it is enough to run small Python script.
Requirements Review
Let me review your requirements before describing solution.
limit number of request to some resource to a number of requests within fixed period of time.
auto unlocking
resource (auto) unlocking shall happen in time shorter than 1 second.
it shall be distributed. I will assume, you mean that multiple distributed servers consuming some resource shall be able and it is fine to have just one locker service (more on it at Conclusions)
Concept
Limit number of requests within timeslot
Timeslot can be a second, more seconds, or shorter time. The only limitation is precision of time measurement in Python.
If your resource has hard limit defined per second, you shall use timeslot 1.0
Monitoring number of requests per timeslot until next one starts
With first request for accessing your resource, set up start time for next timeslot and initialize request counter.
With each request, increase request counter (for current time slot) and allow the request unless you have reached max number of allowed requests in current time slot.
Serve using zmq with REQ/REP
Your consuming servers could be spread across more computers. To provide access to LockerServer, you will use zmq.
Sample code
zmqlocker.py:
import time
import zmq
class Locker():
def __init__(self, max_requests=1, in_seconds=1.0):
self.max_requests = max_requests
self.in_seconds = in_seconds
self.requests = 0
now = time.time()
self.next_slot = now + in_seconds
def __iter__(self):
return self
def next(self):
now = time.time()
if now > self.next_slot:
self.requests = 0
self.next_slot = now + self.in_seconds
if self.requests < self.max_requests:
self.requests += 1
return "go"
else:
return "sorry"
class LockerServer():
def __init__(self, max_requests=1, in_seconds=1.0, url="tcp://*:7777"):
locker=Locker(max_requests, in_seconds)
cnt = zmq.Context()
sck = cnt.socket(zmq.REP)
sck.bind(url)
while True:
msg = sck.recv()
sck.send(locker.next())
class LockerClient():
def __init__(self, url="tcp://localhost:7777"):
cnt = zmq.Context()
self.sck = cnt.socket(zmq.REQ)
self.sck.connect(url)
def next(self):
self.sck.send("let me go")
return self.sck.recv()
Run your server:
run_server.py:
from zmqlocker import LockerServer
svr = LockerServer(max_requests=5, in_seconds=0.8)
From command line:
$ python run_server.py
This will start serving locker service on default port 7777 on localhost.
Run your clients
run_client.py:
from zmqlocker import LockerClient
import time
locker_cli = LockerClient()
for i in xrange(100):
print time.time(), locker_cli.next()
time.sleep(0.1)
From command line:
$ python run_client.py
You shall see "go", "go", "sorry"... responses printed.
Try running more clients.
A bit of stress testing
You may start clients first and server later on. Clients will block until the server is up, and then will happily run.
Conclusions
described requirements are fulfilled
number of requests is limited
no need to unlock, it allows more requests as soon as there is next time slot available
LockerService is available over network or local sockets.
it shall be reliable, zmq is mature solution, python code is rather simple
it does not require time synchronization across all participants
performance will be very good
On the other hand, you may find, that limits of your resource are not so predictable as you assume, so be prepared to play with parameters to find proper balance and be always prepared for exceptions from this side.
There is also some space for optimization of providing "locks" - e.g. if locker runs out of allowed requests, but current timeslot is already almost completed, you might consider waiting a bit with your "sorry" and after a fraction of second provide "go".
Extending it to real distributed lock manager
By "distributed" we might also understand multiple locker servers running together. This is more difficult to do, but is also possible. zmq allows very easy connection to multiple urls, so clients could really easily connect to multiple locker servers. There is a question, how to coordinate locker servers not to allow too many request to your resource. zmq allows inter-server communication. One model could be, that each locker server would publish each provided "go" on PUB/SUB. All other locker servers would be subscribed, and used each "go" to increase their local request counter (with a bit modified logic).

The lowest effort way to implement this is to use lockable.
It offers low-level lock semantics and it comes with a Python client. Iportantly, you don't need to set up any database or server, it works by storing the lock on the lockable servers.
Locks have variable TTLs, but you can also release them early:
$ pip install lockable-dev
from lockable import Lock
my_lock = Lock('my-lock-name')
# acquire the lock
my_lock.acquire()
# release the lock
my_lock.release()

For my cluster I'm using ZooKeeper with python-kazoo library for queues and locks.
Modified example from kazoo api documentation for your purpose:
http://kazoo.readthedocs.org/en/latest/api/recipe/lock.html
zk = KazooClient()
lock = zk.Lock("/lockpath", "my-identifier")
if lock.acquire(timeout=1):
code here
lock.release()
But you need at least three nodes for ZooKeeper as I remember.

Your requirements seem very specific. I'd consider writing a simple lock server then implementing the locks client side with a class that acquires a lock when it is created then deletes the lock when it goes out of scope.
class Lock(object):
def __init__(self,resource):
print "Lock acquired for",resource
# Connect to lock server and acquire resource
def __del__(self):
print "Lock released"
# Connect to lock server and unlock resource if locked
def callWithLock(resource,call,*args,**kwargs):
lock = Lock(resource)
return call( *args, **kwargs )
def test( asdf, something="Else" ):
return asdf + " " + something
if __name__ == "__main__":
import sys
print "Calling test:",callWithLock( "resource.test", test, sys.argv[0] )
Sample output
$ python locktest.py
Calling test: Lock acquired for resource.test
Lock released
locktest.py Else

The distributed lock manager Taooka http://taooka.com has a TTL accuracy to nanoseconds. But it only has Golang client library.

Related

Python AIOHTTP.web server multiprocessing load-balancer?

I am currently developing a web app using the aiohttp module. I'm using:
aiohttp.web, asyncio, uvloop, aiohttp_session, aiohttp_security, aiomysql, and aioredis
I have run some benchmarks against it and while they're pretty good, I can't help but want for more. I know that Python is, by nature, single-threaded. AIOHTTP is using async as to be non-blocking but am I correct in assuming that it is not utilizing all CPU cores?
My idea: Run multiple instances of my aiohttp.web code via concurrent.futures in multiprocessing mode. Each process would serve the site on a different port. I would then put a load balancer in front of them. MySQL and Redis can be used to share state where necessary such as for sessions.
Question: Given a server with several CPU cores, will this result in the desired performance increase? If so, is there any specific pattern to pursue in order to avert problems? I can't think of anything that these aio modules are doing that would require that there only be a single thread though I could be wrong.
Note: This is not a subjective question as I've posed it. Either the module is currently bound to one thread/process or it isn't - can benefit from a multiprocessing module + load balancer or can't.

You're right asyncio uses one CPU only. (one event loop uses one thread only and thus one CPU only)
Whether your whole project is network or CPU bound is something I can't say.
You have to try.
You could use nginx or haproxy as load balancer.
You might even try to use no load balancer at all. I never tried this feature for load balancing, just as proof of concept for a fail-over system.
With new kernels multiple processes can listen to the same port (when using the SO_REUSEPORT option) and I guess it's the kernel who would be doing a round robin.
Here a small link to an article comparing performance of a typical nginx configuration vs an nginx setup with the SO_REUSEPORT feature:
https://blog.cloudflare.com/the-sad-state-of-linux-socket-balancing/
It seems SO_REUSEPORT might distributes the CPU charge rather evenly, but might increase the variation of response times. Not sure this is relevant in your setup, but thought I let you know.
Added 2020-02-04:
My solution added 2019-12-09 works, but triggers a deprecation warning.
When having more time and time for testing it myself I will post the improved solution here. For the time being you can find it at AIOHTTP - Application.make_handler(...) is deprecated - Adding Multiprocessing
Added 2019-12-09:
Here a small example of an HTTP server, that can be started multiple times listening on the same socket.
The kernel would distribute the tasks. I never checked whether this is efficient or not though.
reuseport.py:
import asyncio
import os
import socket
import time
from aiohttp import web
def mk_socket(host="127.0.0.1", port=8000, reuseport=False):
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
if reuseport:
SO_REUSEPORT = 15
sock.setsockopt(socket.SOL_SOCKET, SO_REUSEPORT, 1)
sock.bind((host, port))
return sock
async def handle(request):
name = request.match_info.get('name', "Anonymous")
pid = os.getpid()
text = "{:.2f}: Hello {}! Process {} is treating you\n".format(
time.time(), name, pid)
time.sleep(0.5) # intentionally blocking sleep to simulate CPU load
return web.Response(text=text)
if __name__ == '__main__':
host = "127.0.0.1"
port=8000
reuseport = True
app = web.Application()
sock = mk_socket(host, port, reuseport=reuseport)
app.add_routes([web.get('/', handle),
web.get('/{name}', handle)])
loop = asyncio.get_event_loop()
coro = loop.create_server(
protocol_factory=app.make_handler(),
sock=sock,
)
srv = loop.run_until_complete(coro)
loop.run_forever()
And one way to test it:
./reuseport.py & ./reuseport.py &
sleep 2 # sleep a little so servers are up
for n in 1 2 3 4 5 6 7 8 ; do wget -q http://localhost:8000/$n -O - & done
The output might look like:
1575887410.91: Hello 1! Process 12635 is treating you
1575887410.91: Hello 2! Process 12633 is treating you
1575887411.42: Hello 5! Process 12633 is treating you
1575887410.92: Hello 7! Process 12634 is treating you
1575887411.42: Hello 6! Process 12634 is treating you
1575887411.92: Hello 4! Process 12634 is treating you
1575887412.42: Hello 3! Process 12634 is treating you
1575887412.92: Hello 8! Process 12634 is treating you

I think is better to not reinvent the wheel and use one of the proposed solutions at the documentation:
https://docs.aiohttp.org/en/stable/deployment.html#nginx-supervisord

What is the most efficient way to run independent processes from the same application in Python

I have a script that in the end executes two functions. It polls for data on a time interval (runs as daemon - and this data is retrieved from a shell command run on the local system) and, once it receives this data will: 1.) function 1 - first write this data to a log file, and 2.) function 2 - observe the data and then send an email IF that data meets certain criteria.
The logging will happen every time, but the alert may not. The issue is, in cases that an alert needs to be sent, if that email connection stalls or takes a lengthy amount of time to connect to the server, it obviously causes the next polling of the data to stall (for an undisclosed amount of time, depending on the server), and in my case it is very important that the polling interval remains consistent (for analytics purposes).
What is the most efficient way, if any, to keep the email process working independently of the logging process while still operating within the same application and depending on the same data? I was considering creating a separate thread for the mailer, but that kind of seems like overkill in this case.
I'd rather not set a short timeout on the email connection, because I want to give the process some chance to connect to the server, while still allowing the logging to be written consistently on the given interval. Some code:
def send(self,msg_):
"""
Send the alert message
:param str msg_: the message to send
"""
self.msg_ = msg_
ar = alert.Alert()
ar.send_message(msg_)
def monitor(self):
"""
Post to the log file and
send the alert message when
applicable
"""
read = r.SensorReading()
msg_ = read.get_message()
msg_ = read.get_message() # the data
if msg_: # if there is data in general...
x = read.get_failed() # store bad data
msg_ += self.write_avg(read)
msg_ += "==============================================="
self.ctlog.update_templog(msg_) # write general data to log
if x:
self.send(x) # if bad data, send...

This is exactly the kind of case you want to use threading/subprocesses for. Fork off a thread for the email, which times out after a while, and keep your daemon running normally.

Possible approaches that come to mind:
Multiprocessing
Multithreading
Parallel Python
My personal choice would be multiprocessing as you clearly mentioned independent processes; you wouldn't want a crashing thread to interrupt the other function.
You may also refer this before making your design choice: Multiprocessing vs Threading Python

Thanks everyone for the responses. It helped very much. I went with threading, but also updated the code to be sure it handled failing threads. Ran some regressions and found that the subsequent processes were no longer being interrupted by stalled connections and the log was being updated on a consistent schedule . Thanks again!!

Hanging threads using async requests with connectionpool

For a project we need to request data through an API (HTTP/1.1), depending on what you find you can then send a request.post with instructions after which the API will send back a response. I made the program multithreaded so that the main program keeps requesting data while in case I want to post an instruction I spawn a thread to do that. (request data only takes 1sec where posting instruction and getting reply can take upto 3sec to respond)
However the problem I am walking into is that sometimes 1 of my threads hangs and only finishes if I issue command thread.join().I can see that it hangs as the data that i get in the main thread should resemble my previous instructions (send by the threads), (I allow a 5second period for the server to resemble the instructions I send prior, so it is not the case that the server is not yet updated). If I would now send the same instructions again I will find that now both instructions make it to the server (the hanging one, and the newly issued one). So somehow sending the new instructions has as a side-effect the previous instructions also get send.
The problem looks related to threading as my code doesn't hang when just executed serial. Looking at posts like this didn't help as I do not know in advance what my instruction needs to be for my asynchronous requests. It is important that I make use of persistent connections and reuse them as that saves alot of time on handshakes etc.
My questions:
What is a proper way of handling a connectionpool of persistent connections in a multithreaded way. (so it doesn't hang)
How can I debug/troubleshoot the Thread to find out where it hangs.
Requests gets recommended as a package often but maybe there are others, better suited for this kind of application?
Example Code:
import requests
from threading import Thread
req = requests.Session()
adapter = requests.adapters.HTTPAdapter(pool_connections = 10, pool_maxsize=10)
req.mount('',adapter)
url='http://python-requests.org'
def main(url=''):
thread_list=[]
counter=0
while True:
resp = req.get(url)
interesting = 1 #
if interesting:
instructions = {}
a = Thread(target = send_instructions, kwargs = dict(url = url, instructions = instructions))
a.start()
thread_list.append(a)
tmp=[]
for x in thread_list:
if x.isAlive():
tmp.append(x)
thread_list = tmp
if counter>10:
break
counter+=1
def send_instructions(url='', instructions=''):
resp=req.post(url, headers = instructions)
print(resp)
main(url)

How to abort context.socket.recv() the right way in ZeroMQ?

I have a small software where I have a separate thread which is waiting for ZeroMQ messages. I am using the PUB/SUB communication protocol of ZeroMQ.
Currently I am aborting that thread by setting a variable "cont_loop" to False.
But I discovered that, when no messages arrive to the ZeroMQ subscriber I cannot exit the thread (without taking down the whole program).
def __init__(self):
Thread.__init__(self)
self.cont_loop = True
def abort(self):
self.continue_loop = False
def run(self):
zmq_context = zmq.Context()
zmq_socket = zmq_context.socket(zmq.SUB)
zmq_socket.bind("tcp://*:%s" % *(5556))
zmq_socket.setsockopt(zmq.SUBSCRIBE, "")
while self.cont_loop:
data = zmq_socket.recv()
print "Message: " + data
zmq_socket.close()
zmq_context.term()
print "exit"
I tried to move socket.close() and context.term() to abort-method. So that it shuts down the subscriber but this killed the whole program.
What is the correct way to shut down the above program?

Q: What is the correct way to ... ?
A: There are many ways to achieve the set goal. Let me pick just one, as a mock-up example on how to handle distributed process-to-process messaging.
First. Assume, there are more priorities in typical software design task. Some higher, some lower, some even so low, that one can defer an execution of these low-priority sub-tasks, so that there remains more time in the scheduler, to execute those sub-tasks, that cannot handle waiting.
This said, let's view your code. The SUB-side instruction to .recv() as was being used, causes two things. One visible - it performs a RECEIVE operation on a ZeroMQ-socket with a SUB-behaviour. The second, lesser visible is, it remains hanging, until it gets something "compatible" with a current state of the SUB-behaviour ( more on setting this later ).
This means, it also BLOCKS all the time since such .recv() method call UNTIL some unknown, locally uncontrollable coincidence of states/events makes it to deliver a ZeroMQ-message, with it's content being "compatible" with the locally pre-set state of this (still blocking) SUB-behaviour instance.
That may take ages.
This is exactly why .recv() is being rather used inside a control-loop, where external handling gets both the chance & the responsibility to do what you want ( including abort-related operations & a fair / graceful termination with proper resources' release(s) ).
Receive process becomes .recv( flags = zmq.NOBLOCK ) in rather a try: except: episode. Such a way your local process does not lose it's control over the stream-of-events ( incl. the NOP being one such ).
The best next step?
Take your time and get through a great book of gems, "Code Connected, Volume 1", Pieter HINTJENS, co-father of the ZeroMQ, has published ( also as PDF ).
Many his thoughts & errors to be avoided that he had shared with us is indeed worth your time.
Enjoy the powers of ZeroMQ. It's very powerful & worth getting mastered top-down.

Call .exe from windows system service python?

I have a Windows System Service that I am trying to write. I'm trying to an interface for a POS machine, so ideally I would like to include this code inside of the system service. However some experimentation has lead me to believe that the windows system service will only execute basic tasks and not oter iterations.
I have another function that I need to call every x seconds, this additional function is a while loop, but I cannot get my function and the win32 loop to wait for system calls to play nicely together. I go into greater detail in my code below.
import win32service
import win32serviceutil
import win32event
class PySvc(win32serviceutil.ServiceFramework):
# net name
_svc_name_ = "test"
_svc_display_name_ = "test"
_svc_description_ = "Protects your computer."
def __init__(self, args):
win32serviceutil.ServiceFramework.__init__(self,args)
# create an event to listen for stop requests on
self.hWaitStop = win32event.CreateEvent(None, 0, 0, None)
# core logic of the service
def SvcDoRun(self):
# if the stop event hasn't been fired keep looping
while rc != win32event.WAIT_OBJECT_0:
# block for 60 seconds and listen for a stop event
rc = win32event.WaitForSingleObject(self.hWaitStop, 60000)
## I want to put an additional function that uses a while loop here.
## The service will not work correctly with additional iterations, inside or
## the above api calls.
## Due to the nature of the service and the api call above,
## this leads me to have to compile an additional .exe and somehow call that
## from the service.
# called when we're being shut down
def SvcStop(self):
# tell the SCM we're shutting down
self.ReportServiceStatus(win32service.SERVICE_STOP_PENDING)
# fire the stop event
win32event.SetEvent(self.hWaitStop)
if __name__ == '__main__':
win32serviceutil.HandleCommandLine(PySvc)
My research has shown me that I need to somehow call a .exe from a windows system service. Does anyone know how to do this? I have tried using os.system, and variant calls of the subprocess module to no avail, it seems that windows simply ignores them. Any ideas?
EDIT: revert to original question

Can't say as I'm familiar with Windows development but in *nix I've found sockets are very useful in situations where two things shouldn't be able to talk by definition but you need them to anyway e.g. making web browsers launch desktop apps, making the clipboard interact with the browser etc.
In most cases UDP sockets are all that you need for a little IPC and they are trivial to code for in Python. You do have to be extra careful though, often restrictions are there for a good reason and you need to really understand a rule before you go breaking it... Bear in mind anyone can send a UDP packet so make sure the receiving app only accept packets from localhost and make sure you sanity check all incoming packets to protect against local hackers/malware. If the data transmitted is particularly sensitive or the action initiated is powerful it may not be a good idea at all, only you know your app well enough to say really.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.