Workaround for exiting threads in Python - python

I'm trying to write a server program and I have a thread for listening for new clients:
class ClientFinder(Thread):
def __init__(self, port):
Thread.__init__(self)
self._continue = True
self._port = port
# try to create socket
def run(self):
# listen for new clients
while self._continue:
# add new clients
def stop(self):
# stop client
self._continue = False
client_finder = ClientFinder(8000)
client_finder.start()
client_finder.stop()
client_finder.join()
I can't join client_finder because it never ends. Calling stop() lets the thread stop after the next client is accepted, so the program just hangs forever.
1) Is it okay for my program to just end even if I haven't joined all my threads (such as by removing the join)? Or is this lazy/bad practice?
2) If it is a problem, what's the solution/best practice to avoid this? From what I've found so far, there's no way to force a thread to stop.

Whether waiting for the current clients to finish is a problem is really your choice. It may be a good idea, or you may prefer to kill connections.
Waiting for a new client is probably a worse thing, since it may never happen. An easy solution would be to have some reasonable timeout for the listening - let's say if nobody connects in 5s, you go back to the loop to check the flag. This is short enough for a typical shutdown solution, but long enough that rechecking shouldn't affect your CPU usage.
If you don't want to wait for a short timeout, you can add a pipe/socket between the thread doing shutdown and your ClientFinder and send a notification to shutdown. Instead of only waiting for a new client, you'd need to wait on both fds (I'm assuming ClientFinder uses sockets) and check which of them got a message.

Related

How to sniff a network interface with Twisted?

I need to receive raw packets from a network interface within Twisted code. The packets will not have the correct IP or MAC address, nor valid headers, so I need the raw thing.
I have tried looking into twisted.pair, but I was not able to figure out how to use it to get at the raw interface.
Normally, I would use scapy.all.sniff. However, that is blocking, so I can't just use it with Twisted. (I also cannot use scapy.all.sniff with a timeout and busy-loop, because I don't want to lose packets.)
A possible solution would be to run scapy.all.sniff in a thread and somehow call back into Twisted when I get a packet. This seems a bit inelegant (and also, I don't know how to do it because I am a Twisted beginner), but I might settle for that if I don't find anything better.
You could run a distributed system and pass the data through a central queuing system. Take the Unix philosophy and create a single application that does a few tasks and does them well. Create one application that sniffs the packets (you can use scapy here since it won't really matter if you block anything) then sends them to a queue (RabitMQ, Redis, SQS, etc) and have another application process the packet from the queue. This method should give you the least amount of headache.
If you need to run everything in a single application, then threads/multiprocessing is the only option. But there are some design patterns you'll want to follow. You can also break up the following code into separate functions and use a dedicated queuing system.
from threading import Thread
from time import sleep
from twisted.internet import defer, reactor
class Sniffer(Thread):
def __init__(self, _reactor, shared_queue):
super().__init__()
self.reactor = _reactor
self.shared_queue = shared_queue
def run(self):
"""
Sniffer logic here
"""
while True:
self.reactor.callFromThread(self.shared_queue.put, 'hello world')
sleep(5)
#defer.inlineCallbacks
def consume_from_queue(_id, _reactor, shared_queue):
item = yield shared_queue.get()
print(str(_id), item)
_reactor.callLater(0, consume_from_queue, _id, _reactor, shared_queue)
def main():
shared_queue = defer.DeferredQueue()
sniffer = Sniffer(reactor, shared_queue)
sniffer.daemon = True
sniffer.start()
workers = 4
for i in range(workers):
consume_from_queue(i+1, reactor, shared_queue)
reactor.run()
main()
The Sniffer class starts outside of Twisted's control. Notice the sniffer.daemon = True, this is so that the thread will stop when the main thread has stopped. If it were set to False (default) then the application will exit only if all the threads have come to an end. Depending on the task at hand this may or may not always be possible. If you can take breaks from sniffing to check a thread event, then you might be able to stop the thread in a safer way.
self.reactor.callFromThread(self.shared_queue.put, 'hello world') is necessary so that the item being put into the queue happens in the main reactor thread as opposed to the thread the Sniffer executes. The main benefit of this would be that there would be some sort of synchronization of the messages coming from the threads (assuming you plan to scale to sniffing multiple interfaces). Also, I wasn't sure of DeferredQueue objects are thread safe :) I treated them like they were not.
Since Twisted isn't managing the threads in this case, it's vital that the developer does. Notice the worker loop and consume_from_queue(i+1, reactor, shared_queue). This loop ensures only the desired number of workers are handling tasks. Inside the consume_from_queue() function, shared_queue.get() will wait (non-blocking) until an item is put into the queue, prints the item, then schedule another consume_from_queue().

kill socket.accept() call on closed unix socket

Socket.close() does not stop any blocking socket.accept() calls that are already running on that socket.
I have several threads in my python program that only run a blocking socket.accept() call on a unix domain socket that has been closed already.
I want to kill these threads by making the socket.accept() calls stop or
raise an exception.
I am trying to do this by loading new code in the program, without stopping the program.
Therefore, changing the code that spawned these threads or that closed the sockets is not an option.
Is there any way to do this?
This is similar to https://stackoverflow.com/a/10090348/3084431, but these solutions wont work for my code:
This point is not true, closing won't raise an exception on the accept. shutdown does, but that can not be called anymore when the thread is closed.
I can not connect to this socket anymore. The socket is closed.
The threads with the accept calls are already running, I can't change them.
Same as 3
For clarification, I have written some example code that has this problem.
This code works in both python 2 and python 3.
import socket
import threading
import time
address = "./socket.sock"
sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
sock.bind(address)
sock.listen(5)
def accept():
print(sock.accept())
t = threading.Thread(target=accept, name="acceptorthread")
t.start()
sock.close()
time.sleep(0.5) # give the thread some time to register the closing
print(threading.enumerate()) # the acceptorthread will still be running
What I need is something that I can run after this code has finished that can stop the acceptor thread somehow.
There is no mechanism in kernel to notify every listener that a socket is closed. You have to write something yourself. A simple solution is to use timeout on socket:
sock.settimeout(1)
def accept():
while True:
try:
print(sock.accept())
except socket.timeout:
continue
break
Now when you close the socket the next call (after a timeout) to .accept() will throw a "bad descriptor" exception.
Also remember that sockets api in Python is not thread safe. Wrapping every socket call with a lock (or other synchronization methods) is advised in multi-threaded environment.
More advanced (and efficient) would be to use wrap your socket with a select call. Note that the socket does not have to be in non-blocking mode in order to use it.
Therefore, changing the code that spawned these threads or that closed the sockets is not an option.
If that's the case, then you are doomed. Without changing the code running in threads it is impossible to achieve. It's like asking "how can I fix my broken car without modifying the car". Won't happen, mate.
You should only call .accept() on a socket that has given the "readable" result from some selectors. Then, accept doesn't need to be interrupted.
But in case of spurious wakeup, you should have the listening socket in O_NONBLOCK mode anyway.

How to use threads for functional tests of client server application?

I have client and server module, each one can be started by a function. I just need to find a way to run booth in parallel which:
in case of an exception in the client/server would stop the other so the test runner would not stay stuck
in case of an exception in client/server would print the exception or propagate it to the runner so I could see it and debug the client/server using the test suite
would preferably use threads for performance reasons
The first tentative with simple threads ended with an ugly os._exit(1) when catching a exception in the run method of the thread (which kills the test runner...) Edit: with the threading package
The second tentative (to try to avoid os._exit()) was with concurrent.futures.ThreadPoolExecutor. It allows to get the exception out of the thread but I still can't find a way to abort the other thread.
with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor:
server_future = executor.submit(server)
client_future = executor.submit(client)
concurrent.futures.wait([server_future, client_future],
return_when=concurrent.futures.FIRST_EXCEPTION)
if client_future.done() && client_future.exception():
# we can handle the client exception here
# but how to stop the server from waiting the client?
# also, raise is blocking
if server_future.done() && server_future.exception():
# same here
Is there a way to achieve this with threads?
If not with threads, is there a simple way to test a client server app at all? (I think the two first requirements are enough to have a usable solution)
Edit: The client or the server would be blocked on an accept() or a receive() call so I can't periodically pool a flag a decide to exit.(one of classic method to stop a thread)
You can use the threading package. Be aware though that force killing thread is not a good idea, as discussed here. It seems there is no official way to kill Thread in Python, but you can follow one of the example given on the linked post.
Now you need to wait for one thread to exit before stopping the other one, avoiding your test runner to be stuck. You can use Threads wrapping your server/client launch, and have your main Thread waiting for either client/server Thread to exit before killing the other one.
You can define your client/server Thread like this:
# Server thread (replace
class testServerThread (threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
# Do stuff if required
def run(self):
try:
startServer() # Or startClient() for your client thread
except: Exception
# Print your exception here, so you can debug
Then, start both client and server thread, and wait for one of them to exit. Once one of them is not alive anymore, you can kill the other and continue on testing.
# Create and start client/server
serverThread = testServerThread ()
clientThread = testClientThread ()
serverThread.start()
clientThread.start()
# Wait at most 5 seconds for them to exit, and loop if they're still both alive
while(serverThread.is_alive() and clientThread.is_alive()):
serverThread.join(5)
clientThread.join(5)
# Either client or server exited. Kill the other one.
# Note: the kill function you'll have to define yourself, as said above
if(serverThread.is_alive()):
serverThread.kill()
if(clientThread.islive()):
clientThread.kill()
# Done! Your Test runner can continue its work
The central piece of code is the join() function:
Wait until the thread terminates. This blocks the calling thread until the thread whose join() method is called terminates – either normally or through an unhandled exception –, or until the optional timeout occurs.
So in our case, it will wait 5 seconds for the client and 5 seconds for the server, and if both of them are still alive afterward it will loop again. Whenever one of them exit, the loop will stop, and the remaining thread will be killed.

Stopping SocketServer with blocking handle

I'm using SocketServer.ThreadingMixIn, pretty much as in the docs.
Other than having extracted the clients to run on their own script, I've also redefined the handle method as I want the connection to the client to keep alive and receive more messages:
def handle(self):
try:
while True:
data = self.request.recv(1024)
if not data:
break # Quits the thread if the client was disconnected
else:
print(cur_thread.name)
self.request.send(data)
except:
pass
The problem is that even when I try to terminate the server with server.shutdown() or by KeyboardInterrupt, it will still be blocked on the handle as long as the client maintains an open socket.
So how I can effectively stop the server even if there are still connected clients?
The best solution I found was to use SocketServer.ForkingMixIn instead of SocketServer.ThreadingMixIn.
This way the daemon actually works, even though using processes instead of threads was not exactly what I wanted.

Make epoll return an fd once, without writing

I have a master and a worker thread. The master thread accepts incoming connections and reads once from them. He then calls epoll.register(sock). The worker does epoll.poll() and does further reading and processing of incoming data.
The thing is: If there incoming data is very short, no more data coming from the fd after the first read which is done in the master thread, the worker thread is forever blocking in epoll.poll(). What he should do is, he should at least once wake up and return the newly added file descriptor.
How can I do this.
My current approach:
Master:
worker.epoll.register(sock.fileno())
worker.forced_fds_to_handle.add(sock.fileno())
Worker:
while True:
for fileno, event in self.epoll.poll(1):
self.forced_fds_to_handle.discard(fileno)
self._process(fileno, event)
while self.forced_fds_to_handle:
fileno = self.forced_fds_to_handle.pop()
self._process(fileno, select.EPOLLIN)
What I don't like about this approach: In the worst case the worker ignores the incoming fd for a second which means delay for my clients. Of course I could make the timeout smaller but then it would somehow waste resources.
I'd really appreciate if somebody knew something better.
I already tried:
In the master:
sock.write('')
... to trigger an EPOLLIN, but that didn't work.
The default value for epoll.poll() is -1 and as far as I remembern if you enter 0. it will not wait at all, just check whether there is some input or not.
That would solve your problem.
However, what I would be worry about is of a very CPU consuming loop with a while True and no waiting/blocking/sleeping that could diminish the pressure on the processor. If you find yourself in this situation, consider waiting/blocking/sleeping for 0.1 seconds and none would even notice*! ;-) I promise... :-)

Categories