I have a program that maintains a connection to a server with a periodic heartbeat. Every once in a while, the server stops responding to heartbeats and I have to reconnect. I implemented this with a timer that, if no response is heard after n seconds, will call reconnect. Every time this happens, I leak a thread and over time I eventually run out of threads.
Now, simplifying massively for an easy repro, this illlustrates how reconnecting after a delay and how always causes an increase in threads. How can I kill the old threads/sockets/selects (which may be waiting on a recv)?
import socket
import select
import threading
class Connection():
def tick(self):
print(threading.active_count()) # this increases every 1s!
# ... certain conditions not met / it's been too long, then:
self.reconnect()
def reconnect(self):
self.socket.shutdown(socket.SHUT_WR)
self.socket.close()
self.timer.cancel()
self.connect()
def connect(self):
self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.socket.connect((IP, TCP_PORT))
self.timer = threading.Timer(1, self.tick)
self.timer.start()
r,_,_ = select.select([self.socket], [], [])
if __name__ == '__main__':
Connection().connect()
I'm pretty sure, it's not select() that leaks any threads. Let's assume the select() doesn't return, i.e it blocks forever.
In that case
.tick() is called from the timer thread.
.tick() calls .reconnect() within the timer thread.
.reconnect() closes the existing socket. This causes the active select() call to fail with IOError "Bad file descriptor" (which is also why you should really fix your code).
.reconnect() tries to cancel the current timer.
This does nothing, since the timer already triggered (we are currently inside the timer function!).
.reconnect() calls .connect() and that one establishes a new timer and here we go again.
So the question is: Where does this mode of operation hang on to the existing timer object? Well, all your timer threads get terminated by an IOError from the select() call. This stores a per-thread reference of the exception.
My guess is that this prevents the reference counted cleanup in CPython to trigger and hence the timer thread will only be cleaned up during garbage collection. This is unreliable, since there is no guarantee that the timer thread is ever cleaned up in time.
If you add import gc; gc.collect() at the start of .connect(), the problem (seems to) goes away. But yeah, that's a non-solution.
Why don't you use the timeout parameter to select() to achieve a similar result without having to use a timer thread?
r = []
while not r:
if self.socket:
self.socket.shutdown(socket.SHUT_WR)
self.socket.close()
self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.socket.connect((IP, TCP_PORT))
# select returns empty lists on timeout
r, _, _ = select.select([self.socket], [], [], 1)
Don't forget to set self.socket = None in Connection.__init__() for this to work.
Related
I have a simple code in python 3 using schedule and socket:
import schedule
import socket
from time import sleep
def readDataFromFile():
data = []
with open("/tmp/tmp.txt", "r") as f:
for singleLine in f.readlines():
data.append(str(singleLine))
if(len(data)>0):
writeToBuffer(data)
def readDataFromUDP():
udpData = []
rcvData, addr = sock.recvfrom(256)
udpData.append(rcvData.decode('ascii'))
if(len(udpData)>0):
writeToBuffer(udpData)
.
.
.
def main():
schedule.every().second.do(readDataFromFile)
schedule.every().second.do(readDataFromUDP)
while(1):
schedule.run_pending()
sleep(1)
UDP_IP = "192.xxx.xxx.xxx"
UDP_PORT = xxxx
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.bind((UDP_IP, UDP_PORT))
main()
The problem is, script hung up on the sock.rcvfrom() instruction, and wait until data come.
How force python to run this job independently? Better idea is to run this in threads?
You can use threads here, and it'll work fine, but it will require a few changes. First, the scheduler on your background thread is going to try to kick off a new recvfrom every second, no matter how long the last one took. Second, since both threads are apparently trying to call the same writeToBuffer function, you're probably going to need a Lock or something else to synchronize them.
Rewriting the whole program around an asynchronous event loop is almost certainly overkill here.
Just changing the socket to be nonblocking and doing a hybrid is probably the simplest change, e.g., by using settimeout:
# wherever you create your socket
sock.settimeout(0.8)
# ...
def readDataFromUDP():
udpData = []
try:
rcvData, addr = sock.recvfrom(256)
except socket.timeout:
return
udpData.append(rcvData.decode('ascii'))
if(len(udpData)>0):
writeToBuffer(udpData)
Now, every time you call recvfrom, if there's data available, you'll handle it immediately; if not, it'll wait up to 0.8 seconds, and then raise an exception, which means you have no data to process, so go back and wait for the next loop. (There's nothing magical about that 0.8; I just figured something a little less than 1 second would be a good idea, so there's time left to do all the other work before the next schedule time hits.)
Under the covers, this works by setting the OS-level socket to non-blocking mode and doing some implementation-specific thing to wait with a timeout. You could do the same yourself by using setblocking(False) and using the select or selectors module to wait up to 0.8 seconds for the socket to be ready, but it's easier to just let Python take care of that for you.
I have this thread running :
def run(self):
while 1:
msg = self.connection.recv(1024).decode()
I wish I could end this thread when I close the Tkinter Window like this :
self.window.protocol('WM_DELETE_WINDOW', self.closeThreads)
def closeThreads(self):
self.game.destroy()
#End the thread
Can't use thread._close() because it is deprecated and python 3.4 does not allow it.
The only really satisfactory solution I've seen for this problem is not to allow your thread to block inside recv(). Instead, set the socket to non-blocking and have the thread block inside select() instead. The advantage of blocking inside select() is that you can tell select() to return when any one of several sockets becomes ready-for-read, which brings us to the next part: as part of setting up your thread, create a second socket (either a locally-connected TCP socket e.g. as provided by socketpair, or a UDP socket listening on a port for packets from localhost). When your main thread wants your networking thread to go away, your main thread should send a byte to that socket (or in the TCP case, the main thread could just close its end of the socket-pair). That will cause select() to return ready-for-read on that socket, and when your network thread realizes that the socket is marked ready-for-read, it should respond by exiting immediately.
The advantages of doing it that way are that it works well on all OS's, always reacts immediately (unlike a polling/timeout solution), takes up zero extra CPU cycles when the network is idle, and doesn't have any nasty side effects in multithreaded environments. The downside is that it uses up a couple of extra sockets, but that's usually not a big deal.
Two solutions:
1) Don't stop the thread, just allow it to die when the process exits with sys.exit()
2) Start the thread with a "die now" flag. The Event class is specifically designed to signal one thread from another.
The following example starts a thread, which connects to a server. Any data is handled, and if the parent signals the thread to exit, it will. As an additional safety feature we have an alarm signal to kill everything, just it case something gets out of hand.
source
import signal, socket, threading
class MyThread(threading.Thread):
def __init__(self, conn, event):
super(MyThread,self).__init__()
self.conn = conn
self.event = event
def handle_data(self):
"process data if any"
try:
data = self.conn.recv(4096)
if data:
print 'data:',data,len(data)
except socket.timeout:
print '(timeout)'
def run(self):
self.conn.settimeout(1.0)
# exit on signal from caller
while not self.event.is_set():
# handle any data; continue loop after 1 second
self.handle_data()
print 'got event; returning to caller'
sock = socket.create_connection( ('example.com', 80) )
event = threading.Event()
# connect to server and start connection handler
th = MyThread(conn=sock, event=event)
# watchdog: kill everything in 3 seconds
signal.alarm(3)
# after 2 seconds, tell data thread to exit
threading.Timer(2.0, event.set).start()
# start data thread and wait for it
th.start()
th.join()
output
(timeout)
(timeout)
got event; returning to caller
This question already has answers here:
thread.start_new_thread vs threading.Thread.start
(2 answers)
Closed 9 years ago.
Level beginner. I have a confusion regarding the thread creation methods in python. To be specific is there any difference between the following two approaches:
In first approach I am using import thread module and later I am creating a thread by this code thread.start_new_thread(myfunction,()) as myfunction() doesn't have any args.
In second approach I am using from threading import Thread and later I am creating threads by doing something like this: t = Thread(target=myfunction)then t.start()
The reason why I am asking is because my programme works fine for second approach but when I use first approach it doesn't works as intended. I am working on a Client-Server programme. Thanks
The code is as below:
#!/usr/bin/env python
import socket
from threading import Thread
import thread
data = 'default'
tcpSocket = ''
def start_server():
global tcpSocket
tcpSocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
tcpSocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
tcpSocket.bind(('',1520))
tcpSocket.listen(3)
print "Server is up...."
def service():
global tcpSocket
(clientSocket,address) = tcpSocket.accept()
print "Client connected with: ", address
# data = 'default'
send_data(clientSocket,"Server: This is server\n")
global data
while len(data):
data = receive_data(clientSocket)
send_data(clientSocket,"Client: "+data)
print "Client exited....\nShutting the server"
clientSocket.close()
tcpSocket.close()
def send_data(socket,data):
socket.send(data)
def receive_data(socket):
global data
data = socket.recv(2048)
return data
start_server()
for i in range(2):
t = Thread(target=service)
t.start()
#thread.start_new_thread(service,())
#immortal can you explain bit more please. I didn't get it sorry. How can main thread die? It should start service() in my code then the server waits for client. I guess it should wait rather than to die.
Your main thread calls:
start_server()
and that returns. Then your main thread executes this:
for i in range(2):
t = Thread(target=service)
t.start()
#thread.start_new_thread(service,())
Those also complete almost instantly, and then your main thread ends.
At that point, the main thread is done. Python enters its interpreter shutdown code.
Part of the shutdown code is waiting to .join() all (non-daemon) threads created by the threading module. That's one of the reasons it's far better not to use thread unless you know exactly what you're doing. For example, if you're me ;-) But the only times I've ever used thread are in the implementation of threading, and to write test code for the thread module.
You're entirely on your own to manage all aspects of a thread module thread's life. Python's shutdown code doesn't wait for those threads. The interpreter simply exits, ignoring them completely, and the OS kills them off (well, that's really up to the OS, but on all major platforms I know of the OS does just kill them ungracefully in midstream).
I want to connect to multiple telnet hosts using threading in python, but I stumbled about an issue I'm not able to solve.
Using the following code on MAC OS X Lion / Python 2.7
import threading,telnetlib,socket
class ReaderThread(threading.Thread):
def __init__(self, ip, port):
threading.Thread.__init__(self)
self.ip = ip
self.port = port
self.telnet_con = telnetlib.Telnet()
def run(self):
try:
print 'Start %s' % self.ip
self.telnet_con.open(self.ip,self.port,30)
print 'Done %s' % self.ip
except socket.timeout:
print 'Timeout in %s' % self.ip
def join(self):
self.telnet_con.close()
ta = []
t1 = ReaderThread('10.0.1.162',9999)
ta.append(t1)
t2 = ReaderThread('10.0.1.163',9999)
ta.append(t2)
for t in ta:
t.start()
print 'Threads started\n'
In general it works, but either one of the threads (it is not always the same one) takes a long time to connect (about 20 second and sometimes even runs into a timeout). During that awfully long connection time (in an all local network), cpu load also goes up to 100 %.
Even more strange is the fact that if I'm using only one thread in the array it always works flawlessly. So it must have something to do with the use of multiple threads.
I already added hostname entries for all IP addresses to avoid a DNS lookup issue. This didn't make a difference.
Thanks in advance for your help.
Best regards
senexi
Ok, You have overridden join(), and you are not supposed to do that. The main thread calls join() on each thread when the main thread finishes, which is right after the last line in your code. Since your join() method returns before your telnet thread actually exits, Python gets confused and tries to call join() again, and this is what causes the 100% cpu usage. Try to put a 'print' statement in your join() method.
Your implementation of join() tries to close the socket (probably while the other thread is still trying to open a connection), and this might be what causing your telnet threads to never finish.
I have a queue that always needs to be ready to process items when they are added to it. The function that runs on each item in the queue creates and starts thread to execute the operation in the background so the program can go do other things.
However, the function I am calling on each item in the queue simply starts the thread and then completes execution, regardless of whether or not the thread it started completed. Because of this, the loop will move on to the next item in the queue before the program is done processing the last item.
Here is code to better demonstrate what I am trying to do:
queue = Queue.Queue()
t = threading.Thread(target=worker)
t.start()
def addTask():
queue.put(SomeObject())
def worker():
while True:
try:
# If an item is put onto the queue, immediately execute it (unless
# an item on the queue is still being processed, in which case wait
# for it to complete before moving on to the next item in the queue)
item = queue.get()
runTests(item)
# I want to wait for 'runTests' to complete before moving past this point
except Queue.Empty, err:
# If the queue is empty, just keep running the loop until something
# is put on top of it.
pass
def runTests(args):
op_thread = SomeThread(args)
op_thread.start()
# My problem is once this last line 't.start()' starts the thread,
# the 'runTests' function completes operation, but the operation executed
# by some thread is not yet done executing because it is still running in
# the background. I do not want the 'runTests' function to actually complete
# execution until the operation in thread t is done executing.
"""t.join()"""
# I tried putting this line after 't.start()', but that did not solve anything.
# I have commented it out because it is not necessary to demonstrate what
# I am trying to do, but I just wanted to show that I tried it.
Some notes:
This is all running in a PyGTK application. Once the 'SomeThread' operation is complete, it sends a callback to the GUI to display the results of the operation.
I do not know how much this affects the issue I am having, but I thought it might be important.
A fundamental issue with Python threads is that you can't just kill them - they have to agree to die.
What you should do is:
Implement the thread as a class
Add a threading.Event member which the join method clears and the thread's main loop occasionally checks. If it sees it's cleared, it returns. For this override threading.Thread.join to check the event and then call Thread.join on itself
To allow (2), make the read from Queue block with some small timeout. This way your thread's "response time" to the kill request will be the timeout, and OTOH no CPU choking is done
Here's some code from a socket client thread I have that has the same issue with blocking on a queue:
class SocketClientThread(threading.Thread):
""" Implements the threading.Thread interface (start, join, etc.) and
can be controlled via the cmd_q Queue attribute. Replies are placed in
the reply_q Queue attribute.
"""
def __init__(self, cmd_q=Queue.Queue(), reply_q=Queue.Queue()):
super(SocketClientThread, self).__init__()
self.cmd_q = cmd_q
self.reply_q = reply_q
self.alive = threading.Event()
self.alive.set()
self.socket = None
self.handlers = {
ClientCommand.CONNECT: self._handle_CONNECT,
ClientCommand.CLOSE: self._handle_CLOSE,
ClientCommand.SEND: self._handle_SEND,
ClientCommand.RECEIVE: self._handle_RECEIVE,
}
def run(self):
while self.alive.isSet():
try:
# Queue.get with timeout to allow checking self.alive
cmd = self.cmd_q.get(True, 0.1)
self.handlers[cmd.type](cmd)
except Queue.Empty as e:
continue
def join(self, timeout=None):
self.alive.clear()
threading.Thread.join(self, timeout)
Note self.alive and the loop in run.