I have a Tkinter app which connects to a proprietary sensor over serial and controls it with Python. I am trying to have a thread which always runs in the background to detect if the connection dies, shows an alert, and allows the user to press a button to re-establish the serial connection (reconnect is also done in another thread). For some reason, the COM monitor thread works fine, and when I unplug the serial cable, it shows a popup and enables the reconnect button as expected, but when I reconnect one or twice, the reconnect thread just stops releasing the lock. I've checked this with print statements.
Here is the code which spawns the thread that does the actual reconnect over serial. This thread is spawned when the reconnect button is pressed.
def spawn_reconnect_to_mote_thread(self):
self.reconnect_thread = threading.Thread(target=self.reconnect_to_mote)
self.reconnect_thread.daemon = True
self.reconnect_thread.name = 'reconnect'
self.reconnect_thread.start()
Here is the reconnect function. We also use a proprietary control software written in python which is why the syntax may look weird. This software has its own thread that is spawned when a connection is made. MOTE_MODEL_MAP just maps the correct control functions to the correct model ID which allows this to be easily extended to different hardware models.
def reconnect_to_mote(self):
with self.lock:
self.connection.close()
self.connection = MOTE_MODEL_MAP[self.mote_model_str ['mote_transport'].Connection({'serial': [self.com_port]})
self.com_model_label['text'] = MOTE_MODEL_MAP[self.mote_model_str]['name']
self.reconnect_button['state'] = 'disabled'
self.scan_button['state'] = 'active'
self.monitor_com = True
This spawns the thread to monitor the COM port. This thread is spawned pretty early on in the app's __init__ once the initial connection is made and the hardware is prepped with some commands.
def spawn_com_monitor_thread(self):
self.com_monitor_thread = threading.Thread(target=self.com_monitor)
self.com_monitor_thread.daemon = True
self.com_monitor_thread.name = 'com monitor'
self.com_monitor_thread.start()
Lastly, this is the actual function running in the thread which monitors the COM port. It just constantly grabs open COM ports and checks if the one we're using exists. If not, it should show a popup, set a flag, and allow the user to click the reconnect button. The reconnect function should then set the flag again. This is to prevent infinite popups while a user reconnects.
def com_monitor(self):
while True:
ports = [port[0] for port in list(list_ports.comports())]
if (self.com_port not in ports) and (self.monitor_com == True):
with self.lock:
self.monitor_com = False
self.scan_button['state'] = 'disabled'
msg.showerror(title='Serial connection died', message='The serial cable has become disconnected. Scanning has been stopped. Please reconnect the cable and reconnect to the mote.')
self.reconnect_button['state'] = 'active'
self.monitor_com = False
I've been banging my head on this for a couple days now with no solution. I haven't done much threading before and would like to do this in a threaded way to allow for maximum responsiveness.
I am running Python 3.6.13.
Related
This is my code:
def _poll_for_messages(self, poller: Poller):
sockets = dict(poller.poll(3000))
if not sockets:
self._reconnect_if_necessary(poller)
return
if self._command_handler.command_socket in sockets:
encoded_message = self._command_handler.command_socket.recv_multipart()
This should communicate with my service bus and potentially reconnect if the bus gets restarted. When the Bus gets shut down, sometimes the last line still gets reached but the socket is not able to receive a message and it waits for one indefinitely.
For normal receives there is zmq.DONTWAIT but this does not work for multipart messages as far as I'm aware. Is there an easy way around this or am I polling for messages the wrong way in general?
If anyone stumbles over this and has the same problem, mine got fixed by adding the zmq.POLLIN flag when registering a socket to my poller:
poller.register(self._command_handler._command_socket, zmq.POLLIN)
Suppose I've got a simple Tornado web server, which starts like this:
app = ... # create an Application
srv = tornado.httpserver.HTTPServer(app)
srv.bind(port)
srv.start()
tornado.ioloop.IOLoop.instance().start()
I am writing an "end-to-end" test, which starts the server in a separate process with subprocess.Popen and then calls the server over HTTP. Now I need to make sure the server did not fail to start (e.g. because the port is busy) and then wait till server is ready.
I wrote a function to wait until the server gets ready :
def wait_till_ready(port, n=10, time_out=0.5):
for i in range(n):
try:
requests.get("http://localhost:" + str(port))
return
except requests.exceptions.ConnectionError:
time.sleep(time_out)
raise Exception("failed to connect to the server")
Is there a better way ?
How can the parent process, which forks and execs the server, make sure that the server didn't fail because the server port is busy for example ? (I can change the server code if I need it).
You could approach it in two ways:
Make a pipe / queue before you fork. Then, just before you start the io loop, notify the parent that everything went fine and you're ready for the request.
Open the port and bind to it before forking. You should make sure you close that socket on the parent side. But otherwise, the only thing which needs to run in the child is the io loop. You can handle all the other errors before the fork.
I'm using a library called BACpypes to communicate over network with a PLC. The short version here is that I need to start a BACpypes application in its own thread and then perform read/write to the plc in this separate thread.
For multiple PLC's, there is a processing loop that creates an application (providing the plc ip address), performs read writes on plc using application, kills application by calling BACpypes stop(*args) from the Core module, calls join on the thread, and then moves on to next ip address in the list until we start over again. This works for as many ip addresses (PLCs) as we have, but as soon as we are back at the first ip address (PLC) again, I get the error:
socket.error: [Errno 98] Address already in use
Here is the short code for my thread class, which uses the stop() and run() functions from BACpypes core.
class BACpypeThread(Thread):
def __init__(self, name):
Thread.__init__(self)
Thread.name = name
def run(self):
run()
def stop(self):
stop()
It seems like I'm not correctly killing the application. So, I know stop(*args) is registered as a signal handler according to BACpypes docs. Here is a snippet I pulled from this link http://bacpypes.sourceforge.net/modules/core.html
core.stop(*args)
Parameters: args – optional signal handler arguments
This function is called to stop a BACpypes application. It resets the running boolean value. This function also installed as a signal handler responding to the TERM signal so you can stop a background (deamon) process:
$ kill -TERM 12345
I feel like I need to provide a kill -term signal to make the ip address available again. I don't know how to do that. Here's my question...
1) In this example, 12345 is the process number I believe. How do I figure out that number for my thread?
2) Once I have the number, how do I actually pass the kill -TERM signal to the stop function? I just don't know how to actually write this line of code. So if someone could explain this that would be great.
Thanks for the help!
Before stopping the core, you need to free the socket.
I use :
try:
self.this_application.mux.directPort.handle_close()
except:
self.this_application.mux.broadcastPort.handle_close()
After that I call stop
then thread.join()
I have a threaded python socket server that opens a new thread for each connection.
The thread is a very simple communication based on question and answer.
Basically client sends initial data transmission, server takes it run an external app that does stuff to the transmission and returns a reply that the server will send back and the loop will begin again until client disconnects.
Now because the client will be on a mobile phone thus an unstable connection I get left with open threads no longer connected and because the loop starts with recv it is rather difficult to break on lost connectivity this way.
I was thinking on adding a send before the recv to test if connection is still alive but this might not help at all if the client disconnects after my failsafe send as the client sends a data stream every 5 seconds only.
I noticed the recv will break sometimes but not always and in those cases I am left with zombie threads using resources.
Also this could be a solid vulnerability for my system to be DOSed.
I have looked through the python manual and Googled since thursday trying to find something for this but most things I find are related to client and non blocking mode.
Can anyone point me in the right direction towards a good way on fixing this issue?
Code samples:
Listener:
serversocket = socket(AF_INET, SOCK_STREAM)
serversocket.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
serversocket.bind(addr)
serversocket.listen(2)
logg("Binded to port: " + str(port))
# Listening Loop
while 1:
clientsocket, clientaddr = serversocket.accept()
threading.Thread(target=handler, args=(clientsocket, clientaddr,port,)).start()
# This is useless as it will never get here
serversocket.close()
Handler:
# Socket connection handler (Threaded)
def handler(clientsocket, clientaddr, port):
clientsocket.settimeout(15)
# Loop till client closes connection or connection drops
while 1:
stream = ''
while 1:
ending = stream[-6:] # get stream ending
if ending == '.$$$$.':
break
try:
data = clientsocket.recv(1)
except:
sys.exit()
if not data:
sys.exit()
# this is the usual point where thread is closed when a client closes connection normally
stream += data
# Clear the line ending
stream = base64.b64encode(stream[:-6])
# Send data to be processed
re = getreply(stream)
# Send response to client
try:
clientsocket.send(re + str('.$$$$.'))
except:
sys.exit()
As you can see there are three conditions that at least one should trigger exit if connection fails but sometimes they do not.
Sorry, but I think that threaded idea in this case is not good. As you do not need to process/do a lot of stuff in these threads (workers?) and most of the time these threads are waiting for socket (is the blocking operation, isn't it?) I would advice to read about event-driven programming. According to sockets this pattern is extremly useful, becouse you can do all stuff in one thread. You are communicate with one socket at a time, but the rest of connections are just waiting to data so there is almost no loss. When you send several bytes you just check that maybe another connection requires carrying. You can read about select
and epoll.
In python there is several libraries to play with this nicly:
libev (c library wrapper) - pyev
tornado
twisted
I used tornado in some projects and it is done this task very good. Libev is nice also, but is a c-wrapper so it is a little bit low-level (but very nice for some tasks).
So you should use socket.settimeout(float) with the clientsocket like one of the comments suggested.
The reason you don't see any difference is, when you call socket.recv(bufsize[, flags]) and the timeout runs out an socket.timeout exception is thrown and you catch that exception and exit.
try:
data = clientsocket.recv(1)
except:
sys.exit()
should be somthing like:
try:
data = clientsocket.recv(1)
except timeout:
#timeout occurred
#handle it
clientsocket.close()
sys.exit()
If I connect to an inexistent socket with pyzmq I need to hit CTRL_C to stop the program. Could someone explay why this happens?
import zmq
INVALID_ADDR = 'ipc:///tmp/idontexist.socket'
context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect(INVALID_ADDR)
socket.send('hello')
poller = zmq.Poller()
poller.register(socket, zmq.POLLIN)
conn = dict(poller.poll(1000))
if conn:
if conn.get(socket) == zmq.POLLIN:
print "got result: ", socket.recv(zmq.NOBLOCK)
else:
print 'got no result'
This question was also posted as a pyzmq Issue on GitHub. I will paraphrase my explanation here (I hope that is appropriate, I am fairly new to SO):
A general rule: When in doubt, hangs at the end of your zeromq program are due to LINGER.
The hang here is caused by the LINGER socket option, and happens in the context.term() method called during garbage collection at the very end of the script. The LINGER behavior is described in the zeromq docs, but to put it simply, it is a timeout (in milliseconds) to wait for any pending messages in the queue to be handled after closing the socket before dropping the messages. The default behavior is LINGER=-1, which means to wait forever.
In this case, since no peer was ever started, the 'hello' message that you tried to send is still waiting in the send queue when the socket tries to close. With LINGER=-1, ZeroMQ will wait until a peer is ready to receive that message before shutting down. If you bind a REP socket to 'ipc:///tmp/idontexist.socket' while this script is apparently hanging, the message will be delivered and the script will finish exiting cleanly.
If you do not want your script to wait (as indicated by your print statements that you have already given up on getting a reply), set LINGER to any non-negative value (e.g. socket.linger = 0), and context.term() will return after waiting the specified number of milliseconds.
I should note that the INVALID_ADDR variable name suggests an understanding that connection to an interface that does not yet have a listener is not valid - this is incorrect. zeromq allows bind/connect events to happen in any order, as illustrated by the behavior described above, of binding a REP socket to the interface while the sending script is blocking on term().
In most cases, you can bind and connect ZMQ sockets in either order, so your connect()/send() is simply waiting for the corresponding bind() at the other end, which never comes, so the program appears to hang. Check where the program is hanging by printing out some logging statements...