Python3 how to syncronize a manager Dict between two Threads

Python3 how to syncronize a manager Dict between two Threads - python

I am working on a TestSuite app based on Python. In this application I have a running report multiprocessing thread which generate a html report of all the test cases execute by the main program.
The main program it self is multi threaded although using the Normal threading module.
Now this report writer thread act like a server and each test case thread started by the main application can ask for an interface to this writer (IFW).
Communicating from the IFW to the writer just use a single Queue(). However it is also possible for each of the IFW's to ask for a status form the writer and here is where it gets tricky because this data is IFW ID specific.
It is not possible to just use Queues because of the IFW dynamic behavior, so I used a manager to create a proxy Queue however this cause a lot of problems for me (created a new manager instance for each interface). So now I'm trying with the manager().dict() method. However I cannot figure out how to synchronize between the to threads. Here is my code from the IFW:
def getCurrentTestInfo(self):
self._UpdateQueue.put_nowait(WriterCtrlCmd(self.WriterIfId, 'get_current_test_info'))
while self.WriterIfId not in self._RequestDict:
pass
res = self._RequestDict[self.WriterIfId]
del self._RequestDict[self.WriterIfId]
return res
What happens here is that the IFW sends a request cmd to the writer and the writer then returns the test infomation. Along with this request cmd is there a specific IFW ID, this ID is unique.
I know at this point that the entry does not exist in the dict() so I wait for the entry to show up using a "poll" :) and then I read the data. However everyone can see the potential problem in this code.
Is there a way to somehow wait for the dict to be updated by means of the manager ? maybe an event or condition ? Although I would like something linked to the dict() method it self
NOTE there is a period of processing time from the put command until the dict have been updated. so I cannot use Queue.put() instead of Queue.put_nowait()

Related

What is the most efficient way to run independent processes from the same application in Python

I have a script that in the end executes two functions. It polls for data on a time interval (runs as daemon - and this data is retrieved from a shell command run on the local system) and, once it receives this data will: 1.) function 1 - first write this data to a log file, and 2.) function 2 - observe the data and then send an email IF that data meets certain criteria.
The logging will happen every time, but the alert may not. The issue is, in cases that an alert needs to be sent, if that email connection stalls or takes a lengthy amount of time to connect to the server, it obviously causes the next polling of the data to stall (for an undisclosed amount of time, depending on the server), and in my case it is very important that the polling interval remains consistent (for analytics purposes).
What is the most efficient way, if any, to keep the email process working independently of the logging process while still operating within the same application and depending on the same data? I was considering creating a separate thread for the mailer, but that kind of seems like overkill in this case.
I'd rather not set a short timeout on the email connection, because I want to give the process some chance to connect to the server, while still allowing the logging to be written consistently on the given interval. Some code:
def send(self,msg_):
"""
Send the alert message
:param str msg_: the message to send
"""
self.msg_ = msg_
ar = alert.Alert()
ar.send_message(msg_)
def monitor(self):
"""
Post to the log file and
send the alert message when
applicable
"""
read = r.SensorReading()
msg_ = read.get_message()
msg_ = read.get_message() # the data
if msg_: # if there is data in general...
x = read.get_failed() # store bad data
msg_ += self.write_avg(read)
msg_ += "==============================================="
self.ctlog.update_templog(msg_) # write general data to log
if x:
self.send(x) # if bad data, send...

This is exactly the kind of case you want to use threading/subprocesses for. Fork off a thread for the email, which times out after a while, and keep your daemon running normally.

Possible approaches that come to mind:
Multiprocessing
Multithreading
Parallel Python
My personal choice would be multiprocessing as you clearly mentioned independent processes; you wouldn't want a crashing thread to interrupt the other function.
You may also refer this before making your design choice: Multiprocessing vs Threading Python

Thanks everyone for the responses. It helped very much. I went with threading, but also updated the code to be sure it handled failing threads. Ran some regressions and found that the subsequent processes were no longer being interrupted by stalled connections and the log was being updated on a consistent schedule . Thanks again!!

Gspread - Change Listener?

I currently run a daemon thread that grabs all cell values, calculates if there's a change, and then writes out dependent cells in a loop, ie:
def f():
while not event.is_set():
update()
event.wait(15)
Thread(target=f).start()
This works, but the looped get-all calls are significant I/O.
Rather than doing this, it would be much cleaner if the thread was notified of changes by Google Sheets. Is there a way to do this?

I rephrased my comment on gspread GitHub's Issues:
Getting a change notification from Google Sheets is possible with help of installable triggers in Apps Script. You set up a custom function in the Scripts editor and assign a trigger event for this function. In this function you can fetch an external url with UrlFetchApp.fetch.
On the listening end (your web server) you'll have a handler for this url. This handler will do the job. Depending on the server configuration (many threads or processes) make sure to avoid possible race condition.
Also, I haven't tested non browser-triggered updates. If Sheets trigger the same event for this type of updates there could be a case for infinite loops.

I was able to get this working by triggering an HTTP request whenever Google Sheets detected a change.
On Google Sheets:
function onEdit (e) {
UrlFetchApp.fetch("http://myaddress.com");
}
Python-side (w/ Tornado)
import tornado.ioloop
import tornado.web
class MainHandler(tornado.web.RequestHandler):
def get(self):
on_edit()
self.write('Updating.')
def on_edit():
# Code here
pass
app = tornado.web.Application([(r'/', MainHandler)])
app.listen(#port here)
tornado.ioloop.IOLoop.current().start()
I don't think this sort of functionality should be within the scope of gspread, but I hope the documentation helps others.

Save to database inside thread

I'm working with django,
I have a thread whose purpose is to take queued list of database items and modify them
Here is my model :
class MyModel(models.Model):
boolean = models.BooleanField(editable=False)
and the problematic code :
def unqueue():
while not queue.empty():
myList = queue.get()
for i in myList:
if not i.boolean:
break
i.boolean = False
i.save() # <== error because database table is locked
queue.task_done()
queue = Queue()
thread = Thread(target=unqueue)
thread.start()
def addToQueue(pk_list): # can be called multiple times simultaneously
list = []
for pk in pk_list:
list.append(MyModel.objects.get(pk=pk))
queue.put(list)
I now the code is missing lot of check ect... I simplified here to make it clearer
What can I do to be able to save to my db while inside the thread ?
EDIT : I need to be synchronous because i.boolean (and other properties in my real code) mustn't be overwrite
I tried to create a dedicated Table in the database but it didn't work, I still have the same issue
EDIT 2 : I should precise that I'm using SQLite3. I tried to see if I could unlock/lock specific table in SQLite and it seem that locking apply on the entire db only. That is probably why using a dedicated table wasn't helpful.
That is bad for me, because I need to access different table simultaneously from different thread, is it possible ?
EDIT 3 : It seems that my problem is the one listed here
https://docs.djangoproject.com/en/1.8/ref/databases/#database-is-locked-errors

Are you sure you need a synchronized queue? May be asynchronous solution will solve your problem? Need a thread-safe asynchronous message queue

The solution I found was to change database,
SQLite don't allow concurency access like that.
I switched to MySql and it work now

Python: How to get input from console while an infinite loop is running?

I'm trying to write a simple Python IRC client. So far I can read data, and I can send data back to the client if it automated. I'm getting the data in a while True, which means that I cannot enter text while at the same time reading data. How can I enter text in the console, that only gets sent when I press enter, while at the same time running an infinite loop?
Basic code structure:
while True:
read data
#here is where I want to write data only if it contains '/r' in it

Another way to do it involves threads.
import threading
# define a thread which takes input
class InputThread(threading.Thread):
def __init__(self):
super(InputThread, self).__init__()
self.daemon = True
self.last_user_input = None
def run(self):
while True:
self.last_user_input = input('input something: ')
# do something based on the user input here
# alternatively, let main do something with
# self.last_user_input
# main
it = InputThread()
it.start()
while True:
# do something
# do something with it.last_user_input if you feel like it

What you need is an event loop of some kind.
In Python you have a few options to do that, pick one you like:
Twisted https://twistedmatrix.com/trac/
Asyncio https://docs.python.org/3/library/asyncio.html
gevent http://www.gevent.org/
and so on, there are hundreds of frameworks for this, you could also use any of the GUI frameworks like tkinter or PyQt to get a main event loop.
As comments have said above, you can use threads and a few queues to handle this, or an event based loop, or coroutines or a bunch of other architectures. Depending on your target platforms one or the other might be best. For example on windows the console API is totally different to unix ptys. Especially if you later need stuff like colour output and so on, you might want to ask more specific questions.

You can use a async library (see answer of schlenk) or use https://docs.python.org/2/library/select.html
This module provides access to the select() and poll() functions
available in most operating systems, epoll() available on Linux 2.5+
and kqueue() available on most BSD. Note that on Windows, it only
works for sockets; on other operating systems, it also works for other
file types (in particular, on Unix, it works on pipes). It cannot be
used on regular files to determine whether a file has grown since it
was last read.

trigger tab completion for python batch process built around readline

Background: I have a python program that imports and uses the readline module to build a homemade command line interface. I have a second python program (built around bottle, a web micro-framework) that acts as a front-end for that CLI. The second python program opens a pipe-like interface to the first, essentially passing user input and CLI output back and forth between the two.
Problem: In the outer wrapper program (the web interface), whenever the end-user presses the TAB key (or any other key that I bind the readline completer function), that key is inserted into the CLI's stdin without firing the readline completer function. I need this to trigger readline's command completion function instead, as normally occurs during an interactive CLI session.
Possible Solution #1: Is there some way to send the TAB key to a subprocess' stdin, so that a batch usage works the same as an interactive usage?
Possible Solution #2: Or, if there was some way to trigger the entire completion process manually (including matches generation and display), I could insert and scan for a special text sequence, like "<TAB_KEY_HERE>", firing the possible completion matches display function manually. (I wrote the completer function, which generates the possible matches, so all I really need is access to readline's function to display the possible matches.)
Possible Solution #3: I guess, if I cannot access readline's matches-display function, the last option is to rewrite readline's built-in display-completion function, so I can call it directly. :(
Is there a better solution? Any suggestions on following the paths presented by any of the above solutions? I am stuck on #1 and #2, and I'm trying to avoid #3.
Thanks!

Solution #1 proved to be a workable approach. The key was to not connect the web socket directly to the CLI app. Apparently, readline was falling back into some simpler mode, which filtered out all TAB's, since it was not connected to a real PTY/TTY. (I may not be remembering this exactly right. Many cobwebs have formed.) Instead, a PTY/TTY pair needed to be opened and inserted in between the CLI app and web-sockets app, which tricked the CLI app into thinking it was connected to a real keyboard-based terminal, like so:
import pty
masterPTY, slaveTTY = pty.openpty()
appHandle = subprocess.Popen(
['/bin/python', 'myapp.py'],
shell=False,
stdin=slaveTTY,
stdout=slaveTTY,
stderr=slaveTTY,
)
...
while True
# read output from CLI app
output = os.read(masterPTY, 1024)
...
# write output to CLI app
while input_data:
chars_written = os.write(masterPTY, input_data)
input_data = input_data[chars_written:]
...
appHandle.terminate()
os.close(masterPTY)
os.close(slaveTTY)
HTH someone else. :)
See this answer to a related question for more background:
https://stackoverflow.com/a/14565848/538418

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.