I have already read
http://wiki.wxpython.org/LongRunningTasks
http://wiki.wxpython.org/CallAfter
and searched a lot in Google but found no answer to my problem. Because in my opinion it would be to much code and it is more a theoretical problem, I hope it is ok without code.
What I want to do with an example: I have a grid (wx.grid) with check boxes in the main thread. Then I start a new thread (thread.start_new_thread) where I go through all rows (1 second per row) and check if the checkbox is set. If it is set, some job is done.
This is working, if I read out all rows before I start the thread. But I need to read it out while the thread is running, because the user should have the ability to uncheck or check another checkbox! But if I read it out in the new thread sometimes a "NonType Object is not callable" error is raised. I think because wx.CallAfter should be used to interact with the grid in the other thread. But CallAfter I can not use to get the return value.
I have no idea how to solve this issue. Perhaps some people with more thread experience have some idea? If you need additional data please ask, but I think that my example contains all necessary information.
A common approach to this type of thing is to use a Queue.Queue object to pass commands to one or more worker threads. The worker thread(s) will wait on a pull from the queue until there are items in the queue ready to be pulled. Part of the command object could be a target in the GUI thread to send a message to (in a thread-safe way, like with wx.CallAfter) when the command is completed.
You should also take a look at the wx.lib.delayedresult module. It is similar to the above but a little more capable and robust.
Related
I'm building a Python IDE, which needs to highlight all occurrences of the name under cursor (using Jedi library). The process of finding the occurrences can be quite slow.
In order to avoid freezing the GUI, I could run the search in another thread, but when the user moves quickly over several words, the background threads could pile up while working on now obsolete tasks. I would like to cancel the search for previous occurrences when user moves to new name.
Looks like killing a thread is complicated in Python. What are the other options for creating an easily cancellable background tasks in Python 3.4+?
I think concurrent.futures is the answer.
You can create a Thread / Process pool, submit any callable, receive a Future, which you can cancel if needed.
Reference: https://docs.python.org/3/library/concurrent.futures.html
A thread cannot be stopped by another one. This is a OS limitation rather than a Python one. Only thing you can do is periodically inspect a variable and, if set, stop the thread itself (just return).
Moreover, threads in Python suffer from the GIL. This means that CPU intensive operations, when carried out in a separate thread, will still affect your main loop as only one thread per process can run at a time.
I'd recommend you to run the search in a separate process which you can easily cancel whenever you want.
What the guys of YouCompleteMe are doing for example is wrapping Jedi in a HTTP server which they can query in the background. If the user moves the cursor before the completion comes back, the IDE can simply drop the request.
Well, my personal favorites are work queues. If it's a one-time application you should take a look at python rq. Extremely easy and fun to use. If you want to build something more "professional-grade" take a look at something like celery.
You might also want to look at multiprocessing
Seen this line of code but could not find documentation
self.conn.setblocking(0)
The question is, how do you poll a pool of pipes without blocking?
Got a parent process that needs to communicate with some unstable child processes and wish to poll and check periodically if they've something to say. Do not wish to block if they decide they need more time before they have something new to say. Will this magically do this?
Creating a pipe will return two connection objects. A connection object offers the polling functionality, where you can check if there is anything to read. Polling functionality allows you to specify a timeout to wait for.
If you have a group of connection objects that you are waiting on, then you can use multiprocessing.connection.wait(), or the non-multiprocessing version of it.
For details , see
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.connection.Connection
which will show you the connection object details. Look at the poll function
This is most likely what you were looking at: https://docs.python.org/2/library/socket.html#socket.socket.setblocking
You don't give much detail so I'm not exactly sure what you are trying to do, but usually when you have a number of sockets that you want to poll, you will use select (see these examples from PyMOTW).
you can check p.poll(0) then if the result was True then the pipe is not empty and you can receive the data without blocking .
My apologies beforehand for the length of the question, I didn't want to leave anything out.
Some background information
I'm trying to automate a data entry process by writing a Python application that uses the Windows API to simulate keystrokes, mouse movement and window/control manipulation. I have to resort to this method because I do not (yet) have the security clearance required to access the datastore/database directly (e.g. using SQL) or indirectly through a better suited API. Bureaucracy, it's a pain ;-)
The data entry process involves the correction of sales orders due to changes in article availability. The unavailable articles are either removed from the order or replaced by another suitable article.
Initially I want a human to be able to monitor the automatic data entry process to make sure everything goes right. To achieve this I slow down the actions on the one hand but also inform the user of what is currently going on through a pinned window.
The actual question
To allow the user to halt the automation process I'm registering the Pause/Break key as a hotkey and in the handler I want to pause the automation functionality. However, I'm currently struggling to figure out a way to properly pause the execution of the automation functionality. When the pause function is invoked I want the automation process to stop dead in its tracks, no matter what it is doing. I don't want it to even execute another keystroke.
UPDATE [23/01]: I actually want to do more than just pause, I want to be able to communicate with the automation process while it is running and request it to pause, skip the current sales order, give up completely and perhaps even more.
Can anybody show me The Right Way (TM) to achieve what I want?
Some more information
Here's an example of how the automation works (I'm using the pywinauto library):
from pywinauto import application
app = application.Application()
app.start_("notepad")
app.Notepad.TypeKeys("abcdef")
UPDATE [25/01]: After a few days of working on my application I've noticed I don't really use pywinauto that much, right now I'm only using it for finding window and then I directly use SendKeysCtypes.SendKeys to simulate keyboard input and win32api functions to simulate mouse input.
What I've found out so far
Here are a few methods I've come across so far in my search for an answer:
I could separate the automation functionality and the interface + hotkey listener in two separate processes. Let's refer to the former as "automator" and the latter as "manager". The manager can then pause the execution of the automator by sending the process a SIGSTOP signal and unpause it using the SIGCONT signal (or the Windows equivalents through SuspendThread/ResumeThread).
To be able to update the user interface the automator will need to inform the manager of its progression through some sort of an IPC mechanism.
Cons:
Would using SIGSTOP not be a little harsh? Would it even work properly? Lots of people seem to be advising against it and even calling it "dangerous".
I am worried that implementing the IPC mechanism is going to be a bit complicated. On the other hand, I have worked with DBus which wouldn't be too hard to implement.
The second method and one that lots of people seem to be suggesting involves using threads and essentially boils down to the following (simplified):
while True:
if self.pause: # pause
# Do the work...
However, doing it this way it seems it will only pause after there is no more work to do. The only way I see this method would work would be to divide the work (the entire automation process) into smaller work segments (i.e. tasks). Before starting on a new task the worker thread would check if it should pause and wait.
Cons:
Seems like an implementation to divide the work into smaller segments, such as the one above, would be very ugly code wise (aesthetically).
The way I imagine it, all statements would be transformed to look something like: queue.put((function, args)) (e.g. queue.put((app.Notepad.TypeKeys, "abcdef"))) and you'd have the automating process thread running through the tasks and continuously checking for the pause state before starting a task. That just can't be right...
The program would not actually stop dead in its tracks, but would first finish a task (however small) before actually pausing.
Progress made
UPDATE [23/01]: I've implemented a version of my application using the first method through the mentioned SuspendThread/ResumeThread functionality. So far this seems to work very nicely and also allows me to write the automation stuff just like you'd write any other script. The only quirk I've come across is that keyboard modifiers (CTRL, ALT, SHIFT) get "stuck" while paused. Something I can probably easily work around.
I've also written a test using the second method (threads and signals/message passing) and implemented the pause functionality. However, it looks really ugly (both checking for the pause flag and everything related to the "doing the work"). So if anybody can show me a proper example of something similar to the second method I'd appreciate it.
Related questions
Pausing a process?
Pausing a thread using threading class
Alex Martelli posted an answer saying:
There is no method for other threads to forcibly pause a thread (any more than there is for other threads to kill that thread) -- the target thread must cooperate by occasionally checking appropriate "flags" (a threading.Condition might be appropriate for the pause/unpause case).
He then referred to the multiprocessing module and SIGSTOP/SIGCONT.
Is there a way to indefinitely pause a thread?
Pausing a process in Windows
An answer to this question quotes the MSDN documentation regarding SuspendThread:
This function is primarily designed for use by debuggers. It is not intended to be used for thread synchronization. Calling SuspendThread on a thread that owns a synchronization object, such as a mutex or critical section, can lead to a deadlock if the calling thread tries to obtain a synchronization object owned by a suspended thread. To avoid this situation, a thread within an application that is not a debugger should signal the other thread to suspend itself. The target thread must be designed to watch for this signal and respond appropriately.
Is there any way to kill a Thread in Python?
How do I pass an exception between threads in python
Keep in mind that although in your level of abstraction, "executing a keystroke" is a single atomic operation, it's implemented on the machine as a rather complicated sequence of machine instructions. So, pausing a thread at arbitrary points could lead to things being in an indeterminate state. Sending SIGSTOP is the same level of dangerous as pausing a thread at an arbitrary point. Depending on where you are in a particular step, though, your automation could potentially be broken. For example, if you pause in the middle of a timing-dependent step.
It seems to me that this problem would be best solved at the level of the automation library. I'm not very familiar with the automation library that you're using. It might be worth contacting the developers of the library to see if they have any suggestions for pausing the execution of automation steps at safe sub-step levels.
I don't know pywinauto. But I'll assume that you have something like an Application class which you obtain and have methods like SendKeys/SendMouseEvent/etc to do things.
Create your own MyApplication class which holds a reference to pywinauto's application class. Provide the same methods but before each method check whether a pause event has occurred. If it has, you can jump into code which handles the pause event. That way you are checking for a pause every time you cause an event, but this all is handled by the one class without putting pause all over your code.
Once you've detected the pause you can handle it any way you like. For example, you can throw an exception to force giving up on the current task.
Separating the functionality and the interface thread/process is definately the best option imho, the second solution is quicker and easier but definately not better.
Perhaps using multiple threads and an exception would be a better idea than using multiple processes. But if you're using multiple processes than SIGSTOP might be your only way to get it to work.
Is there anything against using 2 threads for this?
1 thread for actually executing
1 thread for reading the user input
I use Python but not pywinauto; for this sort of tasks I use AutoHotKey . One way to implement a simple pause in an AutoHotkey script may be using a "toggle" key like ScrollLock and testing the key state in the script. Also, the script can restore the key state after switching the internal pause setting on / off.
I have a code-base that I'm looking to split up and add to by using threading, however I'm relatively new on how to handle it. Please before reading further respect my wish of NOT just re-writing this code and tossing it back at me with the problem solved. I would much rather work the problem out by someone pointing me in the right direction, than someone solving it FOR me; I don't learn well that way.
The fully functioning code-base is here -- It requires the mechanize and beautifulsoup libraries which can be installed via easy_install.
I've separated out all of my functions, and tried to keep the code as clean as possible (I'm sure there are some optimizations in there that I'll get reamed for, but the main problem is how to thread this.
My ultimate goal is to pack this into a thread, and then share cookies between other initialized browser objects in order to do other things while my original code is running 'backgrounded'.
I've tried thus:
class Recon(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
#Packed the stuff above my original while loop in here, minus functions.
def run(self):
#Packed my code past the while loop in here.
somevar = Recon()
somevar.start()
Problem I'm having is that, once I run the program it will run the things in init, but afterwards it just sits there and freezes on me. No traceback, no errors, just doesn't do anything, doesn't even return my command prompt back to my control.
Could I just get some tips, or a general flow of how to convert this? I got overwhelmed and deleted the code I was trying with so I don't have that example, but do I need to be prepending 'self.' to all of my variables? Do I need to just define my vars as global?
Here is a reproduction of what I'm having trouble with after having tried to convert the script to use threading.
As long as you have a single thread (as in the above snippet, where you instantiate Recon just once), it shouldn't matter much what you do where; but of course I imagine the reason you're introducing threading is to eventually move to having multiple threads active.
If that's the case, then the first key issue is to ensure that you never have two or more threads simultaneously trying to use the same shared system/resource -- for example, multiple threads writing at the same time to ReconFile, in the case of the code at the pastebin URL you mention.
The classic way to avoid such issues is to use locking, but my favorite way is quite different: make sure any such resource is accessed by only one dedicated thread, and use a Queue.Queue instance (intrinsically threadsafe) to have other threads post work-request to the dedicated thread (so instead of writing to ReconFile directly each other thread would make a list of lines to be written contiguously, then .put the list on the queue where the "recon file writing" worker thread is waiting via .get).
When you need to get results back from such actions (not the case here), the requesting thread would place its own personal "queue on which to return results" as part of the "work request packet" it puts to the worker thread's queue. I've presented much more detail about this recommended architecture in the threading chapter of "Python in a Nutshell" 2nd edition (and why, as the book's author, I would of course never recommend you perform an illegal download of a free pirate copy of my book, I can however mention there's plenty of sites offering such pirate copies for download -- the legal way to read my book for free is to sign up for a trial offer to O'Reilly's "safari" online books website).
This does not address the specific problem you're observing, since it's happening when you only have one thread around. I notice that thread is trying to perform lots of I/O on standard input and standard output, which is possibly problematic from a thread -- consider doing the input for a thread before you start it (in the main thread) and for needed output use Python's standard logging module, which is guaranteed to be thread-safe. Do you still observe problems then? If that's the case, then the next step is to pepper your code with logging.info calls so that you can pinpoint exactly where it's stalling -- and tell us about that, so we can try to help from there!
I've seen a few threaded downloaders online, and even a few multi-part downloaders (HTTP).
I haven't seen them together as a class/function.
If any of you have a class/function lying around, that I can just drop into any of my applications where I need to grab multiple files, I'd be much obliged.
If there is there a library/framework (or a program's back-end) that does this, please direct me towards it?
Threadpool by Christopher Arndt may be what you're looking for. I've used this "easy to use object-oriented thread pool framework" for the exact purpose you describe and it works great. See the usage examples at the bottom on the linked page. And it really is easy to use: just define three functions (one of which is an optional exception handler in place of the default handler) and you are on your way.
from http://www.chrisarndt.de/projects/threadpool/:
Object-oriented, reusable design
Provides callback mechanism to process results as they are returned from the worker threads.
WorkRequest objects wrap the tasks assigned to the worker threads and allow for easy passing of arbitrary data to the callbacks.
The use of the Queue class solves most locking issues.
All worker threads are daemonic, so they exit when the main program exits, no need for joining.
Threads start running as soon as you create them. No need to start or stop them. You can increase or decrease the pool size at any time, superfluous threads will just exit when they finish their current task.
You don't need to keep a reference to a thread after you have assigned the last task to it. You just tell it: "don't come back looking for work, when you're done!"
Threads don't eat up cycles while waiting to be assigned a task, they just block when the task queue is empty (though they wake up every few seconds to check whether they are dismissed).
Also available at http://pypi.python.org/pypi/threadpool, easy_install, or as a subversion checkout (see project homepage).