Python multiprocessing/threading blocking main thread

Python multiprocessing/threading blocking main thread - python

I’m trying to write a program in Python. What I want to write is a script which immediately returns a friendly message to the user, but spawns a long subprocess in the background that takes with several different files and writes them to a granddaddy file. I’ve done several tutorials on threading and processing, but what I’m running into is that no matter what I try, the program waits and waits until the subprocess is done before it displays the aforementioned friendly message to the user. Here’s what I’ve tried:
Threading example:
#!/usr/local/bin/python
import cgi, cgitb
import time
import threading
class TestThread(threading.Thread):
def __init__(self):
super(TestThread, self).__init__()
def run(self):
time.sleep(5)
fileHand = open('../Documents/writable/output.txt', 'w')
fileHand.write('Big String Goes Here.')
fileHand.close()
print 'Starting Program'
thread1 = TestThread()
#thread1.daemon = True
thread1.start()
I’ve read these SO posts on multithreading
How to use threading in Python?
running multiple threads in python, simultaneously - is it possible?
How do threads work in Python, and what are common Python-threading specific pitfalls?
The last of these says that running threads concurrently in Python is actually not possible. Fair enough. Most of those posts also mention the multiprocessing module, so I’ve read up on that, and it seems fairly straightforward. Here’s the some of the resources I’ve found:
How to run two functions simultaneously
Python Multiprocessing Documentation Example
https://docs.python.org/2/library/multiprocessing.html
So here’s the same example translated to multiprocessing:
#!/usr/local/bin/python
import time
from multiprocessing import Process, Pipe
def f():
time.sleep(5)
fileHand = open('../Documents/writable/output.txt', 'w')
fileHand.write('Big String Goes Here.')
fileHand.close()
if __name__ == '__main__':
print 'Starting Program'
p = Process(target=f)
p.start()
What I want is for these programs to immediately print ‘Starting Program’ (in the web-browser) and then a few seconds later a text file shows up in a directory to which I’ve given write privileges. However, what actually happens is that they’re both unresponsive for 5 seconds and then they print ‘Starting Program’ and create the text file at the same time. I know that my goal is possible because I’ve done it in PHP, using this trick:
//PHP
exec("php child_script.php > /dev/null &");
And I figured it would be possible in Python. Please let me know if I’m missing something obvious or if I’m thinking about this in the completely wrong way. Thanks for your time!
(System information: Python 2.7.6, Mac OSX Mavericks. Python installed with homebrew. My Python scripts are running as CGI executables in Apache 2.2.26)

Ok- I think I found the answer. Part of it was my own misunderstanding. A python script can't simply return message to a client-side (ajax) program but still be executing a big process. The very act of responding to the client means that the program has finished, threads and all. The solution, then, is to use the python version of this PHP trick:
//PHP
exec("php child_script.php > /dev/null &");
And in Python:
#Python
subprocess.call(" python worker.py > /dev/null &", shell=True)
It starts an entirely new process outside the current one, and it will continue after the current one has ended. I'm going to stick with Python because at least we're using a civilized api function to start the worker script instead of the exec function, which always made me uncomfortable.

Related

Import Python library in terminal

I need to run a Python script in a terminal, several times. This script requires me to import some libraries. So every time I call the script in the terminal, the libraries are loaded again, which results in a loss of time. Is there any way I can import the libraries once and for all at the beginning?
(If I try the "naive" way, calling first a script just to import libraries then running my code, it doesn't work).
EDIT: I need to run the script in a terminal because actually it is made to serve in another program developed in Java. The Java code calls the Pythin script in the terminal, reads its result and processes it, then calls it again.

One solution is that you can leave the python script always running and use a pipe to communicate between processes like the code below taken from this answer.
import os, time
pipe_path = "/tmp/mypipe"
if not os.path.exists(pipe_path):
os.mkfifo(pipe_path)
# Open the fifo. We need to open in non-blocking mode or it will stalls until
# someone opens it for writting
pipe_fd = os.open(pipe_path, os.O_RDONLY | os.O_NONBLOCK)
with os.fdopen(pipe_fd) as pipe:
while True:
message = pipe.read()
if message:
print("Received: '%s'" % message)
print("Doing other stuff")
time.sleep(0.5)

The libraries will be unloaded once the script finishes, so the best way you can handle this is to write the script so it can iterate however many times you want, rather than running the whole script multiple times. I would likely use input() (or raw_input() if you're running Python2) to read in however many times you want to iterate over it, or use a library like click to create a command line argument for it.

GDB not stopping with "interrupt" command from python script

I've been ripping my hair out over this. I've searched the internet and can't seem to find a solution to my problem. I'm trying to auto test some code using the gdb module from python. I can do basic command and things are working except for stopping a process that's running in the background. Currently I continue my program in the background after a break point with this:
gdb.execute("c&")
I then interact with the running program reading different constant values and getting responses from the program.
Next I need to get a chunk of memory so I run these commands:
gdb.execute("interrupt") #Pause execution
gdb.execute("dump binary memory montiormem.bin 0x0 (&__etext + 4)") #dump memory to file
But when I run the memory dump I get an error saying the command can't be run when the target is running, after the error the interrupt command is run and the target is paused, then from the gdb console window I can run the memory dump.
I found a similar issue from awhile ago that seems to not be answered here.
I'm using python2.7.
I also found this link which seems to be the issue but no indication if it's in my build of gdb (which seems unlikely).

I had the same problem, from what I can tell from googling it is a current limitation of gdb: interrupt simply doesn't work in batch mode (when specifying commands with --ex, or -x file, or on stdin, or sourcing from file), it runs the following commands before actually stopping the execution (inserting a delay doesn't help). Building on the #dwjbosman's solution, here's a compact version suitable for feeding to gdb with --ex arguments for example:
python import threading, gdb
python threading.Timer(1.0, lambda: gdb.post_event(lambda: gdb.execute("interrupt"))).start()
cont
thread apply all bt full # or whatever you wanted to do
It schedules an interrupt after 1 second and resumes the program, then you can do whatever you wanted to do after the pause right in the main script.

I had the same problem, but found that none of the other answers here really work if you are trying to script everything from python. The issue that I ran into was that when I called gdb.execute('continue'), no code in any other python thread would execute. This appears to be because gdb does not release the python GIL while the continue command is waiting for the program to be interrupted.
What I found that actually worked for me was this:
def delayed_interrupt():
time.sleep(1)
gdb.execute('interrupt')
gdb.post_event(delayed_interrupt)
gdb.execute('continue')

I just ran into this same issue while writing some automated testing scripts. What I've noticed is that the 'interrupt' command doesn't stop the application until after the current script has exited.
Unfortunately, this means that you would need to segment your scripts anytime you are causing an interrupt.
Script 1:
gdb.execute('c&')
gdb.execute('interrupt')
Script 2:
gdb.execute("dump binary memory montiormem.bin 0x0 (&__etext + 4)")

I used multi threading to get arround this issue:
def post(cmd):
def _callable():
print("exec " + cmd , flush=True)
gdb.execute(cmd)
print("schedule " + cmd , flush=True)
gdb.post_event(_callable)
class ScriptThread (threading.Thread):
def run (self):
while True:
post("echo hello\n")
time.sleep(1)
x = ScriptThread()
x.start()
Save this as "test_script.py"
Use the script as follows:
gdb
> source test_script.py
Note: that you can also pipe "source test_script.py", but you need to keep the pipe open.
Once the thread is started GDB will wait for the thread to end and will process any commands you send to it via the "post_event" function. Even "interrupt"!

Control executed programm with python

I want to execute a testrun via bash, if the test needs too much time. So far, I found some good solutions here. But since the command kill does not work properly (when I use it correctly it says it is not used correctly), I decided to solve this problem using python. This is the Execution call I want to monitor:
EXE="C:/program.exe"
FILE="file.tpt"
HOME_DIR="C:/Home"
"$EXE" -vm-Xmx4096M --run build "$HOME_DIR/test/$FILE" "Auslieferung (ML) Execute"
(The opened *.exe starts a testrun which includes some simulink simulation runs - sometimes there are simulink errors - in this case, the execution time of the tests need too long and I want to restart the entire process).
First, I came up with the idea, calling a shell script containing these lines within a subprocess from python:
import subprocess
import time
process = subprocess.Popen('subprocess.sh', shell = True)
time.sleep(10)
process.terminate()
But when I use this, *.terminate() or *.kill() does not close the program I started with the subprocess call.
That´s why I am now trying to implement the entire call in python language. I got the following so far:
import subprocess
file = "somePath/file.tpt"
p = subprocess.Popen(["C:/program.exe", file])
Now I need to know, how to implement the second call "Auslieferung (ML) Execute" of the bash function. This call starts an intern testrun named "Auslieferung (ML) Execute". Any ideas? Or is it better to choose one of the other ways? Or can I get the "kill" option for bash somewhere, somehow?

Multiprocessing launching too many instances of Python VM

I am writing some multiprocessing code (Python 2.6.4, WinXP) that spawns processes to run background tasks. In playing around with some trivial examples, I am running into an issue where my code just continuously spawns new processes, even though I only tell it to spawn a fixed number.
The program itself runs fine, but if I look in Windows TaskManager, I keep seeing new 'python.exe' processes appear. They just keep spawning more and more as the program runs (eventually starving my machine).
For example,
I would expect the code below to launch 2 python.exe processes. The first being the program itself, and the second being the child process it spawns. Any idea what I am doing wrong?
import time
import multiprocessing
class Agent(multiprocessing.Process):
def __init__(self, i):
multiprocessing.Process.__init__(self)
self.i = i
def run(self):
while True:
print 'hello from %i' % self.i
time.sleep(1)
agent = Agent(1)
agent.start()

It looks like you didn't carefully follow the guidelines in the documentation, specifically this section where it talks about "Safe importing of main module".
You need to protect your launch code with an if __name__ == '__main__': block or you'll get what you're getting, I believe.
I believe it comes down to the multiprocessing module not being able to use os.fork() as it does on Linux, where an already-running process is basically cloned in memory. On Windows (which has no such fork()) it must run a new Python interpreter and tell it to import your main module and then execute the start/run method once that's done. If you have code at "module level", unprotected by the name check, then during the import it starts the whole sequence over again, ad infinitum

When I run this in Linux with python2.6, I see a maximum of 4 python2.6 processes and I can't guarantee that they're all from this process. They're definitely not filling up the machine.
Need new python version? Linux/Windows difference?

I don't see anything wrong with that. Works fine on Ubuntu 9.10 (Python 2.6.4).
Are you sure you don't have cron or something starting multiple copies of your script? Or that the spawned script is not calling anything that would start a new instance, for example as a side effect of import if your code runs directly on import?

Python - simple reading lines from a pipe

I'm trying to read lines from a pipe and process them, but I'm doing something silly and I can't figure out what. The producer is going to keep producing lines indefinitely, like this:
producer.py
import time
while True:
print 'Data'
time.sleep(1)
The consumer just needs to check for lines periodically:
consumer.py
import sys, time
while True:
line = sys.stdin.readline()
if line:
print 'Got data:', line
else:
time.sleep(1)
When I run this in the Windows shell as python producer.py | python consumer.py, it just sleeps forever (never seems to get data?) It seems that maybe the problem is that the producer never terminates, since if I send a finite amount of data then it works fine.
How can I get the data to be received and show up for the consumer? In the real application, the producer is a C++ program I have no control over.

Some old versions of Windows simulated pipes through files (so they were prone to such problems), but that hasn't been a problem in 10+ years. Try adding a
sys.stdout.flush()
to the producer after the print, and also try to make the producer's stdout unbuffered (by using python -u).
Of course this doesn't help if you have no control over the producer -- if it buffers too much of its output you're still going to wait a long time.
Unfortunately - while there are many approaches to solve that problem on Unix-like operating systems, such as pyexpect, pexpect, exscript, and paramiko, I doubt any of them works on Windows; if that's indeed the case, I'd try Cygwin, which puts enough of a Linux-like veneer on Windows as to often enable the use of Linux-like approaches on a Windows box.

This is about I/O that is bufferized by default with Python. Pass -u option to the interpreter to disable this behavior:
python -u producer.py | python consumer.py
It fixes the problem for me.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python multiprocessing/threading blocking main thread - python

Related

Import Python library in terminal

GDB not stopping with "interrupt" command from python script

Control executed programm with python

Multiprocessing launching too many instances of Python VM

Python - simple reading lines from a pipe

Categories

Resources