why process doesn't join and doesn't run? - python

i have a simple problem to solve(more or less)
if i watch python multiprocessing tutorials i see that a process should be started more or less like this:
from multiprocessing import *
def u(m):
print(m)
return
A=Process(target=u,args=(0,))
A.start()
A.join()
It should print a 0 but nothing gets printed. Instead it hangs forever at the A.join().
if i manually start the function u doing this
A.run()
it actually prints 0 on the shell but it doesn't work simultaneously
for example the output of following code:
from multiprocessing import *
from time import sleep
def u(m):
sleep(1)
print(m)
return
A=Process(target=u,args=(1,))
A.start()
print(0)
should be
0
1
but actually is
0
and if i add before the last line
A.run()
then the output becomes
1
0
this seems confusing to me...
and if i try to join the process it waits forever.
however,if it can help giving me an answer
my OS is Mac os x 10.6.8
python versions used are 3.1 and 3.3
my computer has 1 intel core i3 processor
--Update--
I have noticed that this strange behaviour is present only when launching the program from IDLE ,if i run the program from the terminal everything works as it is supposed to,so this problem must be connected to some IDLE bug.
But runnung programs from terminal is even weirder: using something like range(100000000) activates all my computer's ram until the end of the program; if i remember well this shouldn't happen in python 3,only in older python versions.
I hope these new informations will help you giving an answer
--Update 2--
the bug occurs even if i don't perform output from my process,because setting this:
def u():
return
as the target of the process and then starting it , if i try to join the process,idle waits forever

As suggested here and here, the problem is that IDLE overrides sys.stdin and sys.stdout in some weird ways, which do not propagate cleanly to processes you spawn from it (they are not real filehandles).
The first link also indicates it's unlikely to be fixed any time soon ("may be a 'cannot fix' issue", they say).
So unfortunately the only solution I can suggest is not to use IDLE for this script...

Have you tried adding A.join() to your program? I am guessing that your main process is exiting before the child process prints which is causing the output to be hidden. If you tell the main process to wait for the child process (A.join()), I bet you'll see the output you expect.

Given that it only happens with IDLE, I suspect the problem has to do with the stdout used by both processes. Perhaps it's some file-like object that's not safe to use from two different processes.
If you don't have the child process write to stdout, I suspect it will complete and join properly. For example, you could have it write to a file, instead. Or you could set up a pipe between the parent and child.

Have you tried unbuffered output? Try importing the sys module and change the print statement:
print >> sys.stderr, m
How does this affect the behavior? I'm with the others that suspect that IDLE is mucking with the stdio . . .

Related

Stop KeyboardInterrupt reaching batch file processor

I'm writing some code in Python 3 on Windows that looks like this:
try:
do something that takes a long time
(training a neural network in TensorFlow, as it happens)
except KeyboardInterrupt:
print('^C')
print a summary of results
still useful even if the training was cut short early
This works perfectly if run directly from the console with python foo.py.
However, if the call to Python was within a batch file, it ends up doing all the above but then still spamming the console with the 'terminate batch job' prompt.
Is there a way to stop that happening? By fully eating the ^C within Python, jumping all the way out of the batch file or otherwise?
Use the break (More info here) command in the batch file, which will disable CTRL+C halting the file
EDIT: According to this site of the break command
Newer versions of Windows (Windows ME, Windows 2000, Windows XP, and higher) only include this command for backward compatibility and turning the break off has no effect.
I personally tested this, and can confirm, I will edit when I find a workaround
EDIT #2: If you could have a second batch script that runs start "" /b /wait cmd /c "yourfile.bat" although, this is known to cause glitches with other nested batch files
The flag to disable Ctrl+C is inherited by child processes, so Python will no longer raise a KeyboardInterrupt. Plus we still have bugs here in Python if reading from the console gets interrupted by Ctrl+C without getting a SIGINT from the CRT. The Python script should manually enable Ctrl+C via ctypes. Use import ctypes; kernel32 = ctypes.WinDLL('kernel32', use_last_error=True); success = kernel32.SetConsoleCtrlHandler(None, False)
EDIT #3 As pointed by Eryksyn (in the comments), you can use cytpes to ENABLE it;
import ctypes;
kernel32 = ctypes.WinDLL('kernel32', use_last_error=True); success = kernel32.SetConsoleCtrlHandler(None, False)
EDIT #4: I think I found it, try this (Although it may not work) Can you use the threading import?
import time
from threading import Thread
def noInterrupt():
for i in xrange(4):
print i
time.sleep(1)
a = Thread(target=noInterrupt)
a.start()
a.join()
print "done"

Why sleep is running before print here? [duplicate]

I was writing a simple program on Python 3.1 and I stumbled upon this:
If I run this on the IDLE it works as intended - prints "Initializing." and then adds two dots, one after each second, and waits for input.
from time import sleep
def initialize():
print('Initializing.', end='')
sleep(1)
print(" .", end='')
sleep(1)
print(" .", end='')
input()
initialize()
The problem is that when I double-click the .py to execute the file, it runs on python.exe instead of pythonw.exe, and strange things happen: it joins all the sleep() times i.e. makes me wait for 2 seconds, and then prints the whole string Initializing. . . at once. Why does this happen? Is there a way to avoid that happening in the terminal? It works fine if I use the IDLE in both windows and linux.
This is because the output is being buffered.
You should add a sys.stdout.flush() after each write
It sounds like the difference is that stdout is automatically being flushed in IDLE. For efficiency, programming languages often save up a bunch of print calls before writing to the screen, which is a slow process.
Here's another question that has the answer you need:
How to flush output of Python print?

Python Multiprocessing - sending inputs to child processes

I am using the multiprocessing module in python to launch few processes in parallel. These processes are independent of each other. They generate their own output and write out the results in different files. Each process calls an external tool using the subprocess.call method.
It was working fine until I discovered an issue in the external tool where due to some error condition it goes into a 'prompt' mode and waits for the user input. Now in my python script I use the join method to wait till all the processes finish their tasks. This is causing the whole thing to wait for this erroneous subprocess call. I can put a timeout for each of the process but I do not know in advance how long each one is going to run and hence this option is ruled out.
How do I figure out if any child process is waiting for an user input and how do I send an 'exit' command to it? Any pointers or suggestions to relevant modules in python will be really appreciated.
My code here:
import subprocess
import sys
import os
import multiprocessing
def write_script(fname,e):
f = open(fname,'w')
f.write("Some useful cammnd calling external tool")
f.close()
subprocess.call(['chmod','+x',os.path.abspath(fname)])
return os.path.abspath(fname)
def run_use(mname,script):
print "ssh "+mname+" "+script
subprocess.call(['ssh',mname,script])
if __name__ == '__main__':
dict1 = {}
dict['mod1'] = ['pp1','ext2','les3','pw4']
dict['mod2'] = ['aaa','bbb','ccc','ddd']
machines = ['machine1','machine2','machine3','machine4']
log_file.write(str(dict1.keys()))
for key in dict1.keys():
arr = []
for mod in dict1[key]:
d = {}
arr.append(mod)
if ((mod == dict1[key][-1]) | (len(arr)%4 == 0)):
for i in range(0,len(arr)):
e = arr.pop()
script = write_script(e+"_temp.sh",e)
d[i] = multiprocessing.Process(target=run_use,args=(machines[i],script,))
d[i].daemon = True
for pp in d:
d[pp].start()
for pp in d:
d[pp].join()
Since you're writing a shell script to run your subcommands, can you simply tell them to read input from /dev/null?
#!/bin/bash
# ...
my_other_command -a -b arg1 arg2 < /dev/null
# ...
This may stop them blocking on input and is a really simple solution. If this doesn't work for you, read on for some other options.
The subprocess.call() function is simply shorthand for constructing a subprocess.Popen instance and then calling the wait() method on it. So, your spare processes could instead create their own subprocess.Popen instances and poll them with poll() method on the object instead of wait() (in a loop with a suitable delay). This leaves them free to remain in communication with the main process so you can, for example, allow the main process to tell the child process to terminate the Popen instance with the terminate() or kill() methods and then itself exit.
So, the question is how does the child process tell whether the subprocess is awaiting user input, and that's a trickier question. I would say perhaps the easiest approach is to monitor the output of the subprocess and search for the user input prompt, assuming that it always uses some string that you can look for. Alternatively, if the subprocess is expected to generate output continually then you could simply look for any output and if a configured amount of time goes past without any output then you declare that process dead and terminate it as detailed above.
Since you're reading the output, actually you don't need poll() or wait() - the process closing its output file descriptor is good enough to know that it's terminated in this case.
Here's an example of a modified run_use() method which watches the output of the subprocess:
def run_use(mname,script):
print "ssh "+mname+" "+script
proc = subprocess.Popen(['ssh',mname,script], stdout=subprocess.PIPE)
for line in proc.stdout:
if "UserPrompt>>>" in line:
proc.terminate()
break
In this example we assume that the process either gets hung on on UserPrompt>>> (replace with the appropriate string) or it terminates naturally. If it were to get stuck in an infinite loop, for example, then your script would still not terminate - you can only really address that with an overall timeout, but you didn't seem keen to do that. Hopefully your subprocess won't misbehave in that way, however.
Finally, if you don't know in advance the prompt that will be giving from your process then your job is rather harder. Effectively what you're asking to do is monitor an external process and know when it's blocked reading on a file descriptor, and I don't believe there's a particularly clean solution to this. You could consider running a process under strace or similar, but that's quite an awful hack and I really wouldn't recommend it. Things like strace are great for manual diagnostics, but they really shouldn't be part of a production setup.

Python: first instance of subprocess.Popen() is very slow

I'm sure I'm missing something simple, but when using the subprocess module, there is a very significant wait (> 10 seconds) to starting the first subprocess. The second one starts shortly after the first. Is there any way to fix this? Code below:
EDIT: To add, HWAccess (in proc.py) links a dll. Could this have anything to do with it?
EDIT2: I've boiled the test down to starting a SINGLE subprocess and it takes significantly longer to import HWAccess than if I just run proc.py directly from cmd prompt. I don't see how this has anything to do with the dll specifically if it loads fast from cmd, but not as a sub-process through test.py
test.py:
import subprocess
import os
import time
print 'STARTING'
proc0 = subprocess.Popen(['python','proc.py','0'])
proc1 = subprocess.Popen(['python','proc.py','1'])
while True:
try: pass
except KeyboardInterrupt:
os._exit(0)
except ValueError:
pass
proc.py:
print 'Process starting...'
import HWAccess
print 'HWAccess imported...'
import sys
print 'sys imported...'
import time
print 'time imported...'
print 'hi from ',sys.argv[1]
Edit: After putting the prints in, there is around 5s to reach the first 'Process starting...', the second process prints 'Process starting...' immediately afterwards. Then there is a ~30 second pause to import HWAccess (takes a matter of seconds running on an individual process), the second process then immediately prints that it too has imported HWAccess... from then on execution is fast. HWAccess links a .dll so I'm wondering if two processes trying to import HWAccess result in some sort of race condition that takes a while to negotiate.
I am not sure if this is the right track, but I remember seeing such delays when starting a process wayyy back (and not at all Python related), and it turned out they were related to some badly configured network settings on my computer. Upon subprocess start-up, it has to set up interprocess communication, and those settings might interfere.
I remember my problems were related to using a false hostname for the machine, which was not properly configured on the network - can you check to see if it is your case? If it is not a production machine, try not setting a hostname at all, leaving it as "localhost".

How to kill headless X server started via Python?

I want to get screenshots of a webpage in Python. For this I am using http://github.com/AdamN/python-webkit2png/ .
newArgs = ["xvfb-run", "--server-args=-screen 0, 640x480x24", sys.argv[0]]
for i in range(1, len(sys.argv)):
if sys.argv[i] not in ["-x", "--xvfb"]:
newArgs.append(sys.argv[i])
logging.debug("Executing %s" % " ".join(newArgs))
os.execvp(newArgs[0], newArgs)
Basically calls xvfb-run with the correct args. But man xvfb says:
Note that the demo X clients used in the above examples will not exit on their own, so they will have to be killed before xvfb-run will exit.
So that means that this script will <????> if this whole thing is in a loop, (To get multiple screenshots) unless the X server is killed. How can I do that?
The documentation for os.execvp states:
These functions all execute a new
program, replacing the current
process; they do not return. [..]
So after calling os.execvp no other statement in the program will be executed. You may want to use subprocess.Popen instead:
The subprocess module allows you to
spawn new processes, connect to their
input/output/error pipes, and obtain
their return codes. This module
intends to replace several other,
older modules and functions, such as:
Using subprocess.Popen, the code to run xlogo in the virtual framebuffer X server becomes:
import subprocess
xvfb_args = ['xvfb-run', '--server-args=-screen 0, 640x480x24', 'xlogo']
process = subprocess.Popen(xvfb_args)
Now the problem is that xvfb-run launches Xvfb in a background process. Calling process.kill() will not kill Xvfb (at least not on my machine...). I have been fiddling around with this a bit, and so far the only thing that works for me is:
import os
import signal
import subprocess
SERVER_NUM = 99 # 99 is the default used by xvfb-run; you can leave this out.
xvfb_args = ['xvfb-run', '--server-num=%d' % SERVER_NUM,
'--server-args=-screen 0, 640x480x24', 'xlogo']
subprocess.Popen(xvfb_args)
# ... do whatever you want to do here...
pid = int(open('/tmp/.X%s-lock' % SERVER_NUM).read().strip())
os.kill(pid, signal.SIGINT)
So this code reads the process ID of Xvfb from /tmp/.X99-lock and sends the process an interrupt. It works, but does yield an error message every now and then (I suppose you can ignore it, though). Hopefully somebody else can provide a more elegant solution. Cheers.

Categories