Python v3.5, Windows 10
I'm using multiple processes and trying to captures user input. Searching everything I see there are odd things that happen when using input() with multiple processes. After 8 hours+ of trying, nothing I implement worked, I'm positive I am doing it wrong but I can't for the life of me figure it out.
The following is a very stripped down program that demonstrates the issue. Now it works fine when I run this program within PyCharm, but when I use pyinstaller to create a single executable it fails. The program constantly is stuck in a loop asking the user to enter something as shown below:.
I am pretty sure it has to do with how Windows takes in standard input from things I've read. I've also tried passing the user input variables as Queue() items to the functions but the same issue. I read you should put input() in the main python process so I did that under if __name__ = '__main__':
from multiprocessing import Process
import time
def func_1(duration_1):
while duration_1 >= 0:
time.sleep(1)
print('Duration_1: %d %s' % (duration_1, 's'))
duration_1 -= 1
def func_2(duration_2):
while duration_2 >= 0:
time.sleep(1)
print('Duration_2: %d %s' % (duration_2, 's'))
duration_2 -= 1
if __name__ == '__main__':
# func_1 user input
while True:
duration_1 = input('Enter a positive integer.')
if duration_1.isdigit():
duration_1 = int(duration_1)
break
else:
print('**Only positive integers accepted**')
continue
# func_2 user input
while True:
duration_2 = input('Enter a positive integer.')
if duration_2.isdigit():
duration_2 = int(duration_2)
break
else:
print('**Only positive integers accepted**')
continue
p1 = Process(target=func_1, args=(duration_1,))
p2 = Process(target=func_2, args=(duration_2,))
p1.start()
p2.start()
p1.join()
p2.join()
You need to use multiprocessing.freeze_support() when you produce a Windows executable with PyInstaller.
Straight out from the docs:
multiprocessing.freeze_support()
Add support for when a program which uses multiprocessing has been frozen to produce a Windows executable. (Has been tested with py2exe, PyInstaller and cx_Freeze.)
One needs to call this function straight after the if name == 'main' line of the main module. For example:
from multiprocessing import Process, freeze_support
def f():
print('hello world!')
if __name__ == '__main__':
freeze_support()
Process(target=f).start()
If the freeze_support() line is omitted then trying to run the frozen executable will raise RuntimeError.
Calling freeze_support() has no effect when invoked on any operating system other than Windows. In addition, if the module is being run normally by the Python interpreter on Windows (the program has not been frozen), then freeze_support() has no effect.
In your example you also have unnecessary code duplication you should tackle.
Related
I have a command-line program which runs on single core. It takes an input file, does some calculations, and returns several files which I need to parse to store the produced output.
I have to call the program several times changing the input file. To speed up the things I was thinking parallelization would be useful.
Until now I have performed this task calling every run separately within a loop with the subprocess module.
I wrote a script which creates a new working folder on every run and than calls the execution of the program whose output is directed to that folder and returns some data which I need to store. My question is, how can I adapt the following code, found here, to execute my script always using the indicated amount of CPUs, and storing the output.
Note that each run has a unique running time.
Here the mentioned code:
import subprocess
import multiprocessing as mp
from tqdm import tqdm
NUMBER_OF_TASKS = 4
progress_bar = tqdm(total=NUMBER_OF_TASKS)
def work(sec_sleep):
command = ['python', 'worker.py', sec_sleep]
subprocess.call(command)
def update_progress_bar(_):
progress_bar.update()
if __name__ == '__main__':
pool = mp.Pool(NUMBER_OF_TASKS)
for seconds in [str(x) for x in range(1, NUMBER_OF_TASKS + 1)]:
pool.apply_async(work, (seconds,), callback=update_progress_bar)
pool.close()
pool.join()
I am not entirely clear what your issue is. I have some recommendations for improvement below, but you seem to claim on the page that you link to that everything works as expected and I don't see anything very wrong with the code as long as you are running on Linux.
Since the subprocess.call method is already creating a new process, you should just be using multithreading to invoke your worker function, work. But had you been using multiprocessing and your platform was one that used the spawn method to create new processes (such as Windows), then having the creation of the progress bar outside of the if __name__ = '__main__': block would have resulted in the creation of 4 additional progress bars that did nothing. Not good! So for portability it would have been best to move its creation to inside the if __name__ = '__main__': block.
import subprocess
from multiprocessing.pool import ThreadPool
from tqdm import tqdm
def work(sec_sleep):
command = ['python', 'worker.py', sec_sleep]
subprocess.call(command)
def update_progress_bar(_):
progress_bar.update()
if __name__ == '__main__':
NUMBER_OF_TASKS = 4
progress_bar = tqdm(total=NUMBER_OF_TASKS)
pool = ThreadPool(NUMBER_OF_TASKS)
for seconds in [str(x) for x in range(1, NUMBER_OF_TASKS + 1)]:
pool.apply_async(work, (seconds,), callback=update_progress_bar)
pool.close()
pool.join()
Note: If your worker.py program prints to the console, it will mess up the progress bar (the progress bar will be re-written repeatedly on multiple lines).
Have you considered instead importing worker.py (some refactoring of that code might be necessary) instead of invoking a new Python interpreter to execute it (in this case you would want to be explicitly using multiprocessing). On Windows this might not save you anything since a new Python interpreter would be executed for each new process anyway, but this could save you on Linux:
import subprocess
from multiprocessing.pool import Pool
from worker import do_work
from tqdm import tqdm
def update_progress_bar(_):
progress_bar.update()
if __name__ == '__main__':
NUMBER_OF_TASKS = 4
progress_bar = tqdm(total=NUMBER_OF_TASKS)
pool = Pool(NUMBER_OF_TASKS)
for seconds in [str(x) for x in range(1, NUMBER_OF_TASKS + 1)]:
pool.apply_async(do_work, (seconds,), callback=update_progress_bar)
pool.close()
pool.join()
On windows, I need to stop a python method execution after a specific time using Python 3.7 version. I tried with multiprocessing.Process but it is not able to call the method.
from multiprocessing import Process
import time
def func1(name):
print("Hi from func1")
i = 0
while True:
i += 1
print('hello', name)
time.sleep(1)
if __name__ == '__main__':
p = Process(target=func1, args=('bob',))
p.start()
time.sleep(5)
p.terminate()
In the above code, it is not able to invoke the func1 method. Please let me know if I am missing something or is there any better way to achieve this.
What I'd like to do is the following program to print out:
Running Main
Running Second
Running Main
Running Second
[...]
Code:
from multiprocessing import Process
import time
def main():
while True:
print('Running Main')
time.sleep(1)
def second():
while True:
print('Running Second')
time.sleep(1)
p1 = Process(main())
p2 = Process(second())
p1.start()
p2.start()
But it doesn't have the desired behavior. Instead it just prints out:
Running Main
Running Main
[...]
I suspect my program doesn't work because of the while statement?
Is there any way I can overcome this problem and have my program print out what I mentioned no matter what I execute in my function?
The issue here seems to be when you make the process vars. I suspect the reason for why the process inclusively runs the first function is because of syntax. My interpretation is that instead of creating a process out of a function you are making a process that executes a function exclusively.
When you want to create Process object you want to avoid using this
p1 = Process(target=main())
and rather write
p1 = Process(target=main)
That also means if you want to include any input for the function you will have to
p1 = Process(target=main, args=('hi',))
Python v3.5, Windows 10
I'm using multiple processes and trying to captures user input. Searching everything I see there are odd things that happen when using input() with multiple processes. After 8 hours+ of trying, nothing I implement worked, I'm positive I am doing it wrong but I can't for the life of me figure it out.
The following is a very stripped down program that demonstrates the issue. Now it works fine when I run this program within PyCharm, but when I use pyinstaller to create a single executable it fails. The program constantly is stuck in a loop asking the user to enter something as shown below:.
I am pretty sure it has to do with how Windows takes in standard input from things I've read. I've also tried passing the user input variables as Queue() items to the functions but the same issue. I read you should put input() in the main python process so I did that under if __name__ = '__main__':
from multiprocessing import Process
import time
def func_1(duration_1):
while duration_1 >= 0:
time.sleep(1)
print('Duration_1: %d %s' % (duration_1, 's'))
duration_1 -= 1
def func_2(duration_2):
while duration_2 >= 0:
time.sleep(1)
print('Duration_2: %d %s' % (duration_2, 's'))
duration_2 -= 1
if __name__ == '__main__':
# func_1 user input
while True:
duration_1 = input('Enter a positive integer.')
if duration_1.isdigit():
duration_1 = int(duration_1)
break
else:
print('**Only positive integers accepted**')
continue
# func_2 user input
while True:
duration_2 = input('Enter a positive integer.')
if duration_2.isdigit():
duration_2 = int(duration_2)
break
else:
print('**Only positive integers accepted**')
continue
p1 = Process(target=func_1, args=(duration_1,))
p2 = Process(target=func_2, args=(duration_2,))
p1.start()
p2.start()
p1.join()
p2.join()
You need to use multiprocessing.freeze_support() when you produce a Windows executable with PyInstaller.
Straight out from the docs:
multiprocessing.freeze_support()
Add support for when a program which uses multiprocessing has been frozen to produce a Windows executable. (Has been tested with py2exe, PyInstaller and cx_Freeze.)
One needs to call this function straight after the if name == 'main' line of the main module. For example:
from multiprocessing import Process, freeze_support
def f():
print('hello world!')
if __name__ == '__main__':
freeze_support()
Process(target=f).start()
If the freeze_support() line is omitted then trying to run the frozen executable will raise RuntimeError.
Calling freeze_support() has no effect when invoked on any operating system other than Windows. In addition, if the module is being run normally by the Python interpreter on Windows (the program has not been frozen), then freeze_support() has no effect.
In your example you also have unnecessary code duplication you should tackle.
It seems multiprocessing swaps between threads faster so I started working on swapping over but I'm getting some unexpected results. It causes my entire script to loop several times when a thread didn't before.
Snippet example:
stuff_needs_done = true
more_stuff_needs_done = true
print "Doing stuff"
def ThreadStuff():
while 1 == 1:
#do stuff here
def OtherThreadStuff():
while 1 == 1:
#do other stuff here
if stuff_needs_done == true:
Thread(target=ThreadStuff).start()
if more_stuff_needs_done == true:
Thread(target=OtherThreadStuff).start()
This works as I'd expect. The threads start and run until stopped. But when running a lot of these the overhead is higher (so I'm told) so I tried swapping to multiprocessing.
Snippet example:
stuff_needs_done = true
more_stuff_needs_done = true
print "Doing stuff"
def ThreadStuff():
while 1 == 1:
#do stuff here
def OtherThreadStuff():
while 1 == 1:
#do other stuff here
if stuff_needs_done == true:
stuffproc1= Process(target=ThreadStuff).start()
if more_stuff_needs_done == true:
stuffproc1= Process(target=OtherThreadStuff).start()
But what seems to happen is the whole thing starts a couple of times so the "Doing stuff" output comes up and a couple of the threads run.
I could put some .join()s in but there is no loop which should cause the print output to run again which means there is nowhere for it to wait.
My hope is this is just a syntax thing but I'm stumped trying to find out why the whole script loops. I'd really appreciate any pointers in the right direction.
This is mentioned in the docs:
Safe importing of main module
Make sure that the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a
starting a new process).
For example, under Windows running the following module would fail with a RuntimeError:
from multiprocessing import Process
def foo():
print 'hello'
p = Process(target=foo)
p.start()
Instead one should protect the “entry point” of the program by using if __name__ == '__main__': as follows:
from multiprocessing import Process, freeze_support
def foo():
print 'hello'
if __name__ == '__main__':
freeze_support()
p = Process(target=foo)
p.start()
This allows the newly spawned Python interpreter to safely import the module and then run the module’s foo() function.