Why does my python script continue while if __name__==__main__ runs?

Why does my python script continue while if __name__==__main__ runs? - python

Below is a simplified version of a problem I'm facing. When I run my code, example below, why does the script run code below the if _name__==__main_ section while the function which sits underneath the if statement is still running? I thought the p_1.join() command should block the script from continuing until the separate process has finished. In the output below I'm expecting the word "Finished" to only be printed when all of the script has concluded - but instead it is being printed second and then last.
In the past I have used poolexecutor for similar problems; but in this project I need to start each process individually so that I can assigned separate independent functions to each process.
import time
from multiprocessing import Process, Queue
def a(x,q):
time.sleep(3)
q.put(x*x)
q=Queue()
def main():
print("Main Function Starts")
p_1 = Process(target=a, args=(5,q))
p_1.start()
p_1.join()
b= q.get()
print(b)
print("Main Function Ends")
if __name__ == '__main__':
main()
print("Finished")
**Output:**
Main Function Starts
Finished
25
Main Function Ends
Finished

You were supposed to put that code in the if __name__ == '__main__' guard. Preventing this kind of thing is the whole point of if __name__ == '__main__'.
You're on Windows. When you start p_1, multiprocessing launches a separate Python process, and one of the first things that process does is import your file as a module. When it does that, the module's __name__ isn't '__main__', so anything inside the if __name__ == '__main__' guard doesn't run, but print("Finished") is outside the guard.
Your program isn't somehow continuing past main() while main() is still running. The worker process is performing the unwanted print.

How do you run your script? When I ran your script on command line, 'Finished' was printed once like below.
$ python test.py
Main Function Starts
25
Main Function Ends
Finished

Related

Kill a python process when busy in a loop

I'm running a script on my Raspberry. Sometimes happens that the program freezes, so I've to close the terminal and re-run the .py
So I wanted to "multiprocess" this program. I made two function, the first one does the work, the second one has the job to check the time, and kill the process of the first function in the case the condition is true.
However I tried to do like so:
def AntiFreeze():
print("AntiFreeze partito\n")
global stop
global endtime
global freq
proc_SPN = multiprocessing.Process(target=SPN(), args=())
proc_SPN.start()
time.sleep(2)
proc_SPN.terminate()
proc_SPN.join()
if __name__ == '__main__':
proc_AF = multiprocessing.Process(target=AntiFreeze(), args=())
proc_AF.start()
The main function start the "AntiFreeze" function on a process, this one create another process to run the function that will do the Job I want.
THE PROBLEM (I think):
The function "SPN()" (that is the one that does the job) is busy in a very long while loop that calls function in another .py file.
So when I use proc_SPN.terminate() or proc_SPN.kill() nothing happens... why?
There is another way to force a process to kill? maybe I've to do two different programs?
Thanks in advance for help

You are calling your function at process creation, so most likely the process is never correctly spawned. Your code should be changed into:
def AntiFreeze():
print("AntiFreeze partito\n")
global stop
global endtime
global freq
proc_SPN = multiprocessing.Process(target=SPN, args=())
proc_SPN.start()
time.sleep(2)
proc_SPN.terminate()
proc_SPN.join()
if __name__ == '__main__':
proc_AF = multiprocessing.Process(target=AntiFreeze, args=())
proc_AF.start()
Furthermore, you shouldn't use globals (unless strictly necessarry). You could pass the needed arguments to the AntiFreeze function instead.

Python Multiprocessing Causing Entire Script to Loop

It seems multiprocessing swaps between threads faster so I started working on swapping over but I'm getting some unexpected results. It causes my entire script to loop several times when a thread didn't before.
Snippet example:
stuff_needs_done = true
more_stuff_needs_done = true
print "Doing stuff"
def ThreadStuff():
while 1 == 1:
#do stuff here
def OtherThreadStuff():
while 1 == 1:
#do other stuff here
if stuff_needs_done == true:
Thread(target=ThreadStuff).start()
if more_stuff_needs_done == true:
Thread(target=OtherThreadStuff).start()
This works as I'd expect. The threads start and run until stopped. But when running a lot of these the overhead is higher (so I'm told) so I tried swapping to multiprocessing.
Snippet example:
stuff_needs_done = true
more_stuff_needs_done = true
print "Doing stuff"
def ThreadStuff():
while 1 == 1:
#do stuff here
def OtherThreadStuff():
while 1 == 1:
#do other stuff here
if stuff_needs_done == true:
stuffproc1= Process(target=ThreadStuff).start()
if more_stuff_needs_done == true:
stuffproc1= Process(target=OtherThreadStuff).start()
But what seems to happen is the whole thing starts a couple of times so the "Doing stuff" output comes up and a couple of the threads run.
I could put some .join()s in but there is no loop which should cause the print output to run again which means there is nowhere for it to wait.
My hope is this is just a syntax thing but I'm stumped trying to find out why the whole script loops. I'd really appreciate any pointers in the right direction.

This is mentioned in the docs:
Safe importing of main module
Make sure that the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a
starting a new process).
For example, under Windows running the following module would fail with a RuntimeError:
from multiprocessing import Process
def foo():
print 'hello'
p = Process(target=foo)
p.start()
Instead one should protect the “entry point” of the program by using if __name__ == '__main__': as follows:
from multiprocessing import Process, freeze_support
def foo():
print 'hello'
if __name__ == '__main__':
freeze_support()
p = Process(target=foo)
p.start()
This allows the newly spawned Python interpreter to safely import the module and then run the module’s foo() function.

Python variables not defined after if name == 'main'

I'm trying to divvy up the task of looking up historical stock price data for a list of symbols by using Pool from the multiprocessing library.
This works great until I try to use the data I get back. I have my hist_price function defined and it outputs to a list-of-dicts pcl. I can print(pcl) and it has been flawless, but if I try to print(pcl) after the if __name__=='__main__': block, it blows up saying pcl is undefined. I've tried declaring global pcl in a couple places but it doesn't make a difference.
from multiprocessing import Pool
syms = ['List', 'of', 'symbols']
def hist_price(sym):
#... lots of code looking up data, calculations, building dicts...
stlh = {"Sym": sym, "10D Max": pcmax, "10D Min": pcmin} #simplified
return stlh
#global pcl
if __name__ == '__main__':
pool = Pool(4)
#global pcl
pcl = pool.map(hist_price, syms)
print(pcl) #this works
pool.close()
pool.join()
print(pcl) #says pcl is undefined
#...rest of my code, dependent on pcl...
I've also tried removing the if __name__=='__main__': block but it gives me a RunTimeError telling me specifically to put it back. Is there some other way to call variables to use outside of the if block?

I think there are two parts to your issue. The first is "what's wrong with pcl in the current code?", and the second is "why do I need the if __name__ == "__main__" guard block at all?".
Lets address them in order. The problem with the pcl variable is that it is only defined in the if block, so if the module gets loaded without being run as a script (which is what sets __name__ == "__main__"), it will not be defined when the later code runs.
To fix this, you can change how your code is structured. The simplest fix would be to guard the other bits of the code that use pcl within an if __name__ == "__main__" block too (e.g. indent them all under the current block, perhaps). An alternative fix would be to put the code that uses pcl into functions (which can be declared outside the guard block), then call the functions from within an if __name__ == "__main__" block. That would look something like this:
def do_stuff_with_pcl(pcl):
print(pcl)
if __name__ == "__main__":
# multiprocessing code, etc
pcl = ...
do_stuff_with_pcl(pcl)
As for why the issue came up in the first place, the ultimate cause is using the multiprocessing module on Windows. You can read about the issue in the documentation.
When multiprocessing creates a new process for its Pool, it needs to initialize that process with a copy of the current module's state. Because Windows doesn't have fork (which copies the parent process's memory into a child process automatically), Python needs to set everything up from scratch. In each child process, it loads the module from its file, and if you the module's top-level code tries to create a new Pool, you'd have a recursive situation where each of the child process would start spawning a whole new set of child processes of its own.
The multiprocessing code has some guards against that, I think (so you won't fork bomb yourself out of simple carelessness), but you still need to do some of the work yourself too, by using if __name__ == "__main__" to guard any code that shouldn't be run in the child processes.

multi processing in python getting stuck in a while True loop

for some reason the code does not get to the main function. I am using cloud9 to run the code so that might be the issue.
from multiprocessing import Process, Value
import time
def main():
print "main function"
def market_price_thread():
while True:
market_price()
time.sleep(5)
def market_price():
#do something
print "end"
def start_threads():
thread = Process(target=market_price_thread())
thread.start()
time.sleep(5)
if __name__ == '__main__':
start_threads()
main() #does not seem to get to this

You've asked Python to call market_price_thread:
thread = Process(target=market_price_thread())
and then use whatever it returns, as the target value. So, before calling Process, we'll have to wait for market_price_thread to return. What value does it return, and when?
(Compare with Process(target=market_price_thread), which does not call market_price_thread yet, but rather, passes the function to Process so that Process can call it.)

Windows multiprocessing

As I have discovered windows is a bit of a pig when it comes to multiprocessing and I have a questions about it.
The pydoc states you should protect the entry point of a windows application when using multiprocessing.
Does this mean only the code which creates the new process?
For example
Script 1
import multiprocessing
def somemethod():
while True:
print 'do stuff'
# this will need protecting
p = multiprocessing.Process(target=somemethod).start()
# this wont
if __name__ == '__main__':
p = multiprocessing.Process(target=somemethod).start()
In this script you need to wrap this in if main because the line in spawning the process.
But what about if you had?
Script 2
file1.py
import file2
if __name__ == '__main__':
p = Aclass().start()
file2.py
import multiprocessing
ITEM = 0
def method1():
print 'method1'
method1()
class Aclass(multiprocessing.Process):
def __init__(self):
print 'Aclass'
super(Aclass, self).__init__()
def run(self):
print 'stuff'
What would need to be protected in this instance?
What would happen if there was a if __main__ in File 2, would the code inside of this get executed if a process was being created?
NOTE: I know the code will not compile. It's just an example.

The pydoc states you should protect the entry point of a windows application when using multiprocessing.
My interpretation differs: the documentations states
the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a starting a new process).
So importing your module (import mymodule) should not create new processes. That is, you can avoid starting processes by protecting your process-creating code with an
if __name__ == '__main__':
...
because the code in the ... will only run when your program is run as main program, that is, when you do
python mymodule.py
or when you run it as an executable, but not when you import the file.
So, to answer your question about the file2: no, you do not need protection because no process is started during the import file2.
Also, if you put an if __name__ == '__main__' in file2.py, it would not run because file2 is imported, not executed as main program.
edit: here is an example of what can happen when you do not protect your process-creating code: it might just loop and create a ton of processes.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.