I have been trying the pool.map() multiprocessing with python3 and not matter how I simplify my function it hangs and shows that the code is still running and gives no results.
I am using Windows.Here is my code:
import multiprocessing as mp
def f(x):
return x + 1
if __name__ == '__main__':
with mp.Pool() as pool:
print(pool.map(f, range(10)))
Can anyone tell me how I can solve this problem?
Thanks!
Related
I encountered a problem while writing the python code with a multiprocessing map function. The minimum code to reproduce the problem is like
import multiprocessing as mp
if __name__ == '__main__':
def f(x):
return x*x
num_workers = 2
with mp.Pool(num_workers) as p:
print(p.map(f, [1,2,3]))
If one runs this piece of code, I got the error message
AttributeError: Can't get attribute 'f' on <module '__mp_main__' from 'main.py'>
However, If I move f-function outside the main function, i.e.
import multiprocessing as mp
def f(x):
return x*x
if __name__ == '__main__':
num_workers = 2
with mp.Pool(num_workers) as p:
print(p.map(f, [1,2,3]))
It works this time. I am wondering what's the difference between them and how can I get an error in the first version. Thanks in advance.
Depending on your operating system, sub-processes will either be forked or spawned. macOS, for example, will spawn whereas Windows will fork.
You can enforce forking but you need to fully understand the implications of doing so.
For this specific question a workaround could be implemented thus:
import multiprocessing as mp
from multiprocessing import set_start_method
if __name__ == '__main__':
def f(x):
return x*x
set_start_method('fork')
num_workers = 2
with mp.Pool(num_workers) as p:
print(p.map(f, [1,2,3]))
This will vary between operating systems, but the basic reason is that this line of code
if __name__ == '__main__':
is telling the Python interpreter to only include anything in this code section in the main process when run as a script - it won't be included in any sub process, nor will it appear if you import it as a module. So when you do this
import multiprocessing as mp
if __name__ == '__main__':
def f(x):
return x*x
num_workers = 2
with mp.Pool(num_workers) as p:
print(p.map(f, [1,2,3]))
any sub processes created by p.map will not have the definition of function f
I am trying to use Multiprocessing to scrape web pages but when I launch my program, nothing happened so I tried the (very) simple example and also nothing happens. So the problem seems to be with multiprocessing :
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
with Pool(5) as p:
print(p.map(f, [1, 2, 3]))
Any ideas ?
I am on macbookPro with JupyterLab 2.2.6
Thx
I'm playing with Python multiprocessing. But it wouldn't work on my system.
I ran the example code I found on multiprocessing page. But it just hangs there and the CPU usage is 0%. What do I do to make it work? Thanks a lot!
https://docs.python.org/2/library/multiprocessing.html
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
p = Pool(5)
print(p.map(f, [1, 2, 3]))
update: just tried to run the same code in command line and get the following error message.
Error Message
I am trying to use multiprocessing on a different problem but I can't get it to work. To make sure I'm using the Pool class correctly, I made the following simpler problem but even that won't work. What am I doing wrong here?
from multiprocessing import Pool
def square(x):
sq = x**2
return sq
def main():
x1 = [1,2,3,4]
pool = Pool()
result = pool.map( square, x1 )
print(result)
if __name__ == '__main__': main()
The computer just seems to run forever and I need to close and restart the IPython shell before I can do anything.
I figured out what was wrong. I named the script "multiprocessing.py" which is the name of the module that was being imported. This resulted in the script attempting to import itself instead of the actual module.
When i tried to run the code:
import multiprocessing
def worker():
"""worker function"""
print 'Worker'
return
if __name__ == '__main__':
jobs = []
for i in range(5):
p = multiprocessing.Process(target=worker)
jobs.append(p)
p.start()
The output is blank and simply executing without printing "Worker". How to print the required output in multiprocessing?
What actually is happening while using multiprocessing?
What is the maximum number of cores we can use for multiprocessing?
I've tried your code in Windows 7, Cygwin, and Ubuntu. For me all the threads finish before the loop comes to an end so I get all the prints to show, but using join() will guarantee all the threads will finish.
import multiprocessing
def worker():
"""worker function"""
print 'Worker'
return
if __name__ == '__main__':
jobs = []
for i in range(5):
p = multiprocessing.Process(target=worker)
jobs.append(p)
p.start()
for i in range(len(jobs)):
jobs.pop().join()
As far as how multiprocessing works in the backend, I'm going to let someone more experienced than myself answer that one :) I'll probably just make a fool of myself.
I get 5 time "Worker" printed for my part, are you on Python 3 ? if it is the case you muste use print("Worker"). from my experiment, I think multitreading doesn't mean using multiple cores, it just run the diferent tread alternatively to ensure a parallelism. try reading the multiprocessing lib documentation for more info.