I have a function returned by theano.function(), and I want to use it within multiprocessing for speedup. The following is a simplified demo script to show where I run into problem:
import numpy as np
from multiprocessing import Pool
from functools import partial
import theano
from theano import tensor
def get_theano_func():
x = tensor.dscalar()
y = x + 0.1
f = theano.function([x], [y])
return f
def func1(func, x):
return func(x)
def MPjob(xlist):
f = get_theano_func()
fp = partial(func1, func=f)
pool = Pool(processes=5)
Results = pool.imap(fp, xlist)
Y = []
for y in Results:
Y.append(y[0])
pool.close()
return Y
if __name__ == '__main__':
xlist = np.arange(0, 5, 1)
Y = MPjob(xlist)
print(Y)
In the above codes, the theano function 'f' is fed to 'func1()' as input argument. If MPjob() runs correctly, it should return [0.1, 1.1, 2.1, 3.1, 4.1]. However, an exception "TypeError: func1() got multiple values for argument 'func'" raised.
The full trackback log is as follows:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Python35\lib\multiprocessing\pool.py", line 119, in worker
result = (True, func(*args, **kwds))
TypeError: func1() got multiple values for argument 'func'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "F:/DaweiLeng/Code/workspace/Python/General/theano_multiprocess_debug.py", line 36, in <module>
Y = MPjob(xlist)
File "F:/DaweiLeng/Code/workspace/Python/General/theano_multiprocess_debug.py", line 29, in MPjob
for y in Results:
File "C:\Python35\lib\multiprocessing\pool.py", line 695, in next
raise value
TypeError: func1() got multiple values for argument 'func'
Anyone got a hint?
Turns out it's related with the partial() function. The full explanation is here https://github.com/Theano/Theano/issues/4720#issuecomment-232029702
Related
I am stuck on what to do on this problem, tried to execute it on vscode and the hackerrank IDE, both are giving errors even though all solutions on web are same as mine
import math
import os
import random
import re
import sys
#
# Complete the 'plusMinus' function below.
#
# The function accepts INTEGER_ARRAY arr as parameter.
#
def plusMinus(arr):
# Write your code here
neg,pos,zero=0
for i in range(0,len(arr)):
if(arr[i]<0):
neg+=0
elif(arr[i]>0):
pos+=0
else:
zero+=0
print(pos/len(arr))
print(neg/len(arr))
print(zero/len(arr))
return 0
if __name__ == '__main__':
n = int(input().strip())
arr = list(map(int, input().rstrip().split()))
plusMinus(arr)
Traceback (most recent call last):
File "/tmp/submission/20211128/06/29/hackerrank-a7793862d075fcff390bb368bc113c47/code/Solution.py", line 35, in <module>
plusMinus(arr)
File "/tmp/submission/20211128/06/29/hackerrank-a7793862d075fcff390bb368bc113c47/code/Solution.py", line 17, in plusMinus
neg,pos,zero=0
TypeError: cannot unpack non-iterable int object
Reading the traceback reveals the cause of the error you're getting:
Traceback (most recent call last):
File "/tmp/submission/20211128/06/29/hackerrank-a7793862d075fcff390bb368bc113c47/code/Solution.py", line 35, in <module>
plusMinus(arr)
File "/tmp/submission/20211128/06/29/hackerrank-a7793862d075fcff390bb368bc113c47/code/Solution.py", line 17, in plusMinus
neg,pos,zero=0
TypeError: cannot unpack non-iterable int object
The correct syntax would be either
# map the elements of the iterable on the right-hand side to the
# declared variable names
neg, pos, zero = 0, 0, 0
or
# assign the same value to all declared variables
neg = pos = zero = 0
As-written, it's trying to unpack the integer 0 into three separate values neg, pos, zero. Since 0 is not an iterable object like a tuple (as, for example, 0, 0, 0 is), and thus cannot be unpacked into multiple values, python throws an error.
x = symbols('x')
ch = 'exp(cos(cos(exp((sin(-0.06792841536110628))**(-6.045461643745118)))))'
f = lambdify(x, ch, "numpy")
print(float(f(2)))
It does not work, the programm is running and never ends(no error is issued).
My goal is to avoid this kind of cases (among multiple cases) by doing a try/except but i can't as there is no error
Why no error is released?
How can i avoid these cases ?
Thanks for your helping me !
In general, I'm not sure you can. SymPy or NumPy will keep trying to compute the number until precision is exhausted. But you can create a function that will raise and error if numbers are out of bounds for your interest:
>>> from sympy import cos as _cos, I, exp
>>> def cos(x):
... if abs(x) > 10**20: raise ValueError
... return _cos(x)
>>> exp(cos(cos(exp(5*(1+I)))))
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "<string>", line 2, in cos
ValueError
>>> f = lambda x: exp(cos(cos(exp(x))))
>>> f(sin(-0.06792841536110628)**-6.045461643745118)
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "<string>", line 1, in <lambda>
File "<string>", line 2, in cos
ValueError
But you have to think carefully about when you want to raise such an error. For example, SymPy has no trouble computing f(100) or f(100*I) if the non-error-catching cos is used. So think about when actually you want the error to rise.
lambdify is a lexical translator, converting a sympy expression to a python/numpy function.
Make a string with a symbol:
In [27]: ch = 'exp(cos(cos(exp((sin(x))**(-6.045461643745118)))))'
sympify(ch) has no problem, because it doesn't need to do any numeric calculation. So lambdify also works:
In [28]: f=lambdify(x,ch)
In [29]: f?
Signature: f(x)
Docstring:
Created with lambdify. Signature:
func(x)
Expression:
exp(cos(cos(exp((sin(x))**(-6.045461643745118)))))
Source code:
def _lambdifygenerated(x):
return (exp(cos(cos(exp(sin(x)**(-6.045461643745118))))))
The equivalent mpmath:
def _lambdifygenerated(x):
return (exp(cos(cos(exp(sin(x)**(mpf((1, 54452677612106279, -53, 56))))))))
And a working numeric evaluation:
In [33]: f(0j)
Out[33]: mpc(real='nan', imag='0.0')
I am getting an overflow error
import numpy as np
pi = np.pi
from scipy.integrate import quad
from math import exp
hbar = 1.055e-34
boltz = 1.381e-23
c = 2.998e8
def z(x):
return (x**3)/(exp(x)-1)
B=quad(z,0,np.inf)
A= ((boltz**4)*B)/(4*(pi**2)*(c**2)*(hbar**3))
print (A)
It is giving me an overflow error in line 11, i.e return (x**3)/(exp(x)-1)
You're hitting machine precision and python is freaking out.
>>> def z(x):
... return (x**3)/(exp(x)-1)
...
>>> z(709)
4.336616682334302e-300
>>> z(710)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in z
OverflowError: math range error
Just integrate up to ~700 and you'll be fine.
You can use np.exp instead of math.exp: it will raise a Warning for large numbers and return np.inf (which results in 1/np.inf = 0), instead of raising an OverFlowError
def z(x):
return (x**3)/(np.exp(x)-1) #replace math.exp by np.exp
B, err =quad(z,0,np.inf) # add the err, or use B=quad(...)[0] as quad will also return the integration error
A= ((boltz**4)*B)/(4*(pi**2)*(c**2)*(hbar**3))
print(A)
>> 5.668949306250541e-08
I am new to multiprocessing and I need your help.
I have four variables that each of them can take up to 4 values (integers or floats) and I stored all of them in a list called par=[A, B, C, D]. (see below)
I have created a list of possible combinations with par = itertools.product(*par) .
Then, I call a function func1, that takes these arguments and some more and calculates stuff. With the results of the func1, I call another function that calculates stuff and then writes to a file.
I want to run these as a whole in parallel with multiprocessing.Pool I thought to embed func1 and func2 in another function, called func_run, and map this with the list par I created above.
To summarize, my code looks like:
#values that I will use for func1
r = np.logspace(np.log10(5),np.log10(300),300)
T = 200*r
#Parameters for the sim
A = [0.1, 0.05, 0.001, 0.005]
B = [0.005, 0.025, 0.05, 0.1]
C = [20, 60, 100, 200]
D = [10, 20, 40, 80]
#Store them in a list
par = [A, B, C, D]
#Create a list with all combinations
par = list(itertools.product(*par))
def func_run(param):
for i in range(len(param)):
# Call func1
values = func1(param[i][0],param[i][1],param[i][2], param[i][3], r, T)
x = values[0]
y = values[1]
# and so on
# Call func2
results = func2(x,y,...)
z = results[0]
w = results[1]
# and so on
data_dict = {'result 1': [param[i][0]], 'result 2' : [param[i][1]]}
df = pd.DataFrame(data=data_dict)
with open(filename, 'a') as f:
df.to_csv(f, header=False)
return
Then, I call the func_run with multiprocessing.
from multiprocessing import Pool
pool = Pool(processes=4)
results = pool.map(func_run, par)
As a result, I get a, TypeError with traceback:
---------------------------------------------------------------------------
RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/user/anaconda3/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home/user/anaconda3/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "<ipython-input-14-5ce94acfd95e>", line 5, in run
values = calc_val(param[i][0],param[i][1],param[i][2], param[i][3], r, T)
TypeError: 'float' object is not subscriptable
"""
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
<ipython-input-15-f45146f68f66> in <module>()
1 pool = Pool(processes=4)
----> 2 test = pool.map(run,par)
~/anaconda3/lib/python3.6/multiprocessing/pool.py in map(self, func, iterable, chunksize)
264 in a list that is returned.
265 '''
--> 266 return self._map_async(func, iterable, mapstar, chunksize).get()
267
268 def starmap(self, func, iterable, chunksize=None):
~/anaconda3/lib/python3.6/multiprocessing/pool.py in get(self, timeout)
642 return self._value
643 else:
--> 644 raise self._value
645
646 def _set(self, i, obj):
TypeError: 'float' object is not subscriptable
Unfortunately, it is impossible to add the whole functions and what are they doing because they are hundreds of lines. I hope that you can get the feeling though even though you cannot really reproduce it by yourselfs.
Is it possible to run something like this with multiprocessing or I need a different approach?
It would be great if anyone can help me understand the error and make it run.
The result of
par = list(itertools.product(*par))
is a list of tuples of floats (and ints). Pool.map() takes an iterable as the 2nd argument and maps over its items, passing them individually to given func. In other words in the function func_run(param) param is not a list of tuples of numbers, but a tuple of numbers, and so
param[i][0]
is trying to access the ith float object's 0th item, which of course makes no sense, and so the exception. You probably should remove the for-loop in func_run():
def func_run(param):
values = func1(param[0], param[1], param[2], param[3], r, T)
...
I have a simple multiprocessing example that I'm trying to create. The ordinary map() function version works, but when changed to Pool.map, I'm getting a strange error:
from multiprocessing import Pool
from functools import partial
x = [1,2,3]
y = 10
f = lambda x,y: x**2+y
# ordinary map works:
map(partial(f,y=y),x)
# [11, 14, 19]
# multiprocessing map does not
p = Pool(4)
p.map(partial(f, y=y), x)
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 504, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 319, in _handle_tasks
put(task)
PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed
Pickling error? What is this exactly?
The arguments to Pool.map must be picklable. Module-level functions are picklable, but partial(f, y=y) is not defined at the module-level and so is not pickable.
There is a simple workaround:
def g(x, y=y):
return f(x, y)
p.map(g, x)
Functions made with functools.partial used to be unpickable.
However, with Python2.7 or better, you can also define g (at the module level) using functools.partial:
import multiprocessing as mp
import functools
def f(x, y):
return x**2 + y
x = [1,2,3]
y = 10
g = functools.partial(f, y=y)
if __name__ == '__main__':
p = mp.Pool()
print(p.map(g, x))
yields [11, 14, 19]. But note to get this result f had to be defined with def rather than lambda. I think this is because pickle relies on "fully qualified" name references to look up function object values.