In the following code which is an example of my main code, I have tried to use pathos.multiprocessing to increase the speed of iteration of a loop. The output of each iteration which has implemented with multiprocessing is a 2-D array. I used pathos.multiprocessing instead of multiprocessing since I wanted to use it in my class method. I have used apipe method of the pathos.multiprocessing to collect the output in a list but it returns an empty list. I have no idea why it fails
import numpy as np
import random
import pathos.multiprocessing as mp
class Testsystematics(object):
def __init__(self, x, y, NTH = None, THMIN = None, THMAX = None, NRESAMPLE = None):
self.x = x
self.y = y
self.nbins = NTH
self.bmin = THMIN
self.bmax = THMAX
self.nresample= NRESAMPLE
self.bins = np.linspace(self.bmin, self.bmax, self.nbins+1, True).astype(np.float)
self.sample = np.array([[random.choice(range(len(self.y))) for _ in xrange(len(self.y))] for i in range(self.nresample)])
self.result_list=[]
def log_result(self, result):
self.result_list.append(result)
def bootstrapping(self, k):
xi_p = np.zeros(self.nbins, float)
xi_m = np.zeros(self.nbins, float)
nind = np.zeros(self.nbins, float)
for i in range(len(self.x)):
for j in range(len(self.x)):
if (i!=j):
sep= np.sqrt(self.x[i]**2+self.x[j]**2)
index= np.searchsorted(self.bins, sep , side='right')-1
sind = np.sin(sep)
if ((sep< self.bins[-1]) and (sep>=self.bins[0])):
xi_p[index] += sind*(np.mean(y)-np.median(y))
xi_m[index] += sind*np.std(y)
nind[index] += 1.0
for i in range(self.nbins):
xi_p[i]=xi_p[i]/nind[i]
xi_m[i]=xi_m[i]/nind[i]
return np.vstack((xi_p,xi_m))
def twopcf(self):
if (self.sys_type==1):
pool = mp.ProcessingPool(16)
for n in range(self.nresample):
pool.apipe(self.bootstrapping, args=(n,), callback=self.log_result)
shape,scale=0.5, 0.6
x=np.random.gamma(shape, scale, 10000)
mu1, sigma1 = 0, 0.5 # mean and standard deviation
mu2, sigma2 = 0.1, 0.7 # mean and standard deviation
y = np.random.normal(mu1, sigma1, 1000)+np.random.normal(mu2, sigma2, 1000)
sysTest=Testsystematics(x, y, NTH = 10, THMIN = 0, THMAX = 5, NRESAMPLE = 100)
any suggestion?
I'm the pathos author. I tried your code, and it runs, but produces no error and produces no result in result_list. I believe that is because you are using apipe incorrectly. The correct use of apipe is as follows:
>>> import pathos
>>> def squared(x):
... return x**2
...
>>> pool = pathos.multiprocessing.ProcessingPool()
>>> res = pool.apipe(squared, 5)
>>> res.get()
25
self.bootstrapping takes self and k, so you have to provide a k in the pipe call when you calling it as an instance method. There is no callback -- if you want a callback, you'd need to add one to your function.
Note that the return value is retrieved by (1) getting a return object, and (2) by calling get on the return object.
From you use of apipe within a for loop, that points me to suggest you use pool.amap (or pool.imap) instead -- then you can do the for loop in parallel.
Related
I am trying to minimize the quadratic weighted kappa function using scipy minimize fmin Powell function.
The two functions digitize_train and digitize_train2 gives 100% EXACT same results.
However, when I tried to use these functions with scipy minimize the second method fails.
I have been trying to debug the problem for hours, to my surprise despite the two functions being exact same the bumpy digitize function fails to give fmin Powell mimimization.
How to fix the error?
Question
How to use numpy.digitize in scipy fmin_powell?
SETUP
# imports
import numpy as np
import pandas as pd
import seaborn as sns
from scipy.optimize import fmin_powell
from sklearn import metrics
# data
train_labels = [1,1,8,7,6,5,3,2,4,4]
train_preds = [0.1,1.2,8.9, 7.6, 5.5, 5.5, 2.99, 2.4, 3.5, 4.0]
guess_lst = (1.5,2.9,3.1,4.5,5.5,6.1,7.1)
# functions
# here I am trying the convert real numbers -inf to +inf to integers 1 to 8
def digitize_train(train_preds, guess_lst):
(x1,x2,x3,x4,x5,x6,x7) = list(guess_lst)
res = []
for y in list(train_preds):
if y < x1:
res.append(1)
elif y < x2:
res.append(2)
elif y < x3:
res.append(3)
elif y < x4:
res.append(4)
elif y < x5:
res.append(5)
elif y < x6:
res.append(6)
elif y < x7:
res.append(7)
else: res.append(8)
return res
def digitize_train2(train_preds, guess_lst):
return np.digitize(train_preds,guess_lst) + 1
# compare two functions
df = pd.DataFrame({'train_labels': train_labels,
'train_preds': train_preds,
'method_1': digitize_train(train_preds, guess_lst),
'method_2': digitize_train2(train_preds, guess_lst)
})
df
** NOTE: The two functions are exact same**
Method 1: without numpy digitize runs fine
# using fmin_powel for method 1
def get_offsets_minimizing_train_preds_kappa(guess_lst):
res = digitize_train(train_preds, guess_lst)
return - metrics.cohen_kappa_score(train_labels, res,weights='quadratic')
offsets = fmin_powell(get_offsets_minimizing_train_preds_kappa, guess_lst, disp = True)
print(offsets)
Method 2: using numpy digitize fails
# using fmin_powell for method 2
def get_offsets_minimizing_train_preds_kappa2(guess_lst):
res = digitize_train2(train_preds, guess_lst)
return -metrics.cohen_kappa_score(train_labels, res,weights='quadratic')
offsets = fmin_powell(get_offsets_minimizing_train_preds_kappa2, guess_lst, disp = True)
print(offsets)
How to use numpy digitize method?
Update
As per suggestions I tried pandas cut, but still gives error.
ValueError: bins must increase monotonically.
# using fmin_powell for method 3
def get_offsets_minimizing_train_preds_kappa3(guess_lst):
res = pd.cut(train_preds, bins=[-np.inf] + list(guess_lst) + [np.inf],
right=False)
res = pd.Series(res).cat.codes + 1
res = res.to_numpy()
return -metrics.cohen_kappa_score(train_labels, res,weights='quadratic')
offsets = fmin_powell(get_offsets_minimizing_train_preds_kappa3, guess_lst, disp = True)
print(offsets)
It seems that during the minimization process, the value in guest_lst are not monotonically increasing anymore, one work around is to pass the sorted of guest_lst in digitize like:
def digitize_train2(train_preds, guess_lst):
return np.digitize(train_preds,sorted(guess_lst)) + 1
and you get
# using fmin_powell for method 2
def get_offsets_minimizing_train_preds_kappa2(guess_lst):
res = digitize_train2(train_preds, guess_lst)
return -metrics.cohen_kappa_score(train_labels, res,weights='quadratic')
offsets = fmin_powell(get_offsets_minimizing_train_preds_kappa2, guess_lst, disp = True)
print(offsets)
Optimization terminated successfully.
Current function value: -0.990792
Iterations: 2
Function evaluations: 400
[1.5 2.7015062 3.1 4.50379942 4.72643334 8.12463415
7.13652301]
a device function I have written always throws a no python exception and I do not understand why or where my error is.
Here a small example that represents my problem.
I have the following device function that I call from a kernel:
#cuda.jit (device=True)
def sub_stuff(vec_a, vec_b):
x0 = vec_a[0] - vec_b[0]
x1 = vec_a[1] - vec_b[1]
x2 = vec_a[2] - vec_b[2]
return [x0, x1, x2]
The kernel that calls this function looks like this:
#cuda.jit
def kernel_via_polygon(vectors_a, vectors_b, result_array):
pos = cuda.grid(1)
if pos < vectors_a.size and pos < result_array.size:
result_array[pos] = sub_stuff(vectors_a[pos], vectors_b[pos])
The three input arrays are the following:
vectors_a = np.arange(1, 10).reshape((3, 3))
vectors_b = np.arange(1, 10).reshape((3, 3))
result = np.zeros_like(vectors_a)
When I now call the function via trace_via_polygon(vectors_a, vectors_b, result) a no python error is thrown. When the device funtion would return only an integer value, this error is prevented.
Can someone explain to me where my mistake is?
Edit: FYI as answered by
talonmies list construction isn't supported in device code. An alternative that helped me is using tuples, which are supported.
The source of your error is that the device function sub_stuff is attempting to create a list in GPU code, and that isn't supported.
About the best you can do would be something like this:
from numba import jit, guvectorize, int32, int64, float64
from numba import cuda
import numpy as np
import math
#cuda.jit (device=True)
def sub_stuff(vec_a, vec_b, result):
for i in range(vec_a.shape[0]):
result[i] = vec_a[i] - vec_b[i]
#cuda.jit
def kernel_via_polygon(vectors_a, vectors_b, result_array):
pos = cuda.grid(1)
if pos < vectors_a.size and pos < result_array.size:
sub_stuff(vectors_a[pos], vectors_b[pos], result_array[pos])
vectors_a = 100 + np.arange(1, 10).reshape((3, 3))
vectors_b = np.arange(1, 10).reshape((3, 3))
result = np.zeros_like(vectors_a)
kernel_via_polygon[1,10](vectors_a, vectors_b, result)
print(result)
which uses a loop to iterate over the individual array slices and perform the subtraction between each element.
I want to parallelize a function using numba.vectorize, but my function doesn't take any input. Currently, I use a dummy array and dummy input for my function that is never used.
Is there a more elegant/fast way (possibly without using numba.vectorize)?
Code example (not my actual code, only for demonstration how I discard input):
import numpy as np
from numba import vectorize
#vectorize(["int32(int32)"], nopython=True)
def particle_path(discard_me):
x = 0
for _ in range(10):
x += np.random.uniform(0, 1)
return np.int32(x)
arr = particle_path(np.empty(1024, dtype=np.int32))
print(arr)
If you'll simply be dealing with 1D arrays, then you can use the following, where the array must be instantiated outside the function. There doesn't seem to be any reason to use vectorize here, you can achieve the goal simply with jit although you do have to explicitly write the loop over the array elements using this. If your array will always be 1D, then you can use:
import numpy as np
from numba import jit
#jit(nopython=True)
def particle_path(out):
for i in range(len(out)):
x = 0
for _ in range(10):
x += np.random.uniform(0, 1)
out[i] = x
arr = np.empty(1024, dtype=np.int32)
particle_path(arr)
You can similarly deal with any-dimensional arrays using the flat attribute (and make sure to use .size to get total number of elements in the array):
#jit(nopython=True)
def particle_path(out):
for i in range(out.size):
x = 0
for _ in range(10):
x += np.random.uniform(0, 1)
out.flat[i] = x
arr = np.empty(1024, dtype=np.int32)
particle_path(arr)
and finally you can create your array inside your function if you need a new array each time you run the function (use the above instead if you'll be calling the function repeatedly and want to overwrite the same array, hence saving the time to re-allocate the same array over and over again).
#jit(nopython=True)
def particle_path(num):
out = np.empty(shape=num, dtype=np.int32)
for i in range(num):
x = 0
for _ in range(10):
x += np.random.uniform(0, 1)
out[i] = x
return out
arr = particle_path(1024)
I need some help because I tried since two days, and I don't know how I can do this. I have function compute_desc that takes multiples arguments (5 to be exact) and I would like to run this in parallel.
I have this for now:
def compute_desc(coord, radius, coords, feat, verbose):
# Compute here my descriptors
return my_desc # numpy array (1x10 dimensions)
def main():
points = np.rand.random((1000000, 4))
coords = points[:, 0:3]
feat = points[:, 3]
all_features = np.empty((1000000, 10))
all_features[:] = np.NAN
scales = [0.5, 1, 2]
for radius in scales:
for index, coord in enumerate(coords):
all_features[index, :] = compute_desc(coord,
radius,
coords,
feat,
False)
I would like to parallelize this. I saw several solutions with a Pool, but I don't understand how it works.
I tried with a pool.map(), but I can only send only one argument to the function.
Here is my solution (it doesn't work):
all_features = [pool.map(compute_desc, zip(point, repeat([radius,
coords,
feat,
False]
)
)
)]
but I doubt it can work with a numpy array.
EDIT
This is my minimum code with a pool (it works now):
import numpy as np
from multiprocessing import Pool
from itertools import repeat
def compute_desc(coord, radius, coords, feat, verbose):
# Compute here my descriptors
my_desc = np.rand.random((1, 10))
return my_desc
def compute_desc_pool(args):
coord, radius, coords, feat, verbose = args
compute_desc(coord, radius, coords, feat, verbose)
def main():
points = np.random.rand(1000000, 4)
coords = points[:, 0:3]
feat = points[:, 3]
scales = [0.5, 1, 2]
for radius in scales:
with Pool() as pool:
args = zip(points, repeat(radius),
repeat(coords),
repeat(feat),
repeat(kdtree),
repeat(False))
feat_one_scale = pool.map(compute_desc_pool, args)
feat_one_scale = np.array(feat_one_scale)
if radius == scales[0]:
all_features = feat_one_scale
else:
all_features = np.hstack([all_features, feat_one_scale])
# Others stuffs
The generic solution is to pass to Pool.map a sequence of tuples, each tuple holding one set of arguments for your worker function, and then to unpack the tuple in the worker function.
So, just change your function to accept only one argument, a tuple of your arguments, which you already prepared with zip and passed to Pool.map. Then simply unpack args to variables:
def compute_desc(args):
coord, radius, coords, feat, verbose = args
# Compute here my descriptors
Also, Pool.map should work with numpy types too, since after all, they are valid Python types.
Just be sure to properly zip 5 sequences, so your function receives a 5-tuple. You don't need to iterate over point in coords, zip will do that for you:
args = zip(coords, repeat(radius), repeat(coords), repeat(feat), repeat(False))
# args is a list of [(coords[0], radius, coords, feat, False), (coords[1], ... )]
(if you do, and give point as a first sequence to zip, the zip will iterate over that point, which is in this case a 3-element array).
Your Pool.map line should look like:
for radius in scales:
args = zip(coords, repeat(radius), repeat(coords), repeat(feat), repeat(False))
feat_one_scale = [pool.map(compute_desc_pool, args)]
# other stuff
A solution specific to your case, where all arguments except one are fixed could be to use functools.partial (as the other answer suggests). Furthermore, you don't even need to unpack coords in the first argument, just pass the index [0..n] in coords, since each invocation of your worker function already receives the complete coords array.
I assume from your example that four of those five arguments would be constant to all calls to compute_desc_pool. If so, then you can use partial to do this.
from functools import partial
....
def compute_desc_pool(coord, radius, coords, feat, verbose):
compute_desc(coord, radius, coords, feat, verbose)
def main():
points = np.random.rand(1000000, 4)
coords = points[:, 0:3]
feat = points[:, 3]
feat_one_scale = np.empty((1000000, 10))
feat_one_scale[:] = np.NAN
scales = [0.5, 1, 2]
pool = Pool()
for radius in scales:
feat_one_scale = [pool.map(partial(compute_desc_pool, radius, coords,
feat, False), coords)]
I have tried to use emcee library to implement Monte Carlo Markov Chain inside a class and also make multiprocessing module works but after running such a test code:
import numpy as np
import emcee
import scipy.optimize as op
# Choose the "true" parameters.
m_true = -0.9594
b_true = 4.294
f_true = 0.534
# Generate some synthetic data from the model.
N = 50
x = np.sort(10*np.random.rand(N))
yerr = 0.1+0.5*np.random.rand(N)
y = m_true*x+b_true
y += np.abs(f_true*y) * np.random.randn(N)
y += yerr * np.random.randn(N)
class modelfit():
def __init__(self):
self.x=x
self.y=y
self.yerr=yerr
self.m=-0.6
self.b=2.0
self.f=0.9
def get_results(self):
def func(a):
model=a[0]*self.x+a[1]
inv_sigma2 = 1.0/(self.yerr**2 + model**2*np.exp(2*a[2]))
return 0.5*(np.sum((self.y-model)**2*inv_sigma2 + np.log(inv_sigma2)))
result = op.minimize(func, [self.m, self.b, np.log(self.f)],options={'gtol': 1e-6, 'disp': True})
m_ml, b_ml, lnf_ml = result["x"]
return result["x"]
def lnprior(self,theta):
m, b, lnf = theta
if -5.0 < m < 0.5 and 0.0 < b < 10.0 and -10.0 < lnf < 1.0:
return 0.0
return -np.inf
def lnprob(self,theta):
lp = self.lnprior(theta)
likelihood=self.lnlike(theta)
if not np.isfinite(lp):
return -np.inf
return lp + likelihood
def lnlike(self,theta):
m, b, lnf = theta
model = m * self.x + b
inv_sigma2 = 1.0/(self.yerr**2 + model**2*np.exp(2*lnf))
return -0.5*(np.sum((self.y-model)**2*inv_sigma2 - np.log(inv_sigma2)))
def run_mcmc(self,nstep):
ndim, nwalkers = 3, 100
pos = [self.get_results() + 1e-4*np.random.randn(ndim) for i in range(nwalkers)]
self.sampler = emcee.EnsembleSampler(nwalkers, ndim, self.lnprob,threads=10)
self.sampler.run_mcmc(pos, nstep)
test=modelfit()
test.x=x
test.y=y
test.yerr=yerr
test.get_results()
test.run_mcmc(5000)
I got this error message :
File "MCMC_model.py", line 157, in run_mcmc
self.sampler.run_mcmc(theta0, nstep)
File "build/bdist.linux-x86_64/egg/emcee/sampler.py", line 157, in run_mcmc
File "build/bdist.linux-x86_64/egg/emcee/ensemble.py", line 198, in sample
File "build/bdist.linux-x86_64/egg/emcee/ensemble.py", line 382, in _get_lnprob
File "build/bdist.linux-x86_64/egg/emcee/interruptible_pool.py", line 94, in map
File "/vol/aibn84/data2/zahra/anaconda/lib/python2.7/multiprocessing/pool.py", line 558, in get
raise self._value
cPickle.PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed
I reckon it has something to do with how I have used multiprocessing in the class but I could not figure out how I could keep the structure of my class the way it is and meanwhile use multiprocessing as well??!!
I will appreciate for any tips.
P.S. I have to mention the code works perfectly if I remove threads=10 from the last function.
There are a number of SO questions that discuss what's going on:
https://stackoverflow.com/a/21345273/2379433
https://stackoverflow.com/a/28887474/2379433
https://stackoverflow.com/a/21345308/2379433
https://stackoverflow.com/a/29129084/2379433
…including this one, which seems to be your response… to nearly the same question:
https://stackoverflow.com/a/25388586/2379433
However, the difference here is that you are not using multiprocessing directly -- butemcee is. Therefore, the pathos.multiprocessing solution (from the links above) is not available for you. Since emcee uses cPickle, you'll have to stick to things that pickle knows how to serialize. You are out of luck for class instances. Typical workarounds are to either use copy_reg to register the type of object you want to serialize, or to add a __reduce__ method to tell python how to serialize it. You can see several of the answers from the above links suggest similar things… but none enable you to keep the class the way you have written it.
For the record, you can now create a pathos.multiprocessing pool, and pass it to emcee using the pool argument. However, be aware that the overhead of multiprocessing can actually slow things down, unless your likelihood is particularly time-consuming to compute.