Faster exhaustive research using numpy - python

I'm trying to maximize the minimum between two function using exhaustive research, this solution work but loop in python consumes a lot of computing time. is there an efficient way to use numpy (mesh grid or vectorize) to solve this problem?
Code :
Functions below are used in the exhaustive research method
import numpy as np
def F1(x):
return (x/11)**10
def F2(x,y,z):
return z+x/y
def F3(x,y,z,a,b,c):
return ((x+y)**z)/((a-b)**c)
Exhaustive research method take 6 parameter (scalar or 1D array). for the moment I just want to compute my code on scalar, then I can use another function to browse those parameter if they are 1D array.
def B_F(P1, P2, P3,P4, P5, P6) :
# initializing my optimal parameters
a_Opt, b_opt, c_opt, obj_opt = 0, 0, 0, 0
# feasible set
a = np.linspace(0.0,1.0,10)
b = np.linspace(0.0,100.0,100)
c = np.linspace(0.0,100.0,100)
for i in a:
for j in b:
for k in c:
#if constraint is respected
if P1*k+P2*j+2*(i*k*j) <= F1(P3):
# calculate the max min of the two function
f_1 = F2(i,k,P6)
f_2 = F3(i,k,j,10,P4,P5)
min_f = np.minimum(f_1, f_2)
# extract optimal parameters and objective function
if obj_opt <= min_f :
a_Opt = i
b_opt = j
c_opt = k
obj_opt = min_f
exhaustive_research = np.array([[obj_opt, a_Opt, b_opt, c_opt]])
return exhaustive_research

You can do it this way:
A,B,C = np.meshgrid(a,b,c)
mask = P1*C+P2*B+2*(A*B*C) <= F1(P3)
A = A[mask]
B = B[mask]
C = C[mask]
f_1 = F2(A,C,P6)
f_2 = F3(A,C,B,10,P4,P5)
min_f = np.minimum(f_1, f_2)
ind = np.argmax(min_f)
obj_opt, a_Opt, b_opt, c_opt = min_f[ind], A[ind], B[ind], C[ind]

Related

I'm getting a ValueError when writing a method plot for Newton's method

I have an assignment for school. First of all can you help me with confirming I have interpreted the question right? And also does the code seem somewhat ok? There have been other tasks before this like create the class with a two dimensional function, write the newton method and so on. And now this question. Im not finished programming it, but Im a bit stuck and I feel like I dont know exactly what to do. On what do I run my Newton method? On the point P. Do I create it like I have done in the Plot method??
This is the question:
Write a method plot that checks the dependence of Newton’s method on
several initial vectors x0. This method should plot what is described
in the following steps:
• Use the meshgrid command to set up a grid of
N2 points in the set G = [a, b]×[c, d] (the parameters N, a, b, c and
d are parameters of the methods). You obtain two matrices X and Y
where a specific grid point is defined as pij = (Xij , Yij )
class fractals2D(object):
Allzeroes = [] #a list to add all stored values from each run of newtons method
def __init__(self,f, x):
self.f=f
f0 = self.f(x) #giving a variable name with the function to use in ckass
n=len(x) #for size of matrice
jac=zeros([n]) #creates an array to use for jacobian matrice
h=1.e-8 #to set h for derivative
self.jac = jac
for i in range(n): #creating loop to take partial derivatives of x and y from x in f
temp=x[i]
#print(x[i])
x[i]=temp +h #why setting x[i] two times?
#print(x[i])
f1=f(x)
x[i]=temp
#print(x[i])
jac[:,i]=(f1-f0)/h
def Newtons_method(self,guess):
f_val = f(guess)
self.guess = guess
for i in range(40):
delta = solve(self.jac,-f_val)
guess = guess +delta
if norm((delta),ord=2)<1.e-9:
return guess #alist for storing zeroes from one run
def ZeroesMethod(self, point):
point = self.guess
self.Newtons_method(point)
#adds zeroes from the run of newtons to a list to store them all
self.Allzeroes.append(self.guess)
return (len(self.Allzeroes)) #returns how many zeroes are found
def plot(self, N, a, b, c, d):
x = np.linspace(a, b, N)
y = np.linspace(c, d, N)
P = [X, Y] = np.meshgrid(x, y)
return P #calling ZeroesMethos with our newly meshed point of several arrays
x0 = array([2.0, 1.0]) #creates an x and y value?
x1= array([1, -5])
a= array([2, 8])
b = array([-2, -6])
def f(x):
f = np.array(
[x[0]**2 - x[1] + x[0]*cos(pi*x[0]),
x[0]*x[1] + exp(-x[1]) - x[0]**(-1)])
This is the errormessage im receiving:
delta = solve(self.jac,-f_val)
TypeError: bad operand type for unary -: 'NoneTyp

Find optimal vector that minimizes function

I am trying to find a vector that minimizes the residual sum of squares when multiplying a matrix.
I know of scipy's optimize package (which has a minimize function). However, there is an extra constraint for my code. The sum of all entries of w (see function below) must equal 1, and no entry of w can be less than 0. Is there a package that does this for me? If not, how can I do this?
Trying to minimize w:
def w_rss(w,x0,x1):
predictions = np.dot(x0,w)
errors = x1 - predictions
rss = np.dot(errors.transpose(),errors).item(0)
return rss
X0 = np.array([[3,4,5,3],
[1,2,2,4],
[6,5,3,7],
[1,0,5,2]])
X1 = np.array([[4],
[2],
[4],
[2]])
W = np.array([[.0],
[.5],
[.5],
[.0]])
print w_rss(W,X0,X1)
So far this is my best attempt at looping through possible values of w, but it's not working properly.
def get_w(x0,x1):
J = x0.shape[1]
W0 = np.matrix([[1.0/J]*J]).transpose()
rss0 = w_rss(W0,x0,x1)
loop = range(J)
for i in loop:
W1 = W0
rss1 = rss0
while rss0 == rss1:
den = len(loop)-1
W1[i][0] += 0.01
for j in loop:
if i == j:
continue
W1[j][0] -= 0.01/den
if W1[j][0] <= 0:
loop.remove(j)
rss1 = w_rss(W1,x0,x1)
if rss1 < rss0:
#print W1
W0 = W1
rss0 = rss1
print '--'
print rss0
print W0
return W0,rss0
The SLSQP code in scipy can do this. You can use scipy.optimize.minimize with method='SLSQP, or you can use the function fmin_slsqp directly. In the following, I use fmin_slsqp.
The scipy solvers generally pass a one-dimensional array to the objective function, so to be consistent, I'll change W and X1 to be 1-d arrays, and I'll write the objective function (now called w_rss1) to expect a 1-d argument w.
The condition that all the elements in w must be between 0 and 1 is specified using the bounds argument, and the condition that the sum must be 1 is specified using the f_eqcons argument. The constraint function returns np.sum(w) - 1, so it is 0 when the sum of the elements is 1.
Here's the code:
import numpy as np
from scipy.optimize import fmin_slsqp
def w_rss1(w, x0, x1):
predictions = np.dot(x0, w)
errors = x1 - predictions
rss = (errors**2).sum()
return rss
def sum1constraint(w, x0, x1):
return np.sum(w) - 1
X0 = np.array([[3,4,5,3],
[1,2,2,4],
[6,5,3,7],
[1,0,5,2]])
X1 = np.array([4, 2, 4, 2])
W = np.array([.0, .5, .5, .0])
result = fmin_slsqp(w_rss1, W, f_eqcons=sum1constraint, bounds=[(0.0, 1.0)]*len(W),
args=(X0, X1), disp=False, full_output=True)
Wopt, fW, its, imode, smode = result
if imode != 0:
print("Optimization failed: " + smode)
else:
print(Wopt)
When I run this, the output is
[ 0.05172414 0.55172414 0.39655172 0. ]

MATLAB fftfilt equivalent for Python

I am trying to traslate the following function created in MATLAB into Python,
function output_phase = fix_phasedata180(phase_data_degrees, averaging_length)
x = exp(sqrt(-1)*phase_data_degrees*2/180*pi);
N = averaging_length;
b = 1/sqrt(N)*ones(1,N);
y = fftfilt(b,x);y = fftfilt(b,y(end:-1:1));y = y(end:-1:1); # This is a quick implementation of filtfilt using fftfilt instead of filter
output_phase = (phase_data_degrees-(round(mod(phase_data_degrees/180*pi-unwrap(angle(y))/2,2*pi)*180/pi/180)*180));
temp = mod(output_phase(1),90);
output_phase = output_phase-output_phase(1)+temp;
output_phase = mod(output_phase,360);
s = find(output_phase>= 180);
output_phase(s) = output_phase(s)-360;
So, I am trying to implement this function defined in MATLAB into Python here
def fix_phasedata180(data_phase, averaging_length):
x = np.exp(1j*data_phase*2./180.*np.pi)
N = averaging_length
b = 1./np.sqrt(N)*np.ones(N)
y = fftfilt(b,x)
y = fftfilt(b,y[::-1])
y = y[::-1]
output_phase = data_phase - np.array(map(round,((data_phase/180.*np.pi-np.unwrap(np.angle(y))/2.)%(2.*np.pi))*180./np.pi/180.))*180
temp = output_phase[0]%90
output_phase = output_phase-output_phase[0]+temp
s = output_phase[output_phase >= 180]
for s in range(len(output_phase)):
output_phase[s] = output_phase[s]-360
return output_phase
I was thinking that the function fftfilt was a clone of fftfilt in MATLAB, when I run I have the following error
ValueError Traceback (most recent call last)
<ipython-input-40-eb6944fd1053> in <module>()
4 N = averaging_length
5 b = 1./np.sqrt(N)*np.ones(N)
----> 6 y = fftfilt(b,x)
D:/folder/fftfilt.pyc in fftfilt(b, x, *n)
66 k = min([i+N_fft,N_x])
67 yt = ifft(fft(x[i:il],N_fft)*H,N_fft) # Overlap..
---> 68 y[i:k] = y[i:k] + yt[:k-i] # and add
69 i += L
70 return y
ValueError: could not broadcast input array from shape (0,0) into shape (0)
So, my question is: are there any equivalent to MATLAB fftfilt in Python? The aim of my function output_phase is to correct the fast variations in a phase signal and then correct n*90 degrees shifts, showed bellow
The function you linked to is a Python equivalent to the Matlab function. It just happens to be broken.
Anyway, MNE also has an implementation of the overlap and add method used by the fftfilt function. It's a private function of the library, and I'm not sure if you can call it exactly equivalent to the Matlab function, but maybe it's useful. You can find the source code here: https://github.com/mne-tools/mne-python/blob/master/mne/filter.py#L41.
Finally, I got an improvement in my code. I replace the fftfilt (twice applied) by the scipy.signal.filtfilt (that is basically the same). So my code traslated into python will be:
import numpy as np
import scipy.signal as sg
AveragingLengthAmp = 10
AveragingLengthPhase = 10
PhaseFixLength = 60
averaging_length = channel_sampling_freq1*PhaseFixLength
def fix_phasedata180(data_phase, averaging_length):
data_phase = np.reshape(data_phase,len(data_phase))
x = np.exp(1j*data_phase*2./180.*np.pi)
N = float(averaging_length)
b, a = sg.butter(10, 1./np.sqrt(N))
y = sg.filtfilt(b, a, x)
output_phase = data_phase - np.array(map(round,((data_phase/180*np.pi-np.unwrap(np.angle(y))/2)%(2*np.pi))*180/np.pi/180))*180
temp = output_phase[0]%90
output_phase = output_phase-output_phase[0]+temp
s = output_phase[output_phase >= 180]
for s in range(len(output_phase)):
output_phase[s] = output_phase[s]-360
return output_phase
out1 = fix_phasedata180(data_phase, averaging_length)
def fix_phasedata90(data_phase, averaging_length):
data_phase = np.reshape(data_phase,len(data_phase))
x = np.exp(1j*data_phase*4./180.*np.pi)
N = float(averaging_length)
b, a = sg.butter(10, 1./np.sqrt(N))
y = sg.filtfilt(b, a, x)
output_phase = data_phase - np.array(map(round,((data_phase/180*np.pi-np.unwrap(np.angle(y))/4)%(2*np.pi))*180/np.pi/90))*90
temp = output_phase[0]%90
output_phase = output_phase-output_phase[0]+temp
output_phase = output_phase%360
s = output_phase[output_phase >= 180]
for s in range(len(output_phase)):
output_phase[s] = output_phase[s]-360
return output_phase
offset = 0
data_phase_unwrapped = np.zeros(len(out2))
data_phase_unwrapped[0] = out2[0]
for jj in range(1,len(out2)):
if out2[jj]-out2[jj-1] > 180:
offset = offset + 360
elif out2[jj]-out2[jj-1] < -180:
offset = offset - 360
data_phase_unwrapped[jj] = out2[jj] - offset
Here fix_phasedata180 fix the 180-degrees shifts, similarly for fix_phasedata90. The channel_sampling_freq1 is 1/sec.
The result is:
that is mostly right. Only I have some question understanding the scipy.signal.butter and scipy.signal.filtfilt. As you see, I choose:
b, a = sg.butter(10, 1./np.sqrt(N))
Here the order of the filter (N) is 10 and the critical frequency (Wn) is 1/sqrt(60). My question is, How can I choose the appropiated order of the filter? I tried since N=1 until N=21, larger than 21 the result data_phase_unwrapped are all NAN. I tried too, giving values for padlen in filtfilt but I didnt understand it well.
This is a bit late but I found the answer to this while translating some matlab code of my own.
TLDR: Use mode="full" for any of the convolve functions in scipy.signal
I leaned on scipy's recipes to guide me through this. The rest of my answer is effectively a summary of that page. Matlabs fftfilt function can be replaced with any of the convolve functions mentioned in the cookbook (np.convolve, scipy.signal.convolve, .oaconvolve, .fttconvolve), if you pass mode='full'.
import numpy as np
from numpy import convolve as np_convolve
from scipy.signal import fftconvolve, lfilter, firwin
from scipy.signal import convolve as sig_convolve
# Create the m by n data to be filtered.
m = 1
n = 2 ** 18
x = np.random.random(size=(m, n))
ntaps_list = 2 ** np.arange(2, 14)
for ntaps in ntaps_list:
# Create a FIR filter.
b = firwin(ntaps, [0.05, 0.95], width=0.05, pass_zero=False)
conv_result = sig_convolve(x, b[np.newaxis, :], mode='full')
Happy filtering!
I also had issues when converting a MATLAB code. I went from this MATLAB code:
signal_weighted = fftfilt( weight, signal.^2 ) / Ntau;
to this python code:
from scipy.signal import convolve
signal_weighted = convolve(signal**2 ,weightData, 'full', 'direct') / Ntau
signal_weighted = signal_weighted[:len(signal)]
If you want something faster than convolve, see this overlap and add fft implementation

Reducing computation Time for a nested Loop

I would like to reduce the computation time for the code posted below. In essence, the code below calculates the array Tf as product of the following nested loop:
Af = lambda x: Approximationf(f, x)
for idxp, prior in enumerate(grid_prior):
for idxy, y in enumerate(grid_y):
posterior = lambda yPrime: updated_posterior(prior, y, yPrime)
integrateL = integrate(lambda z: Af(np.array([y*np.exp(mu[0])*z,
posterior(y*np.exp(mu[0]) * z)])))
integrateH = integrate(lambda z: Af(np.array([y*np.exp(mu[1])*z,
posterior(y * np.exp(mu[1])*z)])))
Tf[idxy, idxp] = (h[idxy, idxp] +
beta * ((prior * integrateL) +
(1-prior)*integrateH))
The objects posterior, integrate and Af are functions that are repeatedly called while iterating over the loop. The function posterior calculates a scalar called posterior. The function Af approximates the function f at sample points x and passes the result on to the function integrate, which calculates the conditional expectation of the function f.
The code posted below is a simplification of a more difficult problem. Instead of running the nested loop once, I have to run it multiple times to solve a fixed point problem. This problem is initialized with an arbitrary function f and a function Tf is created. This array is then used in the next iteration over the nested loop to calculate another array Tf. The process continues until convergence.
I decided not to report results of the cProfile module. By neglecting the iteration over the nested loop until convergence a lot of internal python executions require a relatively long time. However, when iterating until convergence, these internal executions loose their relative importance and are relegated to lower positions in the cPython output.
I tried to mimick different suggestions for lowering the computation time of loops I found online for slightly modified problems. Unfortunately, I couldn't do so and could not really figure out a common approach to tackle these problems. Does somebody has an idea how to lower the computation time of this loop? I am grateful for any help!
import numpy as np
from scipy import interpolate
from scipy.stats import lognorm
from scipy.integrate import fixed_quad
# == The following lines define the paramters for the problem == #
gamma, beta, sigma, mu = 2, 0.95, 0.0255, np.array([0.0113, -0.0016])
grid_y, grid_prior = np.linspace(7, 10, 15), np.linspace(0, 1, 5)
int_min, int_max = np.exp(- 7 * sigma), np.exp(+ 7 * sigma)
phi = lognorm(sigma)
f = np.array([[ 1.29824564, 1.29161017, 1.28379398, 1.2676886, 1.15320819],
[ 1.26290108, 1.26147364, 1.24755837, 1.23819851, 1.11912802],
[ 1.22847276, 1.23013194, 1.22128198, 1.20996971, 1.0864706 ],
[ 1.19528104, 1.19645792, 1.19056084, 1.17980572, 1.05532966],
[ 1.16344832, 1.16279841, 1.15997191, 1.15169942, 1.02564429],
[ 1.13301675, 1.13109952, 1.12883038, 1.1236645, 0.99730795],
[ 1.10398195, 1.10125013, 1.0988554, 1.09612933, 0.97019688],
[ 1.07630046, 1.07356297, 1.07126087, 1.06878758, 0.94417658],
[ 1.04989686, 1.04728542, 1.04514962, 1.04289665, 0.91910765],
[ 1.02467087, 1.0221532, 1.02011384, 1.01797238, 0.89485162],
[ 1.00050447, 0.99795025, 0.99576917, 0.99330549, 0.87127677],
[ 0.97726849, 0.97443288, 0.97190614, 0.96861352, 0.84826362],
[ 0.95482612, 0.94783816, 0.94340077, 0.93753641, 0.82569922],
[ 0.93302433, 0.91985497, 0.9059118, 0.88895196, 0.80348449],
[ 0.91165997, 0.88253486, 0.86126688, 0.84769975, 0.78147382]])
# == Calculate function h, Used in the loop below == #
E0 = np.exp((1-gamma)*mu + (1-gamma)**2*sigma**2/2)
h = np.outer(beta*grid_y**(1-gamma), grid_prior*E0[0] + (1-grid_prior)*E0[1])
def integrate(g):
"""
This function is repeatedly called in the loop below
"""
integrand = lambda z: g(z) * phi.pdf(z)
result = fixed_quad(integrand, int_min, int_max, n=15)[0]
return result
def Approximationf(f, x):
"""
This function approximates the function f and is repeatedly called in
the loop
"""
# == simplify notation == #
fApprox = np.empty((x.shape[1]))
lower, middle = (x[0] < grid_y[0]), (x[0] >= grid_y[0]) & (x[0] <= grid_y[-1])
upper = (x[0] > grid_y[-1])
# = Calculate Polynomial == #
y_tile = np.tile(grid_y, len(grid_prior))
prior_repeat = np.repeat(grid_prior, len(grid_y))
s = interpolate.SmoothBivariateSpline(y_tile, prior_repeat,
f.T.flatten(), kx=5, ky=5)
# == interpolation == #
fApprox[middle] = s(x[0, middle], x[1, middle])[:, 0]
# == Extrapolation == #
if any(lower):
s0 = s(lower[lower]*grid_y[0], x[1, lower])[:, 0]
s1 = s(lower[lower]*grid_y[1], x[1, lower])[:, 0]
slope_lower = (s0 - s1)/(grid_y[0] - grid_y[1])
fApprox[lower] = s0 + slope_lower*(x[0, lower] - grid_y[0])
if any(upper):
sM1 = s(upper[upper]*grid_y[-1], x[1, upper])[:, 0]
sM2 = s(upper[upper]*grid_y[-2], x[1, upper])[:, 0]
slope_upper = (sM1 - sM2)/(grid_y[-1] - grid_y[-2])
fApprox[upper] = sM1 + slope_upper*(x[0, upper] - grid_y[-1])
return fApprox
def updated_posterior(prior, y, yPrime):
"""
This function calculates the posterior weights put on each distribution.
It is the thrid function repeatedly called in the loop below.
"""
z_0 = yPrime/(y * np.exp(mu[0]))
z_1 = yPrime/(y * np.exp(mu[1]))
l0, l1 = phi.pdf(z_0), phi.pdf(z_1)
posterior = l0*prior / (l0*prior + l1*(1-prior))
return posterior
Tf = np.empty_like(f)
Af = lambda x: Approximationf(f, x)
# == Apply the T operator to f == #
for idxp, prior in enumerate(grid_prior):
for idxy, y in enumerate(grid_y):
posterior = lambda yPrime: updated_posterior(prior, y, yPrime)
integrateL = integrate(lambda z: Af(np.array([y*np.exp(mu[0])*z,
posterior(y*np.exp(mu[0]) * z)])))
integrateH = integrate(lambda z: Af(np.array([y*np.exp(mu[1])*z,
posterior(y * np.exp(mu[1])*z)])))
Tf[idxy, idxp] = (h[idxy, idxp] +
beta * ((prior * integrateL) +
(1-prior)*integrateH))
Some experience with multiprocessing Following reptilicus comment, I decided to investigate how to use the multiprocessing module. My idea was to begin by parallizing the computation of the intergrateL array. To do so, I fixed the outer loop to prior =0.5 and wanted to iterate over the inner loop, grid_y. However, I still have to take into consideration that intergrateL is a lambda function in z. I tried to follow the advice of the stack-overflow question "How to let Pool.map take a lambda function" and wrote the following code:
prior = 0.5
Af = lambda x: Approximationf(f, x)
class Iteration(object):
def __init__(self,state):
self.y = state
def __call__(self,z):
Af(np.array([self.y*np.exp(mu[0])*z,
updated_posterior(prior,
self.y,self.y*np.exp(mu[0])*z)]))
with Pool(processes=4) as pool:
out = pool.map(Iteration(y), np.nditer(grid_y))
Unfortunately, python returns upon running the program:
IndexError: tuple index out of range
On first sight, these sniffs like a trivial error, but I cannot remedy it. Does somebody has an idea how to tackle the problem? Again, I'm grateful for any advice I receive!
I would target that nested loop, something like this. This is psuedo-code but it should get you started.
def do_calc(idxp, idxy, y, prior):
posterior = lambda yPrime: updated_posterior(prior, y, yPrime)
integrateL = integrate(lambda z: Af(np.array([y*np.exp(mu[0])*z,
posterior(y*np.exp(mu[0]) * z)])))
integrateH = integrate(lambda z: Af(np.array([y*np.exp(mu[1])*z,
posterior(y * np.exp(mu[1])*z)])))
return (idxp, idyy, posterior, integrateL, integrateH)
pool = multiprocessing.pool(8) # or however many cores you have
results = []
# This is the part that I would try to parallelize
for idxp, prior in enumerate(grid_prior):
for idxy, y in enumerate(grid_y):
results.append(pool.apply_async(do_calc, args=(idxpy, idxy, y, prior))
pool.close()
pool.join()
results = [r.get() for r in results]
for r in results:
Tf[r[0], r[1] = (h[r[0], r[1]] +
beta * ((prior * r[3]) +
(1-prior)*r[4))

3D distance vectorization

I need help vectorizing this code. Right now, with N=100, its takes a minute or so to run. I would like to speed that up. I have done something like this for a double loop, but never with a 3D loop, and I am having difficulties.
import numpy as np
N = 100
n = 12
r = np.sqrt(2)
x = np.arange(-N,N+1)
y = np.arange(-N,N+1)
z = np.arange(-N,N+1)
C = 0
for i in x:
for j in y:
for k in z:
if (i+j+k)%2==0 and (i*i+j*j+k*k!=0):
p = np.sqrt(i*i+j*j+k*k)
p = p/r
q = (1/p)**n
C += q
print '\n'
print C
The meshgrid/where/indexing solution is already extremely fast. I made it about 65 % faster. This is not too much, but I explain it anyway, step by step:
It was easiest for me to approach this problem with all 3D vectors in the grid being columns in one large 2D 3 x M array. meshgrid is the right tool for creating all the combinations (note that numpy version >= 1.7 is required for a 3D meshgrid), and vstack + reshape bring the data into the desired form. Example:
>>> np.vstack(np.meshgrid(*[np.arange(0, 2)]*3)).reshape(3,-1)
array([[0, 0, 1, 1, 0, 0, 1, 1],
[0, 0, 0, 0, 1, 1, 1, 1],
[0, 1, 0, 1, 0, 1, 0, 1]])
Each column is one 3D vector. Each of these eight vectors represents one corner of a 1x1x1 cube (a 3D grid with step size 1 and length 1 in all dimensions).
Let's call this array vectors (it contains all 3D vectors representing all points in the grid). Then, prepare a bool mask for selecting those vectors fulfilling your mod2 criterion:
mod2bool = np.sum(vectors, axis=0) % 2 == 0
np.sum(vectors, axis=0) creates an 1 x M array containing the element sum for each column vector. Hence, mod2bool is a 1 x M array with a bool value for each column vector. Now use this bool mask:
vectorsubset = vectors[:,mod2bool]
This selects all rows (:) and uses boolean indexing for filtering the columns, both are fast operations in numpy. Calculate the lengths of the remaining vectors, using the native numpy approach:
lengths = np.sqrt(np.sum(vectorsubset**2, axis=0))
This is quite fast -- however, scipy.stats.ss and bottleneck.ss can perform the squared sum calculation even faster than this.
Transform the lengths using your instructions:
with np.errstate(divide='ignore'):
p = (r/lengths)**n
This involves finite number division by zero, resulting in Infs in the output array. This is entirely fine. We use numpy's errstate context manager for making sure that these zero divisions do not throw an exception or a runtime warning.
Now sum up the finite elements (ignore the infs) and return the sum:
return np.sum(p[np.isfinite(p)])
I have implemented this method two times below. Once exactly like just explained, and once involving bottleneck's ss and nansum functions. I have also added your method for comparison, and a modified version of your method that skips the np.where((x*x+y*y+z*z)!=0) indexing, but rather creates Infs, and finally sums up the isfinite way.
import sys
import numpy as np
import bottleneck as bn
N = 100
n = 12
r = np.sqrt(2)
x,y,z = np.meshgrid(*[np.arange(-N, N+1)]*3)
gridvectors = np.vstack((x,y,z)).reshape(3, -1)
def measure_time(func):
import time
def modified_func(*args, **kwargs):
t0 = time.time()
result = func(*args, **kwargs)
duration = time.time() - t0
print("%s duration: %.3f s" % (func.__name__, duration))
return result
return modified_func
#measure_time
def method_columnvecs(vectors):
mod2bool = np.sum(vectors, axis=0) % 2 == 0
vectorsubset = vectors[:,mod2bool]
lengths = np.sqrt(np.sum(vectorsubset**2, axis=0))
with np.errstate(divide='ignore'):
p = (r/lengths)**n
return np.sum(p[np.isfinite(p)])
#measure_time
def method_columnvecs_opt(vectors):
# On my system, bn.nansum is even slightly faster than np.sum.
mod2bool = bn.nansum(vectors, axis=0) % 2 == 0
# Use ss from bottleneck or scipy.stats (axis=0 is default).
lengths = np.sqrt(bn.ss(vectors[:,mod2bool]))
with np.errstate(divide='ignore'):
p = (r/lengths)**n
return bn.nansum(p[np.isfinite(p)])
#measure_time
def method_original(x,y,z):
ind = np.where((x+y+z)%2==0)
x = x[ind]
y = y[ind]
z = z[ind]
ind = np.where((x*x+y*y+z*z)!=0)
x = x[ind]
y = y[ind]
z = z[ind]
p=np.sqrt(x*x+y*y+z*z)/r
return np.sum((1/p)**n)
#measure_time
def method_original_finitesum(x,y,z):
ind = np.where((x+y+z)%2==0)
x = x[ind]
y = y[ind]
z = z[ind]
lengths = np.sqrt(x*x+y*y+z*z)
with np.errstate(divide='ignore'):
p = (r/lengths)**n
return np.sum(p[np.isfinite(p)])
print method_columnvecs(gridvectors)
print method_columnvecs_opt(gridvectors)
print method_original(x,y,z)
print method_original_finitesum(x,y,z)
This is the output:
$ python test.py
method_columnvecs duration: 1.295 s
12.1318801965
method_columnvecs_opt duration: 1.162 s
12.1318801965
method_original duration: 1.936 s
12.1318801965
method_original_finitesum duration: 1.714 s
12.1318801965
All methods produce the same result. Your method becomes a bit faster when doing the isfinite style sum. My methods are faster, but I would say that this is an exercise of academic nature rather than an important improvement :-)
I have one question left: you were saying that for N=3, the calculation should produce a 12. Even yours doesn't do this. All methods above produce 12.1317530867 for N=3. Is this expected?
Thanks to #Bill, I was able to get this to work. Very fast now. Perhaps could be done better, especially with the two masks to get rid of the two conditions that I originally had for loops for.
from __future__ import division
import numpy as np
N = 100
n = 12
r = np.sqrt(2)
x, y, z = np.meshgrid(*[np.arange(-N, N+1)]*3)
ind = np.where((x+y+z)%2==0)
x = x[ind]
y = y[ind]
z = z[ind]
ind = np.where((x*x+y*y+z*z)!=0)
x = x[ind]
y = y[ind]
z = z[ind]
p=np.sqrt(x*x+y*y+z*z)/r
ans = (1/p)**n
ans = np.sum(ans)
print 'ans'
print ans

Categories