I'm trying to sum a two dimensional function using the array method, somehow, using a for loop is not outputting the correct answer. I want to find (in latex) $$\sum_{i=1}^{M}\sum_{j=1}^{M_2}\cos(i)\cos(j)$$ where according to Mathematica the answer when M=5 is 1.52725. According to the for loop:
def f(N):
s1=0;
for p1 in range(N):
for p2 in range(N):
s1+=np.cos(p1+1)*np.cos(p2+1)
return s1
print(f(4))
is 0.291927.
I have thus been trying to use some code of the form:
def f1(N):
mat3=np.zeros((N,N),np.complex)
for i in range(0,len(mat3)):
for j in range(0,len(mat3)):
mat3[i][j]=np.cos(i+1)*np.cos(j+1)
return sum(mat3)
which again
print(f1(4))
outputs 0.291927. Looking at the array we should find for each value of i and j a matrix of the form
mat3=[[np.cos(1)*np.cos(1),np.cos(2)*np.cos(1),...],[np.cos(2)*np.cos(1),...]...[np.cos(N+1)*np.cos(N+1)]]
so for N=4 we should have
mat3=[[np.cos(1)*np.cos(1) np.cos(2)*np.cos(1) ...] [np.cos(2)*np.cos(1) ...]...[... np.cos(5)*np.cos(5)]]
but what I actually get is the following
mat3=[[0.29192658+0.j 0.+0.j 0.+0.j ... 0.+0.j] ... [... 0.+0.j]]
or a matrix of all zeros apart from the mat3[0][0] element.
Does anybody know a correct way to do this and get the correct answer? I chose this as an example because the problem I'm trying to solve involves plotting a function which has been summed over two indices and the function that python outputs is not the same as Mathematica (i.e., a function of the form $$f(E)=\sum_{i=1}^{M}\sum_{j=1}^{M_2}F(i,j,E)$$).
The return statement is not indented correctly in your sample code. It returns immediately in the first loop iteration. Indent it on the function body instead, so that both for loops finish:
def f(N):
s1=0;
for p1 in range(N):
for p2 in range(N):
s1+=np.cos(p1+1)*np.cos(p2+1)
return s1
>>> print(f(5))
1.527247272700347
I have moved your code to a more numpy-ish version:
import numpy as np
N = 5
x = np.arange(N) + 1
y = np.arange(N) + 1
x = x.reshape((-1, 1))
y = y.reshape((1, -1))
mat = np.cos(x) * np.cos(y)
print(mat.sum()) # 1.5272472727003474
The trick here is to reshape x to a column and y to a row vector. If you multiply them, they are matched up like in your loop.
This should be more performant, since cos() is only called 2*N times. And it avoids loops (bad in python).
UPDATE (regarding your comment):
This pattern can be extended in any dimension. Basically, you get something like a crossproduct. Where every instance of x is matched up with every instance of y, z, u, k, ... Along the corresponding dimensions.
It's a bit confusing to describe, so here is some more code:
import numpy as np
N = 5
x = np.arange(N) + 1
y = np.arange(N) + 1
z = np.arange(N) + 1
x = x.reshape((-1, 1, 1))
y = y.reshape((1, -1, 1))
z = z.reshape((1, 1, -1))
mat = z**2 * np.cos(x) * np.cos(y)
# x along first axis
# y along second, z along third
# mat[0, 0, 0] == 1**2 * np.cos(1) * np.cos(1)
# mat[0, 4, 2] == 3**2 * np.cos(1) * np.cos(5)
If you use this for many dimensions, and big values for N, you will run into memory problems, though.
Related
The following function is written on Matlab. Now, I need to write an equivalent python function that will produce a similar output as Matlab. Can you help write the code, please?
function CORR=function_AutoCorr(tau,y)
% This function will generate a matrix, Where on-diagonal elements are autocorrelation and
% off-diagonal elements are cross-correlations
% y is the data set. e.g., a 10 by 9 Matrix.
% tau is the lag value. e.g. tau=1
Size=size(y);
N=Size(1,2); % Number of columns
T=Size(1,1); % length of the rows
for i=1:N
for j=1:N
temp1=0;
for t=1:T-tau
G=0.5*((y(t+tau,i)*y(t,j))+(y(t+tau,j)*y(t,i)));
temp1=temp1+G;
end
CORR(i,j)=temp1/(T-tau);
end
end
end
Assuming that y is a numpy Array, it would be pretty near something like (although I have not tested):
import numpy as np
def function_AutoCorr(tau, y):
Size = y.shape
N = Size[1]
T = Size[0]
CORR = np.zeros(shape=(N,N))
for i in range(N):
for j in range(N):
temp1 = 0
for t in range(T - tau):
G=0.5*((y[t+tau,i]*y[t,j])+(y[t+tau,j]*y[t,i]))
temp1 = temp1 + G
CORR[i, j] = temp1/(T - tau)
return CORR
y = np.array([[1,2,3], [4,5,6], [6,7,8], [13,14,15]])
print(y)
result = function_AutoCorr(1, y)
print(result)
The resulting CORR matrix for this example is:
If you want to run the function for different tau values, you could do, in Python:
result = [function_AutoCorr(tau, y) for tau in range(1, 11)]
The result will be a list of autocorrelation matrices, which are numpy arrays. This syntax is called a list comprehension.
You'll probably want to use NumPy. They even have a guide for Matlab users.
Here are some useful tips.
Defining a function
def auto_corr(tau, y):
"""Generate matrix of correlations"""
# Do calculations
return corr
Get the size of a numpy array
n_rows, n_cols = y.shape
Indexing
Indexing is 0-based and uses brackets ([]) instead of parentheses.
How can I write a recursive function to generate a vector X of size (1,n) as follows, where X_i is the i-th entry:
X_1 = Z_1 * E_1
X_i = max{B_(1,i) * X_1, ... , B_((i-1),i) * X_(i-1), Z_i} * E_i, i = 2,...,n,
where
Z = np.random.normal(0, 1,size = n)
E = np.random.lognormal(0, 1, size = n)
B = np.random.uniform(0,1,(n,n))
I do not have any experience with recursive functions, that is why I can not present any code with which I tried to solve this.
If you're working with numpy, then use all the power of numpy, not just the random module ;)
And if you work with vectors, then forget about recursion and use numpy's vectorised operations. For example, np.max gives you the maximum over an axis, np.dot gives you element-wise multiplication. You also have np.prod for the product of array elements over a given axis... Those are just examples that might fit your problem well. For a full documentation, https://docs.scipy.org/doc/numpy/
I got it, one does not need a recursion as #meowgoesthedog stated in the first comment.
import numpy as np
s=1000 # sample size
n=5
Z = np.random.normal(0, 1,size = (s,n))
B = np.random.uniform(0,1,(n,n))
E = np.random.lognormal(0, 1, size = (s,n))
X = np.zeros((s,n))
X[:,0] = Z[:,0]*E[:,0]
for k in range(s):
for l in range(1,n):
X[k,l] = max(np.max(X[k,:(l)] * B[:(l),l]), Z[k,l]) * E[k,l]
I have written the following code for creating a 2D array and filing the first element of each row. I am new to numpy. Is there a better way to do this?
y=np.zeros(N*T1).reshape(N,T1)
x = np.linspace(0,L,num = N)
for k in range(0,N):
y[k][0] = np.sin(PI*x[k]/L)
Yes, since numpy vectorizes operations, you can just do:
y[:,0] = np.sin(np.pi * x / L)
Note that y[:,0] grabs the first column of y (the : in the first coordinate essentially means "grab all rows", and the 0 in the second coordinate means "from the column at index 0" (ie the first column)). Since np.sin(np.pi * x / L) is also an array, you can assign the latter to the former directly.
This question is rather for codereview#stackexchange, but this snippet works!
import numpy as np
N = 1000 # arbitrary
T1 = 1000 # arbitrary
L = 10 # arbitrary
x = np.linspace(0,L,num = N)
# you don't need reshape here, give the size as a tuple!
y = np.zeros((N,T1))
# use a vectorized call here:
y[:,0] = np.sin(np.pi*x/L)
I started with this code to calculate a simple matrix multiplication. It runs with %timeit in around 7.85s on my machine.
To try to speed this up I tried cython which reduced the time to 0.4s. I want to also try to use numba jit compiler to see if I can get similar speed ups (with less effort). But adding the #jit annotation appears to give exactly the same timings (~7.8s). I know it can't figure out the types of the calculate_z_numpy() call but I'm not sure what I can do to coerce it. Any ideas?
from numba import jit
import numpy as np
#jit('f8(c8[:],c8[:],uint)')
def calculate_z_numpy(q, z, maxiter):
"""use vector operations to update all zs and qs to create new output array"""
output = np.resize(np.array(0, dtype=np.int32), q.shape)
for iteration in range(maxiter):
z = z*z + q
done = np.greater(abs(z), 2.0)
q = np.where(done, 0+0j, q)
z = np.where(done, 0+0j, z)
output = np.where(done, iteration, output)
return output
def calc_test():
w = h = 1000
maxiter = 1000
# make a list of x and y values which will represent q
# xx and yy are the co-ordinates, for the default configuration they'll look like:
# if we have a 1000x1000 plot
# xx = [-2.13, -2.1242,-2.1184000000000003, ..., 0.7526000000000064, 0.7584000000000064, 0.7642000000000064]
# yy = [1.3, 1.2948, 1.2895999999999999, ..., -1.2844000000000058, -1.2896000000000059, -1.294800000000006]
x1, x2, y1, y2 = -2.13, 0.77, -1.3, 1.3
x_step = (float(x2 - x1) / float(w)) * 2
y_step = (float(y1 - y2) / float(h)) * 2
y = np.arange(y2,y1-y_step,y_step,dtype=np.complex)
x = np.arange(x1,x2,x_step)
q1 = np.empty(y.shape[0],dtype=np.complex)
q1.real = x
q1.imag = y
# Transpose y
x_y_square_matrix = x+y[:, np.newaxis] # it is np.complex128
# convert square matrix to a flatted vector using ravel
q2 = np.ravel(x_y_square_matrix)
# create z as a 0+0j array of the same length as q
# note that it defaults to reals (float64) unless told otherwise
z = np.zeros(q2.shape, np.complex128)
output = calculate_z_numpy(q2, z, maxiter)
print(output)
calc_test()
I figured out how to do this with some help from someone else.
#jit('i4[:](c16[:],c16[:],i4,i4[:])',nopython=True)
def calculate_z_numpy(q, z, maxiter,output):
"""use vector operations to update all zs and qs to create new output array"""
for iteration in range(maxiter):
for i in range(len(z)):
z[i] = z[i] + q[i]
if z[i] > 2:
output[i] = iteration
z[i] = 0+0j
q[i] = 0+0j
return output
What I learnt is that use numpy datastructures as inputs (for typing), but within use c like paradigms for looping.
This runs in 402ms which is a touch faster than cython code 0.45s so for fairly minimal work in rewriting the loop explicitly we have a python version faster than C(just).
Is there a way to speed up a double loop that updates its values from the previous iteration?
In code:
def calc(N, m):
x = 1.0
y = 2.0
container = np.zeros((N, 2))
for i in range(N):
for j in range(m):
x=np.random.gamma(3,1.0/(y*y+4))
y=np.random.normal(1.0/(x+1),1.0/sqrt(x+1))
container[i, 0] = x
container[i, 1] = y
return container
calc(10, 5)
As you can see, the inner loop is updating variables x and y while the outer loop starts with a different value of x each time. I don't think this is vectorizable but maybe there are other possible improvements.
Thanks!
I don't think it's going to add up to any important speed up, but you can save some function calls if you generate all your gamma and normally distributed random values at once.
Gamma functions have a scaling property, so that if you draw a value x from a gamma(k, 1) distribution, then c*x will be a value drawn from a gamma(k, c) distribution. Similarly, with the normal distribution, you can take a y value drawn from a normal(0, 1) distribution and convert it into a value drawn from a normal(m, s) distribution doing x*s + m. So you can rewrite your function as follows:
def calc(N, m):
x = 1.0
y = 2.0
container = np.zeros((N, 2))
nm = N*m
gamma_vals = np.random.gamma(3, 1, size=(nm,))
norm_vals = np.random.normal(0, 1, size=(nm,))
for i in xrange(N):
for j in xrange(m):
ij = i*j
x = gamma_vals[ij] / (y*y+4)
y = norm_vals[ij]/np.sqrt(x+1) + 1/(x+1)
container[i, 0] = x
container[i, 1] = y
return container
If the actual parameters of your distributions had a simpler expression, you may actually be able to use some elaborate form of np.cumprod or the like, and spare yourself the loops. I am not able to figure out a way of doing so...
Does this work?
for i in xrange(N):
# xrange is an iterator, range makes a new list.
# You save linear space and `malloc`ing time by doing this
x += m*y # a simple algebra hack. Compute this line of the loop just once instead of `m` times
y -= m*x
y *= -1 # another simple algebra hack. Compute this line of the loop just once instead of `m` times
container[i,0] = x
container[i,1] = y
return container