Optimization of loop for circular matrices operations

Optimization of loop for circular matrices operations - python

I have a vector X_k and a matrix Y_{k,j}, where k = (k,...,K) and j = (1, .. J). They are circular, that means that
X_{k+K} = X_k, Y_{k+K,j} = Y_{k,j} and Y_{k,j+J} = Y_{k,j}
I want to compute a Zx vector and Zy matrix according to the following expressions:
Zx = -X_{k-1}*(X_{K-2}-X_{K+1})-X_k
Zy = -Y_{k,j+1}*(Y_{k,j+2}-Y_{k,j-1})-Y_{k,j}+X_k
Currently, I'm doing this through a loop, where first I compute the edge cases (for X: k = 1, 2, K. For Y: j = 1, J, J-1). And for the others, I use the formula.
I'm wondering if this calculus can be vectorized. Here is the example code.
import numpy as np
np.random.seed(10)
K = 20
J = 10
# initial state (equilibrium)
x = np.random.uniform(size=K)
y = np.random.uniform(size=(K*J))
y = y.reshape(K,J)
# zy
zy = np.zeros((K*J))
zy = zy.reshape(K,J)
# Edge case of Y
for k in range(K):
zy[k,0] = -y[k,1]*(y[k,2]-y[k,J-1])-y[k,0]+ x[k]
zy[k,J-1] = -y[k,0]*(y[k,1]-y[k,J-2])-y[k,J-1]+ x[k]
zy[k,J-2] = -y[k,J-1]*(y[k,0]-y[k,J-3])-y[k,J-2]+ x[k]
# General Case of Y
for j in range(1,J-2):
zy[k,j] = -y[k,j+1]*(y[k,j+2]-y[k,j-1])-y[k,j]+ x[k]
# zx
zx = np.zeros(K)
# first the 3 edge cases: k = 1, 2, K
zx[0] = -x[K-1]*(-x[1] + x[K-2]) - x[0]
zx[1] = - x[0]*(-x[2] + x[K-1])- x[1]
zx[K-1] = -x[K-2]*(-x[0] + x[K-3]) - x[K-1]
# then the general case for X
for k in range(2, K-1)
zx[k] = -x[k-1]*(-x[k+1] + x[k-2]) - x[k]
print(zx)
print(zy)
I suspect that is possible to optimize with matrix operations but not sure if it is possible without the loops (at least for the edge cases).

From Code Review I get the correct answer. The way to vectorize this kind of operation is through np.roll().
For example, the Y variable would be:
zy = np.roll(y,-1,axis=1)*(np.roll(y,-2,axis=1)-np.roll(y,1,axis=1))- y +x[:,None]

Related

General minimal residual method with right-preconditioner of SSOR

I am trying to implement the algorithm of GMRES with right-preconditioner P for solving the linear system Ax = b . The code is running without error; however, it pops into unprecise result for me because the error I have is very large. For the GMRES method (without preconditioning matrix - remove P in the algorithm), the error I get is around 1e^{-12} and it converges with the same matrix.
import numpy as np
from scipy import sparse
import matplotlib.pyplot as plt
from scipy.linalg import norm as norm
import scipy.sparse as sp
from scipy.sparse import diags
"""The program is to split the matrix into D-diagonal; L: strictly lower matrix; U strictly upper matrix
satisfying: A = D - L - U """
def splitMat(A):
n,m = A.shape
if (n == m):
diagval = np.diag(A)
D = diags(diagval,0).toarray()
L = (-1)*np.tril(A,-1)
U = (-1)*np.triu(A,1)
else:
print("A needs to be a square matrix")
return (L,D,U)
"""Preconditioned Matrix for symmetric successive over-relaxation (SSOR): """
def P_SSOR(A,w):
## Split up matrix A:
L,D,U = splitMat(A)
Comp1 = (D - w*U)
Comp2 = (D - w*L)
Comp1inv = np.linalg.inv(Comp1)
Comp2inv = np.linalg.inv(Comp2)
P = w*(2-w)*np.matmul(Comp1inv, np.matmul(D,Comp2inv))
return P
"""GMRES_SSOR using right preconditioning P:
A - matrix of linear system Ax = b
x0 - initial guess
tol - tolerance
maxit - maximum iteration """
def myGMRES_SSOR(A,x0, b, tol, maxit):
matrixSize = A.shape[0]
e = np.zeros((maxit+1,1))
rr = 1
rstart = 2
X = x0
w = 1.9 ## in ssor
P = P_SSOR(A,w) ### preconditioned matrix
### Starting the GMRES ####
for rs in range(0,rstart+1):
### first check the residual:
if rr<tol:
break
else:
r0 = (b-A.dot(x0))
rho = norm(r0)
e[0] = rho
H = np.zeros((maxit+1,maxit))
Qcol = np.zeros((matrixSize, maxit+1))
Qcol[:,0:1] = r0/rho
for k in range(1, maxit+1):
### Arnodi procedure ##
Qcol[:,k] =np.matmul(np.matmul(A,P), Qcol[:,k-1]) ### This step applies P here:
for j in range(0,k):
H[j,k-1] = np.dot(np.transpose(Qcol[:,k]),Qcol[:,j])
Qcol[:,k] = Qcol[:,k] - (np.dot(H[j,k-1], Qcol[:,j]))
H[k,k-1] =norm(Qcol[:,k])
Qcol[:,k] = Qcol[:,k]/H[k,k-1]
### QR decomposition step ###
n = k
Q = np.zeros((n+1, n))
R = np.zeros((n, n))
R[0, 0] = norm(H[0:n+2, 0])
Q[:, 0] = H[0:n+1, 0] / R[0,0]
for j in range (0, n+1):
t = H[0:n+1, j-1]
for i in range (0, j-1):
R[i, j-1] = np.dot(Q[:, i], t)
t = t - np.dot(R[i, j-1], Q[:, i])
R[j-1, j-1] = norm(t)
Q[:, j-1] = t / R[j-1, j-1]
g = np.dot(np.transpose(Q), e[0:k+1])
Y = np.dot(np.linalg.inv(R), g)
Res= e[0:n] - np.dot(H[0:n, 0:n], Y[0:n])
rr = norm(Res)
#### second check on the residual ###
if rr < tol:
break
#### Updating the solution with the preconditioned matrix ####
X = X + np.matmul(np.matmul(P,Qcol[:, 0:k]), Y) ### This steps applies P here:
return X
######
A = np.random.rand(100,100)
x = np.random.rand(100,1)
b = np.matmul(A,x)
x0 = np.zeros((100,1))
maxit = 100
tol = 0.00001
x = myGMRES_SSOR(A,x0,b,tol,maxit)
res = b - np.matmul(A,x)
print(norm(res))
print("Solution with gmres\n", np.matmul(A,x))
print("---------------------------------------")
print("b matrix:", b)
I hope anyone could help me figure out this!!!

I'm not sure where you got you "Symmetric_successive_over-relaxation" SSOR code from, but it appears to be wrong. You also seem to be assuming that A is symmetric matrix, but in your random test case it is not.
Following SSOR's Wikipedia entry, I replaced your P_SSOR function with
def P_SSOR(A,w):
L,D,U = splitMat(A)
P = 2/(2-w) * (1/w*D+L)*np.linalg.inv(D)*(1/w*D+L).T
return P
and your test matrix with
A = np.random.rand(100,100)
A = A + A.T
and your code works up to a 12 digit residual error.

How can I code a matrix within a matrix using a loop?

So I have this 3x3 G matrix (not shown here, it's irrelevant to my problem) that I created using the two variables u (a vector, x - y) and the scalar k. x_j = (x_1 (j), x_2 (j), x_3 (j)) and y_j = (y_1 (j), y_2 (j), y_3 (j)). alpha_j is a 3x3 matrix. The A matrix is block diagonal matrix of size 3nx3n. I am having trouble with the W matrix. How do I code a matrix of size 3nx3n, where the (i,j)th block is the 3x3 matrix given by alpha_i*G_[ij]*alpha_j?? I am lost.
My alpha_j matrix also seems to be having some trouble. The loop keeps throwing me the error, "only length-1 arrays can be converted to Python scalars." pls help :/
def W(x, y, k, alpha, A):
u = x - y
n = x.shape[0]
W = np.zeros((3*n, 3*n))
for i in range(0, n-1):
for j in range(0, n-1):
#u = -np.array([[x[i,0] - x[j,0]], [x[i,1] - x[j,1]], [0]]) ??
W[i][j] = (alpha_j(alpha, A) * G(u, k) * alpha_j(alpha, A))
W[i][i] = np.zeros((n, n))
return W
def alpha_j(a, A):
alph = np.array([[0,0,0],[0,0,0],[0,0,0]],complex)
rho = np.random.rand(3,1)
for i in range(0, 2):
for j in range(0, 2):
alph[i][j] = (rho[i] * a * A[i][j])
return alph
#-------------------------------------------------------------------
x1 = np.array([[1], [2], [0]])
y1 = np.array([[4], [5], [0]])
# SYSTEM PARAMETERS
# incoming Wave angle
theta = 0 # can range from [0, 2pi)
# susceptibility
chi = 10 + 1j
# wavelength
lam = 0.5 # microns (values between .4-.7)
# frequency
k = (2 * np.pi)/lam # 1/microns
# volume
V_0 = (0.05)**3 # microns^3
# incoming wave vector
K = k * np.array([[0], [np.sin(theta)], [np.cos(theta)]])
# polarization vector
vecinc = np.array([[1], [0], [0]]) # (can choose any vector perpendicular to K)
# for the fixed alpha case
alpha = (V_0 * 3 * chi)/(chi + 3)
# 3 x 3 matrix
A = np.matlib.identity(3) # could be any symmetric matrix,
#-------------------------------------------------------------------
# TEST FUNCTIONS
test = G((x1-y1), k)
print(test)
w = W(x1, y1, k, alpha, A)
print(w)
Sometimes my W loops throws me the error, "can't set an array element with a sequence." But I need to set each array element in this arbitrary matrix W to the 3x3 matrix created by multiplying alpha by G...

To your question of how to create a new array with a block for each element, the following should do the trick:
G = np.random.random([3,3])
result = np.zeros([9,9])
num_blocks = 3
a = np.random.random([3,3])
b = np.random.random([3,3])
for i in range(G.shape[0]):
for j in range(G.shape[1]):
block_result = a*G[i,j]*b
for k in range(num_blocks):
for l in range(num_blocks):
result[3*i + k, 3*j + l] = block_result[i, j]
You should be able to generalize from there. I hope I've understood correctly.
EDIT: It looks like I haven't understood correctly. I'm leaving it in hopes it spurs you to an answer. The general idea is to generate ranges of indices to operate on, and then just operate on them directly. Slicing might be helpful, too.
Ah, you asked how to create a diagonal filled with blocks. In that case:
num_diagonal_blocks = 3 # for example
for block_dim in range(num_diagonal_blocks)
# do your block calculation...
for k in range(G.shape[0]):
for l in range(G.shape[1]):
result[3*block_dim + k, 3*block_dim + l] = # assign to element of block
I think that's nearly it.

how to do this interpolation formula faster?

I have a function of Everett interpolation and I'd like to make it a little bit faster than it is right now. It works very well b
x and y are the parameters: time and values.
xi is the time which I want to have interpolated value.
def everett(x,y,xi):
'''
function everett
INPUT:
x list float
y list float
xi float
RETURN:
yi float
'''
n = len(x) #interpolation degree
h = x[1]-x[0] #differences between epochs
D = np.zeros([n,n+1])
D[:,0] = x
D[:,1] = y
for j in range(1,n): #loop to each column
for i in range(0,n-j): #loop to cell within a column
D[i,j+1] = D[i+1,j] - D[i,j]
#Finding the value of u
for i in range(0,n):
u = ( xi - x[i] ) / h
if u == 0:
return y[i]
elif( u > 0 and u < 1.0 ):
break
if i == n-1:
return None
z = i
w = 1 - u
i = 0
yi = 0
m1 = u
m2 = w
for j in range(1,n+1,2):
yi += m1 * D[z+1-i,j] + m2 * D[z-i,j]
i = i + 1
m1 *= (u - i) * (u + i) / ((j+1)*(j+2))
m2 *= (w - i) * (w + i) / ((j+1)*(j+2))
if (z-i)<0 or (z+1-i)>(n-j-1):
break #//checks validity of index in the table
return yi
Thx!
EDIT: some modification using numpy
I change this part of code:
#Finding the value of u
for i in range(0,n):
u = ( xi - x[i] ) / h
if u == 0:
return y[i]
elif( u > 0 and u < 1.0 ):
break
if i == n-1:
return None
by this one:
#Finding the value of u
u = (xi - x) /h
u0 = np.where(u == 0)[0]
if u0.size:
return y[u0[0]]
i = np.where((u > 0) & (u < 1.0))[0]
if not i.size:
return None
z = i[0]
u = u[z]
the biggest problem I have right now is how to modify the last loop and the first loop where variable D is filled with values.
Any ideas?

Include numpy, put the data into numpy.array()s and use numpy operations. You'll simplify your code and get, potentially, orders of magnitude better performance. If you're comfortable with Matlab, you'll find numpy easy to learn.
Loops like
for i in range(0,n):
u = ( xi - x[i] ) / h
become simple one liners:
u = (xi - x) / h
(where x is an array, u will be an array and the - and / will do element-wise arithmetic if xi and h are numbers)
This even works for whole arrays. For example, a forward difference can be expressed in 1D as
Dx = X[1:] - x[:1]
The X[1:] means the elements of X excluding the first and X[:1] means the elements of X excluding the last.
You can do the same on N-dimensional arrays, eliminating nested loops.
I wrote this article a long time ago, but it's still relevant. You'll see where I use numpy to speed up a finite difference calculation on a mesh (solving the 2D diffusion equation) while also simplifying the code: http://www.timteatro.net/2010/10/29/performance-python-solving-the-2d-diffusion-equation-with-numpy/
If I get a chance, I will come back and help you work on your specific code. But actually, I think this algorithm is a perfect project to introduce yourself to numpy.
And, if you're more interested in the result than you are the method, SciPy (an extension of numpy) has interpolation functions:
http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html

Using Horner's method to find P(x) and P'(x)

My textbook gives the pseudo code for Horner's method as follows:
P(x) = a_n x n + a_n−1 x n−1 + ··· + a_1 x + a_0 = (x−x0)Q(x) + b0
INPUT degree n; coefficients a_0 , a_1 , . . . , a_n ; x0 .
OUTPUT y = P(x 0 ); z = P (x 0 ).
Step 1 Set y = a_n ; (Compute b n for P.)
z = a_n . (Compute b n−1 for Q.)
Step 2 For j = n − 1, n − 2, . . . , 1
set y = x0 * y + a_j ; (Compute b_j for P.)
z = x0 * z + y. (Compute b_j−1 for Q.)
Step 3 Set y = x0 + y + a_0 .
Step 4 OUTPUT (y, z);
STOP.
Now the issue I have here is that the subscript in the pseudo code does not seem to match the subscripts in the formula
I did a python implementation, but the the answers I got seemed wrong, so I changed it slightly
def horner(x0, *a):
'''
Horner's method is an algorithm to calculate a polynomial at
f(x0) and f'(x0)
x0 - The value to avaluate
a - An array of the coefficients
The degree is the polynomial is set equal to the number of coefficients
'''
n = len(a)
y = a[0]
z = a[0]
for j in range(1, n):
y = x0 * y + a[j]
z = x0 * z + y
y = x0 * y + a[-1]
print('P(x0) =', y)
print('P\'(x0) =', z)
It works pretty well, but I need someone with more experience in this regard to give it a once over.
As a very basic test I took the polynomial 2x^4 with a value of -2 for x.
To use it in the method I call it as such
horner(-2, 2, 0, 0 ,0)
The output does indeed look correct for this very simple problem. Since f(x) = 2x^4 then f(-2) = -32, but this is where my implementation gives a different result my answer is positive

I found the problem and I am adding the answer here since it may prove useful to someone in the future
Firstly there is a but in the algorithm, the loop must go up to n-1
def horner(x0, *a):
'''
Horner's method is an algorithm to calculate a polynomial at
f(x0) and f'(x0)
x0 - The value to avaluate
a - An array of the coefficients
The degree is the polynomial is set equal to the number of coefficients
'''
n = len(a)
y = a[0]
z = a[0]
for j in range(1, n - 1):
y = x0 * y + a[j]
z = x0 * z + y
y = x0 * y + a[-1]
print('P(x0) =', y)
print('P\'(x0) =', z)
Second, and this was my biggest mistake, I am not passing enough coefficients to the method
This will calculate 2x^3
horner(-2, 2, 0, 0, 0)
I actually had to call
horner(-2, 2, 0, 0, 0, 0)

Runge Kutta approximation Bessel function, second order differential equation

I'm trying to approximate the Bessel function with Runge-Kutta. I'm using RK4 here cause I don't know what the equations are for RK2 for a system of ODEs.....Basically it's not working, or at least it's not working very well, I don't know why.
Anyway it's written in python. Here is code:
from __future__ import division
from pylab import*
#This literally just makes a matrix
def listoflists(rows,cols):
return [[0]*cols for i in range(rows)]
def f(x,Jm,Zm,m):
def rm(x):
return (m**2 - x**2)/(x**2)
def q(x):
return 1/x
return rm(x)*Jm-q(x)*Zm
n = 100 #No Idea what to set this to really
m = 1 #Bessel function order; computes up to m (i.e. 0,1,2,...m)
interval = [.01, 10] #Interval in x
dx = (interval[1]-interval[0])/n #Step size
x = zeros(n+1)
Z = listoflists(m+1,n+1) #Matrix: Rows are Function order, Columns are integration step (i.e. function value at xn)
J = listoflists(m+1,n+1)
x[0] = interval[0]
x[n] = interval[1]
#This reproduces all the Runge-Kutta relations if you read 'i' as 'm' and 'j' as 'n'
for i in range(m+1):
#Initial Conditions, i is m
if i == 0:
J[i][0] = 1
Z[i][0] = 0
if i == 1:
J[i][0] = 0
Z[i][0] = 1/2
#Generate each Bessel function, j is n
for j in range(n):
x[j] = x[0] + j*dx
K1 = Z[i][j]
L1 = f(x[j],J[i][j],Z[i][j],i)
K2 = Z[i][j] + L1/2
L2 = f(x[j] + dx/2, J[i][j]+K1/2,Z[i][j]+L1/2,i)
K3 = Z[i][j] + L2/2
L3 = f(x[j] +dx/2, J[i][j] + K2/2, Z[i][j] + L2/2,i)
K4 = Z[i][j] + L3
L4 = f(x[j]+dx,J[i][j]+K3, Z[i][j]+L3,i)
J[i][j+1] = J[i][j]+(dx/6)*(K1+2*K2+2*K3+K4)
Z[i][j+1] = Z[i][j]+(dx/6)*(L1+2*L2+2*L3+L4)
plot(x,J[0][:])
show()

(To close this old question with at least a partial answer.) The immediate error is that the RK4 method is incorrectly implemented. For instance instead of
K2 = Z[i][j] + L1/2
L2 = f(x[j] + dx/2, J[i][j]+K1/2, Z[i][j]+L1/2, i)
it should be
K2 = Z[i][j] + L1*dx/2
L2 = f(x[j] + dx/2, J[i][j]+K1*dx/2, Z[i][j]+L1*dx/2, i)
that is, the slopes have to be scaled by the step size to get the correct update for the variables.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Optimization of loop for circular matrices operations - python

From Code Review I get the correct answer. The way to vectorize this kind of operation is through np.roll(). For example, the Y variable would be: zy = np.roll(y,-1,axis=1)*(np.roll(y,-2,axis=1)-np.roll(y,1,axis=1))- y +x[:,None]

Related

General minimal residual method with right-preconditioner of SSOR

How can I code a matrix within a matrix using a loop?

how to do this interpolation formula faster?

Using Horner's method to find P(x) and P'(x)

Runge Kutta approximation Bessel function, second order differential equation

Categories

Resources