I'm trying to construct a (p+1,n) matrix with the code below:
import numpy as np
p = 3
xh = np.linspace(0.5,1.5,p+1)
n = 7
x = np.linspace(0,2,n)
M = np.zeros([p+1,n])
l2 = 1
for i in range(len(x)):
for k in range(len(xh)):
for j in range(len(xh)):
if k != j:
l = (x[i]-xh[j])/(xh[k]-xh[j])
l2 *= l
elif k == j:
l = 1
l2 *= l
M[k][i]=l2
l2 = 1
print(M)
This method produces the matrix I want but is very slow (6 sec for p=40 and n=2000).
The matrix itself is a matrix of lagrange polynomials, for approximating some function. The nodal points, xh, are the points used in forming/calculating the interpolation of a function. They have the property that their values on the original function and the interpolation are always the same. The number of distinct nodal points (p+1) indicate the degree (p) of the polynomial for the Lagrange interpolation. The x points are where a function is to be evaluated. That could be the interpolation of the function or the function. This is the formula I'm following:
I don't know how a faster way to construct a matrix in numpy, other methods seem to keep going wrong when I apply it to the code I've got and I don't know enough to see why. What faster method can I use here?
Your code can be nicely compiled by decorating a function with #nb.njit from the numba package. Some minor redundant parts were removed.
import numpy as np
import numba as nb
#nb.njit
def test(p,n):
xh = np.linspace(0.5,1.5,p+1)
x = np.linspace(0,2,n)
M = np.zeros((p+1,n), dtype=nb.float64)
l2 = 1
for k in range(len(x)):
for i in range(len(xh)):
for j in range(len(xh)):
if i != j:
l = (x[k]-xh[j])/(xh[i]-xh[j])
else:
l = 1
l2 *= l
M[i][k]=l2
l2 = 1
return M
Benchmark for p=40, n=2000 on a 2-core colab instance. Array M was computed with your original code.
a = [0]
%timeit a[0] = test(40,2000)
np.testing.assert_allclose(M, a[0])
Runs in 5.57 ms per loop vs 2.24 s per loop or ~402x speed up.
Related
I'm creating a non-linear response to a series of random values from {-1, +1} using a simple Volterra kernel:
With a zero mean for a(k) values I would expect r(k) to have a zero mean as well for arbitrary w values. However, I get r(k) with an always positive mean value, while a mean for a(k) behaves as expected: is close to zero and changes sign from run to run.
Why don't I get a similar behavior for r(k)? Is it because a(k) are pseudo-random and two different values from a are not actually independent?
Here is a code that I use:
import numpy as np
import matplotlib.pyplot as plt
import itertools
# array of random values {-1, 1}
A = np.random.randint(2, size=10000)
A = [x*2 - 1 for x in A]
# array of random weights
M = 3
w = np.random.rand(int(M*(M+1)/2))
# non-linear response to random values
R = []
for i in range(M, len(A)):
vals = np.asarray([np.prod(x) for x in itertools.combinations_with_replacement(A[i-M:i], 2)])
R.append(np.dot(vals, w))
print(np.mean(A), np.var(A))
print(np.mean(R), np.var(R))
Edit:
Check on whether the quadratic form, which is employed by the kernel, is definite-positive fails (i.e. there are negative principal minors). The code to do the check:
import scipy.linalg as lin
wm = np.zeros((M,M))
w_index = 0
# check Sylvester's criterion
# reconstruct weights for quadratic form
for r in range(0,M):
for c in range(r,M):
wm[r,c] += w[w_index]/2
wm[c,r] += w[w_index]/2
w_index += 1
# check principal minors
for r in range(0,M):
if lin.det(wm[:r+1,:r+1])<0: print('found negative principal minor of order', r)
I'm not certain if this is the case for Volterra kernels, but many kernels are positive definite, and some kernels, such as covariance functions, do not admit values less than zero (e.g. Squared Exponential/RBF, Rational Quadratic, Matern kernels).
If these are not the cases for the Volterra kernel, you can also try changing the random seed to seed the RNG differently to check if this is still the case. Here is a looped version of your code that iterates over different random seeds:
import numpy as np
import matplotlib.pyplot as plt
import itertools
# Loop over random seeds
for i in range(10):
# Seed the RNG
np.random.seed(i)
# array of random values {-1, 1}
A = np.random.randint(2, size=10000)
A = [x*2 - 1 for x in A]
# array of random weights
M = 3
w = np.random.rand(int(M*(M+1)/2))
# non-linear response to random values
R = []
for i in range(M, len(A)):
vals = np.asarray([np.prod(x) for x in itertools.combinations_with_replacement(A[i-M:i], 2)])
R.append(np.dot(vals, w))
# Covert R to a numpy array to check for slicing
R = np.array(R)
print("A: ", np.mean(A), np.var(A))
print("R <= 0: ", R[R <= 0])
print("R: ", np.mean(R), np.var(R))
Running this, I get the following values:
A: 0.017 0.9997109999999997
R <= 0: []
R: 1.487637375177384 0.14880206863520892
A: -0.0012 0.9999985600000002
R <= 0: []
R: 2.28108226352669 0.5926651729251319
A: 0.0104 0.9998918400000001
R <= 0: []
R: 1.6138015284426408 0.9526360372883802
A: -0.0064 0.9999590399999999
R <= 0: []
R: 0.988332642595828 0.9650456000380685
A: 0.0026 0.9999932399999998
R <= 0: [-0.75835076 -0.75835076 -0.75835076 ... -0.75835076 -0.75835076
-0.75835076]
R: 0.7352258581171865 1.2668744674748733
A: -0.0048 0.9999769599999996
R <= 0: [-0.02201476 -0.29894937 -0.29894937 ... -0.02201476 -0.29894937
-0.02201476]
R: 0.7396699663779303 1.3844391355510492
A: -0.0012 0.9999985600000002
R <= 0: []
R: 2.4343947709617475 1.6377776468054106
A: -0.0052 0.99997296
R <= 0: []
R: 0.8778918601676095 0.07656607914368625
A: 0.0086 0.99992604
R <= 0: []
R: 2.3490174001719937 0.059871902764070624
A: 0.0046 0.9999788399999996
R <= 0: []
R: 1.7699147798471178 1.8049209966313247
So as you can see, R still has some negative values. My guess is that this occurs because your kernel is positive definite.
This question ended up being about math, and not programming. Nevertheless, this is my own answer.
Simply put, when indices of a(k-i) are equal, the variables in the resulting product are not independent (because they are equal). Such a product does not have a zero mean, hence the mean value of the whole equation is shifted into the positive range.
Formally, implemented function is a quadratic form, for which a mean value can be calculated by
where \mu and \Sigma are a vector of expected values and a covariance matrix for a vector A respectively.
Having a zero vector \mu leaves only the first part of this equation. The resulting estimate can be done with the following code. And it actually gives values that are close to the statistical results in the question.
# Estimate R mean
# sum weights in a main diagonal for quadratic form (matrix trace)
w_sum = 0
w_index = 0
for r in range(0,M):
for c in range(r,M):
if r==c: w_sum += w[w_index]
w_index += 1
Rmean_est = np.var(A) * w_sum
print(Rmean_est)
This estimate uses an assumption, that a elements with different indices are independent. Any implicit dependency due to the nature of pseudo-random generator, if present, probably gives only a slight change to the resulting estimate.
I'm trying to write a program that can solve the general regression formula:
So I'm trying to implement this matrix equation, is there anyway to do this such as to let the user decide how big it can be, without me making more and more if conditions (so just one piece of code that collapses to the matrix that the user wishes for)?
Code:
#Solving the general matrix for the coefficients
if 3 == n:
a = np.array([[np.sum(np.multiply(FL[1],FL[1])),np.sum(np.multiply(FL[1],FL[2]))],
[np.sum(np.multiply(FL[1],FL[2])),np.sum(np.multiply(FL[2],FL[2]))]])
b = np.array([np.sum(np.multiply(FL[0],FL[1])),np.sum(np.multiply(FL[0],FL[2]))])
x = np.linalg.solve(a, b)
if 4 == n:
a = np.array([[np.sum(np.multiply(FL[1],FL[1])),np.sum(np.multiply(FL[1],FL[2])),np.sum(np.multiply(FL[1],FL[3]))],
[np.sum(np.multiply(FL[1],FL[2])),np.sum(np.multiply(FL[2],FL[2])),np.sum(np.multiply(FL[2],FL[3]))],
[np.sum(np.multiply(FL[1],FL[3])),np.sum(np.multiply(FL[2],FL[3])),np.sum(np.multiply(FL[3],FL[3]))]])
b = np.array([np.sum(np.multiply(FL[0],FL[1])),np.sum(np.multiply(FL[0],FL[2])),np.sum(np.multiply(FL[0],FL[3]))])
x = np.linalg.solve(a, b)
1 In this code Phi_0 corresponds to FL[i=1] and FL[0] corresponds to y.
You can make the algorithm independent of the order of the polynomial. The easiest way is using for loops, although these will be slow (since they don't exploit NumPy's vectorization).
Here is a reproducible example with random data:
import numpy as np
# Order of polynomial
n = 5
# Random seed for reproducibility
np.random.seed(1)
# Input arrays
phi = np.random.random((100,n))
y = np.random.random(100)
# Output arrays
a = np.zeros((n,n))
b = np.zeros(n)
for i in range(n):
b[i] = np.sum(y * phi[:,i])
for j in range(i,n):
# Exploit that matrix is diagonal
a[i,j] = a[j,i] = np.sum(phi[:,i] * phi[:,j])
# Coefficients array
x = np.linalg.solve(a,b)
In Python, I have a matrix K of dimensions (N x N). I want to normalize K by dividing every entry K_ij by sqrt(K_(i,i)*K_(j,j)). What is a fast way to achieve this in Python without iterating through every entry?
My current solution is:
import numpy as np
K = np.random.rand(3,3)
diag = np.diag(K)
for i in range(np.shape(K)[0]):
for j in range(np.shape(K)[1]):
K[i,j] = K[i,j]/np.sqrt(diag[i]*diag[j])
Of course you have to iterate through every entry, at least internally. For square matrices:
K / np.sqrt(np.einsum('ii,jj->ij', K, K))
If the matrix is not square, you first have to define what should replace the "missing" values K[i,i] where i > j etc.
Alternative: use numba to leave your loop as is, get free speedup, and even avoid intermediate allocation:
#njit
def normalize(K):
M = np.empty_like(K)
m, n = K.shape
for i in range(m):
Kii = K[i,i]
for j in range(n):
Kjj = K[j,j]
M[i,j] = K[i,j] / np.sqrt(Kii * Kjj)
return M
I am trying to solve the following problem via a Finite Difference Approximation in Python using NumPy:
$u_t = k \, u_{xx}$, on $0 < x < L$ and $t > 0$;
$u(0,t) = u(L,t) = 0$;
$u(x,0) = f(x)$.
I take $u(x,0) = f(x) = x^2$ for my problem.
Programming is not my forte so I need help with the implementation of my code. Here is my code (I'm sorry it is a bit messy, but not too bad I hope):
## This program is to implement a Finite Difference method approximation
## to solve the Heat Equation, u_t = k * u_xx,
## in 1D w/out sources & on a finite interval 0 < x < L. The PDE
## is subject to B.C: u(0,t) = u(L,t) = 0,
## and the I.C: u(x,0) = f(x).
import numpy as np
import matplotlib.pyplot as plt
# definition of initial condition function
def f(x):
return x^2
# parameters
L = 1
T = 10
N = 10
M = 100
s = 0.25
# uniform mesh
x_init = 0
x_end = L
dx = float(x_end - x_init) / N
#x = np.zeros(N+1)
x = np.arange(x_init, x_end, dx)
x[0] = x_init
# time discretization
t_init = 0
t_end = T
dt = float(t_end - t_init) / M
#t = np.zeros(M+1)
t = np.arange(t_init, t_end, dt)
t[0] = t_init
# Boundary Conditions
for m in xrange(0, M):
t[m] = m * dt
# Initial Conditions
for j in xrange(0, N):
x[j] = j * dx
# definition of solution to u_t = k * u_xx
u = np.zeros((N+1, M+1)) # NxM array to store values of the solution
# finite difference scheme
for j in xrange(0, N-1):
u[j][0] = x**2 #initial condition
for m in xrange(0, M):
for j in xrange(1, N-1):
if j == 1:
u[j-1][m] = 0 # Boundary condition
else:
u[j][m+1] = u[j][m] + s * ( u[j+1][m] - #FDM scheme
2 * u[j][m] + u[j-1][m] )
else:
if j == N-1:
u[j+1][m] = 0 # Boundary Condition
print u, t, x
#plt.plot(t, u)
#plt.show()
So the first issue I am having is I am trying to create an array/matrix to store values for the solution. I wanted it to be an NxM matrix, but in my code I made the matrix (N+1)x(M+1) because I kept getting an error that the index was going out of bounds. Anyways how can I make such a matrix using numpy.array so as not to needlessly take up memory by creating a (N+1)x(M+1) matrix filled with zeros?
Second, how can I "access" such an array? The real solution u(x,t) is approximated by u(x[j], t[m]) were j is the jth spatial value, and m is the mth time value. The finite difference scheme is given by:
u(x[j],t[m+1]) = u(x[j],t[m]) + s * ( u(x[j+1],t[m]) - 2 * u(x[j],t[m]) + u(x[j-1],t[m]) )
(See here for the formulation)
I want to be able to implement the Initial Condition u(x[j],t[0]) = x**2 for all values of j = 0,...,N-1. I also need to implement Boundary Conditions u(x[0],t[m]) = 0 = u(x[N],t[m]) for all values of t = 0,...,M. Is the nested loop I created the best way to do this? Originally I tried implementing the I.C. and B.C. under two different for loops which I used to calculate values of the matrices x and t (in my code I still have comments placed where I tried to do this)
I think I am just not using the right notation but I cannot find anywhere in the documentation for NumPy how to "call" such an array so at to iterate through each value in the proposed scheme. Can anyone shed some light on what I am doing wrong?
Any help is very greatly appreciated. This is not homework but rather to understand how to program FDM for Heat Equation because later I will use similar methods to solve the Black-Scholes PDE.
EDIT: So when I run my code on line 60 (the last "else" that I use) I get an error that says invalid syntax, and on line 51 (u[j][0] = x**2 #initial condition) I get an error that reads "setting an array element with a sequence." What does that mean?
I have a system of equations in the form of A*x = B where [A] is a tridiagonal coefficient matrix. Using the Numpy solver numpy.linalg.solve I can solve the system of equations for x.
See example below of how I develop the tridiagonal [A] martix. the {B} vector, and solve for x:
# Solve system of equations with a tridiagonal coefficient matrix
# uses numpy.linalg.solve
# use Python 3 print function
from __future__ import print_function
from __future__ import division
# modules
import numpy as np
import time
ti = time.clock()
#---- Build [A] array and {B} column vector
m = 1000 # size of array, make this 8000 to see time benefits
A = np.zeros((m, m)) # pre-allocate [A] array
B = np.zeros((m, 1)) # pre-allocate {B} column vector
A[0, 0] = 1
A[0, 1] = 2
B[0, 0] = 1
for i in range(1, m-1):
A[i, i-1] = 7 # node-1
A[i, i] = 8 # node
A[i, i+1] = 9 # node+1
B[i, 0] = 2
A[m-1, m-2] = 3
A[m-1, m-1] = 4
B[m-1, 0] = 3
print('A \n', A)
print('B \n', B)
#---- Solve using numpy.linalg.solve
x = np.linalg.solve(A, B) # solve A*x = B for x
print('x \n', x)
#---- Elapsed time for each approach
print('NUMPY time', time.clock()-ti, 'seconds')
So my question relates to two sections of the above example:
Since I am dealing with a tridiagonal matrix for [A], also called a banded matrix, is there a more efficient way to solve the system of equations instead of using numpy.linalg.solve?
Also, is there a better way to create the tridiagonal matrix instead of using a for-loop?
The above example runs on Linux in about 0.08 seconds according to the time.clock() function.
The numpy.linalg.solve function works fine, but I'm trying to find an approach that takes advantage of the tridiagonal form of [A] in hopes of speeding up the solution even further and then apply that approach to a more complicated example.
There are two immediate performance improvements (1) do not use a loop, (2) use scipy.linalg.solve_banded().
I would write the code something more like
import scipy.linalg as la
# Create arrays and set values
ab = np.zeros((3,m))
b = 2*ones(m)
ab[0] = 9
ab[1] = 8
ab[2] = 7
# Fix end points
ab[0,1] = 2
ab[1,0] = 1
ab[1,-1] = 4
ab[2,-2] = 3
b[0] = 1
b[-1] = 3
return la.solve_banded ((1,1),ab,b)
There may be more elegant ways to construct the matrix, but this works.
Using %timeit in ipython the original code took 112 ms for m=1000. This code takes 2.94 ms for m=10,000, an order of magnitude larger problem yet still almost two orders of magnitude faster! I did not have the patience to wait on the original code for m=10,000. Most of the time in the original may be in constructing the array, I did not test this. Regardless, for large arrays it is much more efficient to only store the non-zero values of the matrix.
There is a scipy.sparse matrix type called scipy.sparse.dia_matrix which captures the structure of your matrix well (it will store 3 arrays, in "positions" 0 (diagonal), 1 (above) and -1 (below)). Using this type of matrix you can try scipy.sparse.linalg.lsqr for solving. If your problem has an exact solution, it will be found, otherwise it will find the solution in least squares sense.
from scipy import sparse
A_sparse = sparse.dia_matrix(A)
ret_values = sparse.linalg.lsqr(A_sparse, C)
x = ret_values[0]
However, this may not be completely optimal in terms of exploiting the triadiagonal structure, there may be a theoretical way of making this faster. What this conversion does do for you is cut down the matrix multiplication expenses to the essential: Only the 3 bands are used. This, in combination with the iterative solver lsqr should already yield a speedup.
Note: I am not proposing scipy.sparse.linalg.spsolve, because it converts your matrix to csr format. However, replacing lsqr with spsolve is worth a try, especially because spsolve can bind UMFPACK, see relevant doc on spsolve. Also, it may be of interest to take a look at this stackoverflow question and answer relating to UMFPACK
You could use scipy.linalg.solveh_banded.
EDIT: You CANNOT used the above as your matrix is not symmetric and I thought it was. However, as was mentioned above in the comment, the Thomas algorithm is great for this
a = [7] * ( m - 2 ) + [3]
b = [1] + [8] * ( m - 2 ) + [4]
c = [2] + [9] * ( m - 2 )
d = [1] + [2] * ( m - 2 ) + [3]
# This is taken directly from the Wikipedia page also cited above
# this overwrites b and d
def TDMASolve(a, b, c, d):
n = len(d) # n is the numbers of rows, a and c has length n-1
for i in xrange(n-1):
d[i+1] -= 1. * d[i] * a[i] / b[i]
b[i+1] -= 1. * c[i] * a[i] / b[i]
for i in reversed(xrange(n-1)):
d[i] -= d[i+1] * c[i] / b[i+1]
return [d[i] / b[i] for i in xrange(n)]
This code is not optimize nor does it use np, but if I (or any of the other fine folks here) have time, I will edit it so that it does those thing. It currently times at ~10 ms for m=10000.
This probably will help
There is a function creates_tridiagonal which will create tridiagonal matrix. There is another function which converts a matrix into diagonal ordered form as requested by SciPy solve_banded function.
import numpy as np
def lu_decomp3(a):
"""
c,d,e = lu_decomp3(a).
LU decomposition of tridiagonal matrix a = [c\d\e]. On output
{c},{d} and {e} are the diagonals of the decomposed matrix a.
"""
n = np.diagonal(a).size
assert(np.all(a.shape ==(n,n))) # check if square matrix
d = np.copy(np.diagonal(a)) # without copy (assignment destination is read-only) error is raised
e = np.copy(np.diagonal(a, 1))
c = np.copy(np.diagonal(a, -1))
for k in range(1,n):
lam = c[k-1]/d[k-1]
d[k] = d[k] - lam*e[k-1]
c[k-1] = lam
return c,d,e
def lu_solve3(c,d,e,b):
"""
x = lu_solve(c,d,e,b).
Solves [c\d\e]{x} = {b}, where {c}, {d} and {e} are the
vectors returned from lu_decomp3.
"""
n = len(d)
y = np.zeros_like(b)
y[0] = b[0]
for k in range(1,n):
y[k] = b[k] - c[k-1]*y[k-1]
x = np.zeros_like(b)
x[n-1] = y[n-1]/d[n-1] # there is no x[n] out of range
for k in range(n-2,-1,-1):
x[k] = (y[k] - e[k]*x[k+1])/d[k]
return x
from scipy.sparse import diags
def create_tridiagonal(size = 4):
diag = np.random.randn(size)*100
diag_pos1 = np.random.randn(size-1)*10
diag_neg1 = np.random.randn(size-1)*10
a = diags([diag_neg1, diag, diag_pos1], offsets=[-1, 0, 1],shape=(size,size)).todense()
return a
a = create_tridiagonal(4)
b = np.random.randn(4)*10
print('matrix a is\n = {} \n\n and vector b is \n {}'.format(a, b))
c, d, e = lu_decomp3(a)
x = lu_solve3(c, d, e, b)
print("x from our function is {}".format(x))
print("check is answer correct ({})".format(np.allclose(np.dot(a, x), b)))
## Test Scipy
from scipy.linalg import solve_banded
def diagonal_form(a, upper = 1, lower= 1):
"""
a is a numpy square matrix
this function converts a square matrix to diagonal ordered form
returned matrix in ab shape which can be used directly for scipy.linalg.solve_banded
"""
n = a.shape[1]
assert(np.all(a.shape ==(n,n)))
ab = np.zeros((2*n-1, n))
for i in range(n):
ab[i,(n-1)-i:] = np.diagonal(a,(n-1)-i)
for i in range(n-1):
ab[(2*n-2)-i,:i+1] = np.diagonal(a,i-(n-1))
mid_row_inx = int(ab.shape[0]/2)
upper_rows = [mid_row_inx - i for i in range(1, upper+1)]
upper_rows.reverse()
upper_rows.append(mid_row_inx)
lower_rows = [mid_row_inx + i for i in range(1, lower+1)]
keep_rows = upper_rows+lower_rows
ab = ab[keep_rows,:]
return ab
ab = diagonal_form(a, upper=1, lower=1) # for tridiagonal matrix upper and lower = 1
x_sp = solve_banded((1,1), ab, b)
print("is our answer the same as scipy answer ({})".format(np.allclose(x, x_sp)))