making python code faster using numba, cache or any other optimization

making python code faster using numba, cache or any other optimization - python

I have defined the following function
def laplacian_2D_array(func_2D):
func_2nd_derv_x_fin2D = np.zeros((N,N))
for j in range (0,N):
func_CD2_x_list = []
for i in range (0,N):
value = func_2D[i][j] #func_2D is a NxN matrix
func_CD2_x_list.append(value)
func_CD2_x_array = np.array (func_CD2_x_list)[np.newaxis]
func_2nd_derv_x = matrix_R # (np.transpose(func_CD2_x_array))
func_2nd_derv_x_fin2D[j] = np.transpose (np.reshape(func_2nd_derv_x,[N,]))
func_2nd_derv_y_fin2D = np.zeros((N,N))
for i in range (0,N):
func_CD2_y_list = []
for j in range (0,N):
value = func_2D[i][j]
func_CD2_y_list.append(value)
func_CD2_y_array = np.array (func_CD2_y_list)[np.newaxis]
func_2nd_derv_y = matrix_S # (np.transpose(func_CD2_y_array))
func_2nd_derv_y_fin2D[i] = (np.reshape(func_2nd_derv_y,[N,]))
return (np.add(func_2nd_derv_x_fin2D, func_2nd_derv_y_fin2D))
In the above code a 2D matrix func_2Deach j row is extracted as column vector and multiplied with matrix_R and stored in jth column of func_2nd_derv_x_fin2D and similarly of 2nd block of code and finally function returns the addition of func_2nd_derv_x_fin2D and func_2nd_derv_y_fin2D
In this N = 401 and matrix_S and matrix_Rare also N X N matrices. This function is being called multiple times in a while loop and execution of single iteration is taking a lot of time. I have tried #njit to make it faster but I am not successful in doing so and getting errors. I have also tried using cache.
How can we optimise for lists and arrays in this and what are other ways to optimize the defined function?
I am showing the code where the function is used.
while (time<timemax):
#Analytical Solution----------------------------------------------------------------------------
exact_time = time/a_sec
omega_t = np.zeros((N,N))
psi_t = np.zeros((N,N))
for i in range (0,N):
for j in range (0,N):
psi_t[i][j] = np.sin(x_list[i]) * np.sin(y_list[j]) * np.exp((-2*exact_time)/Re)
omega_t[i][j] = 2*np.sin (x_list[i]) * np.sin(y_list[j]) * np.exp((-2*exact_time)/Re)
.
.
.
.
.
.
.
# BiCGSTAB algo
x0 = psi_0 #initial guess--> psi of previous time step
r0 = omega_0 - laplacian_2D_array(x0) # r0 = b-Ax0
r0_hat = r0
rho_0 = 1
alpha = 1
w0 = 1
v0 = np.zeros((N,N))
P_0 = np.zeros((N,N))
tol = 10 ** (-7)
iteration = 0
while ((np.max(np.abs(laplacian_2D_array(x0) - omega_0))) < tol):
rho_prev = rho_0
rho = np.dot ((np.reshape(r0_hat,(N**2,1))),(np.reshape(r0,(N**2,1))))
beta = (rho/rho_prev) * (alpha/w0)
P_0 = r0 + beta * (P_0 - w0 * v0)
v0 = laplacian_2D_array(P_0)
alpha = rho / np.dot ((np.reshape(r0_hat,(N**2,1))),(np.reshape(v0,(N**2,1))))
s = r0 - alpha * v0
t = laplacian_2D_array(s)
Any suggestions to fasten code is highly appreciable.

It turns out the complicated code of laplacian_2D_array can be simplified as the following implementation:
def laplacian_2D_array(func_2D):
return (matrix_R # func_2D + matrix_S # func_2D.T).T
This is 63 times faster on my machine on random matrices based on your provided inputs. Most of the time should be spent in the matrix multiplication (performed very efficiently in parallel if the installation of Numpy/Python/BLAS is done correctly on the target platform).

Related

how to implement least square polynomial with no built in methods using python?

currently running into a problem solving this.
The objective of the exercise given is to find a polynom of certian degree (the degree is given) from a dataset of points (that can be noist) and to best fit it using least sqaure method.
I don't understand the steps that lead to solving the linear equations?
what are the steps or should anyone provide such a python program that lead to the matrix that I put as an argument in my decomposition program?
Note:I have a python program for cubic splines ,LU decomposition/Guassian decomposition.
Thanks.
I tried to apply guassin / LU decomposition straight away on the dataset but I understand there are more steps to the solution...
I donwt understand how cubic splines add to the mix either..
Edit:
guassian elimintaion :
import numpy as np
import math
def swapRows(v,i,j):
if len(v.shape) == 1:
v[i],v[j] = v[j],v[i]
else:
v[[i,j],:] = v[[j,i],:]
def swapCols(v,i,j):
v[:,[i,j]] = v[:,[j,i]]
def gaussPivot(a,b,tol=1.0e-12):
n = len(b)
# Set up scale factors
s = np.zeros(n)
for i in range(n):
s[i] = max(np.abs(a[i,:]))
for k in range(0,n-1):
# Row interchange, if needed
p = np.argmax(np.abs(a[k:n,k])/s[k:n]) + k
if abs(a[p,k]) < tol: error.err('Matrix is singular')
if p != k:
swapRows(b,k,p)
swapRows(s,k,p)
swapRows(a,k,p)
# Elimination
for i in range(k+1,n):
if a[i,k] != 0.0:
lam = a[i,k]/a[k,k]
a[i,k+1:n] = a[i,k+1:n] - lam*a[k,k+1:n]
b[i] = b[i] - lam*b[k]
if abs(a[n-1,n-1]) < tol: error.err('Matrix is singular')
# Back substitution
b[n-1] = b[n-1]/a[n-1,n-1]
for k in range(n-2,-1,-1):
b[k] = (b[k] - np.dot(a[k,k+1:n],b[k+1:n]))/a[k,k]
return b
def polyFit(xData,yData,m):
a = np.zeros((m+1,m+1))
b = np.zeros(m+1)
s = np.zeros(2*m+1)
for i in range(len(xData)):
temp = yData[i]
for j in range(m+1):
b[j] = b[j] + temp
temp = temp*xData[i]
temp = 1.0
for j in range(2*m+1):
s[j] = s[j] + temp
temp = temp*xData[i]
for i in range(m+1):
for j in range(m+1):
a[i,j] = s[i+j]
return gaussPivot(a,b)
degree = 10 # can be any degree
polyFit(xData,yData,degree)
I was under the impression the code above gets a dataset of points and a degree. The output should be coeefients of a polynom that fits those points but I have a grader that was provided by my proffesor , and after checking the grading the polynom that returns has a lrage error.
After that I tried the following LU decomposition instead:
import numpy as np
def swapRows(v,i,j):
if len(v.shape) == 1:
v[i],v[j] = v[j],v[i]
else:
v[[i,j],:] = v[[j,i],:]
def swapCols(v,i,j):
v[:,[i,j]] = v[:,[j,i]]
def LUdecomp(a,tol=1.0e-9):
n = len(a)
seq = np.array(range(n))
# Set up scale factors
s = np.zeros((n))
for i in range(n):
s[i] = max(abs(a[i,:]))
for k in range(0,n-1):
# Row interchange, if needed
p = np.argmax(np.abs(a[k:n,k])/s[k:n]) + k
if abs(a[p,k]) < tol: error.err('Matrix is singular')
if p != k:
swapRows(s,k,p)
swapRows(a,k,p)
swapRows(seq,k,p)
# Elimination
for i in range(k+1,n):
if a[i,k] != 0.0:
lam = a[i,k]/a[k,k]
a[i,k+1:n] = a[i,k+1:n] - lam*a[k,k+1:n]
a[i,k] = lam
return a,seq
def LUsolve(a,b,seq):
n = len(a)
# Rearrange constant vector; store it in [x]
x = b.copy()
for i in range(n):
x[i] = b[seq[i]]
# Solution
for k in range(1,n):
x[k] = x[k] - np.dot(a[k,0:k],x[0:k])
x[n-1] = x[n-1]/a[n-1,n-1]
for k in range(n-2,-1,-1):
x[k] = (x[k] - np.dot(a[k,k+1:n],x[k+1:n]))/a[k,k]
return x
the results were a bit better but nowhere near what it should be
Edit 2:
I tried the chebyshev method suggested in the comments and came up with:
import numpy as np
def chebyshev_transform(x, n):
"""
Transforms x-coordinates to Chebyshev coordinates
"""
return np.cos(n * np.arccos(x))
def chebyshev_design_matrix(x, n):
"""
Constructs the Chebyshev design matrix
"""
x_cheb = chebyshev_transform(x, n)
T = np.zeros((len(x), n+1))
T[:,0] = 1
T[:,1] = x_cheb
for i in range(2, n+1):
T[:,i] = 2 * x_cheb * T[:,i-1] - T[:,i-2]
return T
degree =10
f = lambda x: np.cos(X)
xdata = np.linspace(-1,1,num=100)
ydata = np.array([f(i) for i in xdata])
M = chebyshev_design_matrix(xdata,degree)
D_x ,D_y = np.linalg.qr(M)
D_x, seq = LUdecomp(D_x)
A = LUsolve(D_x,D_y,seq)
I can't use linalg.qr in my program , it was just for checking how it works.In addition , I didn't get the 'slow way' of the formula that were in the comment.
The program cant get an x point that is not between -1 and 1 , is there any way around it , any normalizition?
Thanks a lot.

Hints:
You are probably asked for an unsophisticated method. If the degree of the polynomial remains low, you can use the straightforward approach below. For the sake of the explanation, I'll use a cubic model.
Assume that you want to fit your data to this polynomial, by observing that it seems to follow a cubic behavior:
ax³ + bx² + cx + d ~ y
[All x and y should be understood with an index i which is omitted for notational convenience.]
If there are more than four data points, you get an overdetermined system of equations, usually with no solution. The trick is to consider the error on the individual equations, e = ax³ + bx² + cx + d - y, and to minimize the total error. As the error is a signed number, negative errors would make minimization impossible. Instead, we minimize the sum of squared errors. (The sum of absolute errors is another option but it unfortunately leads to a much harder problem.)
Min(a, b, c, d) Σ(ax³ + bx² + cx + d - y)²
As the unknown parameters are unconstrained, it suffices to look for a stationary point, i.e. cancel the gradient of the total error. By differentiation on the unknowns a, b, c and d, we obtain
2Σ(ax³x³ + bx²x³ + cxx³ + dx³ - yx³) = 0
2Σ(ax³x² + bx²x² + cxx² + dx² - yx²) = 0
2Σ(ax³x + bx²x + cxx + dx - yx ) = 0
2Σ(ax³ + bx² + cx + d - y ) = 0
As you can recognize, this is a square linear system of equations.

Weird results obtained while solving a set of coupled differential equations (using a sparse array) in python

I have tried to no avail for a week while trying to solve a system of coupled differential equations and reproduce the results shown in the attached image. I seem to be getting weird results as shown also. I don't seem to know what I might be doing wrong.The set of coupled differential equations were solved using Newman's BAND. Here's a link to the python implementation: python solution using BAND . And another link to the original image of the problem in case the attached is not clear enough: here you find a clearer image of the problem. Now what I am trying to do is to solve the same problem by creating a sparse array directly from the discretized equations using a combination of sympy and numpy and then solving using scipy's spsolve. Here is my code below. I need some help to figure out what I am doing wrong.
I have represented the variables as c1 = cA, c2 = cB, c3 = cC, c4 = cD in my code. Equation 2 has been linearized and phi10 and phi20 are the trial values of the variables cC and cD.
# import modules
import numpy as np
import sympy
from sympy.core.function import _mexpand
import scipy as sp
import scipy.sparse as ss
import scipy.sparse.linalg as ssl
import matplotlib.pyplot as plt
# define functions
def flatten(t):
"""
function to flatten lists
"""
return [item for sublist in t for item in sublist]
def get_coeffs(coeff_dict, func_vars):
"""
function to extract coefficients from variables
and form the sparse symbolic array
"""
c = coeff_dict
for i in list(c.keys()):
b, _ = i.as_base_exp()
if b == i:
continue
if b in c:
c[i] = 0
if any(k.has(b) for k in c):
c[i] = 0
return [coeff_dict[val] for val in func_vars]
# Constants for the problem
I = 0.1 # A/cm2
L = 1.0 # distance (x) in cm
m = 100 # grid spacing
h = L / (m-1)
a = 23300 # 1/cm
io = 2e-7 # A/cm2
n = 1
F = 96500 # C/mol
R = 8.314 # J/mol-K
T = 298 # K
sigma = 20 # S/cm
kappa = 0.06 # S/cm
alpha = 0.5
beta = -(1-alpha)*n*F/R/T
phi10 , phi20 = 5, 0.5 # these are just guesses
P = a*io*np.exp(beta*(phi10-phi20))
j = sympy.symbols('j',integer = True)
cA = sympy.IndexedBase('cA')
cB = sympy.IndexedBase('cB')
cC = sympy.IndexedBase('cC')
cD = sympy.IndexedBase('cD')
# write the boundary conditions at x = 0
bc=[cA[1], cB[1],
(4/3) * cC[2] - (1/3)*cC[3], # use a three point approximation for cC_prime
cD[1]
]
# form a list of expressions from the boundary conditions and equations
expr=flatten([bc,flatten([[
-cA[j-1] - cB[j-1] + cA[j+1] + cB[j+1],
cB[j-1] - 2*h*P*beta*cC[j] + 2*h*P*beta*cD[j] - cB[j+1],
-sigma*cC[j-1] + 2*h*cA[j] + sigma * cC[j+1],
-kappa * cD[j-1] + 2*h * cB[j] + kappa * cD[j+1]] for j in range(2, m)])])
vars = [cA[j], cB[j], cC[j], cD[j]]
# flatten the list of variables
unknowns = flatten([[cA[j], cB[j], cC[j], cD[j]] for j in range(1,m)])
var_len = len(unknowns)
# # # substitute in the boundary conditions at x = L while getting the coefficients
A = sympy.SparseMatrix([get_coeffs(_mexpand(i.subs({cA[m]:I}))\
.as_coefficients_dict(), unknowns) for i in expr])
# convert to a numpy array
mat_temp = np.array(A).astype(np.float64)
# you can view the sparse array with this
fig = plt.figure(figsize=(6,6))
ax = fig.add_axes([0,0, 1,1])
cmap = plt.cm.binary
plt.spy(mat_temp, cmap = cmap, alpha = 0.8)
def solve_sparse(b0, error):
# create the b column vector
b = np.copy(b0)
b[0:4] = np.array([0.0, I, 0.0, 0.0])
b[var_len-4] = I
b[var_len-3] = 0
b[var_len-2] = 0
b[var_len-1] = 0
print(b.shape)
old = np.copy(b0)
mat = np.copy(mat_temp)
b_2 = np.copy(b)
resid = 10
lss = 0
while lss < 100:
mat_2 = np.copy(mat)
for j in range(3, var_len - 3, 4):
# update the forcing term of equation 2
b_2[j+2] = 2*h*(1-beta*old[j+3]+beta*old[j+4])*a*io*np.exp(beta*(old[j+3]-old[j+4]))
# update the sparse array at every iteration for variables cC and cD in equation2
mat_2[j+2, j+3] += 2*h*beta*a*io*np.exp(beta*(old[j+3]-old[j+4]))
mat_2[j+2, j+4] += 2*h*beta*a*io*np.exp(beta*(old[j+3]-old[j+4]))
# form the column sparse matrix
A_s = ss.csc_matrix(mat_2)
new = ssl.spsolve(A_s, b_2).flatten()
resid = np.sum((new - old)**2)/var_len
lss += 1
old = np.copy(new)
return new
val0 = np.array([[0.0, 0.0, 0.0, 0.0] for _ in range(m-1)]).flatten() # form an array of initial values
error = 1e-7
## Run the code
conc = solve_sparse(val0, error).reshape(m-1, len(vars))
conc.shape # gives (99, 4)
# Plot result for cA:
plt.plot(conc[:,0], marker = 'o', linestyle = '')

What happens seems pretty clear now, after having seen that the plotted solution indeed oscillates between the upper and lower values. You are using the central Euler method as discretization, for u'=F(u) this reads as
u[j+1]-u[j-1] = 2*h*F(u[j])
This method is only weakly stable and allows the sub-sequences of odd and even indices to evolve rather independently. As equation this would mean that the solution might approximate the system ue'=F(uo), uo'=F(ue) with independent functions ue, uo that follow the path of the even or odd sub-sequence.
These even and odd parts are only tied together by the treatment of the boundary points, two or three points deep. So to avoid or reduce the oscillation requires a very careful handling of boundary conditions and also the differential equations for the boundary points.
But one can avoid all this unpleasantness by using the trapezoidal method
u[j+1]-u[j] = 0.5*h*(F(u[j+1])+F(u[j]))
This also reduces the band-width of the system matrix.
To properly implement the implied Newton method correctly (linearizing via Taylor and solving the linearized equation is what the Newton-Kantorovich method does) you need to replace F(u[j]) with F(u_old[j])+F'(u_old[j])*(u[j]-u_old[j]). This then gives a linear system of equations in u for the iteration step.
For the trapezoidal method this gives
(I-0.5*h*F'(u_old[j+1]))*u[j+1] - (I+0.5*h*F'(u_old[j]))*u[j]
= 0.5*h*(F(u_old[j+1])-F'(u_old[j+1])*u_old[j+1] + F(u_old[j])-F'(u_old[j])*u_old[j])
In general, the derivatives values and thus the system matrix need not be updated every step, only the function value (else the iteration does not move forward).

It seems that multiprocessing.pool take the same argument two times

I'm using multiprocessing.pool to perform multiple integration in parallel.
In this program I integrate an equation of motion for different realization of noise by generating the dW 3D array. The first part of the program is just definition of parameters and the generation of arrays needed for the calculation.
I generate dW outside the function since I know that otherwise I have to reseed at each process to not obtain the same random sequence.
Euler(replica) function is the function which have to be parallelize. This include a for loop over a single process for the numerical integration. The arg replica is the number of the replica of my system as stored in the "replicas" array, which is the argument passed in pool.map.
import numpy as np
from multiprocessing import Pool
# parameters
N = 30 # number of sites
T = 1 # total time
dt = 0.1 # time step
l = 0 # initially localized state on site l
e = 0.0 # site energy
v = 1.0 # hopping coefficient
mu, sigma = 0, 1.0 # average and variance of the gaussian distribution
num_replicas = 8 # number of replicas of the system
processes=2 # number of processes
# identity vector which represents the diagonal of the Hamiltonian
E = np.ones(N) * e
# vector which represents the upper/lower diagonal terms of the Hopping Matrix and the Hamiltonian
V = np.ones(N-1) * v
# definition of the tight-binding Hamiltonian (tridiagonal)
H = np.diag(E) + np.diag(V, k=1) + np.diag(V, k=-1)
# corner elements of the Hamiltonian
H[0, -1] = v
H[-1, 0] = v
# time array
time_array = np.arange(0, T, dt)
# site array
site_array = np.arange(N)
# initial state
psi_0 = np.zeros((N), dtype=complex)
psi_0[l] = 1. + 0.j
#initialization of the state array
Psi = np.zeros((len(time_array), N), dtype=complex)
Psi[0,:] = psi_0
# replicas 1D array
replicas = np.arange(0, num_replicas)
# random 2D array
dW = np.random.normal(mu, 1.0, (len(time_array), num_replicas, N)) * np.sqrt(dt)
def Euler(replica):
psi_0 = np.zeros((N), dtype=complex)
psi_0[l] = 1. + 0.j
psi = psi_0
for i in np.arange(1, len(time_array)):
psi += -1.j * (H # psi) * dt - 1.j * sigma * psi * dW[i,replica,:] - 0.5 * (sigma**2) * psi * dt
psi /= np.sqrt(psi # np.conj(psi))
Psi[i,:] = psi
return Psi
pool = Pool(processes)
Psi = pool.map(Euler, replicas)
Psi = np.asarray(Psi)
Psi = np.swapaxes(Psi,0,1)
print(Psi)
Empirically I found that if num_replicas > 4 * processes as expressed in the pool.map function, it seems that two processes take the same argument, as if the same calculation is repeated two times. Instead, from 'num_replicas <= 4*processes` I get the expected result: each process is different from the others.
This is not due to the generation of the random matrix dW, since each row is uncorrelated, so I ascribe this behavior to my use of multiprocessing.pool.

I think you should initialize your psi_0 and "Psi" inside the Euler function.
I tried to reproduce your results and, indeed, I found that when num_replicas > 4 * processes you get the same results from multiple processors. But I think this is due to the fact that Psi, in your case, it's a global variable.
Modifying the code as following it gives different results for each num_replicas (by the way, why do you use site_array? It is not used anywhere).
import numpy as np
from multiprocessing import Pool
# parameters
N = 3 # number of sites
T = 1 # total time
dt = 0.1 # time step
l = 0 # initially localized state on site l
e = 0.0 # site energy
v = 1.0 # hopping coefficient
mu, sigma = 0, 1.0 # average and variance of the gaussian distribution
num_replicas = 10 # number of replicas of the system
processes=2 # number of processes
# identity vector which represents the diagonal of the Hamiltonian
E = np.ones(N) * e
# vector which represents the upper/lower diagonal terms of the Hopping Matrix and the Hamiltonian
V = np.ones(N-1) * v
# definition of the tight-binding Hamiltonian (tridiagonal)
H = np.diag(E) + np.diag(V, k=1) + np.diag(V, k=-1)
# corner elements of the Hamiltonian
H[0, -1] = v
H[-1, 0] = v
# time array
time_array = np.arange(0, T, dt)
## site array
#site_array = np.arange(N)
# replicas 1D array
replicas = np.arange(0, num_replicas)
# random 2D array
dW = np.random.normal(mu, 1.0, (len(time_array), num_replicas, N)) * np.sqrt(dt)
#dW = np.random.normal(mu, 1.0, (len(time_array), num_replicas, N)) * np.sqrt(dt)
def Euler(replica):
# initial state
psi_0 = np.zeros((N), dtype=complex)
psi_0[l] = 1. + 0.j
#initialization of the state array
Psi = np.zeros((len(time_array), N), dtype=complex)
Psi[0,:] = psi_0
psi_0 = np.zeros((N), dtype=complex)
psi_0[l] = 1. + 0.j
psi = psi_0
# print(dW[0,replica,0])
for i in np.arange(1, len(time_array)):
psi += -1.j * (H # psi) * dt - 1.j * sigma * psi * dW[i,replica,:] - 0.5 * (sigma**2) * psi * dt
psi /= np.sqrt(psi # np.conj(psi))
Psi[i,:] = psi
return Psi
pool = Pool(processes)
Psi = pool.map(Euler, replicas)
Psi = np.asarray(Psi)
Psi = np.swapaxes(Psi,0,1)
print(Psi)

as #Fabrizio pointed out, Psi is shared between invocations of Euler. this is generally a bad thing to do and another example of why it's a bad idea to have "global mutable state". it's just too easy for things to break in unexpected ways
the reason it fails in this case is subtle and due to the way Pool.map accumulates several results in each process before pickling them and returning them to the parent/controlling process. you can see this by setting the chunksize parameter of map to 1, causing it return results immediately and hence not overwriting intermediate results
it's equivalent to the following minimal working example:
from multiprocessing import Pool
arr = [None]
def mutate_global(i):
arr[0] = i
return arr
with Pool(2) as pool:
out = pool.map(mutate_global, range(10), chunksize=5)
print(out)
the last time I ran this I got:
[[4], [4], [4], [4], [4], [9], [9], [9], [9], [9]]
you can change the chunksize parameter to get an idea of what it's doing, or maybe run with the following version:
def mutate_local(i):
arr = [None]
arr[0] = i
return arr
which "just works", and is the equlivelant to doing what #Fabrizio describes where you create Phi inside Euler rather than using a single global variable

How to create an array that can be accessed according to its indices in Numpy?

I am trying to solve the following problem via a Finite Difference Approximation in Python using NumPy:
$u_t = k \, u_{xx}$, on $0 < x < L$ and $t > 0$;
$u(0,t) = u(L,t) = 0$;
$u(x,0) = f(x)$.
I take $u(x,0) = f(x) = x^2$ for my problem.
Programming is not my forte so I need help with the implementation of my code. Here is my code (I'm sorry it is a bit messy, but not too bad I hope):
## This program is to implement a Finite Difference method approximation
## to solve the Heat Equation, u_t = k * u_xx,
## in 1D w/out sources & on a finite interval 0 < x < L. The PDE
## is subject to B.C: u(0,t) = u(L,t) = 0,
## and the I.C: u(x,0) = f(x).
import numpy as np
import matplotlib.pyplot as plt
# definition of initial condition function
def f(x):
return x^2
# parameters
L = 1
T = 10
N = 10
M = 100
s = 0.25
# uniform mesh
x_init = 0
x_end = L
dx = float(x_end - x_init) / N
#x = np.zeros(N+1)
x = np.arange(x_init, x_end, dx)
x[0] = x_init
# time discretization
t_init = 0
t_end = T
dt = float(t_end - t_init) / M
#t = np.zeros(M+1)
t = np.arange(t_init, t_end, dt)
t[0] = t_init
# Boundary Conditions
for m in xrange(0, M):
t[m] = m * dt
# Initial Conditions
for j in xrange(0, N):
x[j] = j * dx
# definition of solution to u_t = k * u_xx
u = np.zeros((N+1, M+1)) # NxM array to store values of the solution
# finite difference scheme
for j in xrange(0, N-1):
u[j][0] = x**2 #initial condition
for m in xrange(0, M):
for j in xrange(1, N-1):
if j == 1:
u[j-1][m] = 0 # Boundary condition
else:
u[j][m+1] = u[j][m] + s * ( u[j+1][m] - #FDM scheme
2 * u[j][m] + u[j-1][m] )
else:
if j == N-1:
u[j+1][m] = 0 # Boundary Condition
print u, t, x
#plt.plot(t, u)
#plt.show()
So the first issue I am having is I am trying to create an array/matrix to store values for the solution. I wanted it to be an NxM matrix, but in my code I made the matrix (N+1)x(M+1) because I kept getting an error that the index was going out of bounds. Anyways how can I make such a matrix using numpy.array so as not to needlessly take up memory by creating a (N+1)x(M+1) matrix filled with zeros?
Second, how can I "access" such an array? The real solution u(x,t) is approximated by u(x[j], t[m]) were j is the jth spatial value, and m is the mth time value. The finite difference scheme is given by:
u(x[j],t[m+1]) = u(x[j],t[m]) + s * ( u(x[j+1],t[m]) - 2 * u(x[j],t[m]) + u(x[j-1],t[m]) )
(See here for the formulation)
I want to be able to implement the Initial Condition u(x[j],t[0]) = x**2 for all values of j = 0,...,N-1. I also need to implement Boundary Conditions u(x[0],t[m]) = 0 = u(x[N],t[m]) for all values of t = 0,...,M. Is the nested loop I created the best way to do this? Originally I tried implementing the I.C. and B.C. under two different for loops which I used to calculate values of the matrices x and t (in my code I still have comments placed where I tried to do this)
I think I am just not using the right notation but I cannot find anywhere in the documentation for NumPy how to "call" such an array so at to iterate through each value in the proposed scheme. Can anyone shed some light on what I am doing wrong?
Any help is very greatly appreciated. This is not homework but rather to understand how to program FDM for Heat Equation because later I will use similar methods to solve the Black-Scholes PDE.
EDIT: So when I run my code on line 60 (the last "else" that I use) I get an error that says invalid syntax, and on line 51 (u[j][0] = x**2 #initial condition) I get an error that reads "setting an array element with a sequence." What does that mean?

Gradient Descent

So I am writing a program that handles gradient descent. Im using this method to solve equations of the form
Ax = b
where A is a random 10x10 matrix and b is a random 10x1 matrix
Here is my code:
import numpy as np
import math
import random
def steepestDistance(A,b,xO, e):
xPrev = xO
dPrev = -((A * xPrev) - b)
magdPrev = np.linalg.norm(dPrev)
danger = np.asscalar(((magdPrev * magdPrev)/(np.dot(dPrev.T,A * dPrev))))
xNext = xPrev + (danger * dPrev)
step = 1
while (np.linalg.norm((A * xNext) - b) >= e and np.linalg.norm((A * xNext) - b) < math.pow(10,4)):
xPrev = xNext
dPrev = -((A * xPrev) - b)
magdPrev = np.linalg.norm(dPrev)
danger = np.asscalar((math.pow(magdPrev,2))/(np.dot(dPrev.T,A * dPrev)))
xNext = xPrev + (danger * dPrev)
step = step + 1
return xNext
##print(steepestDistance(np.matrix([[5,2],[2,1]]),np.matrix([[1],[1]]),np.matrix([[0.5],[0]]), math.pow(10,-5)))
def chooseRandMatrix():
matrix = np.zeros(shape = (10,10))
for i in range(10):
for a in range(10):
matrix[i][a] = random.randint(0,100)
return matrix.T * matrix
def chooseRandColArray():
arra = np.zeros(shape = (10,1))
for i in range(10):
arra[i][0] = random.randint(0,100)
return arra
for i in range(4):
matrix = np.asmatrix(chooseRandMatrix())
array = np.asmatrix(chooseRandColArray())
print(steepestDistance(matrix, array, np.asmatrix(chooseRandColArray()),math.pow(10,-5)))
When I run the method steepestDistance on the random matrix and column, I keep getting an infinite loop. It works fine when simple 2x2 matrices are used for A, but it loops indefinitely for 10x10 matrices. The problem is in np.linalg.norm((A * xNext) - b); it keeps growing indefinitely. Thats why I put an upper bound on it; Im not supposed to do it for the algorithm however. Can someone tell me what the problem is?

Solving a linear system Ax=b with gradient descent means to minimize the quadratic function
f(x) = 0.5*x^t*A*x - b^t*x.
This only works if the matrix A is symmetric, A=A^t, since the derivative or gradient of f is
f'(x)^t = 0.5*(A+A^t)*x - b,
and additionally A must be positive definite. If there are negative eigenvalues,then the descent will proceed to minus infinity, there is no minimum to be found.
One work-around is to replace b by A^tb and A by a^t*A, that is to minimize the function
f(x) = 0.5*||A*x-b||^2
= 0.5*x^t*A^t*A*x - b^t*A*x + 0.5*b^t*b
with gradient
f'(x)^t = A^t*A*x - A^t*b
But for large matrices A this is not recommended since the condition number of A^t*A is about the square of the condition number of A.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

making python code faster using numba, cache or any other optimization - python

Related

how to implement least square polynomial with no built in methods using python?

Weird results obtained while solving a set of coupled differential equations (using a sparse array) in python

It seems that multiprocessing.pool take the same argument two times

How to create an array that can be accessed according to its indices in Numpy?

Gradient Descent

Categories

Resources