I am solving simply linear problem A*x=b by using conjugate gradient method. I want to find x unknown.
Note that conjGrad calls the function Av that returns the product Av
The code is given below:
Inputs:
A - sparse matrix. 2D array;
b - right hand-side vector. 1D array;
x - initial guess. Here, it is just 1D array with zero values.
Code:
import numpy as np
import math
A = np.array([[ 0.56244579, 0. , 0. , 0. , 0.52936075,
0.59553084, 0. , 0. , 0. , 1.1248915 ,
0. , 0. , 0. , 0.46319065, 0.43672262,
0. ],
[ 0.5 , 1. , 1. , 0.5 , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. ],
[ 0. , 0. , 0. , 0.58009067, 0. ,
0. , 0.75411788, 0.40606347, 0. , 0. ,
0.23203627, 0. , 0. , 0. , 0. ,
0. ]])
x = np.array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0.])
b = np.array([ 3.99464617, 1.81663614, 1.86413003])
def Av(v):
return np.dot(A,v)
def conjGrad(Av, x, b, tol=1.0e-9):
n = len(b)
r = b - Av(x)
s = r.copy()
for i in range(n):
u = Av(s)
alpha = np.dot(s,r)/np.dot(s,u)
x = x + aplha*s
r = b - Av(x)
if(math.sqrt(np.dot(r,r))) < tol:
break
else:
beta = - np.dot(r,u)/np.dot(s,u)
s = r + beta * s
return x,i
if __name__ == '__main__':
x, iter_number = conjGrad(Av, x, b)
Traceback (most recent call last):
File "C:\Python27\Conjugate_Gradient.py", line 59, in <module>
x, iter_number = conjGrad(Av, x, b)
File "C:\Python27\Conjugate_Gradient.py", line 47, in conjGrad
u = Av(s)
File "C:\Python27\Conjugate_Gradient.py", line 40, in Av
return np.dot(A,v)
ValueError: matrices are not aligned
Is there any simple solution to avoid this message? Any answers will be appreciated
You have implemented the CG method wrong. The error message shows you one of the lines where there is a problem.
In particular, your matrix is not square.
The conjugate gradients method solves for Ax=b when A is SPD.
If A is not SPD, like in your case, then you can still use conjugate gradients to find the least squares solution for your problem:
A^t A x = A^t b
The matrix A^t A is SPD and well suited for your method.
Related
I'm currently trying to develop a function that performs matrix multiplication while expanding a differential equation with odeint in Python and am seeing strange results.
I converted the function:
def f(x, t):
return [
-0.1 * x[0] + 2 * x[1],
-2 * x[0] - 0.1 * x[1]
]
to the below so that I can incorporate different matrices.
I have the below matrix of values and function that takes specific values of that matrix:
from scipy.integrate import odeint
x0_train = [2,0]
dt = 0.01
t = np.arange(0, 1000, dt)
matrix_a = np.array([-0.09999975, 1.999999, -1.999999, -0.09999974])
# Function to run odeint with
def f(x, t, a):
return [
a[0] * x[0] + a[1] * x[1],
a[2] * x[0] - a[3] * x[1]
]
odeint(f, x0_train, t, args=(matrix_a,))
>>> array([[ 2. , 0. ],
[ 1.99760115, -0.03999731],
[ 1.99440529, -0.07997867],
...,
[ 1.69090227, 1.15608741],
[ 1.71199436, 1.12319701],
[ 1.73240339, 1.08985846]])
This seems right, but when I create my own function to perform multiplication/regression, I see the results at the bottom of the array are completely different. I have two sparse arrays that provide the same conditions as matrix_a but with zeros around them.
from sklearn.preprocessing import PolynomialFeatures
new_matrix_a = array([[ 0. , -0.09999975, 1.999999 , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. ],
[ 0. , -1.999999 , -0.09999974, 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. ]])
# New function
def f_new(x, t, parameters):
polynomials = PolynomialFeatures(degree=5)
x = np.array(x).reshape(-1,2)
#x0_train_array_reshape = x0_train_array.reshape(1,2)
polynomial_transform = polynomials.fit(x)
polynomial_features = polynomial_transform.fit_transform(x).T
x_ode = np.matmul(parameters[0],polynomial_features)
y_ode = np.matmul(parameters[1],polynomial_features)
return np.concatenate((x_ode, y_ode), axis=None).tolist()
odeint(f_new, x0_train, t, args=(new_matrix_a,))
>>> array([[ 2.00000000e+00, 0.00000000e+00],
[ 1.99760142e+00, -3.99573216e-02],
[ 1.99440742e+00, -7.98188169e-02],
...,
[-3.50784051e-21, -9.99729456e-22],
[-3.50782881e-21, -9.99726119e-22],
[-3.50781711e-21, -9.99722781e-22]])
As you can see, I'm getting completely different values at the end of the array. I've been running through my code and can't seem to find a reason why they would be different. Does anybody have a clear reason why or if I'm doing something wrong with my f_new? Ideally, I'd like to develop a function that can take any values in that matrix_a, which is why I'm trying to create this new function.
Thanks in advance.
You should perhaps use numpy even more in the first version, to avoid sign errors in routine algorithms.
def f(x, t, a):
return a.reshape([2,2]) # x # or use matmul, or a.reshape([2,2]).dot(x)
or, for efficiency, pass the already reshaped a.
I receive arrays of 9 doubles via tcp, split them, convert with np.array() method, store them in list, and finally again convert this list to numpy array, so i could save it, and later load to train keras model.
Each time I re-run code, shape of output is (500, 6) or (500,) randomly,
I don't change anything, just keep running same code and I get different results.
How is that possible?
How to convert (n,) to (n, 6)? I Tried with .reshape()
Edit: my full code:
def tcp_server_get_training_data():
host = "localhost"
port = 5367
msg = ""
mySocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mySocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
mySocket.bind((host, port))
# print(socket.getfqdn())
# print(socket.gethostbyname(socket.getfqdn()))
mySocket.listen(1)
print('waiting for connection...')
conn, addr = mySocket.accept()
print("Connection from: " + str(addr))
x = []
y = []
i = 0
n = 500
while i < n:
data = conn.recv(128)
doubles_sequence = array.array('d', data)
doubles_sequence2 = np.array(doubles_sequence)
x.append(doubles_sequence2[:6])
y.append(doubles_sequence2[-3:])
i += 1
print(str(round((i/n)*100, 2))+"%")
#print(doubles_sequence[:6])
xp = np.array(x)
yp = np.array(y)
print("X shape: "+str(xp.shape))
print("y shape: "+str(yp.shape))
np.save(file='x', arr=xp)
np.save(file='y', arr=yp)
conn.close()
when I print x with shape (500,6) i get:
>>> x.shape
(500, 6)
>>> x
array([[ 0. , 0. , 0. , -0.29219246, 0. , 0. ],
[ 0. , 0. , 0. , 0.34277358, 0. , 0. ],
[ 0. , 0. , 0. , 0.34277358, 0. , 0. ],
when I print x with shape (500,) i get:
>>> x.shape
(500,)
>>> x
array([array([ 0., 0. , 0. , -0.29219246, 0., 0. ]),
array([ 0. , 0. , 0. , 0.34277358, 0., 0.]),
array([ 0. , 0., 0., -0.10241638, , dtype=object)
I'd really appreciate some help, I tried to figure it out on my own, but after few hours just get frustrated.
I'm relatively new in programming and spend more time on C#, in python I'm very confused without type declaration.
i want to solve the following ode
KT + CT' = Q
to given example Data is my code below
import numpy as np
import scipy as sp
# Solve the following ODE
# K*T + C*T' = Q
# T' = C^-1 ( Q - K * T )
T_start=sp.array([ 151.26, 132.18, 131.64, 146.55, 147.87, 137.87])
K = sp.array([[-0.01761969, 0.02704873, 0.00572222, 0. , 0. ,
0. ],
[ 0.02704873, -0.03546941, 0. , 0. , 0.00513177,
0. ],
[ 0.00572222, 0. , 0.03001858, -0.04752982, 0. ,
0.02030505],
[ 0. , 0. , -0.04752982, 0.0444405 , 0.00308932,
0. ],
[ 0. , 0.00513177, 0. , 0.00308932, 0.02629577,
-0.01793915],
[ 0. , 0. , 0.02030505, 0. , -0.01793915,
0.00084506]])
Q = sp.array([ 1.66342077, 0.16187956, 0.65115035, 0.71274755,2.54614269, 0.13680399])
C_invers = sp.array([[ 3.44827586, 0. , 0. , 0. , 0. ,
-0. ],
[ 0. , 1.5625 , 0. , 0. , 0. ,
-0. ],
[ 0. , 0. , 2.63157895, 0. , 0. ,
-0. ],
[ 0. , 0. , 0. , 2.17391304, 0. ,
-0. ],
[ 0. , 0. , 0. , 0. , 1.63934426,
-0. ],
[ 0. , 0. , 0. , 0. , 0. ,
2.38095238]])
time = np.linspace(0, 20, 10000)
#T_real = sp.array([[ 151.26, 132.18, 131.64, 146.55, 147.87, 137.87]])
def deriv(T, t):
return sp.dot( C_invers, Q - np.dot(K, T) )
T_sol = sp.integrate.odeint(deriv, T_start, time)
i know that the result is
sp.array([ 151.26, 132.18, 131.64, 146.55, 147.87, 137.87])
the solution is "stable" if and only if i use this as the T_start condition
but if i change my start condition for example to
T_start=sp.array([ 0, 0, 0, 0, 0, 0])
it won't converge im getting the following result:
where is my fault? Negative values make no sense for my system :/ Can you help me? thanks ;)
The array
array([ 151.26, 132.18, 131.64, 146.55, 147.87, 137.87])
is the equilibrium of your system (approximately). You can find this by setting the right-hand side of your system of equations to 0, which leads to Teq = inv(K)*Q:
In [9]: Teq = np.linalg.solve(K, Q)
In [10]: Teq
Out[10]:
array([ 151.25960795, 132.17972469, 131.6402527 , 146.55025359,
147.87025015, 137.87029892])
That's why your solution appears to be stable when you use these values for the starting point. The solution is very close to the equilibrium, so it doesn't change much.
Long term, however, the solution will eventually diverge away from Teq, because that equilibrium point is unstable. Your system, T' = inv(C)*(Q - K*T), is linear in T, so you can determine the stability by computing the eigenvalues of the coefficient matrix of T. That is, write T = inv(C)*Q - inv(C)*K*T. The coefficient matrix of T is -inv(C)*K. Here's how you can find the eigenvalues of that matrix:
In [11]: A = -C_invers.dot(K)
In [12]: np.linalg.eigvals(A)
Out[12]:
array([-0.2089754 , 0.12257481, -0.06349952, -0.01489581, 0.00146708,
0.05878143])
The coefficent matrix A has three positive eigenvalues. Those correspond to modes that will grow exponentially in time. That is, the equilibrium is unstable, so the growth that you see is to be expected.
First of all, I work with byte array (>= 400x400x1000) bytes.
I wrote a small function which can insert a multidimensional array (or a fraction of) into another one by indicating an offset. This works if the embedded array is smaller than the embedding array (case A). Otherwise the embedded array is truncated (case B).
case A) Inserting a 3x3 into a 5x5 matrix with offset 1,1 would look like this.
[[ 0. 0. 0. 0. 0.]
[ 0. 1. 1. 1. 0.]
[ 0. 1. 1. 1. 0.]
[ 0. 1. 1. 1. 0.]
[ 0. 0. 0. 0. 0.]]
case B) If the offsets are exceeding the dimensions of the embedding matrix, the smaller array is truncated. E.g. a (-1,-1) offset would result in this.
[[ 1. 1. 0. 0. 0.]
[ 1. 1. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]]
case C) Now, instead of truncating the embedded array, I want to extend the embedding array (by zeroes) if the embedded array is either bigger than the embedding array or the offsets enforce it (e.g. case B). Is there a smart way with numpy or scipy to solve this?
[[ 1. 1. 1. 0. 0. 0.]
[ 1. 1. 1. 0. 0. 0.]
[ 1. 1. 1. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0.]]
Actually I work with 3D array, but for simplicity I wrote an example for 2D arrays. Current source:
import numpy as np
import nibabel as nib
def addAtPos(mat_bigger, mat_smaller, xyz_coor):
size_sm_x, size_sm_y = np.shape(mat_smaller)
size_gr_x, size_gr_y = np.shape(mat_bigger)
start_gr_x, start_gr_y = xyz_coor
start_sm_x, start_sm_y = 0,0
end_x, end_y = (start_gr_x + size_sm_x), (start_gr_y + size_sm_y)
print(size_sm_x, size_sm_y)
print(size_gr_x, size_gr_y)
print(end_x, end_y)
if start_gr_x < 0:
start_sm_x = -start_gr_x
start_gr_x = 0
if start_gr_y < 0:
start_sm_y = -start_gr_y
start_gr_y = 0
if end_x > size_gr_x:
size_sm_x = size_sm_x - (end_x - size_gr_x)
end_x = size_gr_x
if end_y > size_gr_y:
size_sm_y = size_sm_y - (end_y - size_gr_y)
end_y = size_gr_y
# copy all or a chunk (if offset is small/big enough) of the smaller matrix into the bigger matrix
mat_bigger[start_gr_x:end_x, start_gr_y:end_y] = mat_smaller[start_sm_x:size_sm_x, start_sm_y:size_sm_y]
return mat_bigger
a_gr = np.zeros([5,5])
a_sm = np.ones([3,3])
a_res = addAtPos(a_gr, a_sm, [-2,1])
#print (a_gr)
print (a_res)
Actually there is an easier way to do it.
For your first example of a 3x3 array embedded to a 5x5 one you can do it with something like:
A = np.array([[1,1,1], [1,1,1], [1,1,1]])
(N, M) = A.shape
B = np.zeros(shape=(N + 2, M + 2))
B[1:-1:, 1:-1] = A
By playing with slicing you can select a subset of A and insert it anywhere within a continuous subset of B.
Hope it helps! ;-)
I'm very new to GPU programming and pyCUDA and have a pretty fundamental gap in my knowledge. I have spent quite a bit of time searching SO, looking at example code and reading supporting documentation for CUDA/pyCUDA but haven't found much diversity in the explanations and can't get my head around a few things.
I am having trouble correctly defining block and grid dimensions. The code I am currently running is as follows, and aims to do element-wise multiplication of an array a by a float b:
from __future__ import division
import pycuda.gpuarray as gpuarray
import pycuda.driver as cuda
import pycuda.autoinit
from pycuda.compiler import SourceModule
import numpy as np
rows = 256
cols = 10
a = np.ones((rows, cols), dtype=np.float32)
a_gpu = cuda.mem_alloc(a.nbytes)
cuda.memcpy_htod(a_gpu, a)
b = np.float32(2)
mod = SourceModule("""
__global__ void MatMult(float *a, float b)
{
const int i = threadIdx.x + blockDim.x * blockIdx.x;
const int j = threadIdx.y + blockDim.y * blockIdx.y;
int Idx = i + j*gridDim.x;
a[Idx] *= b;
}
""")
func = mod.get_function("MatMult")
xBlock = np.int32(np.floor(1024/rows))
yBlock = np.int32(cols)
bdim = (xBlock, yBlock, 1)
dx, mx = divmod(rows, bdim[0])
dy, my = divmod(cols, bdim[1])
gdim = ( (dx + (mx>0)) * bdim[0], (dy + (my>0)) * bdim[1])
print "bdim=",bdim, ", gdim=", gdim
func(a_gpu, b, block=bdim, grid=gdim)
a_doubled = np.empty_like(a)
cuda.memcpy_dtoh(a_doubled, a_gpu)
print a_doubled - 2*a
The code should print the block dimensions bdim and the grid dimensions gdim, as well as an array of zeroes.
This works for small array sizes, for example, if rows=256 and cols=10 (as in the example above) the output is as follows:
bdim= (4, 10, 1) , gdim= (256, 10)
[[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
...,
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]]
However, if I increase rows=512, I get the following output:
bdim= (2, 10, 1) , gdim= (512, 10)
[[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
...,
[ 2. 2. 2. ..., 2. 2. 2.]
[ 2. 2. 2. ..., 2. 2. 2.]
[ 2. 2. 2. ..., 2. 2. 2.]]
Indicating that the multiplication is happening twice for some elements of the array.
However, if I force the block dimensions to bdim = (1,1,1), the problem no longer occurs and I get the following (correct) output for the larger array size:
bdim= (1, 1, 1) , gdim= (512, 10)
[[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
...,
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]]
I don't understand this. What is happening here which means that this method of defining the block and grid dimensions is no longer appropriate as the array size is increased? Also, if block has dimensions (1,1,1) does this mean that the calculation is being performed serially?
Thanks in advance for any pointers and help!
You operate on 2D grid of 2D blocks. In your kernel you seem to assume that gridDim.x would return number of threads in x dimension of a grid.
__global__ void MatMult(float *a, float b)
{
const int i = threadIdx.x + blockDim.x * blockIdx.x;
const int j = threadIdx.y + blockDim.y * blockIdx.y;
int Idx = i + j*gridDim.x;
a[Idx] *= b;
}
The gridDim.x returns number of blocks r x direction of grid, not number of threads. In order to obtain number of threads in given direction you should multiply number of threads in a block with number of blocks in a grid in the same direction:
int Idx = i + j * blockDim.x * gridDim.x