I am given this matrix and am trying to write a function to build this matrix for any size of n. I am told the height of the matrix is n, but not sure the width.
Below is my code and output, is this correct? I am slightly confused by the notation of the matrix itself.
def buildMatrix(n, a):
matrix = np.zeros([n, n], dtype=float)
x_diag, y_diag = np.diag_indices_from(matrix)
for (x,y) in zip(x_diag, y_diag):
if x > (n / 2):
matrix[x][y] = -2*a
elif x == (n / 2):
matrix[x][y] = -(1 + a)
else:
matrix[x][y] = -2
if x != n - 1:
matrix[x + 1][y] = a if x >= (n / 2) else 1
matrix[x][y + 1] = a if x >= (n / 2) else 1
return matrix
Output with buildMatrix(5, 2)
[[-2. 1. 0. 0. 0.]
[ 1. -2. 1. 0. 0.]
[ 0. 1. -3. 2. 0.]
[ 0. 0. 2. -4. 2.]
[ 0. 0. 0. 2. -4.]]
Can anyone help me out?
To answer your first question, the matrix has to have a width of n in order for the matrix-vector product to be compatible.
The picture of the matrix is ambiguous on where the switch from -2 to -(1-a) to -2a occurs. In your code, you check if x==n/2 to set the switch. This is fine in python2 but will cause problems in python3 since x/2 returns 2.5. Using safer x==n//2 since n//2 return an integer in python2 as well as python3.
For generality, I'm going to assume that the switch happens at row m. The matrix can be built easier using slicing and the np.diag command.
def buildmat(n, m, a):
diag = np.zeros(n)
offdiag = np.zeros(n-1)
offdiag[0:m] = 1
offdiag[m:n-1] = a
diag[0:m] = -2
diag[m] = -(1+a)
diag[m+1:n] = -2*a
matrix = np.diag(diag) + np.diag(offdiag, 1) + np.diag(offdiag, -1)
return matrix
Running
buildmat(5, 2, 3)
produces
[[-2. 1. 0. 0. 0.]
[ 1. -2. 1. 0. 0.]
[ 0. 1. -3. 2. 0.]
[ 0. 0. 2. -4. 2.]
[ 0. 0. 0. 2. -4.]]
Related
I am currently trying to create a sparse matrix that will look like this.
[[ 50. -25. 0. 0.]
[-25. 50. -25. 0.]
[ 0. -25. 50. -25.]
[ 0. 0. -25. 50.]]
But when I run it through I keep getting the value error
'data array must have rank 2' in my data array.
I am positive it is a problem with my B variable. I have tried several things but nothing is working. Any advice?
def sparse(a,b,N):
h = (b-a)/(N+1)
e = np.ones([N,1])/h**2
B = np.array([e, -2*e, e])
diags = np.array([-1,0,1])
A = spdiags(B,diags,N,N).toarray()
return A
print(sparse(0,1,4))
Just change to this:
import numpy as np
from scipy.sparse import spdiags
def sparse(a, b, N):
h = (b - a) / (N + 1)
e = np.ones(N) / h ** 2
diags = np.array([-1, 0, 1])
A = spdiags([-1 * e, 2 * e, -1 * e], diags, N, N).toarray()
return A
print(sparse(0, 1, 4))
Output
[[-50. 25. 0. 0.]
[ 25. -50. 25. 0.]
[ 0. 25. -50. 25.]
[ 0. 0. 25. -50.]]
The main change is this:
e = np.ones([N,1])/h**2
by
e = np.ones(N) / h ** 2
Note that toarray transforms the sparse matrix into a dense one, from the documentation:
Return a dense ndarray representation of this matrix.
I would like to build a locally connected weight matrix that represents a locally connected neural network in pure python/numpy without deep learning frameworks like Torch or TensorFlow.
The weight matrix is a non-square 2D matrix with the dimension (number_input, number_output). (an autoencoder in my case; input>hidden)
So the function I would like to build, take the matrix dimension and the size of the receptive field (number of local connection) and give the associated weight matrix. I've already create a function like this, but for an input size of 8 and an output size of 4 (and RF = 4) my function output :
[[ 0.91822845 0. 0. 0. ]
[-0.24264655 -0.54754138 0. 0. ]
[ 0.55617366 0.12832513 -0.28733965 0. ]
[ 0.27993286 -0.33150324 0.06994107 0.61184121]
[ 0. 0.04286912 -0.20974503 -0.37633903]
[ 0. 0. -0.10386762 0.33553009]
[ 0. 0. 0. 0.09562682]
[ 0. 0. 0. 0. ]]
but I would like :
[[ 0.91822845 0. 0. 0. ]
[-0.24264655 -0.54754138 0. 0. ]
[ 0.55617366 0.12832513 0. 0. ]
[ 0 -0.33150324 0.06994107 0 ]
[ 0. 0.04286912 -0.20974503 0. ]
[ 0. 0. -0.10386762 0.33553009]
[ 0. 0. 0.11581854 0.09562682]
[ 0. 0. 0. 0.03448418]]
Here's my python code :
import numpy as np
def local_weight(input_size, output_size, RF):
input_range = 1.0 / input_size ** (1/2)
w = np.zeros((input_size, output_size))
for i in range(0, RF):
for j in range(0, output_size):
w[j+i, j] = np.random.normal(loc=0, scale=input_range, size=1)
return w
print(local_weight(8, 4, 4))
I look forward for your response!
The trick is in a small pad to work more comfortably (or control the limits).
Then you must define the step you will take with respect to the input (it is not more than the input / output). Once this is done you just have to fill in the gaps and then remove the pad.
import math
import numpy as np
def local_weight(input_size, output_size, RF):
input_range = 1.0 / input_size ** (1/2)
padding = ((RF - 1) // 2)
w = np.zeros(shape=(input_size + 2*padding, output_size))
step = float(w.shape[0] - RF) / (output_size - 1)
for i in range(output_size):
j = int(math.ceil(i * step))
j_next = j + RF
w[j:j_next, i] = np.random.normal(loc=0, scale=input_range, size=(j_next - j))
return w[padding:-padding, :]
I hope that is what you are looking for.
EDIT:
I think the implementation was misguided. I reimplement the function, we go by parts.
I calculate the radius of the receptive field (padding).
Determine the size of the W.
I calculate the step by removing the padding area so that I always stay inside.
I calculate the weights.
Remove the padding.
First of all, I work with byte array (>= 400x400x1000) bytes.
I wrote a small function which can insert a multidimensional array (or a fraction of) into another one by indicating an offset. This works if the embedded array is smaller than the embedding array (case A). Otherwise the embedded array is truncated (case B).
case A) Inserting a 3x3 into a 5x5 matrix with offset 1,1 would look like this.
[[ 0. 0. 0. 0. 0.]
[ 0. 1. 1. 1. 0.]
[ 0. 1. 1. 1. 0.]
[ 0. 1. 1. 1. 0.]
[ 0. 0. 0. 0. 0.]]
case B) If the offsets are exceeding the dimensions of the embedding matrix, the smaller array is truncated. E.g. a (-1,-1) offset would result in this.
[[ 1. 1. 0. 0. 0.]
[ 1. 1. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]]
case C) Now, instead of truncating the embedded array, I want to extend the embedding array (by zeroes) if the embedded array is either bigger than the embedding array or the offsets enforce it (e.g. case B). Is there a smart way with numpy or scipy to solve this?
[[ 1. 1. 1. 0. 0. 0.]
[ 1. 1. 1. 0. 0. 0.]
[ 1. 1. 1. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0.]]
Actually I work with 3D array, but for simplicity I wrote an example for 2D arrays. Current source:
import numpy as np
import nibabel as nib
def addAtPos(mat_bigger, mat_smaller, xyz_coor):
size_sm_x, size_sm_y = np.shape(mat_smaller)
size_gr_x, size_gr_y = np.shape(mat_bigger)
start_gr_x, start_gr_y = xyz_coor
start_sm_x, start_sm_y = 0,0
end_x, end_y = (start_gr_x + size_sm_x), (start_gr_y + size_sm_y)
print(size_sm_x, size_sm_y)
print(size_gr_x, size_gr_y)
print(end_x, end_y)
if start_gr_x < 0:
start_sm_x = -start_gr_x
start_gr_x = 0
if start_gr_y < 0:
start_sm_y = -start_gr_y
start_gr_y = 0
if end_x > size_gr_x:
size_sm_x = size_sm_x - (end_x - size_gr_x)
end_x = size_gr_x
if end_y > size_gr_y:
size_sm_y = size_sm_y - (end_y - size_gr_y)
end_y = size_gr_y
# copy all or a chunk (if offset is small/big enough) of the smaller matrix into the bigger matrix
mat_bigger[start_gr_x:end_x, start_gr_y:end_y] = mat_smaller[start_sm_x:size_sm_x, start_sm_y:size_sm_y]
return mat_bigger
a_gr = np.zeros([5,5])
a_sm = np.ones([3,3])
a_res = addAtPos(a_gr, a_sm, [-2,1])
#print (a_gr)
print (a_res)
Actually there is an easier way to do it.
For your first example of a 3x3 array embedded to a 5x5 one you can do it with something like:
A = np.array([[1,1,1], [1,1,1], [1,1,1]])
(N, M) = A.shape
B = np.zeros(shape=(N + 2, M + 2))
B[1:-1:, 1:-1] = A
By playing with slicing you can select a subset of A and insert it anywhere within a continuous subset of B.
Hope it helps! ;-)
I'm very new to GPU programming and pyCUDA and have a pretty fundamental gap in my knowledge. I have spent quite a bit of time searching SO, looking at example code and reading supporting documentation for CUDA/pyCUDA but haven't found much diversity in the explanations and can't get my head around a few things.
I am having trouble correctly defining block and grid dimensions. The code I am currently running is as follows, and aims to do element-wise multiplication of an array a by a float b:
from __future__ import division
import pycuda.gpuarray as gpuarray
import pycuda.driver as cuda
import pycuda.autoinit
from pycuda.compiler import SourceModule
import numpy as np
rows = 256
cols = 10
a = np.ones((rows, cols), dtype=np.float32)
a_gpu = cuda.mem_alloc(a.nbytes)
cuda.memcpy_htod(a_gpu, a)
b = np.float32(2)
mod = SourceModule("""
__global__ void MatMult(float *a, float b)
{
const int i = threadIdx.x + blockDim.x * blockIdx.x;
const int j = threadIdx.y + blockDim.y * blockIdx.y;
int Idx = i + j*gridDim.x;
a[Idx] *= b;
}
""")
func = mod.get_function("MatMult")
xBlock = np.int32(np.floor(1024/rows))
yBlock = np.int32(cols)
bdim = (xBlock, yBlock, 1)
dx, mx = divmod(rows, bdim[0])
dy, my = divmod(cols, bdim[1])
gdim = ( (dx + (mx>0)) * bdim[0], (dy + (my>0)) * bdim[1])
print "bdim=",bdim, ", gdim=", gdim
func(a_gpu, b, block=bdim, grid=gdim)
a_doubled = np.empty_like(a)
cuda.memcpy_dtoh(a_doubled, a_gpu)
print a_doubled - 2*a
The code should print the block dimensions bdim and the grid dimensions gdim, as well as an array of zeroes.
This works for small array sizes, for example, if rows=256 and cols=10 (as in the example above) the output is as follows:
bdim= (4, 10, 1) , gdim= (256, 10)
[[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
...,
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]]
However, if I increase rows=512, I get the following output:
bdim= (2, 10, 1) , gdim= (512, 10)
[[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
...,
[ 2. 2. 2. ..., 2. 2. 2.]
[ 2. 2. 2. ..., 2. 2. 2.]
[ 2. 2. 2. ..., 2. 2. 2.]]
Indicating that the multiplication is happening twice for some elements of the array.
However, if I force the block dimensions to bdim = (1,1,1), the problem no longer occurs and I get the following (correct) output for the larger array size:
bdim= (1, 1, 1) , gdim= (512, 10)
[[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
...,
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]]
I don't understand this. What is happening here which means that this method of defining the block and grid dimensions is no longer appropriate as the array size is increased? Also, if block has dimensions (1,1,1) does this mean that the calculation is being performed serially?
Thanks in advance for any pointers and help!
You operate on 2D grid of 2D blocks. In your kernel you seem to assume that gridDim.x would return number of threads in x dimension of a grid.
__global__ void MatMult(float *a, float b)
{
const int i = threadIdx.x + blockDim.x * blockIdx.x;
const int j = threadIdx.y + blockDim.y * blockIdx.y;
int Idx = i + j*gridDim.x;
a[Idx] *= b;
}
The gridDim.x returns number of blocks r x direction of grid, not number of threads. In order to obtain number of threads in given direction you should multiply number of threads in a block with number of blocks in a grid in the same direction:
int Idx = i + j * blockDim.x * gridDim.x
I need to write a basic function that computes a 2D convolution between a matrix and a kernel.
I have recently got into Python, so I'm sorry for my mistakes.
My dissertation teacher said that I should write one by myself so I can handle it better and to be able to modify it for future improvements.
I have found an example of this function on a website, but I don't understand how the returned values are obtained.
This is the code (from http://docs.cython.org/src/tutorial/numpy.html )
from __future__ import division
import numpy as np
def naive_convolve(f, g):
# f is an image and is indexed by (v, w)
# g is a filter kernel and is indexed by (s, t),
# it needs odd dimensions
# h is the output image and is indexed by (x, y),
# it is not cropped
if g.shape[0] % 2 != 1 or g.shape[1] % 2 != 1:
raise ValueError("Only odd dimensions on filter supported")
# smid and tmid are number of pixels between the center pixel
# and the edge, ie for a 5x5 filter they will be 2.
#
# The output size is calculated by adding smid, tmid to each
# side of the dimensions of the input image.
vmax = f.shape[0]
wmax = f.shape[1]
smax = g.shape[0]
tmax = g.shape[1]
smid = smax // 2
tmid = tmax // 2
xmax = vmax + 2*smid
ymax = wmax + 2*tmid
# Allocate result image.
h = np.zeros([xmax, ymax], dtype=f.dtype)
# Do convolution
for x in range(xmax):
for y in range(ymax):
# Calculate pixel value for h at (x,y). Sum one component
# for each pixel (s, t) of the filter g.
s_from = max(smid - x, -smid)
s_to = min((xmax - x) - smid, smid + 1)
t_from = max(tmid - y, -tmid)
t_to = min((ymax - y) - tmid, tmid + 1)
value = 0
for s in range(s_from, s_to):
for t in range(t_from, t_to):
v = x - smid + s
w = y - tmid + t
value += g[smid - s, tmid - t] * f[v, w]
h[x, y] = value
return h
I don't know if this function does the weighted sum from input and filter, because I see no sum here.
I applied this with
kernel = np.array([(1, 1, -1), (1, 0, -1), (1, -1, -1)])
file = np.ones((5,5))
naive_convolve(file, kernel)
I got this matrix:
[[ 1. 2. 1. 1. 1. 0. -1.]
[ 2. 3. 1. 1. 1. -1. -2.]
[ 3. 3. 0. 0. 0. -3. -3.]
[ 3. 3. 0. 0. 0. -3. -3.]
[ 3. 3. 0. 0. 0. -3. -3.]
[ 2. 1. -1. -1. -1. -3. -2.]
[ 1. 0. -1. -1. -1. -2. -1.]]
I tried to do a manual calculation (on paper) for the first full iteration of the function and I got 'h[0,0] = 0', because of the matrix product: 'filter[0, 0] * matrix[0, 0]', but the function returns 1. I am very confused with this.
If anyone can help me understand what is going on here, I would be very grateful. Thanks! :)
Yes, that function computes the convolution correctly. You can check this using scipy.signal.convolve2d
import numpy as np
from scipy.signal import convolve2d
kernel = np.array([(1, 1, -1), (1, 0, -1), (1, -1, -1)])
file = np.ones((5,5))
x = convolve2d(file, kernel)
print x
Which gives:
[[ 1. 2. 1. 1. 1. 0. -1.]
[ 2. 3. 1. 1. 1. -1. -2.]
[ 3. 3. 0. 0. 0. -3. -3.]
[ 3. 3. 0. 0. 0. -3. -3.]
[ 3. 3. 0. 0. 0. -3. -3.]
[ 2. 1. -1. -1. -1. -3. -2.]
[ 1. 0. -1. -1. -1. -2. -1.]]
It's impossible to know how to explain all this to you since I don't know where to start, and I don't know how all the other explanations aren't working for you. I think, though, that you are doing all of this as a learning exercise so you can figure this out for yourself. From what I've seen on SO, asking big questions on SO is not a substitute for working it through yourself.
Your specific question of why does
h[0,0] = 0
in your calculation not match this matrix is a good one. In fact, both are correct. The reason for mismatch is that the output of the convolution doesn't have the mathematical indices specified, but instead they are implied. The center, which is mathematically indicated by the indices [0,0] corresponds to x[3,3] in the matrix above.