I am studying image-processing using NumPy and facing a problem with filtering with convolution.
I would like to convolve a gray-scale image. (convolve a 2d Array with a smaller 2d Array)
Does anyone have an idea to refine my method?
I know that SciPy supports convolve2d but I want to make a convolve2d only by using NumPy.
What I have done
First, I made a 2d array the submatrices.
a = np.arange(25).reshape(5,5) # original matrix
submatrices = np.array([
[a[:-2,:-2], a[:-2,1:-1], a[:-2,2:]],
[a[1:-1,:-2], a[1:-1,1:-1], a[1:-1,2:]],
[a[2:,:-2], a[2:,1:-1], a[2:,2:]]])
the submatrices seems complicated but what I am doing is shown in the following drawing.
Next, I multiplied each submatrices with a filter.
conv_filter = np.array([[0,-1,0],[-1,4,-1],[0,-1,0]])
multiplied_subs = np.einsum('ij,ijkl->ijkl',conv_filter,submatrices)
and summed them.
np.sum(np.sum(multiplied_subs, axis = -3), axis = -3)
#array([[ 6, 7, 8],
# [11, 12, 13],
# [16, 17, 18]])
Thus this procedure can be called my convolve2d.
def my_convolve2d(a, conv_filter):
submatrices = np.array([
[a[:-2,:-2], a[:-2,1:-1], a[:-2,2:]],
[a[1:-1,:-2], a[1:-1,1:-1], a[1:-1,2:]],
[a[2:,:-2], a[2:,1:-1], a[2:,2:]]])
multiplied_subs = np.einsum('ij,ijkl->ijkl',conv_filter,submatrices)
return np.sum(np.sum(multiplied_subs, axis = -3), axis = -3)
However, I find this my_convolve2d troublesome for 3 reasons.
Generation of the submatrices is too awkward that is difficult to read and can only be used when the filter is 3*3
The size of the variant submatrices seems to be too big, since it is approximately 9 folds bigger than the original matrix.
The summing seems a little non intuitive. Simply said, ugly.
Thank you for reading this far.
Kind of update. I wrote a conv3d for myself. I will leave this as a public domain.
def convolve3d(img, kernel):
# calc the size of the array of submatrices
sub_shape = tuple(np.subtract(img.shape, kernel.shape) + 1)
# alias for the function
strd = np.lib.stride_tricks.as_strided
# make an array of submatrices
submatrices = strd(img,kernel.shape + sub_shape,img.strides * 2)
# sum the submatrices and kernel
convolved_matrix = np.einsum('hij,hijklm->klm', kernel, submatrices)
return convolved_matrix
You could generate the subarrays using as_strided:
import numpy as np
a = np.array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
sub_shape = (3,3)
view_shape = tuple(np.subtract(a.shape, sub_shape) + 1) + sub_shape
strides = a.strides + a.strides
sub_matrices = np.lib.stride_tricks.as_strided(a,view_shape,strides)
To get rid of your second "ugly" sum, alter your einsum so that the output array only has j and k. This implies your second summation.
conv_filter = np.array([[0,-1,0],[-1,5,-1],[0,-1,0]])
m = np.einsum('ij,ijkl->kl',conv_filter,sub_matrices)
# [[ 6 7 8]
# [11 12 13]
# [16 17 18]]
Cleaned up using as_strided and #Crispin 's einsum trick from above. Enforces the filter size into the expanded shape. Should even allow non-square inputs if the indices are compatible.
def conv2d(a, f):
s = f.shape + tuple(np.subtract(a.shape, f.shape) + 1)
strd = numpy.lib.stride_tricks.as_strided
subM = strd(a, shape = s, strides = a.strides * 2)
return np.einsum('ij,ijkl->kl', f, subM)
You can also use fft (one of the faster methods to perform convolutions)
from numpy.fft import fft2, ifft2
import numpy as np
def fft_convolve2d(x,y):
""" 2D convolution, using FFT"""
fr = fft2(x)
fr2 = fft2(np.flipud(np.fliplr(y)))
m,n = fr.shape
cc = np.real(ifft2(fr*fr2))
cc = np.roll(cc, -m/2+1,axis=0)
cc = np.roll(cc, -n/2+1,axis=1)
return cc
https://gist.github.com/thearn/5424195
you must pad the filter to be the same size as image ( place it in the middle of a zeros_like mat.)
cheers,
Dan
https://laurentperrinet.github.io/sciblog/posts/2017-09-20-the-fastest-2d-convolution-in-the-world.html
Check out all the convolution methods and their respective performances here.
Also, I found the below code snippet to be simpler.
from numpy.fft import fft2, ifft2
def np_fftconvolve(A, B):
return np.real(ifft2(fft2(A)*fft2(B, s=A.shape)))
Related
I have 2 2D-arrays. I am trying to convolve along the axis 1. np.convolve doesn't provide the axis argument. The answer here, convolves 1 2D-array with a 1D array using np.apply_along_axis. But it cannot be directly applied to my use case. The question here doesn't have an answer.
MWE is as follows.
import numpy as np
a = np.random.randint(0, 5, (2, 5))
"""
a=
array([[4, 2, 0, 4, 3],
[2, 2, 2, 3, 1]])
"""
b = np.random.randint(0, 5, (2, 2))
"""
b=
array([[4, 3],
[4, 0]])
"""
# What I want
c = np.convolve(a, b, axis=1) # axis is not supported as an argument
"""
c=
array([[16, 20, 6, 16, 24, 9],
[ 8, 8, 8, 12, 4, 0]])
"""
I know I can do it using np.fft.fft, but it seems like an unnecessary step to get a simple thing done. Is there a simple way to do this? Thanks.
Why not just do a list comprehension with zip?
>>> np.array([np.convolve(x, y) for x, y in zip(a, b)])
array([[16, 20, 6, 16, 24, 9],
[ 8, 8, 8, 12, 4, 0]])
>>>
Or with scipy.signal.convolve2d:
>>> from scipy.signal import convolve2d
>>> convolve2d(a, b)[[0, 2]]
array([[16, 20, 6, 16, 24, 9],
[ 8, 8, 8, 12, 4, 0]])
>>>
One possibility could be to manually go the way to the Fourier spectrum, and back:
n = np.max([a.shape, b.shape]) + 1
np.abs(np.fft.ifft(np.fft.fft(a, n=n) * np.fft.fft(b, n=n))).astype(int)
# array([[16, 20, 6, 16, 24, 9],
# [ 8, 8, 8, 12, 4, 0]])
Would it be considered too ugly to loop over the orthogonal dimension? That would not add much overhead unless the main dimension is very short. Creating the output array ahead of time ensures that no memory needs to be copied about.
def convolvesecond(a, b):
N1, L1 = a.shape
N2, L2 = b.shape
if N1 != N2:
raise ValueError("Not compatible")
c = np.zeros((N1, L1 + L2 - 1), dtype=a.dtype)
for n in range(N1):
c[n,:] = np.convolve(a[n,:], b[n,:], 'full')
return c
For the generic case (convolving along the k-th axis of a pair of multidimensional arrays), I would resort to a pair of helper functions I always keep on hand to convert multidimensional problems to the basic 2d case:
def semiflatten(x, d=0):
'''SEMIFLATTEN - Permute and reshape an array to convenient matrix form
y, s = SEMIFLATTEN(x, d) permutes and reshapes the arbitrary array X so
that input dimension D (default: 0) becomes the second dimension of the
output, and all other dimensions (if any) are combined into the first
dimension of the output. The output is always 2-D, even if the input is
only 1-D.
If D<0, dimensions are counted from the end.
Return value S can be used to invert the operation using SEMIUNFLATTEN.
This is useful to facilitate looping over arrays with unknown shape.'''
x = np.array(x)
shp = x.shape
ndims = x.ndim
if d<0:
d = ndims + d
perm = list(range(ndims))
perm.pop(d)
perm.append(d)
y = np.transpose(x, perm)
# Y has the original D-th axis last, preceded by the other axes, in order
rest = np.array(shp, int)[perm[:-1]]
y = np.reshape(y, [np.prod(rest), y.shape[-1]])
return y, (d, rest)
def semiunflatten(y, s):
'''SEMIUNFLATTEN - Reverse the operation of SEMIFLATTEN
x = SEMIUNFLATTEN(y, s), where Y, S are as returned from SEMIFLATTEN,
reverses the reshaping and permutation.'''
d, rest = s
x = np.reshape(y, np.append(rest, y.shape[-1]))
perm = list(range(x.ndim))
perm.pop()
perm.insert(d, x.ndim-1)
x = np.transpose(x, perm)
return x
(Note that reshape and transpose do not create copies, so these functions are extremely fast.)
With those, the generic form can be written as:
def convolvealong(a, b, axis=-1):
a, S1 = semiflatten(a, axis)
b, S2 = semiflatten(b, axis)
c = convolvesecond(a, b)
return semiunflatten(c, S1)
Indexing NumPy array through slices/indexes creates a view that is lightweight (doesn't copy data) and allows assigning to elements of original array. I.e.
import numpy as np
a = np.array([1, 2, 3, 4, 5])
a[2:4] = [6, 7]
print(a)
# [1 2 6 7 5]
But how about multiple views, how do I concatenate them to create a bigger view that still assigns to original first array. E.g. for imaginary function concatenate_views(...):
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
concatenate_views((a[1:3], a[4:6], a[7:9])) = [11, 12, 13, 14, 15, 16]
print(a)
# should print [1 11 12 4 13 14 7 15 16 10]
Of course, I can create a list of indexes for each view that it views to, just by converting slices to indexes and then concatenating these indexes. This way I'll get all indexes of concatenated views and can use these indexes to create a combined view. But this is not what I want. I want NumPy to keep the notion of slices representation, because all slices can be very long and it will be inefficient to convert and store these slices as indexes. I want NumPy to be aware of underlying slices of all concatenated views to make internally just simple looping of slices ranges.
Also would be nice to generalize the problem. Not only concatenate views, but also allow to form any arbitrary tree of slicing/indexing operations, e.g. concatenate views, then apply some slicing, then indexing, then slicing, then concatenate again. Also N-dimensional slicing/indexing. I.e. all fancy stuff that can be done with single un-concatenated view.
The main point of concatenated views is only efficiency. Of cause we can represent any view or slicing operation by N-D array of integer indexes (coordinates, like meshgrid) and then can use this array to make a view of source array. But if numpy can keep a notion of source set of slices instead of array of integers, then first it will be lightweight (much less memory consumption), second instead of reading indexes from memory numpy internally can loop (iterate) through each slice more efficiently in C++ loops.
By having concatenated view I wanted to be able to apply any numpy operation like np.mean(...) to the combined view in efficient way.
Full procedure of concatenating views of N-D slicing based on 2D example is described down below:
1 Step described below:
2D array slicing using 3 slices for each axis
a,b,c - sizes of "slices" along axis 0
d,e,f - sizes of "slices" along axis 1
Each "slice" - is either slice(start, stop, step) or 1D array of integer indexes
d e f
.......
a.0.1.2.
.......
b.3.4.5.
.......
c.6.7.8.
.......
Above 0 1 2 3 4 5 6 7 8 mean not a single integer but some 2D sub-array.
Dots (`.`) also mean some 2D sub-arrays.
Sub-views shapes:
0:(a, d), 1:(a, e), 2:(a, f)
3:(b, d), 4:(b, e), 5:(b, f)
6:(c, d), 7:(c, e), 8:(c, f)
Final aggregated (concatenated) view shape:
((a + b + c), (d + e + f))
containing 2D array
012
345
678
There can be more than one Steps, each next Step applies new sequence of slicing
to the final view obtained on previous Step. Each Step has different set of sizes
of slices and different amount of slices per each dimension.
In general each next Step reduces number of total elements, except the case
when slices or indexes overlap then you may get more elements but with duplicates.
You can use np.r_ to concatenate slice objects and assign back to the indexed array:
a[np.r_[1:3, 4:6, 7:9]] = [11, 12, 13, 14, 15, 16]
print(a)
array([ 1, 11, 12, 4, 13, 14, 7, 15, 16, 10])
Update
Based on your update, I think you might want something like:
from itertools import islice
it = iter([11, 12, 13, 14, 15, 16])
for s in slice(1,3), slice(4,6), slice(7,9):
a[s] = list(islice(it, s.stop-s.start))
print(a)
array([ 1, 11, 12, 4, 13, 14, 7, 15, 16, 10])
You can only concatenate views if they are contiguous in terms of dtypes, strides and offsets. Here is one way to check. This way is likely incomplete, but it illustrates the gist of it. Basically, if the views share a base, and the strides and offsets are aligned so that they are on the same grid, you can concatenate.
In the spirit of TDD, I will work with the following example:
x = np.arange(24).reshape(4, 6)
We (or at least I) want the following to be concatenatable:
a, b = x[:, :4], x[:, 4:] # Basic case
a, b = x[:, :4:2], x[:, 4::2] # Strided
a, b = x[:, :4:2], x[:, 2::2] # Strided overlapping
a, b = x[1:2, 1:4], x[2:4, 1:4] # Stacked
# Completely reshaped:
a, b = x.ravel()[:12].reshape(3, 4), x.ravel()[12:].reshape(3, 4)
# Equivalent to
a, b = x[:2, :].reshape(3, 4), x[2:, :].reshape(3, 4)
We do not want the following to be concatenatable:
a, b = x, np.arange(12).reshape(2, 6) # Buffer mismatch
a, b = x[0, :].view(np.uint), x[1:, :] # Dtype mismatch
a, b = x[:, ::2], x[:, ::3] # Stride mismatch
a, b = x[:, :4], x[:, 4::2] # Stride mismatch
a, b = x[:, :3], x[:, 4:] # Overlap mismatch
a, b = x[:, :4:2], x[:, 3::2] # Overlap mismatch
a, b = x[:-1, :-1], x[1:, 1:] # Overlap mismatch
a, b = x[:-1, :4], x[:, 4:] # Shape mismatch
The following could be interpreted as concatenatable, but won't be in this case:
a, b = x, x[1:-1, 1:-1]
The idea is that everything (dtype, strides, offsets) has to match exactly. Only one axis offset is allowed to be different between the views, as long as it is no more than one stride away from the edge of the other view. The only possible exception is when one view is fully contained in another, but we will ignore this scenario here. Generalizing to multiple dimensions should be pretty simple if we use array operations on the offsets and strides.
def cat_slices(a, b):
if a.base is not b.base:
raise ValueError('Buffer mismatch')
if a.dtype != b.dtype: # I don't thing you can use `is` here in general
raise ValueError('Dtype mismatch')
sa = np.array(a.strides)
sb = np.array(b.strides)
if (sa != sb).any():
raise ValueError('Stride mismatch')
oa = np.byte_bounds(a)[0]
ob = np.byte_bounds(b)[0]
if oa > ob:
a, b = b, a
oa, ob = ob, oa
offset = ob - oa
# Check if you can get to `b` from a by moving along exactly one axis
# This part works consistently for arrays with internal overlap
div = np.zeros_like(sa)
mod = np.ones_like(sa) # Use ones to auto-flag divide-by zero
np.divmod(offset, sa, where=sa.astype(bool), out=(div, mod))
zeros = np.flatnonzero((mod == 0) & (div >= 0) & (div <= a.shape))
if not zeros.size:
raise ValueError('Overlap mismatch')
axis = zeros[0]
check_shape = np.equal(a.shape, b.shape)
check_shape[axis] = True
if not check_shape.all():
raise ValueError('Shape mismatch')
shape = list(a.shape)
shape[axis] = b.shape[axis] + div[axis]
start = np.byte_bounds(a)[0] - np.byte_bounds(a.base)[0]
return np.ndarray(shape, dtype=a.dtype, buffer=a.base, offset=start, strides=a.strides)
Some things that this function does not handle:
Merging flags
Broadcasting
Handling arrays that are fully contained within each other but with multi-axis offsets
Negative strides
You can, however, check that it returns the expected views (and errors) for all the cases shown above. In a more production-y version, I could envision this enhancing np.concatenate, so for failed cases, it would just copy data instead of raising an error.
Ok so I wrote some code for vectorizing a symmetric matrix, it just takes the unique elements and turns them into a 1d vector, while also multiplying the off diagonal elements by root2:
def vectorize_mat(mat):
assert mat.shape[0] == mat.shape[1], 'Matrix is not square'
n = int(mat.shape[0])
vec_len = 0.5*n*(n+1)
weight_mat = (np.tri(n,k=-1)*np.sqrt(2))+np.identity(n)
mask_mat = np.tri(n).astype(bool)
vec_mat = (mat*weight_mat)[mask_mat]
return vec_mat
and this works really well, now I'm trying to figure out how to reconstruct the original array from the vector. I've gotten the original matrix dimensions like so:
v = len(vec_mat)
n = isqrt(2*v)
where isqrt() is an integer square root from:Integer square root in python
but I'm struggling with what to do next. I can now reconstruct the weight and mask matrices. So obviously I could vectorize the weight matrix and divide the vector by it, or divide the reconstructed matrix by the weight matrix to undo that step, but it's the reshaping and stuff (from the boolean indexing) that I don't know how to do. Maybe there's some super simple answer out there,but I can't seem to see it.
To answer your headline question. Indexing - including boolean indexing - can be used for assignment.
Here is an example. Let us first extract the lower triangle using a mask.
>>> a = np.arange(25).reshape(5, 5)
>>> y, x = np.ogrid[:5, :5]
>>> lower = y>=x
>>> b = a[lower]
Now b contains the lower triangle. We can use the same mask to reconstruct the lower triangle and fill the upper triangle symmetrically:
>>> recon = np.empty_like(a)
>>> recon[lower] = b
>>> recon.T[lower] = b
>>> recon
array([[ 0, 5, 10, 15, 20],
[ 5, 6, 11, 16, 21],
[10, 11, 12, 17, 22],
[15, 16, 17, 18, 23],
[20, 21, 22, 23, 24]])
I have an application that has two perspective transforms obtained from two findHomography calls that get applied in succession to a set of points (python):
pts = np.float32([ [758,141],[769,141],[769,146],[758,146] ]).reshape(-1,1,2)
pts2 = cv2.perspectiveTransform(pts, trackingM)
dst = cv2.perspectiveTransform(pts2, updateM)
I would like to combine this into a single transformation. I've tried the following but the transformation is not correct:
M = trackingM * updateM
dst = cv2.perspectiveTransform(pts, M)
How can I combine two matrix transforms into a single transform? For now I'm prototyping in python. A C++ solution in addition to python would be a bonus.
Let us say, we have a series of Perspective Transformation as follows. Let tij be the perspective transform matrix from image_i to image_j
image_1 -- t12 --> image_2 -- t23 --> .... -- tN-1N --> image_N
The point p1 in image_1 would be transformed to point p2 in image_2 as p2 = t12.p1
The point p2 in image_2 would be transformed to point p3 in image_3 as p3 = t23.p2 = t23.t12.p1
Hence pN = tN-1N...t23.t12.p1
Hence the resultant transformation matrix t1N = tN-1N...t23.t12
Hence you can get M = np.dot(updateM, trackingM)
Note that all products are matrix (np.dot) products.
In numpy, np.multiply(M, N) is elementwise-product, while np.dot(M,N) is dot-product. I think, in your case, you should choose np.dot.
For example:
>>> import numpy as np
>>> x = np.array([[1,2],[3,4]])
## elementwise-product
>>> x*x
array([[ 1, 4],
[ 9, 16]])
>>> np.multiply(x,x)
array([[ 1, 4],
[ 9, 16]])
## dot-product
>>> x.dot(x)
array([[ 7, 10],
[15, 22]])
>>> np.dot(x,x)
array([[ 7, 10],
[15, 22]])
I found a clue to the answer in this post. In essence my original code was doing element-wise multiplication as pointed out by #Silencer. The trick is to convert the transform to a matrix before doing the multiply:
M = np.matrix(updateM) * np.matrix(trackingM)
dst = cv2.perspectiveTransform(pts, M)
The order of the operands is significant. One can think of the above as applying the updateM transformation to the trackingM transform to match the order stated in the original problem.
I'm pretty new to Python, so I'm doing a project in it. Part of it includes a diffusion across a map. I'm implementing it by going through and making the current tile equal to .2 * the sum of its neighbors n,w,s,e. If I was doing this in C, I'd just do a double for loop that loops through an array doing arr[i*width + j] = arr of j+1, j-1, i+i, i-1 the neighbors) and have several different arrays that I'd do the same thing for (different qualities of the map I'd be changing). However, I'm not sure if this is really the fastest way in Python. Some people I have asked suggest stuff like numPy, but the width probably won't be more than ~200 (so 40-50k elements max) and I wasn't sure if the overhead is worth it. I don't really know any builtin functions to do what I want. Any advice?
edit: This will be very dense i.e. every spot is going to have a non-trivial calculation
This is quite simple to arrange with NumPy. The function np.roll returns a copy of the array, "rolled" in a specified direction.
For example, given the array x,
x=np.arange(9).reshape(3,3)
# array([[0, 1, 2],
# [3, 4, 5],
# [6, 7, 8]])
you can roll the columns to the right with
np.roll(x,shift=1,axis=1)
# array([[2, 0, 1],
# [5, 3, 4],
# [8, 6, 7]])
Using np.roll, boundaries are wrapped like on a torus. If you do not want wrapped boundaries, you could pad the array with an edge of zeros, and reset the edge to zero before every iteration.
import numpy as np
def diffusion(arr):
while True:
arr+=0.2*np.roll(arr,shift=1,axis=1) # right
arr+=0.2*np.roll(arr,shift=-1,axis=1) # left
arr+=0.2*np.roll(arr,shift=1,axis=0) # down
arr+=0.2*np.roll(arr,shift=-1,axis=0) # up
yield arr
N=5
initial=np.random.random((N,N))
for state in diffusion(initial):
print(state)
raw_input()
Use convolution.
from numpy import *
from scipy.signal import convolve2d
mapArr=array(map)
kernel=array([[0 , 0.2, 0],
[0.2, 0, 0.2],
[0 , 0.2, 0]])
diffused=convolve2d(mapArr,kernel,boundary='wrap')
Is this for the ants challenge? If so, in the ants context, convolve2d worked ~20 times faster than the loop, in my implementation.
This modification to unutbu's code maintains constant the global sum of the array while diffuses the values of it:
import numpy as np
def diffuse(arr, d):
contrib = (arr * d)
w = contrib / 8.0
r = arr - contrib
N = np.roll(w, shift=-1, axis=0)
S = np.roll(w, shift=1, axis=0)
E = np.roll(w, shift=1, axis=1)
W = np.roll(w, shift=-1, axis=1)
NW = np.roll(N, shift=-1, axis=1)
NE = np.roll(N, shift=1, axis=1)
SW = np.roll(S, shift=-1, axis=1)
SE = np.roll(S, shift=1, axis=1)
diffused = r + N + S + E + W + NW + NE + SW + SE
return diffused