Related
Suppose I have an input matrix of shape (batch_size ,channels ,h ,w)
in this case (1 ,2 ,3 ,3)
[[[[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]],
[[ 9., 10., 11.],
[12., 13., 14.],
[15., 16., 17.]]]])
to do a convolution with it i unroll it to the shape of
(batch_size ,channels * kernel_size * kernel_size ,out_h * out_w)
which is:
[[[ 0., 1., 3., 4.],
[ 1., 2., 4., 5.],
[ 3., 4., 6., 7.],
[ 4., 5., 7., 8.],
[ 9., 10., 12., 13.],
[10., 11., 13., 14.],
[12., 13., 15., 16.],
[13., 14., 16., 17.]]]
now i want to get the unrolled matrix back to its original form
which looks like this:
# for demonstration only the first and second column of the unrolled matrix
# the output should be the same shape as the initial matrix -> initialized to zeros
# current column -> [ 0., 1., 3., 4., 9., 10., 12., 13.]
[[[[0+0, 0+1, 0],
[0+3, 0+4, 0],
[0 , 0 , 0]],
[[0+9 , 0+10, 0],
[0+12, 0+13, 0],
[0 , 0 , 0]]]]
# for the next column it would be
# current column -> [ 1., 2., 4., 5., 10., 11., 13., 14.]
[[[[0 , 1+1, 0+2],
[3 , 4+4, 0+5],
[0 , 0 , 0 ]],
[[9 , 10+10, 0+11],
[12 , 13+13, 0+14],
[0 , 0 , 0 ]]]])
you basically put your unrolled elements back to its original place and sum the overlapping parts together.
But now to my question:
How could one implement this as fast as possible using numpy and
as less loops as possible. I already just looped through it kernel by kernel but this aproach isnt feasible with larger inputs. I think this could be parallelized quite a bit but my numpy indexing and overall knowledge isnt good enough to figure out a good solution by myself.
thanks for reading and have a nice day :)
With numpy, I expect this can be done using numpy.lib.stride_tricks.as_strided. However, I'd suggest that you look at pytorch, which interoperates easily with numpy and has quite efficient primitives for this operation. In your case, the code would look like:
kernel_size = 2
x = torch.arange(18).reshape(1, 2, 3, 3).to(torch.float32)
unfold = torch.nn.Unfold(kernel_size=kernel_size)
fold = torch.nn.Fold(kernel_size=kernel_size, output_size=(3, 3))
unfolded = unfold(x)
cols = torch.arange(kernel_size ** 2)
for col in range(kernel_size ** 2):
# col = 0
unfolded_masked = torch.where(col == cols, unfolded, torch.tensor(0.0, dtype=torch.float32))
refolded = fold(unfolded_masked)
print(refolded)
tensor([[[[ 0., 1., 0.],
[ 3., 4., 0.],
[ 0., 0., 0.]],
[[ 9., 10., 0.],
[12., 13., 0.],
[ 0., 0., 0.]]]])
tensor([[[[ 0., 1., 2.],
[ 0., 4., 5.],
[ 0., 0., 0.]],
[[ 0., 10., 11.],
[ 0., 13., 14.],
[ 0., 0., 0.]]]])
tensor([[[[ 0., 0., 0.],
[ 3., 4., 0.],
[ 6., 7., 0.]],
[[ 0., 0., 0.],
[12., 13., 0.],
[15., 16., 0.]]]])
tensor([[[[ 0., 0., 0.],
[ 0., 4., 5.],
[ 0., 7., 8.]],
[[ 0., 0., 0.],
[ 0., 13., 14.],
[ 0., 16., 17.]]]])
I have a numpy array:
arr=np.array([[1., 2., 0.],
[2., 4., 1.],
[1., 3., 2.],
[-1., -2., 4.],
[-1., -2., 5.],
[1., 2., 6.]])
I want to flip the second half of this array upward. I mean I want to have:
flipped_arr=np.array([[-1., -2., 4.],
[-1., -2., 5.],
[1., 2., 6.],
[1., 2., 0.],
[2., 4., 1.],
[1., 3., 2.]])
When I try this code:
fliped_arr=np.flip(arr, 0)
It gives me:
fliped_arr= array([[1., 2., 6.],
[-1., -2., 5.],
[-1., -2., 4.],
[1., 3., 2.],
[2., 4., 1.],
[1., 2., 0.]])
In advance, I do appreciate any help.
You can simply concatenate rows below the nth row (included) with np.r_ for instance, with row index n of your choice, at the top and the other ones at the bottom:
import numpy as np
n = 3
arr_flip_n = np.r_[arr[n:],arr[:n]]
>>> array([[-1., -2., 4.],
[-1., -2., 5.],
[ 1., 2., 6.],
[ 1., 2., 0.],
[ 2., 4., 1.],
[ 1., 3., 2.]])
you can do this by slicing the array using the midpoint:
ans = np.vstack((arr[int(arr.shape[0]/2):], arr[:int(arr.shape[0]/2)]))
to break this down a little:
find the midpoint of arr, by finding its shape, the first index of which is the number of rows, dividing by two and converting to an integer:
midpoint = int(arr.shape[0]/2)
the two halves of the array can then be sliced like so:
a = arr[:midpoint]
b = arr[midpoint:]
then stack them back together using np.vstack:
ans = np.vstack((a, b))
(note vstack takes a single argument, which is a tuple containing a and b: (a, b))
You can do this with array slicing and vstack -
arr=np.array([[1., 2., 0.],
[2., 4., 1.],
[1., 3., 2.],
[-1., -2., 4.],
[-1., -2., 5.],
[1., 2., 6.]])
mid = arr.shape[0]//2
np.vstack([arr[mid:],arr[:mid]])
array([[-1., -2., 4.],
[-1., -2., 5.],
[ 1., 2., 6.],
[ 1., 2., 0.],
[ 2., 4., 1.],
[ 1., 3., 2.]])
I have tensorA of size 10x4x9x2, the other tensorB is of size 10x5x2 that contains values from tensorA. Now, how can i find the index of each element in tensorB in tensorA.
Example:
First 2 elements of TensorA:
[[[[ 4., 1.],
[ 1., 2.],
[ 2., 5.],
[ 5., 3.],
[ 3., 11.],
[11., 10.],
[10., -1.],
[-1., -1.],
[-1., -1.]],
[[12., 13.],
[13., 9.],
[ 9., 7.],
[ 7., 5.],
[ 5., 3.],
[ 3., 4.],
[ 4., 1.],
[ 1., 0.],
[ 0., -1.]],
...... so on
Fist 2 elements of TensorB:
[[[ 2., 5.],
[ 5., 7.],
[ 7., 9.],
[ 9., 10.],
[10., 12.]],
[[ 0., 1.],
[ 1., 2.],
[ 2., 5.],
[ 5., -1.],
[-1., -1.]],
Now in tensorB the first element is [2,5] included in the first 5x2 matrix (dimension 0).
so the element should be matched against dimension 0 in tensorA. And the output should be index
0,0,2 since it is the 3rd element.
You can compare the rows that are equal, sum along the last axis, and check that sum against the size of the searched tensor. Then the nonzero function will get you the indices you're looking for.
Since for the example tensors you have given, TensorB[0, 0] is [2., 5.], that looks like:
((TensorA == TensorB[0, 0]).sum(dim=3) == 2).nonzero()
This will return a tensor of [[0, 0, 2]] if that is the only matching row. If you don't want to hard-code 2 (the size of the searched tensor), you can use:
((TensorA == TensorB[0, 0]).sum(dim=3) == TensorB[0, 0].size()[0]).nonzero()
I have the code
img = cv2.imread("poolPictures\chessboard3.jpg", cv2.IMREAD_COLOR)
chessboardImage = cv2.imread("poolPictures\chessboardActual.jpg", cv2.IMREAD_COLOR)
ret, corners = cv2.findChessboardCorners(img, (9,6), None)
cv2.drawChessboardCorners(img, (9,6), corners, ret)
chessRet, chessCorners = cv2.findChessboardCorners(chessboardImage, (9,6), None)
ret, matrix, dist, rvecs, tvecs = cv2.calibrateCamera(corners, chessCorners, chessboardImage.shape[::-1][1:3], None, None)`
Running the code throws the error:
ret, matrix, dist, rvecs, tvecs = cv2.calibrateCamera(corners, chessCorners, chessboardImage.shape[::-1][1:3], None, None)
cv2.error: C:\projects\opencv-python\opencv\modules\calib3d\src\calibration.cpp:3110: error: (-210) objectPoints should contain vector of vectors of points of type Point3f in function cv::collectCalibrationData
chessboard3.jpg:
chessboardActual.jpg:
results of draw chessboard:
I have tried converting the objectpoints to a 3 dimensional vector instead of 2 by by introducing a dummy 3rd dimension - I could not find the python version for Point3f().
I also saw from here https://github.com/opencv/opencv/issues/6002 that sometimes the error might be misleading, and that the real problem is that one of the vectors inside imagePoints is empty - I have tried printing the vectors and none are empty.
Hopefully someone can help, might just be a case of taking more pictures...
Cheers,
As Zenith042 pointed out, I had image points and object points the wrong way round. However, the main issue was that instead of a numpy array for my image points like:
[[[ 137.5 205. ]]
[[ 143.5 206.5]]
.
.
.
[[ 137.5 209.5]]]
I instead needed:
[[ 137.5 205. ]
[ 143.5 206.5]
.
.
.
[ 137.5 209.5]]]
Which I achieved with:
ret, corners = cv2.findChessboardCorners(img, (9,6), None)
corners = np.array([[corner for [corner] in corners]])
although I suspect there is a nicer way with numpy.reshape.
I also needed the same structure for the object points, i.e.
objp = np.array([objp])
OpenCV also requires floats to have single precision, as per this question.
This means every array needs to be converted to np.float32 before passing to calibrateCamera.
You can create dummy objPoints using
>>> objp = np.zeros((6*9, 3), np.float32)
>>> objp[:,:2] = np.mgrid[0:9, 0:6].T.reshape(-1,2)
>>> objp
array([[ 0., 0., 0.],
[ 1., 0., 0.],
[ 2., 0., 0.],
[ 3., 0., 0.],
[ 4., 0., 0.],
[ 5., 0., 0.],
[ 6., 0., 0.],
[ 7., 0., 0.],
[ 8., 0., 0.],
[ 0., 1., 0.],
[ 1., 1., 0.],
[ 2., 1., 0.],
[ 3., 1., 0.],
[ 4., 1., 0.],
[ 5., 1., 0.],
[ 6., 1., 0.],
[ 7., 1., 0.],
[ 8., 1., 0.],
[ 0., 2., 0.],
[ 1., 2., 0.],
[ 2., 2., 0.],
[ 3., 2., 0.],
[ 4., 2., 0.],
[ 5., 2., 0.],
[ 6., 2., 0.],
[ 7., 2., 0.],
[ 8., 2., 0.],
[ 0., 3., 0.],
[ 1., 3., 0.],
[ 2., 3., 0.],
[ 3., 3., 0.],
[ 4., 3., 0.],
[ 5., 3., 0.],
[ 6., 3., 0.],
[ 7., 3., 0.],
[ 8., 3., 0.],
[ 0., 4., 0.],
[ 1., 4., 0.],
[ 2., 4., 0.],
[ 3., 4., 0.],
[ 4., 4., 0.],
[ 5., 4., 0.],
[ 6., 4., 0.],
[ 7., 4., 0.],
[ 8., 4., 0.],
[ 0., 5., 0.],
[ 1., 5., 0.],
[ 2., 5., 0.],
[ 3., 5., 0.],
[ 4., 5., 0.],
[ 5., 5., 0.],
[ 6., 5., 0.],
[ 7., 5., 0.],
[ 8., 5., 0.]], dtype=float32)
You can then call
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objp, corners, img.shape[::-1],None,None)
You don't need to actually provide calibrateCamera with an 'actual' chessboard image.
I would like to get several values whose i have the coordinates.
My coordinates are given by "Coord" (shape : (3, 3, 2, 3) : X and Y during 3 times and with 2 because of 2 coordinates) and my values are given by "Values" (shape : (3, 3, 3) for 3 times)
In other words, i would like to concatenate values in time with "slices" for each positions...
I dont know how to undertake that...Here there is a little part of the arrays.
import numpy as np
Coord = np.array([[[[ 4., 6., 10.],
[ 1., 3., 7.]],
[[ 3., 5., 9.],
[ 1., 3., 7.]],
[[ 2., 4., 8.],
[ 1., 3., 7.]]],
[[[ 4., 6., 10.],
[ 2., 4., 8.]],
[[ 3., 5., 9.],
[ 2., 4., 8.]],
[[ 2., 4., 8.],
[ 2., 4., 8.]]],
[[[ 4., 6., 10.],
[ 3., 5., 9.]],
[[ 3., 5., 9.],
[ 3., 5., 9.]],
[[ 2., 4., 8.],
[ 3., 5., 9.]]]])
Values = np.array([[[-4.24045246, 0.97551048, -5.78904502],
[-3.24218504, 0.9771782 , -4.79103141],
[-2.24390519, 0.97882129, -3.79298771]],
[[-4.24087775, 1.97719843, -5.79065966],
[-3.24261128, 1.97886271, -4.7926441 ],
[-2.24433235, 1.98050192, -3.79459845]],
[[-4.24129055, 2.97886284, -5.79224713],
[-3.24302502, 2.98052345, -4.79422942],
[-2.24474697, 2.98215901, -3.79618161]]])
EDIT LATER
I try in case of a simplified problem (without time first). I have used a "for loop" but
somes errors seems subsist...do you think it s the best way to treat this problem? because my arrays are important... 400x300x100
Coord3 = np.array([[[ 2, 2.],
[ 0., 1.],
[ 0., 2.]],
[[ 1., 0.],
[ 2., 1.],
[ 1., 2.]],
[[ 2., 0.],
[ 1., 1.],
[ 0., 0.]]])
Coord3 = Coord3.astype(int)
Values2 = np.array([[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]])
b = np.zeros((3,3))
for i in range(Values2.shape[0]):
for j in range(Values2.shape[1]):
b[Coord3[i,j,0], Coord3[i,j,1]] = Values2[i,j]
b
Your second example is relatively easy to do with fancy indexing:
b = np.zeros((3,3), values2.dtype)
b[coord3[..., 0], coord3[..., 1]] = values2
The origial problem is a bit harder to do, but I think this takes care of it:
coord = coord.astype(int)
x_size = coord[..., 0, :].max() + 1
y_size = coord[..., 1, :].max() + 1
# x_size, y_size = coord.max(axis=(0, 1, 3)) + 1
nt = coord.shape[3]
b = np.zeros((x_size, y_size, nt), values.dtype)
b[coord[..., 0, :], coord[..., 1, :], np.arange(nt)] = values