I'm working on a perspective transform application involving transforming 3D points to 2D camera pixels. It is a purely mathematical model, because I'm preparing to use it on hardware that I don't really have access to (so I'm making up focal length and offset values for the intrinsic camera matrix).
When I do the mapping, depending on the xyz location of the camera, I get huge differences in where my transformed image is, and I have to make the matrix where I'm inputting the pixels really large. (I'm mapping an image of 1000x1000 pixels to an image of about 600x600 pixels, but its located around 6000, so I have to make my output matrix 7000x7000, which takes a long time to plt.imshow. I have no use for the actual location of the pixels, because I'm only concerned with what the remapped image looks like.
I was wondering how people dealt with this issue:
I can think of just cropping the image down to the area that is non-zero (where my pixels are actually mapped too? Like seen in:
How to crop a numpy 2d array to non-zero values?
but that still requires me to use space and time to alot a 7000x7000 destination matrix
Related
I want to find an efficient way to rotate a 4x4 image patches from a larger image by angles that are multiples of 15. I am currently extracting a 6x6 patch e.g. patch=img[x-3:x+3,y-3:y+3] and then running scipy.ndimage.interpolation.rotate(patch,-15*o,reshape=False)[1:5,1:5]. However, I essentially need to do this at ever location (x,y) in the image. I have a "stacked" version of the image with an array of size (m,n,6,6) where m and n are the dimensions of the original image. Even if run interpolation.rotate on the stacked version, it looks like it internally simply does it iteratively and it takes a long time.
Since I only need to do this at fixed angles, I am trying to pre-compute some constants and vectorize the implementation so that I can process them all at once. I have tried digging into the implementation of SciPy rotate but it did not help much.
Is there a sensible way to do this?
I know the basic flow or process of the Image Registration/Alignment but what happens at the pixel level when 2 images are registered/aligned i.e. similar pixels of moving image which is transformed to the fixed image are kept intact but what happens to the pixels which are not matched, are they averaged or something else?
And how the correct transformation technique is estimated i.e. how will I know that whether to apply translation, scaling, rotation, etc and how much(i.e. what value of degrees for rotation, values for translation, etc.) to apply?
Also, in the initial step how the similar pixel values are identified and matched?
I've implemented the python code given in https://simpleitk.readthedocs.io/en/master/Examples/ImageRegistrationMethod1/Documentation.html
Input images are of prostate MRI scans:
Fixed Image Moving Image Output Image Console output
The difference can be seen in the output image on the top right and top left. But I can't interpret the console output and how the things actually work internally.
It'll be very helpful if I get a deep explanation of this thing. Thank you.
A transformation is applied to all pixels. You might be confusing rigid transformations, which will only translate, rotate and scale your moving image to match the fixed image, with elastic transformations, which will also allow some morphing of the moving image.
Any pixel that a transformation cannot place in the fixed image is interpolated from the pixels that it is able to place, though a registration is not really intelligent.
What it attempts to do is simply reduce a cost function, where a high cost is associated with a large difference and a low cost is associated with a small difference. Cost functions can be intensity based (pixel values) or feature based (shapes). It will (semi-)randomly shift the image around untill a preset criteria is met, generally a maximum amount of iterations.
What that might look like can be seen in the following gif:
http://insightsoftwareconsortium.github.io/SimpleITK-Notebooks/registration_visualization.gif
Input image
This is a 3000x3000 greyscale image, and I would like to get the coordinates of the diagonal components in the image.
I tried Hough transform and pylsd (line segment detection), but both of them don't work as I hoped. Here are some unsatisfactory outcomes:
result with too many junks
I would like to have a maximum amount of true diagonals with minimum number of junks by using a simple parameter such as the length above which a cluster can be labeled as a line. Any suggestions or tips will be appreciated.
Either Python or R is preferred (not matlab)
I'm trying to implement a blob detector based on LOG, the steps are:
creating an array of n levels of LOG filters
use each of the filters on the input image to create a 3d array of h*w*n where h = height, w = width and n = number of levels.
find a local maxima and circle the blob in the original image.
I already created the filters and the 3d array (which is an array of 2d images).
I used padding to make sure I don't have any problems around the borders (which includes creating a constant border for each image and create 2 extra empty images).
Now I'm trying to figure out how to find the local maxima in the array.
I need to compare each pixel to its 26 neighbours (8 in the same picture and the 9 pixels in each of the two adjacent scales)
The brute force way of checking the pixel value directly seems ugly and not very efficient.
Whats the best way to find a local maxima point in python using openCV?
I'd take advantage of the fact that dilations are efficiently implemented in OpenCV. If a point is a local maximum in 3d, then it is also in any 2d slice, therefore:
Dilate each image in the array with a 3x3 kernel, keep as candidate maxima the points whose intensity is unchanged.
Brute-force test the candidates against their upper and lower slices.
I have two images of the same size and I have computed a vector field to warp the second image onto the first one.
However, as my vector field is computed over a grid with a 10 pixels spacing along both directions, I would like to define such a vector field but for all points of my image.
Thus, I am wondering how I could achieve this.
Possibilities:
interpolate between the points - 2D interpolation over regular grid,
should be fast, using scipy
compute your vector field for a 1-pixel resolution
reduce the size (using PIL) of your original image and use the
10-pixel vector field
Either case it is a tradeoff between image size/quality and speed.