I'm looking for a way to implement motion blur using Pillow or OpenCV in Python along an arbitrary angle. I have an implementation as I will describe below, but is there a more efficient, built-in method for this in the commonly used libraries? I.e.,
implement_motion_blur(image, kernel_size, angle)
After doing some reading in kernel convolution, I think finding such an implementation also amounts to finding a kernel/matrix that graphically represents a straight line with a slope corresponding to the angle of rotation. So if there is a formulaic implementation for such a kernel in Numpy, that would also work for me.
Currently, I am implementing motion blur along an arbitrary angle as follows:
Create a horizontal motion blur kernel with kernel size ceil(sqrt(2)*original_kernel_size) (since the diagonal line is sqrt(2) times that of the horizontal/vertical line.)
Convert the motion blur kernel into a PIL image, and rotate the image by the prescribed angle using PIL.Image.rotate
Reconvert the image into a matrix, and extract a submatrix of the desired dimension from the center to recover the motion blur kernel.
A horizontal motion blur kernel can be obtained using Numpy, or from GeeksForGeeks. Effectively, I am rotating a straight line by an angle to obtain a motion blur kernel along the same angle.
Although this method works, I would prefer a more conventional approach, if possible. Thanks!
Related
Not completely sure what to call this problem but I will try my best to explain it here.
I have the coordinates of a line I want to draw onto a numpy array. However, I don't just want a simple line, but a thick line where I can specify the falloff (brightness with distance from the line) with a curve or mathematic function. For example, I might want to have a gaussian falloff, which would look something similar to the example below where a gaussian blur was applied to the image.
However, using blur filters does not allow the flexibility in functions I would like and does not enable precise control of the falloff (for example, when I want points on the line to have exactly value 1.0 and points further than say 10 pixels away to be 0.0).
I have attempted to solve this problem by creating the falloff pattern for a point, and then drawing that pattern into a new numpy channel for every point of the line, before merging them via the max function. This works but is too slow.
Is there a more efficient way to draw such a line from my input coordinates?
The solution I came up with is to make use of dilations. This method is more general and can be applied to any polygonal shape or binary mask.
Rasterize geometry the simple way first. For points set the corresponding pixel; for lines draw 1 pixel thick lines with library function from opencv or similar; for polygons draw the boundary or fill the polygon with opencv functions. This results in the initial mask with value 1 on the lines.
Iteratively apply dilations to this mask. This grows the mask pixel by pixel. Set the strength of the new mask according to an arbitrary falloff function.
The dilation operation is available in opencv. Alternatively, it can efficiently be implemented as a simple convolution with boolean matrices, which can then run on GPU devices.
An example of the results can be seen with the polygonal input:
Exponential falloff:
Sinusoidal falloff:
I have two images, one blurred and another sharp image. I need to recover the original image using these images. I have used simple FFT and inverse FFT to estimate point spread function and deblurred image.
fftorg = np.fft.fft2(img1)
fftblur = np.fft.fft2(img2)
psffft = fftblur/fftorg
psfifft = np.fft.ifftshift(np.fft.ifft2(psffft))
plt.imshow(abs(psfifft), cmap='gray')
This is the point spread function image I got, I need to find the type of kernel used for blurring and also its size. Is it possible to get the kernel used from PSF?
I have pairs of image - an image (blurred intentionally) and its depth map (given as PNG).
For example:
However, there seems to be a shift between the depth map and the real image as can be seen in this example:
All i know that these images were shot with a RealSense LiDAR Camera L515 (I do not have knowledge of the underlying camera characteristics or the distance between both rgb and infrared sensors).
Is there a way to align both images? I searched the internet for possible solutions. However, all solutions rely on data that I do not have, such as the intrinsic matrix, cameras SDK and more.
Since the two imaging systems are very close physically, the homography between them would likely be a good approximation. You can find the homography using 4 corresponding points that you choose manually.
You can use the OpenvCV implementation.
I have been capturing a photo of my face every day for the last couple of months, resulting in a sequence of images taken from the same spot, but with slight variations in orientation of my face. I have tried several ways to stabilize this sequence using Python and OpenCV, with varying rates of success. My question is: "Is the process I have now the best way to tackle this, or are there better techniques / order to execute things in?"
My process so far looks like this:
Collect images, keep the original image, a downscaled version and a downscaled grayscale version
Using dlib.get_frontal_face_detector() on the grayscale image, get a rectangle face containing my face.
Using the dlib shape-predictor 68_face_landmarks.dat, obtain the coordinates of the 68 face landmarks, and extract the position of eyes, nose, chin and mouth (specifically landmarks 8, 30, 36, 45, 48 and 54)
Using a 3D representation of my face (i.e. a numpy array containing 3D coordinates of an approximation of these landmarks on my real actual face in an arbitrary reference frame) and cv2.solvePnP, calculate a perspective transform matrix M1 to align the face with my 3D representation
Using the transformed face landmarks (i.e. cv2.projectPoints(face_points_3D, rvec, tvec, ...) with _, rvec, tvec = cv2.solvePnP(...)), calculate the 2D rotation and translation required to align the eyes vertically, center them horizontally and place them on a fixed distance from each other, and obtain the transformation matrix M2.
Using M = np.matmul(M2, M1) and cv2.warpPerspective, warp the image.
Using this method, I get okay-ish results, but it seems the 68 landmark prediction is far from perfect, resulting in twitchy stabilization and sometimes very skewed images (in that I can't remember having such a large forehead...). For example, the landmark prediction of one of the corners of the eye not always aligns with the actual eye, resulting in a perspective transform with the actual eye being skewed 20px down.
In an attempt to fix this, I have tried using SIFT to find features in two different photos (aligned using above method) and obtain another perspective transform. I then force the features to be somewhere around my detected face landmarks as to not align the background (using a mask in cv2.SIFT_create().detectAndCompute(...)), but this sometimes results in features only (or predominantly) being found around only one of the eyes, or not around the mouth, resulting again in extremely skewed images.
What would be a good way to get a smooth sequence of images, stabilized around my face? For reference, this video (not mine, which is stabilized around the eyes).
firstly, I wanted to know the metric unit of the 3d point we got from the opencv reprojectImageTo3D() function.
secondly, I have calibrated each camera individually with a chessboard with "mm" as metric unit and then use the opencv functions to calibrate the stereo system, rectify the stereo pair and then compute the disparity map.
Basically i want the distance of a center of a bounding box.
so i compute the disparity map and reproject it to 3D with the reprojectImageTo3D() function and then i take from those 3D points, the one which correspond to the center of the bbox (x, y).
But which image should i use to get the center of bbox? the rectified or the original?
Secondly, is it better to use the same camera model for a stereo system?
Thank you
During the calibration process (calibrateCamera) you have to give the points grid of your calibration target. The unit that you give there will then define the unit for the rest of the process.
When calling reprojectImageTo3D, you probably used the matrix Q output by stereoRectify, which takes in the individual calibrations (cameraMatrix1, cameraMatrix2). That's where the unit came from.
So in your case you get mm I guess.
reprojectImageTo3D has to use the rectified image, since the disparity is calculated using the rectified image (It wouldn't be properly aligned otherwise). Also, when calculating the disparity, it is calculated relative to the first image given (left one in the doc). So you should use the left rectified image if you computed the disparity like this: cv::StereoMatcher::compute(left, right)
I never had two different cameras, but it makes sense to use the same ones. I think that if you have very different color images, edges or any image difference, that could potentially influence the disparity quality.
What is actually very important (unless you are only working with still pictures), is to use cameras that can be synchronized by hardware (e.g. GENLOCK signal: https://en.wikipedia.org/wiki/Genlock). If you have a bit of delay between left and right and a moving subject, the disparity can be wrong. This is also true for the calibration.
Hope this helps!