Open CV Python- Image distortion radial and perspective- - python

I have a distorted picture, where without distortion the point A, B C and D form a square of 1 cm * 1 cm.
I tried to use homography to correct it, but it distort the line AD and BC, as you can see in the figure.
Do you have an idea how could I correct that?
Thanks a lot!
Marie- coder beginner
PS: for info, the image is taken in a tube with an endoscope camera having a large field of view allowing to take picture of the tube almost around the camera. I will use the 1*1 cm square to estimate roots growth with several pictures taken over time.
here is my code:
import cv2
import numpy as np
import matplotlib.pyplot as plt
if __name__ == '__main__' :
# Read source image.
im_src = cv2.imread('points2.jpg', cv2.IMREAD_COLOR)
# Four points of the miniR image
pts_src = np.array([[742,223],[806,255],[818,507],[753,517]], dtype=float)
# Read destination image.
im_dst = cv2.imread('rectangle.jpg', cv2.IMREAD_COLOR)
# Four points of the square
pts_dst = np.array([[200,200],[1000,200],[1000,1000],[200,1000]], dtype=float)
# Calculate Homography
h, status = cv2.findHomography(pts_src, pts_dst)
# Warp source image to destination based on homography
im_out = cv2.warpPerspective(im_src, h, (im_dst.shape[1],im_dst.shape[0]))
cv2.imwrite('corrected2.jpg', im_out)
# Display images
cv2.imshow("Source Image", im_src)
cv2.imshow("Destination Image", im_dst)
cv2.imshow("Warped Source Image", im_out)
cv2.waitKey(0)

A homography is a projective transformation. As such it can only map straight lines to straight lines. The straight sides of your input curvilinear quadrangle are correctly rectified, but there is no way that you can straighten the curved sides using a projective transform.
In the photo you posted it may be reasonable to assume that the overall geometry is approximately a cylinder, and the "vertical" lines are parallel to the axis of the cylinder. So they are approximately straight, and a projective transformation (the camera projection) will map them to straight lines. The "horizontal" lines are the images of circles, or ellipses if the cylinder is squashed. A projective transformation will map ellipses (in particular, circles) into ellipses. So you could proceed by fitting ellipses. See this other answer for hints.

I found a solution using GDAL. We can use two chessboard images. One image is imaged with the device creating the distortion and remain unchanged - so with no distortion. With the help QGIS you create a file with associating distorted point to undistorted one. For that you add a Ground Control Point at each intersection using a defined grid interval (e.g. 100px) and export the resulting GCPs as pointsfile.points.
After that, you can use a batch file that a collaborator created here. It is using GDAL to geo-correct the images.
You just need to put the images that you would like to transform (jpg format) into the root directory of the repo and run bash warp.sh. This will output the re-transformed images into the out/ directory.

Related

Python Tesseract figuring out text orientation/transformation

# Tesseract Win-Installer https://github.com/UB-Mannheim/tesseract/wiki
import pytesseract
import cv2
image = cv2.imread("img.png")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
text = pytesseract.image_to_string(image)
print(text)
the output is not even close "WX017," instead of "MX011A"
however, if I manually rearrange the characters it works. I could transform the input image and define an ROI but the orientation could be anything. It could be upside down as well.
I want to recognize curved text around a circle
1:
2:
3:
This is pretty difficult, as tesseract expects text to be minimally distorted.
One (admittedly far-fetched) possibility would be to try and detect the circle, then map it onto a rectangle.
To do this, you can reduce each letter to a non-connected blob using a blur filter and discarding grey values below a threshold; iterate until you get more or less circular blobs, then get their centers. Take several of those three by three at random, and for each triplet calculate the center of the circle encompassing all three. The average of those centers should be more or less the center of the lettering circle.
Having the center, and the approximate radius, it is relatively easy to map the circular crown of appropriate height to a rectangle (e.g. using polar-to-cartesian transform).
You then apply tesseract to the transformed rectangle.
It should also be possible to use autocorrelation to average and sharpen a multiple identical text along said rectangle (i.e. "MX011A MXOI1A MX017A" --> "MX011A").

Kernel size estimation for Point spread function for image deblurring

I have two images, one blurred and another sharp image. I need to recover the original image using these images. I have used simple FFT and inverse FFT to estimate point spread function and deblurred image.
fftorg = np.fft.fft2(img1)
fftblur = np.fft.fft2(img2)
psffft = fftblur/fftorg
psfifft = np.fft.ifftshift(np.fft.ifft2(psffft))
plt.imshow(abs(psfifft), cmap='gray')
This is the point spread function image I got, I need to find the type of kernel used for blurring and also its size. Is it possible to get the kernel used from PSF?

Find the coordinate of a source image's point opencv python

I am working on a project of image alignment. So far I've computed the warped image using cv2.warpPerspective(). Now to esure the robustness of algorithm I want to do the following:
Pick 4 or 5 pixel coordinates from sorce image and find their corresponding pixel points in warped image.
Compute the difference b/w these values to that of location of these points in the reference image.

Measuring a real object with two static calibrated cameras

The goal. To estimate the 3D location (x, y, z) of the centre, the width (larger diameter of the glass) and height of the glass. Similarly as in this drawing. The inputs are two images, one coming from one different camera (here and here).
The setup. The images come from two fixed and calibrated (known intrinsic and extrinsic parameters) cameras.
My attempt.
I have segmented the image using FCN or DeepLab. Results here and here.
Then I have got a binary mask of the class of interest (glass) and extracted the most left, up, right and bottom parts of that mask. Results here and here.
I have obtained four 3D points trough triangulation of the "corresponding points" (upper of image 1 with the upper of image 2, most right of image 1 with the most right of image 2, etc...).
I compute the dimensions as: width = | left - right |, and height = |up - bottom|.
Issues. The points are not actual correspondences, therefore the reprojection is inaccurate and then the measure is inaccurate as well (resulting on up to 3cm error). Note that if I choose manually the corresponding pixels on both images and then triangulate I get approximatively 0.1cm error.
Can you guide me on how better (more accurately) solve this problem?
Thank you!
PS: I am using python and OpenCV.

Extracting data from graphs in a scanned document

EDIT: This is a deeper explanation of a question I asked earlier, which is still not solved for me.
I'm currently trying to write some code that can extract data from some uncommon graphs in a book. I scanned the pages of the book, and by using opencv I would like to detect some features ofthe graphs in order to convert them into useable data. In the left graph I'm looking for the height of the "triangles" and in the right graph the distance from the center to the points where the dotted lines intersect with the gray area. In both cases I would like to convert these values into numeric data for further usage.
For the left graph, I thought of detecting all the individual colors and computing the area of each sector by counting the amount of pixels in that color. When I have the area of these sectors, I can easily calculate their heights, using basic math. The following code snippet shows how far I've gotten already with identifying different colors. However I can't manage to make this work accurately. It always seems to detect some colors of other sectors as well, or not detect all pixels of one sector. I think it has something to do with the boundaries I'm using. I can't quite figure out how to make them work. Does someone know how I can determine these values?
import numpy as np
import cv2
img = cv2.imread('images/test2.jpg')
lower = np.array([0,0,100])
upper = np.array([50,56,150])
mask = cv2.inRange(img, lower, upper)
output = cv2.bitwise_and(img, img, mask = mask)
cv2.imshow('img', img)
cv2.imshow('mask', mask)
cv2.imshow('output', output)
cv2.waitKey(0)
cv2.destroyAllWindows()
For the right graph, I still have no idea how to extract data from it. I thought of identifying the center by detecting all the dotted lines, and then by detecting the intersections of these dotted lines with the gray area, I could measure the distance between the center and these intersections. However I couldn't yet figure out how to do this properly, since it sounds quite complex. The following code snippet shows how far I've gotten with the line detection. Also in this case the detection is far from accurate. Does someone have an idea how to tackle this problem?
import numpy as np
import cv2
# Reading the image
img = cv2.imread('test2.jpg')
# Convert the image to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Apply edge detection
edges = cv2.Canny(gray,50,150,apertureSize = 3)
# Line detection
lines = cv2.HoughLinesP(edges,1,np.pi/180,100,minLineLength=50,maxLineGap=20)
for line in lines:
x1,y1,x2,y2 = line[0]
cv2.line(img,(x1,y1),(x2,y2),(0,0,255),2)
cv2.imwrite('linesDetected.jpg',img)
For the left image, using your approach, try to look at the RGB histogram, the colors should be significant peaks, if you would like to use the relative area of the segments.
Another alternative could be to use Hough Circle Transform, which should work on circle segments. See also here.
For the right image ... let me think ...
You could create a "empty" diagram with no data inside. You know the locations of the circle segment ("cake pieces"). Then you could identify the area where the data is (the dark ones), either by using a grey threshold, an RGB threshold, or Find Contours or look for Watershed / Distance Transform.
In the end the idea is to make a boolean overlay between the cleared image and the segments (your data) that was found. Then you can identify which share of your circle segments is covered, or knowing the center, find the farthest point from the center.

Categories