# Tesseract Win-Installer https://github.com/UB-Mannheim/tesseract/wiki
import pytesseract
import cv2
image = cv2.imread("img.png")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
text = pytesseract.image_to_string(image)
print(text)
the output is not even close "WX017," instead of "MX011A"
however, if I manually rearrange the characters it works. I could transform the input image and define an ROI but the orientation could be anything. It could be upside down as well.
I want to recognize curved text around a circle
1:
2:
3:
This is pretty difficult, as tesseract expects text to be minimally distorted.
One (admittedly far-fetched) possibility would be to try and detect the circle, then map it onto a rectangle.
To do this, you can reduce each letter to a non-connected blob using a blur filter and discarding grey values below a threshold; iterate until you get more or less circular blobs, then get their centers. Take several of those three by three at random, and for each triplet calculate the center of the circle encompassing all three. The average of those centers should be more or less the center of the lettering circle.
Having the center, and the approximate radius, it is relatively easy to map the circular crown of appropriate height to a rectangle (e.g. using polar-to-cartesian transform).
You then apply tesseract to the transformed rectangle.
It should also be possible to use autocorrelation to average and sharpen a multiple identical text along said rectangle (i.e. "MX011A MXOI1A MX017A" --> "MX011A").
I have two images, one blurred and another sharp image. I need to recover the original image using these images. I have used simple FFT and inverse FFT to estimate point spread function and deblurred image.
fftorg = np.fft.fft2(img1)
fftblur = np.fft.fft2(img2)
psffft = fftblur/fftorg
psfifft = np.fft.ifftshift(np.fft.ifft2(psffft))
plt.imshow(abs(psfifft), cmap='gray')
This is the point spread function image I got, I need to find the type of kernel used for blurring and also its size. Is it possible to get the kernel used from PSF?
I am working on a project of image alignment. So far I've computed the warped image using cv2.warpPerspective(). Now to esure the robustness of algorithm I want to do the following:
Pick 4 or 5 pixel coordinates from sorce image and find their corresponding pixel points in warped image.
Compute the difference b/w these values to that of location of these points in the reference image.
The goal. To estimate the 3D location (x, y, z) of the centre, the width (larger diameter of the glass) and height of the glass. Similarly as in this drawing. The inputs are two images, one coming from one different camera (here and here).
The setup. The images come from two fixed and calibrated (known intrinsic and extrinsic parameters) cameras.
My attempt.
I have segmented the image using FCN or DeepLab. Results here and here.
Then I have got a binary mask of the class of interest (glass) and extracted the most left, up, right and bottom parts of that mask. Results here and here.
I have obtained four 3D points trough triangulation of the "corresponding points" (upper of image 1 with the upper of image 2, most right of image 1 with the most right of image 2, etc...).
I compute the dimensions as: width = | left - right |, and height = |up - bottom|.
Issues. The points are not actual correspondences, therefore the reprojection is inaccurate and then the measure is inaccurate as well (resulting on up to 3cm error). Note that if I choose manually the corresponding pixels on both images and then triangulate I get approximatively 0.1cm error.
Can you guide me on how better (more accurately) solve this problem?
Thank you!
PS: I am using python and OpenCV.
EDIT: This is a deeper explanation of a question I asked earlier, which is still not solved for me.
I'm currently trying to write some code that can extract data from some uncommon graphs in a book. I scanned the pages of the book, and by using opencv I would like to detect some features ofthe graphs in order to convert them into useable data. In the left graph I'm looking for the height of the "triangles" and in the right graph the distance from the center to the points where the dotted lines intersect with the gray area. In both cases I would like to convert these values into numeric data for further usage.
For the left graph, I thought of detecting all the individual colors and computing the area of each sector by counting the amount of pixels in that color. When I have the area of these sectors, I can easily calculate their heights, using basic math. The following code snippet shows how far I've gotten already with identifying different colors. However I can't manage to make this work accurately. It always seems to detect some colors of other sectors as well, or not detect all pixels of one sector. I think it has something to do with the boundaries I'm using. I can't quite figure out how to make them work. Does someone know how I can determine these values?
import numpy as np
import cv2
img = cv2.imread('images/test2.jpg')
lower = np.array([0,0,100])
upper = np.array([50,56,150])
mask = cv2.inRange(img, lower, upper)
output = cv2.bitwise_and(img, img, mask = mask)
cv2.imshow('img', img)
cv2.imshow('mask', mask)
cv2.imshow('output', output)
cv2.waitKey(0)
cv2.destroyAllWindows()
For the right graph, I still have no idea how to extract data from it. I thought of identifying the center by detecting all the dotted lines, and then by detecting the intersections of these dotted lines with the gray area, I could measure the distance between the center and these intersections. However I couldn't yet figure out how to do this properly, since it sounds quite complex. The following code snippet shows how far I've gotten with the line detection. Also in this case the detection is far from accurate. Does someone have an idea how to tackle this problem?
import numpy as np
import cv2
# Reading the image
img = cv2.imread('test2.jpg')
# Convert the image to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Apply edge detection
edges = cv2.Canny(gray,50,150,apertureSize = 3)
# Line detection
lines = cv2.HoughLinesP(edges,1,np.pi/180,100,minLineLength=50,maxLineGap=20)
for line in lines:
x1,y1,x2,y2 = line[0]
cv2.line(img,(x1,y1),(x2,y2),(0,0,255),2)
cv2.imwrite('linesDetected.jpg',img)
For the left image, using your approach, try to look at the RGB histogram, the colors should be significant peaks, if you would like to use the relative area of the segments.
Another alternative could be to use Hough Circle Transform, which should work on circle segments. See also here.
For the right image ... let me think ...
You could create a "empty" diagram with no data inside. You know the locations of the circle segment ("cake pieces"). Then you could identify the area where the data is (the dark ones), either by using a grey threshold, an RGB threshold, or Find Contours or look for Watershed / Distance Transform.
In the end the idea is to make a boolean overlay between the cleared image and the segments (your data) that was found. Then you can identify which share of your circle segments is covered, or knowing the center, find the farthest point from the center.