I'm working on an anpr system and to convert the registration plate image to text output. I previously tried to use (py)tessaract to do the ocr for me but this wasn't giving me sufficient results.
As my current training set i'm using this font as all registration fonts are the same
from my images some of the resultant number plates will be at weird angles so the plate isn't recognised correctly
So I am asking is there a way to make each digit distorted in many different ways and storing that distortion in an nparray in a file and from this can perform machine learning techniques on
Something like this (however output to be different)
https://archive.ics.uci.edu/ml/datasets/Letter+Recognition
Thanks i used a previous point to help me so far to separate characters and so on
Recognize the characters of license plate
Thanks any help would be appreciated
Related
I'm trying to translate images of texts using tesseract. The results seems accurate from my trials. However it seems that I can also train tesseract to be more accurate although complicated.
My question is, how reliable out-of-box tesseract for image to text function for digital images containing popular font like times new roman, arial, etc?
It usually depends on the content of the image - if there's some noise or just unrelated to text background (logos/tables/just random things) - the quality would drop, especially if the contrast of text vs noise is not big enough.
It also depends on the text size: if you have multiple text areas with different font size - you'd most likely need to process those separately (or figure out if different PSM mode could help you), so it would be hard to prepare a generic solution which would work in all cases.
In general - you can visit Tessereact: how to improve quality page and try to follow all the instructions there.
for my school project, I need to find images in a large dataset. I'm working with python and opencv. Until now, I've managed to find an exact match of an image in the dataset but it takes a lot of time even though I had 20 images for the test code. So, I've searched few pages of google and I've tried the code on these pages
image hashing
building an image hashing search engine
feature matching
Also, I've been thinking to search through the hashed dataset, save their paths, then find the best feature matching image among them. But most of the time, my narrowed down working area is so much different than what is my query image.
The image hashing is really great. It looks like what I need but there is a problem: I need to find an exact match, not similar photos. So, I'm asking you guys, if you have any suggestion or a piece of code might help or improve the reference code that I've linked, can you share it with me? I'd be really happy to try or research what you guys send or suggest.
opencv is probably the wrong tool for this. The algorithms there are geared towards finding similar matches, not exact ones. The general idea is to use machine learning to teach the code to recognize what a car looks like so it can detect cars in videos, even when the color or form changes (driving in the shadow, different make, etc).
I've found two approaches work well when trying to build an image database.
Use a normal hash algorithm like SHA-256 plus maybe some metadata (file or image size) to find matches
Resize the image down to 4x4 or even 2x2. Use the pixel RGB values as "hash".
The first approach is to reduce the image to a number. You can then put the number in a look up table. When searching for the image, apply the same hashing algorithm to the image you're looking for. Use the new number to look in the table. If it's there, you have a match.
Note: In all cases, hashing can produce the same number for different pictures. So you have to compare all the pixels of two pictures to make sure it's really an exact match. That's why it sometimes helps to add information like the picture size (in pixels, not file size in bytes).
The second approach allows to find pictures which very similar to the eye but in fact slightly different. Imagine cropping off a single pixel column on the left or tilting the image by 0.01°. To you, the image will be the same but for a computer, they will by totally different. The second approach tries to average small changes out. The cost here is that you will get more collisions, especially for B&W pictures.
Finding exact image matches using hash functions can be done with the undouble library (Disclaimer: I am also the author). It works using a multi-step process of pre-processing the images (grayscaling, normalizing, and scaling), computing the image hash, and the grouping of images based on a threshold value.
We want to train a particular font and all the alphabets from A-Z and all numbers from 0-9. How many positive and negative samples of each would do the job?
It would be a tedious task to do though but tesseract is not that accurate to read number plates of moving vehicles. Any other suggestions to do the task?
I am quoting from the following Wikipedia article- https://en.m.wikipedia.org/wiki/Automatic_number_plate_recognition
There are seven primary algorithms that the software requires for identifying a license plate:
1.Plate localization – responsible for finding and isolating the plate on the picture.
2.Plate orientation and sizing – compensates for the skew of the plate and adjusts the dimensions to the required size.
3.Normalization – adjusts the brightness and contrast of the image.
4.Character segmentation – finds the individual characters on the plates.
5.Optical character recognition.
6.Syntactical/Geometrical analysis – check characters and positions against country-specific rules.
7.The averaging of the recognised value over multiple fields/images to produce a more reliable or confident result. Especially since any single image may contain a reflected light flare, be partially obscured or other temporary effect.
Coming back to your question Haar cascades can be used to localise number plates. However for the OCR part I would personally recommend a CNN network. You can find an implementation here- https://matthewearl.github.io/2016/05/06/cnn-anpr/
There is also this library specialised in the task-https://github.com/openalpr/openalpr check out that as well
For haar cascade-https://github.com/opencv/opencv/blob/master/data/haarcascades/haarcascade_licence_plate_rus_16stages.xml
Good luck
Here is the image after the Pre Processed of a water meter reading...
But whenever I am using tesseract to recognize the digits its not giving an appropriate output.
So, I want to extract/segment out the digits part only as an region of Interest and to save it in a new image file, such that the tesseract can recognize it properly...
I am able to remove those extra noises in an image, that's why I am using this option.
Is there any way to do that ?
The Unprocessed Image is
Before you try extracting your digits from this image, try to reduce your image size so that your digit size would be about 16 pixels height. Secondly, reduce your tesseract scanned characters whitelist to "0123456789", to avoid other characters like ",.;'/" and so on being scanned (that is quite common on this type of pictures). Lowering your image size should help tesseract to dump this noise and not scan in or mix it with digits. This method should not work by 100% on this kind of image for sure, but to clear this kind of noise would be a challenge withoud a doubt by other ways. Maybe you could try to provide us with unprocessed image if you have one, lets see what is possible then.
I want to extract the text information contained in a postscript image file (the captions to my axis labels).
These images were generated with pgplot. I have tried ps2ascii and ps2txt on Ubuntu but they didn't produce any useful results. Does anyone know of another method?
Thanks
It's likely that pgplot drew the fonts in the text directly with lines rather than using text. Especially since pgplot is designed to output to a huge range of devices including plotters where you would have to do this.
Edit:
If you have enough plots to be worth
the effort than it's a very simple
image processing task. Convert each
page to something like tiff, in mono
chrome Threshold the image to binary,
the text will be max pixel value.
Use a template matching technique.
If you have a limited set of
possible labels then just match the
entire label, you can even start
with a template of the correct size
and rotation. Then just flag each
plot as containing label[1-n], no
need to read the actual text.
If you
don't know the label then you can
still do OCR fairly easily, just
extract the region around the axis,
rotate it for the vertical - and use
Google's free OCR lib
If you have pgplot you can even
build the training set for OCR or
the template images directly rather
than having to harvest them from the
image list