I am trying to analyse an image and extract each number to then process using a CNN trained with MNIST. The images show garments with a grid-like pattern in each intersection of the grid there is a number (e.g. 0412). I want to analyse and detect which number it is to then store it's coordinates. Does anyone have any recommendations on how to preprocess the image given that it is quite noisy and with multiple numbers. I have tried using contours and it didn't work. I also put the image into binary and there are areas of the image which are unreadable. My initial idea was to isolate each number to then process.
Thanks in advance!
Related
I have a large images 5000x3500 and I want to divide it into small images 512x512 but without loosing the original image coordinates. The large images are annotated/labled that's why I want to keep the original coordinates and I will use the small images to train YOLO model. I am not sure if that called tiled or not. but is there any suggestion to do it using python or opencv-python?
Is there a way to attach low feature images together from vertical? I have tried OpenCV's ORB, SIFT& SURF, however, if the images have no features or low features, it fails to stitch them together.
I want these images put together:
Please, let me know if there a way to stitch them together or if blending works?
the feature matching are often based on contours inside images, there is not contours on your both images like corners or particular poylgon. Once openCv can't find contours and so feature, it can't make template matching to check if feature are ok and assembly both images.
If you have some feature despite the image content, try to lower the matching threshold, it will allow algorithm to match feature that are not exactly the same, like in your two differents pictures
I am trying to write a script (in bash using imagemagick or in python), to generate an image similar as in this example:
The source is 25 separate jpeg's. So far I have written a script (imagemagick) which takes each of the images and detects the contours of the person and replaces the white background with a transparent one.
The next step is to fit the contours randomly into one large image. Each image should fit into the larger image, without overlapping it's neighbors. It seems I need to some type of collision detection.
I am looking for pointers on how to tackle this problem.
I have pictures of networks, and my goal is to process the images in order to extract the skeleton of the network.
My approach lies in two steps :
1) Going from grayscale image to binary image (Using local thresholding or Otsu method, and then a medianfilter (python function medfilt)
2) Using thinning algorithms in order to extract the skeleton of the network.
Considering the quality of the first image, I'm pretty sure I can do a lot better than that. I thus have two question :
1) Taking the last image, how would you remove all the small lines perpendendicular to the ridges, and the small gaps, that are artefacts ?
2) What algorithm would you actually advise me to use for these two steps ?
l have a set of images which represent a sequence of characters. l'm wonderning whether OpenCV or other techniques can segment and crop each character from the image. for instance :
l have as input
l want to get :
is 5
is 0
is 4
is 1
is 9
is 2
You have two problems here for going from your input to your output :
The first is seperating your characters. If your images always look like this, with numbers neatly seperated, then you should have no problem at all seperating them using findContours or connectedComponents, maybe along with a bounding box function like minAreaRect.
The second problem is once you have seperated your digits, how to tell which digit the image represents. This problem has a name : OCR.
If you have a lot of images, it is also possible to train a classification algorithm, as your tagging of this question suggests. The "hot topic" right now is deep learning with neural networks, but for simple applications, regular machine learning classification with hand-designed features might do the trick.
If you want to segment the numbers, I would first try to play with opening operations (because your letters are black on a white background, it would be closing if it was the opposite) in order to fill the holes that you have in your numbers. Then I would project vertically the pixels and analyze the shape that you get. If you find the valley points in this projected shape you will get the vertical limits between characters. You can do the same horizontally to get the upper and bottom limits of your chars. This approach will only work if the text is horizontal.
Then you could use an standard OCR library or go for deep learning. Since these number appear to be from MNIST dataset, you will find a lot of examples to do OCR using deep learning or other techniques with this dataset:
http://yann.lecun.com/exdb/mnist/