Removing lines from an image a notebook for digit detection python - python

I need to remove the lines from the image below of numbers on a piece of ruled paper without causing my digits any distortion. Without this, my digit detection algorithm to fails as there are artefacts of the ruling lines of the piece of paper in the region of interest's.
a cleaner version of the file without any artefacts

Classic task for Fourier-domain transform.
Perform Fourier transform:
import numpy as np
from scipy.misc import imshow, imsave, imread
img = imread("10XIn.jpg")[:,:,:3]
imggray = np.mean(img, -1)
imfft = np.fft.fft2(imggray)
mags = np.abs(np.fft.fftshift(imfft))
angles = np.angle(np.fft.fftshift(imfft))
visual = np.log(mags)
visual2 = (visual - visual.min()) / (visual.max() - visual.min())*255
visual2 will look like following:
Note the diagonal line across the center - it represents your lines.
Now, I've manually created the mask for this line, but idealy you could filter it out programmatucally
Then we read the mask and paint out the line:
mask = imread("fftimg4_mask.jpg")[:,:,:3]
mask = (np.mean(mask,-1) > 20)
visual[mask] = np.mean(visual)
And then reverse the fft:
newmagsshift = np.exp(visual)
newffts = newmagsshift * np.exp(1j*angles)
newfft = np.fft.ifftshift(newffts)
imrev = np.fft.ifft2(newfft)
newim2 = 255 - np.abs(imrev).astype(np.uint8)
imsave("fftimg2.jpg", newim2 )
Here is newim2
Of course, you could do more accurate patching in fourier space and also you could apply the result back to the original image to keep colors, but I think this post illustrates the idea.

Okay, this might be a bitcomplicated as the color of the notebook lines is quite close to the color of digits, as it seems from your example. I presume, that the green boxes are you addition and not part of the data itself.
You don't state which framework you use, so I will provide only some general tips how to approach this problem.
First step would be some thresholding. You can use either binary thresholding or better some adaptive thresholding with correctly sized windows. You will have to experiment on this. Result of threshholding will be binary image. Still with lines.
Second step will be to use morphological operations to clear the image. If you are not sure what morphology is, look at this morphology tutorial.
Around half way through, there are some examples of removing lines from images. The biggest problem is, that some number also contain horizontal lines. So one option will be to use rather small morphology kernel (maybe 3 rows and 1 column), as the notebook lines are thinner. And update the recognizer, to recognize even distorted numbers. This should be doable, because all the digits will be distored in same way.

Another way to do it is to exploit the known structure.
deskew the image (skew can be found with Hough transform in opencv)
locate peaks in row sums
physically clone pixels above and below lines
I just implemented this for another dataset, example attached. This could be tuned further.

Related

Choosing right structuring element

I am creating an program to automatically separate the solar cells of from a pv module for which i first thresholded the image using adaptive threshold to obtain the following image
After which i intend to remove the the black pixels within the cell boundries by using dilation for which i have used a elliptical structuring element of size (10,10) and have obtained the following image
As you can see there are still some black pixels left, now if i increase the size of the structuring element i lose the cell boundaries
I have tried the other structuring elements available such as cross and rectangular without any success hence i would need to define a custom kernel and don't have an idea about how do i go about defining one.
import numpy as np
import cv2
img=cv2.imread('result2.jpg',0)
th1 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C,\
cv2.THRESH_BINARY,25,-2)
kernel=cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(15,15))
closing = cv2.morphologyEx(th1, cv2.MORPH_CLOSE, kernel)
cv2.imwrite('closing.jpg',closing)
cv2.imwrite('threshold.jpg',th1)
cv2.waitKey(0)
cv2.destroyAllWindows()
Original image before thresholding
My first piece of advice is to not threshold right away. Thresholding is something you want to do way at the end. Thresholding throws away valuable information. Morphological operations work on grey-value images too!
Choosing the right morphological operators to keep only the shapes you are interested in is actually quite intuitive. In this case, you want to keep both horizontal and vertical lines. Let's use line structuring elements. The lines are dark, so we use a closing to remove things that do not look like our lines.
A closing with vertical lines will remove all horizontal lines, and a closing with horizontal lines will remove all vertical lines. So how to combine these two? It turns out that the infimum (pixel-wise minimum) of two closings is a closing also. So the infimum of a closing with vertical lines and one with horizontal lines is a closing with the two lines at the same time, you'll preserve shapes where either of those two lines fit.
Here is an example. I'm using PyDIP (I don't have OpenCV).
import diplib as dip
img = dip.ImageRead('/Users/cris/Downloads/ZrF7k.tif')
img = img.TensorElement(1) # keep only green channel
img = img[0:-2,1:-1] # let's remove the artifacts at the right and top edges
f1 = dip.Closing(img, dip.SE([50,1],'rectangular'))
f2 = dip.Closing(img, dip.SE([1,50],'rectangular'))
out = dip.Infimum(f1, f2)
out.Show('lin')
You can try to tweak that a bit, and add some additional processing, and add your adaptive thresholding at the end to get the edges of the PV cells. But there is actually a much better way of finding these.
I'm taking advantage here of the fact that the panel is so very straight w.r.t. the image, and that it covers the whole image. We can simply take a mean projection along rows and along columns:
x = dip.Mean(out, process=[1, 0]).Squeeze()
y = dip.Mean(out, process=[0, 1]).Squeeze()
import matplotlib.pyplot as pp
pp.subplot(2,1,1)
pp.plot(x)
pp.subplot(2,1,2)
pp.plot(y)
pp.show()
It should be fairly straight-forward to detect the edges of the cells from these projections.

Defining color range for histologic image mask within HSV colorspace (Python, OpenCV, Image-Analysis):

In an effort to separate histologic slides into several layers based on color, I modified some widely distributed code (1) available through OpenCV's community. Our staining procedure marks different cell types of tissue cross sections with different colors (B cells are red, Macrophages are brown, background nuceli have a bluish color).
I'm interested in selecting only the magenta-colored and brown parts of the image.
Here's my attempt to create a mask for the magenta pigment:
import cv2
import numpy as np
def mask_builder(filename,hl,hh,sl,sh,vl,vh):
#load image, convert to hsv
bgr = cv2.imread(filename)
hsv = cv2.cvtColor(bgr, cv2.COLOR_BGR2HSV)
#set lower and upper bounds of range according to arguements
lower_bound = np.array([hl,sl,vl],dtype=np.uint8)
upper_bound = np.array([hh,sh,vh],dtype=np.uint8)
return cv2.inRange(hsv, lower_bound,upper_bound)
mask = mask_builder('sample 20 138 1.jpg', 170,180, 0,200, 0,230)
cv2.imwrite('mask.jpg', mask)
So far a trial and error approach has produced poor results:
The can anyone suggest a smarter method to threshhold within the HSV colorspace? I've done my best to search for answers in previous posts, but it seems that these color ranges are particularly difficult to define due to the nature of the image.
References:
Separation with Colorspaces: http://opencv-python-tutroals.readthedocs.org/en/latest/py_tutorials/py_imgproc/py_colorspaces/py_colorspaces.html
python opencv color tracking
BGR separation: http://www.pyimagesearch.com/2014/08/04/opencv-python-color-detection/
UPDATE:
I've found a working solution to my problem. I increased the lower bound of 'S' and 'V' by regular intervals using a simple FOR control structure, outputing the results for each test image and choosing the best. I found my lower bounds for S and V should be set at 100 and 125. This systematic method of trial and error produced better results:
I am happy you found your answer.
I will suggest an alternate method that might work. Unfortunately I am not proficient with python so you'll need to find out how to code that in python (its basic).
If I had the firs image you have after the HSV threshold, I would use morphological operations to get the information I want.
I would probably give it a go to "closing", but if it doesnt work I would first dilate, then fill and then erode the same amount firstly dilated.
Probably after this first step you'll need to delete the small "noise" blobs you have around and you'll get the image.
This is how it would be in Matlab (showing this mainly so you can see the results):
I=imread('http://i.stack.imgur.com/RlH4V.jpg');
I=I>230; % Create Black and white image (this is because in stackoverflow its a jpg)
ker=strel('square',3); % Create a 3x3 square kernel
I1=imdilate(I,ker); % Dilate
I2=imfill(I1,'holes'); % Close
I3=imerode(I2,ker); % Erode
Ilabel=bwlabel(I3,8); % Get a label per independent blob
% Get maximum area blob (you can do this with a for in python easily)
areas = regionprops(Ilabel,'Centroid','Area','PixelIdxList');
[~,index] = max([areas.Area]); % Get the maximum area
Imask=Ilabel==index; % Get the image with only the max area.
% Plot: This is just matlab code, no relevance
figure;
subplot(131)
title('Dialted')
imshow(I1);
subplot(132)
title('Closed')
imshow(I2);
subplot(133)
title('Eroded')
imshow(I3);
figure;
imshow(imread('http://i.stack.imgur.com/ZqrF9.jpg'))
hold on
h=imshow(bwperim(Imask));
set(h,'alphadata',Imask/2)
Note that I started from the "bad" HSV segmentation. If you try a better one the results may improve. Also, play with the kernel size for the erosion and dilation.
Through trial-and-error (incrementing down and up the "S" and "V" scales), I found that my desired colors require a relaxed range for "S" and "V" values. I'll refrain from sharing the particular values I use because I don't think anyone would find such information useful.
Note that the original code shared works fine once more representitive ranges are used.

Cases where Morphological Opening and Closing yields the same results?

I would like to know if there are any examples or cases where Opening and Closing Morphology operations on an single image produce the same results.
As an example, let's say we have an image X, and we have done opening operation to produce Y. Similarly, we have done a closing operation on the original X to produce the same Y. I would like to know if there are examples for these type of images X. Programming examples in Python or MATLAB are also appreciated.
Yes there are. As one small example, if you had a binary image where it consists of a bunch of squares that are disconnected and distinct. Provided that you specify a structuring element that is square, and choosing the structuring element so that it is smaller than the smallest square in the image, then doing either operation will give you the same results.
If you did an opening on this image and a closing on this image, you will produce the same results. Remember, an opening is an erosion followed by a dilation where a closing is a dilation followed by an erosion. In terms of analyzing the shapes, erosion slightly shrinks the area of the image while dilation slightly enlarges it.
By doing an erosion followed by a dilation (opening), you're shrinking the object and then growing it again. This will bring the image back to where it was before, provided that you choose the structuring element like what we talked about before. Similarly, if you did an dilation followed by an erosion (closing), you're growing the object and then shrinking it again, also bringing the image back to where it was before... following that same guideline I just talked about of course.
If you were to choose a structuring element where it is larger than the smallest object, doing an opening will remove this object from the image, and so you won't get the original image back. Also, you need to make sure that the objects are well far away from each other, and that the size of the structuring element does not overlap any of the objects as you slide over and do the morphology operations. The reason why is because if you were to do a closing, you would join these two objects together and so that won't get you the same results either!
Here's an example image that I generated that is binary:
To generate this image in MATLAB, you can do:
A = false(200,200);
A(30:60,30:60) = true;
A(90:110,90:110) = true;
A(10:30, 135:155) = true;
A(150:180,100:120) = true;
In Python, you can do this with numpy:
import numpy as np
A = np.zeros((200,200), dtype='uint8')
A[29:60,29:60] = 255
A[89:110,89:110] = 255
A[9:30, 134:155] = 255
A[149:180, 99:120] = 255
The reason why I had to create the array as uint8 in numpy is because when we want to show this image, I'm going to use OpenCV and it requires that the image be at least a uint8 type.
Now, let's choose a 5 x 5 square structuring element, and let's perform a closing and an opening with this image. We will display the results in a single figure going from left to right:
se = strel('square', 5);
A_close = imclose(A, se);
A_open = imopen(A, se);
figure;
subplot(1,3,1);
imshow(A);
title('Original');
subplot(1,3,2);
imshow(A_close);
title('Closed');
subplot(1,3,3);
imshow(A_open);
title('Open');
This is the result:
It certainly looks the same! To really show the difference, let's subtract the closed and opened result from the original image. You should get a blank image in the end if they're both equal to the original image.
figure;
subplot(1,2,1);
imshow(abs(double(A) - double(A_close)));
subplot(1,2,2);
imshow(abs(double(A) - double(A_open)));
Bear in mind that I converted the images to double to facilitate subtraction, and I used abs to ensure that negative differences are reflected. This is what I get:
As you can see, both results are totally blank, meaning they're exact copies of the original image after each result.
The equivalent code in Python for the first part is the following:
import cv2
se = np.ones((5,5), dtype='uint8')
A_close = cv2.morphologyEx(A, cv2.MORPH_CLOSE, se)
A_open = cv2.morphologyEx(A, cv2.MORPH_OPEN, se)
cv2.imshow('Original', A)
cv2.imshow('Close', A_close)
cv2.imshow('Open', A_open)
cv2.waitKey(0)
cv2.destroyAllWindows()
Here's what I get:
You'll need to install the OpenCV package for this Python code. I displayed all of the images as three separate figures, then left the windows there until you choose any one of them and push a key. Once you do this, all of the windows will close. If you want to show the subtraction stuff, this is the code in Python:
A_close_diff = A - A_close
A_open_diff = A - A_open
cv2.imshow('Close Diff', A_close_diff)
cv2.imshow('Open Diff', A_open_diff)
cv2.waitKey(0)
cv2.destroyAllWindows()
I didn't name the figures in MATLAB because what we're showing is obvious, but for OpenCV, you need to name the windows, and so I put names that describe what we're showing for each. I also didn't need to take the absolute value, because in numpy, doing arithmetic operations that result in an overflow or underflow will simply wrap around itself, while for MATLAB, the values get clipped. That's why for MATLAB, I needed to convert to double and take the absolute value because imshow doesn't display negative intensities or if we were to have a situation where we did 0 - 1, the output would be 0 and you wouldn't be able to show that this location has a difference. With Python, doing 0 - 1 for uint8, will result in 255, so we can certainly see a difference here.... so there's no need to do any of this abs and casting stuff that we did in MATLAB. Here's what I get:
In general, you can reproduce what I did with any kind of shape and any size shape, so long as you choose a structuring element that mimics the properties of the shape that is in your image, and you choose a structuring element that is smaller than the smallest shape seen in that image. I'm sure there are many more examples that don't have to follow these specific guidelines, but this is the best example that I can think of at this moment.
This should hopefully get you started.
Good luck!
Yes, there are such images. One of the properties of opening (it's mentioned in wiki article, for example) is that it is an anti-extensive operation, i.e. if Y is opening of X, then Y ⊆ X. Similarly, closing is an extensive operation (i.e. X ⊆ Y), therefore for any such image X = Y. Any image invariant to both opening and closing will satisfy your requirement (and, as I have just shown, only such images will).
Concrete examples depend on structuring element used when performing erosion or dilation. For example, if it is a square n x n matrix with all elements equal to 1, then any rectangle with both height and width greater than n (and located far enough, i.e. at least n/2 pixels, from image edges) will satisfy this requirement.

Detect the size of a QR Code in Python using OpenCV and Zbar

I have code that takes an image from the webcam, scans it for QR codes using zBar and returns the value of the code and an image with the QR code highlighted (based off http://sourceforge.net/p/qrtracker/wiki/Home/). How can I also make it tell me the size of the code (as a pixel value or % of the screen)?
Additional question: is there a way to detect how skewed it is (e.g rotation in Z about the Y-axis)?
Regarding the size of Code
zBar provides a method to do this in terms of pixel values (Once you know the size in pixel values, you can find it in %)
I would like to extend the code here: http://sourceforge.net/apps/mediawiki/zbar/index.php?title=HOWTO:_Scan_images_using_the_API
Above code finds a QR code in an image, prints its data etc. Now checking last few lines of code:
import math
scanner.scan(image)
[a,b,c,d] = x.location # it returns the four corners of the QR code in an order
w = math.sqrt((a[0]-b[0])**2 + (a[1]-b[1])**2) # Just distance between two points
h = math.sqrt((b[0]-c[0])**2 + (b[1]-c[1])**2)
Area = w*h
Skewness of QRCode
I think you want to transform it into a pre-defined shape (like square, rectangle, etc). If so, you can define corners of a pre-defined shape, say ((100,100), (300,100),(300,300),(100,300)). Then find the perspective transform and apply the transformation if you would like. An example in OpenCV is provided here: http://docs.opencv.org/trunk/doc/py_tutorials/py_imgproc/py_geometric_transformations/py_geometric_transformations.html#perspective-transformation

Image Gurus: Optimize my Python PNG transparency function

I need to replace all the white(ish) pixels in a PNG image with alpha transparency.
I'm using Python in AppEngine and so do not have access to libraries like PIL, imagemagick etc. AppEngine does have an image library, but is pitched mainly at image resizing.
I found the excellent little pyPNG module and managed to knock up a little function that does what I need:
make_transparent.py
pseudo-code for the main loop would be something like:
for each pixel:
if pixel looks "quite white":
set pixel values to transparent
otherwise:
keep existing pixel values
and (assuming 8bit values) "quite white" would be:
where each r,g,b value is greater than "240"
AND each r,g,b value is within "20" of each other
This is the first time I've worked with raw pixel data in this way, and although works, it also performs extremely poorly. It seems like there must be a more efficient way of processing the data without iterating over each pixel in this manner? (Matrices?)
I was hoping someone with more experience in dealing with these things might be able to point out some of my more obvious mistakes/improvements in my algorithm.
Thanks!
This still visits every pixel, but may be faster:
new_pixels = []
for row in pixels:
new_row = array('B', row)
i = 0
while i < len(new_row):
r = new_row[i]
g = new_row[i + 1]
b = new_row[i + 2]
if r>threshold and g>threshold and b>threshold:
m = int((r+g+b)/3)
if nearly_eq(r,m,tolerance) and nearly_eq(g,m,tolerance) and nearly_eq(b,m,tolerance):
new_row[i + 3] = 0
i += 4
new_pixels.append(new_row)
It avoids the slicen generator, which will be copying the entire row of pixels for every pixel (less one pixel each time).
It also pre-allocates the output row by directly copying the input row, and then only writes the alpha value of pixels which have changed.
Even faster would be to not allocate a new set of pixels at all, and just write directly over the pixels in the source image (assuming you don't need the source image for anything else).
Honestly, the only heuristic I could conceive is picking a few arbitrary, random points on your image and using a flood fill.
This only works well if your image as large contiguous white portions (if your image is an object with no or little holes in front of a background, then you're in luck -- you actually have a heuristic for which points to flood fill from).
(disclaimer: I am no image guru =/ )
I'm quite sure there is no short cut for this. You have to visit every single pixel.
The issue seems to have more to do with loops in Python than with images.
Python loops are extremely slow, it is best to avoid them and use built-ins loop operators instead.
Here, if you were willing to copy the image, you could use a list comprehension:
def make_transparent(pixel):
if pixel looks "quite white": return transparent
else: return pixel
newImage = [make_transparent(p) for p in oldImage]

Categories