I want to remove anything other than text from a license plate with a binary filter.
I have the projections on each axis but I don't know how to apply it. My idea is to erase the white outlines.
This is the image I'm working for now:
This is the projection in Axis X:
from matplotlib import pyplot as plt
import pylab
(rows,cols)=img.shape
h_projection = np.array([ x/255/rows for x in img.sum(axis=0)])
plt.plot(range(cols), h_projection.T)
And this is the result:
As you can see in the graph, at the end the line is shot by the white contour.
How can I erase everything that is at a certain threshold of the photo? Every help is appreciated
So, you want to extract the black areas within the white characters.
For example, you can select the columns (or rows) in your histograms where the value is less than a certain threshold.
from matplotlib import pyplot as plt
import pylab
import numpy as np
img = plt.imread('binary_image/iXWgw.png')
(rows,cols)=img.shape
h_projection = np.array([ x/rows for x in img.sum(axis=0)])
threshold = (np.max(h_projection) - np.min(h_projection)) / 4
print("we will use threshold {} for horizontal".format(threshold))
# select the black areas
black_areas = np.where(h_projection < threshold)
fig = plt.figure(figsize=(16,8))
fig.add_subplot(121)
for j in black_areas:
img[:, j] = 0
plt.plot((j, j), (0, 1), 'g-')
plt.plot(range(cols), h_projection.T)
v_projection = np.array([ x/cols for x in img.sum(axis=1)])
threshold = (np.max(v_projection) - np.min(v_projection)) / 4
print("we will use threshold {} for vertical".format(threshold))
black_areas = np.where(v_projection < threshold)
fig.add_subplot(122)
for j in black_areas:
img[j, :] = 0
plt.plot((0,1), (j,j), 'g-')
plt.plot(v_projection, range(rows))
plt.show()
# obscurate areas on the image
plt.figure(figsize=(16,12))
plt.subplot(211)
plt.title("Image with the projection mask")
plt.imshow(img)
# erode the features
import scipy
plt.subplot(212)
plt.title("Image after erosion (suggestion)")
eroded_img = scipy.ndimage.morphology.binary_erosion(img, structure=np.ones((5,5))).astype(img.dtype)
plt.imshow(eroded_img)
plt.show()
So now you have the horizontal and vertical projections, that look like this
And after that you can apply the mask: there are several ways of doing this, in the code is already applied within the for loop, where we set img[:,j] = 0 for the columns, and img[j,:] = 0 for the rows. It was easy and I think intuitive, but you can look for other methods.
As a suggestion, I would say you can look into the morphological operator of erosion that can help to separate the white parts.
So the output would look like this.
Unfortunately, the upper and lower part still show white regions. You can manually set the rows to white img[:10,:] = 0, img[100:,:] = 0, but that probably would not work on all the images you have (if you are trying to train a neural network I assume you have lots of them, so you need to have a code that works on all of them.
So, since now you ask for segmentation also, this opens another topic. Segmentation is a complex task, and it is not as straightforward as a binary mask. I would strongly suggest you read some material on that before you just apply something without understanding. For example here a guide on image processing with scipy, but you may look for more.
As a suggestion and a small snippet to make it work, you can use the labeling from scipy.ndimage.
Here a small part of code (from the guide)
label_im, nb_labels = scipy.ndimage.label(eroded_img)
plt.figure(figsize=(16,12))
plt.subplot(211)
plt.title("Segmentation")
plt.imshow(label_im)
plt.subplot(212)
plt.title("One Object as an example")
plt.imshow(label_im == 6) # change number for the others!
Which will output:
As an example I showed the S letter. if you change label_im == 6 you will get the next letter. As you will see yourself, it is not always correct and other little pieces of the image are also considered as objects. So you will have to work a little bit more on that.
Related
I've been using skimage.segmentation modules to find contiguous segments within an image.
For example,
segments quite nicely to
I want to be able to view the distinct regions of the original image in isolation (such that the above image would result in 6 roughly rectangular sub-images). I have obtained some degree of success in doing this, but it's been difficult. Is there any pre-existing module I can use to accomplish this?
If not, high-level algorthim advice would be appreciated.
Approach thus far:
image_slic = seg.slic(image, n_segments=6)
borders = seg.find_boundaries(image_slic)
sub_images = []
new_seg = []
for every row of borders:
new_seg.append([])
for every pixel in every row:
if (pixel is not a border and is not already processed):
new_seg[-1].append(pixel)
Mark pixel as processed
elif (pixel is a border and is not already processed):
break
if (on the first pixel of a row OR the first unprocessed pixel):
sub_images.append(new_seg)
new_seg = []
With this approach, I can generate the four regions from the example image that border the left side without error. While it's not shown in the above pseudo-code, I'm also padding segments with transparent pixels to preserve their shape. This additional consideration makes finding right-side sub-images more difficult.
This can be readily accomplished through NumPy's boolean indexing:
import numpy as np
from skimage import io, segmentation
import matplotlib.pyplot as plt
n_segments = 6
fig_width = 2.5*n_segments
img = io.imread('https://i.imgur.com/G44JEG7.png')
segments = segmentation.slic(img, n_segments=n_segments)
fig, ax = plt.subplots(1, n_segments)
fig.set_figwidth(fig_width)
for index in np.unique(segments):
segment = img.copy()
segment[segments!=index] = 0
ax[index].imshow(segment)
ax[index].set(title=f'Segment {index}')
ax[index].set_axis_off()
plt.show(fig)
You could obtain the same result using NumPy's where function like this:
for index in np.unique(segments):
segment = np.where(np.expand_dims(segments, axis=-1)==index, img, [0, 0, 0])
I am trying to create an OCR system in python - the first part involves extracting all characters from an image. This works fine and all characters are separated into their own bounding boxes.
Code attached below:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from scipy.misc import imread,imresize
from skimage.segmentation import clear_border
from skimage.morphology import label
from skimage.measure import regionprops
image = imread('./ocr/testing/adobe.png',1)
bw = image < 120
cleared = bw.copy()
clear_border(cleared)
label_image = label(cleared,neighbors=8)
borders = np.logical_xor(bw, cleared)
label_image[borders] = -1
print label_image.max()
fig, ax = plt.subplots(ncols=1, nrows=1, figsize=(6, 6))
ax.imshow(bw, cmap='jet')
for region in regionprops(label_image):
if region.area > 20:
minr, minc, maxr, maxc = region.bbox
rect = mpatches.Rectangle((minc, minr), maxc - minc, maxr - minr,
fill=False, edgecolor='red', linewidth=2)
ax.add_patch(rect)
plt.show()
However, since the letters i and j have 'dots' on top of them, the code takes the dots as separate bounding boxes. I am using the regionprops library. Would it also be a good idea to resize and normalise each bounding box?
How would i modify this code to account for i and j? My understanding would be that I would need to merge the bounding boxes that are closeby? Tried with no luck... Thanks.
Yes, you generally want to normalize the content of your bounding boxes to fit your character classifier's input dimensions (assuming you are working on character classifiers with explicit segmentation and not with sequence classifiers segmenting implicitly).
For merging vertically isolated CCs of the same letter, e.g. i and j, I'd try an anisotropic Gaussian filter (very small sigma in x-direction, larger in y-direction). The exact parameterization will depend on your input data, but it should be easy to find a suitable value experimentally such that all letters result in exactly one CC.
An alternative would be to analyze CCs which exhibit horizontal overlap with other CCs and merge those pairs where the overlap exceeds a certain relative threshold.
Illustrating on the given example:
# Anisotropic Gaussian
from scipy.ndimage.filters import gaussian_filter
filtered = gaussian_filter(f, (2,0))
plt.imshow(filtered, cmap=plt.cm.gray)
# Now threshold
bin = filtered < 1
plt.imshow(bin, cmap=plt.cm.gray)
It's easy to see that each character is now represented by exactly one CC. Now we pretty much only have to apply each mask and crop the white areas to end up with the bounding boxes for each character. After normalizing their size we can directly feed them to the classifier (consider that we lose ascender/descender line information as well as width/height ratio though and those may be useful as a feature for the classifier; so it should help feeding them explicitly to the classifier in addition to the normalized bounding box content).
Many papers on image segmentation provide examples where each segment is covered with half-transparent color mask:
If I have an image and a mask, can I achieve the same result in Matplotlib?
EDIT:
The mask in my case is defined as an array with the same width and height as an image filled with numbers from 0 to (num_segments+1), where 0 means "don't apply any color" and other numbers mean "cover this pixel with some distinct color". Yet, if another representation of a mask is more suitable, I can try to convert to it.
Here are a couple of complications that I found in this task so that it doesn't sound that trivial:
Colored regions are not regular shapes like lines, squares or circles so functions like plot(..., 'o'), fill() or fill_between() don't work here. They are not even contours (or at least I don't see how to apply them here).
Modifying alpha channel isn't the most popular thing in plots, so is rarely mentioned in Matplotlib's docs.
This can surely be done. The implementation would depend on how your mask looks like.
Here is an example
import matplotlib.pyplot as plt
import numpy as np
image = plt.imread("https://i.stack.imgur.com/9qe6z.png")
ar= np.zeros((image.shape[0],image.shape[1]) )
ar[100:300,50:150] = np.ones((200,100))
ar[:,322:] = np.zeros((image.shape[0],image.shape[1]-322) )*np.nan
fig,ax=plt.subplots()
ax.imshow(image)
ax.imshow(ar, alpha=0.5, cmap="RdBu")
plt.show()
I'm trying to stretch an image's histogram using a logarithmic transformation. Basically, I am applying a log operation to each pixel's intensity. When I'm trying to change image's value in each pixel, the new values are not saved but the histogram looks OK. Also, the maximum value is not correct. This is my code:
import cv2
import numpy as np
import math
from matplotlib import pyplot as plt
img = cv2.imread('messi.jpg',0)
img2 = img
for i in range(0,img2.shape[0]-1):
for j in range(0,img2.shape[1]-1):
if (math.log(1+img2[i,j],2)) < 0:
img2[i,j]=0
else:
img2[i,j] = np.int(math.log(1+img2[i,j],2))
print (np.int(math.log(1+img2[i,j],2)))
print (img2.ravel().max())
cv2.imshow('LSP',img2)
cv2.waitKey(0)
fig = plt.gcf()
fig.canvas.set_window_title('LSP histogram')
plt.hist(img2.ravel(),256,[0,256]); plt.show()
img3 = img2
B = np.int(img3.max())
A = np.int(img3.min())
print ("Maximum intensity = ", B)
print ("minimum intensity = ", A)
This is also the histogram I get:
However, the maximum intensity shows 186! This isn't applying the proper logarithmic operation at all.
Any ideas?
The code you wrote performs a logarithmic transformation applied to the image intensities. The reason why you are getting such a high spurious intensity as the maximum is because your for loops are wrong. Specifically, your range is incorrect. range is exclusive of the ending interval, which means that you must go up to img.shape[0] and img.shape[1] respectively, and not img.shape[0]-1 or img.shape[1]-1. Therefore, you are missing the last row and last column of the image, and these don't get touched by logarithmic operation. The maximum that is reported is from one of these pixels in the last row or column that you didn't touch.
Once you correct this, you don't get those bad intensities anymore:
for i in range(0,img2.shape[0]): # Change
for j in range(0,img2.shape[1]): # Change
if (math.log(1+img2[i,j],2)) < 0:
img2[i,j]=0
else:
img2[i,j] = np.int(math.log(1+img2[i,j],2))
Doing that now gives us:
('Maximum intensity = ', 7)
('minimum intensity = ', 0)
However, what you're going to get now is a very dark image. The histogram that you have shown us illustrates that all of the image pixels are in the dark range... roughly between [0-7]. Because of that, the majority of your image is going to be dark if you use uint8 as the data type for visualization. Take note that I searched for the Lionel Messi image that's part of the OpenCV tutorials, and this is the image I found:
Source: https://opencv-python-tutroals.readthedocs.org/en/latest/_images/roi.jpg
Your code is converting this to grayscale, and that's fine for the purpose of your question. Now, using the above image, if you actually show what the histogram count looks like as well as what the intensities are per bin in the histogram, this is what we get for img2:
In [41]: np.unique(img2)
Out[41]: array([0, 1, 2, 3, 4, 5, 6, 7], dtype=uint8)
In [42]: np.bincount(img2.ravel())
Out[42]: array([ 86, 88, 394, 3159, 14841, 29765, 58012, 19655])
As you can see, the bulk of the image pixels are hovering between the [0-7] range, which is why everything looks black. If you want to see this better, perhaps scale the image by roughly 255 / 7 = 36 or so we can see the image better:
img2 = 36*img2
cv2.imshow('LSP',img2)
cv2.waitKey(0)
We get this image:
I also get this histogram:
That personally looks very ugly... at least to me. As such, I would recommend that you choose a more meaningful image transformation if you want to stretch the histogram. In fact, the log operation compresses the dynamic range of the histogram. If you want to stretch the histogram, go the opposite way and try a power-law operation. Specifically, given an input intensity and the output is defined as:
out = c*in^(p)
in is the input intensity, p is a power and c is a constant to ensure that you scale the image so that the maximum intensity gets mapped to the same maximum intensity of the input when you're finished and not anything larger. That can be done by calculating c so that:
c = (img2.max()) / (img2.max()**p)
... where p is the power you want. In addition, the transformation via power-law can be explained with this nice diagram:
Source: http://www.nptel.ac.in/courses/117104069/chapter_8/8_14.html
Basically, powers that are less than 1 perform an intensity expansion where darker intensities get pushed towards the lighter side. Similarly, powers that are greater than 1 perform an intensity compression where lighter intensities get pushed to the darker side. In your case, you want to expand the histogram, and so you want the first option. Specifically, try making the intensities that are smaller go towards the larger range. This can be done by choosing a power that's smaller than 1... try 0.5 for example.
You'd modify your code so that it is like this:
img2 = img2.astype(np.float) # Cast to float
c = (img2.max()) / (img2.max()**(0.5))
for i in range(0,img2.shape[0]-1):
for j in range(0,img2.shape[1]-1):
img2[i,j] = np.int(c*img2[i,j]**(0.5))
# Cast back to uint8 for display
img2 = img2.astype(np.uint8)
Doing that, I get this image:
I also get this histogram:
Minor Note
If I can suggest something in terms of efficiency, I wouldn't recommend that you loop through the entire image and set each pixel individually... that's how numpy arrays were not supposed to be used. You can achieve what you want vectorized in a single line of code.
With your old code, use np.log2, not math.log with the base 2 with numpy arrays:
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Your code
img = cv2.imread('messi.jpg',0)
# New code
img2 = np.log2(1 + img.astype(np.float)).astype(np.uint8)
# Back to your code
img2 = 36*img2 # Edit from before
cv2.imshow('LSP',img2)
cv2.waitKey(0)
fig = plt.gcf()
fig.canvas.set_window_title('LSP histogram')
plt.hist(img2.ravel(),256,[0,256]); plt.show()
img3 = img2
B = np.int(img3.max())
A = np.int(img3.min())
print ("Maximum intensity = ", B)
print ("minimum intensity = ", A)
cv2.destroyAllWindows() # Don't forget this
Similarly, if you want to apply a power-law transformation, it's very simply:
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Your code
img = cv2.imread('messi.jpg',0)
# New code
c = (img2.max()) / (img2.max()**(0.5))
img2 = (c*img.astype(np.float)**(0.5)).astype(np.uint8)
#... rest of code as before
I am attempting to write a program that will automatically locate a protein in an image, this will ultimately be used to differentiate between two proteins of different heights that are present.
The white area on top of the background is a membrane in which the proteins sit and the white blobs that are present are the proteins. The proteins have two lobes hence they appear in pairs (actually one protein).
I have been writing a script in Fiji (Jython) to try and locate the proteins so we can work out the height from the local background. This so far involves applying an adaptive histogram equalisation and then subtracting the background with a rolling ball of radius 10 pixels. After that I have been applying a kernel of sorts which is 10 pixels by 10 pixels and works out the average of the 5 centre pixels and divides it by the average of the pixels on the 4 edges of the kernel to get a ratio. if the ratio is above a certain value then it is a candidate.
the output I got was this image which apart from some wrapping and sensitivity (ratio=2.0) issues seems to be ok. My questions are:
Is this a reasonable approach or is there an obviously better way of doing this?
Can you suggest a way on from here? I am a little stuck now and not really sure how to proceed.
code if necessary: http://pastebin.com/D45LNJCu
Thanks!
Sam
How about starting off a bit more simple and using the Harris-point approach and detect local maxima. Eg.
import numpy as np
import Image
from scipy import ndimage
import matplotlib.pyplot as plt
roi = 2.5
peak_threshold = 120
im = Image.open('Q766c.png');
image = im.copy()
size = 2 * roi + 1
image_max = ndimage.maximum_filter(image, size=size, mode='constant')
mask = (image == image_max)
image *= mask
# Remove the image borders
image[:size] = 0
image[-size:] = 0
image[:, :size] = 0
image[:, -size:] = 0
# Find peaks
image_t = (image > peak_threshold) * 1
# get coordinates of peaks
f = np.transpose(image_t.nonzero())
# Show
img = plt.imshow(np.asarray(im))
plt.plot(f[:, 1], f[:, 0], 'o', markeredgewidth=0.45, markeredgecolor='b', markerfacecolor='None')
plt.axis('off')
plt.savefig('local_max.png', format='png', bbox_inches='tight')
plt.show()
Which gives this:
ImageJ "Find maxima" does also similar.
Here is the Jython code
from ij import ImagePlus, IJ, Prefs
from ij.plugin import RGBStackMerge
from ij.process import ImageProcessor, ImageConverter
from ij.plugin.filter import Binary, MaximumFinder
from jarray import array
# define background is black (0)
Prefs.blackBackground = True
# find maxima
#imp = IJ.getImage()
imp = ImagePlus('http://i.stack.imgur.com/Q766c.png')
ImageConverter(imp).convertToGray8()
ip = imp.getProcessor()
segip = MaximumFinder().findMaxima( ip, 10, 200, MaximumFinder.SINGLE_POINTS , False, False)
# display detection result
binner = Binary()
binner.setup("dilate", None)
binner.run(segip)
segimp = ImagePlus("seg", segip)
mergeimp = RGBStackMerge.mergeChannels(array([segimp, imp, None, None, None, None, None], ImagePlus), True)
mergeimp.show()
EDIT: Updated the code to allow processing PNG image (RGB), and directly loading image from this thread. See comments for more details.