Pillow - How to binarize an image with threshold? - python

I would like to binarize a png image.
I would like to use Pillow if possible.
I've seen two methods used:
image_file = Image.open("convert_image.png") # open colour image
image_file = image_file.convert('1') # convert image to black and white
This method appears to handle a region filled with a light colour by dithering the image. I don't want this behaviour. If there is, for example, a light yellow circle, I want that to become a black circle.
More generally, if a pixel's RGB is (x,y,z) the I want the pixel to become black if x<=t OR y<=t OR z<=t for some threshold 0<t<255
I can covert the image to greyscale or RGB and then manually apply a threshold test but this seems inefficient.
The second method I've seen is this:
threshold = 100
im = im2.point(lambda p: p > threshold and 255)
from here I don't know how this works though or what the threshold is or does here and what "and 255" does.
I am looking for either an explanation of how to apply method 2 or an alternative method using Pillow.

I think you need to convert to grayscale, apply the threshold, then convert to monochrome.
image_file = Image.open("convert_iamge.png")
# Grayscale
image_file = image_file.convert('L')
# Threshold
image_file = image_file.point( lambda p: 255 if p > threshold else 0 )
# To mono
image_file = image_file.convert('1')
The expression "p > threshhold and 255" is a Python trick. The definition of "a and b" is "a if a is false, otherwise b". So that will produce either "False" or "255" for each pixel, and the "False" will be evaluated as 0. My if/else does the same thing in what might be a more readable way.

Related

How to remove the empty images using python [duplicate]

Using the Python Imaging Library PIL how can someone detect if an image has all it's pixels black or white?
~Update~
Condition: Not iterate through each pixel!
if not img.getbbox():
... will test to see whether an image is completely black. (Image.getbbox() returns the falsy None if there are no non-black pixels in the image, otherwise it returns a tuple of points, which is truthy.) To test whether an image is completely white, invert it first:
if not ImageChops.invert(img).getbbox():
You can also use img.getextrema(). This will tell you the highest and lowest values within the image. To work with this most easily you should probably convert the image to grayscale mode first (otherwise the extrema might be an RGB or RGBA tuple, or a single grayscale value, or an index, and you have to deal with all those).
extrema = img.convert("L").getextrema()
if extrema == (0, 0):
# all black
elif extrema == (1, 1):
# all white
The latter method will likely be faster, but not so you'd notice in most applications (both will be quite fast).
A one-line version of the above technique that tests for either black or white:
if sum(img.convert("L").getextrema()) in (0, 2):
# either all black or all white
Expanding on Kindall:
if you look at an image called img with:
extrema = img.convert("L").getextrema()
It gives you a range of the values in the images. So an all black image would be (0,0) and an all white image is (255,255). So you can look at:
if extrema[0] == extrema[1]:
return("This image is one solid color, so I won't use it")
else:
# do something with the image img
pass
Useful to me when I was creating a thumbnail from some data and wanted to make sure it was reading correctly.
from PIL import Image
img = Image.open("test.png")
clrs = img.getcolors()
clrs contains [("num of occurences","color"),...]
By checking for len(clrs) == 1 you can verify if the image contains only one color and by looking at the second element of the first tuple in clrs you can infer the color.
In case the image contains multiple colors, then by taking the number of occurences into account you can also handle almost-completly-single-colored images if 99% of the pixles share the same color.
I tried the Kindall solution ImageChops.invert(img).getbbox() without success, my test images failed.
I had noticed a problem, white should be 255 BUT I have found white images where numeric extrema are (0,0).. why? See the update below.
I have changed Kindall second solution (getextrema), that works right, in a way that doesn't need image conversion, I wrote a function and verified that works with Grayscale and RGB images both:
def is_monochromatic_image(img):
extr = img.getextrema()
a = 0
for i in extr:
if isinstance(i, tuple):
a += abs(i[0] - i[1])
else:
a = abs(extr[0] - extr[1])
break
return a == 0
The img argument is a PIL Image object.
You can also check, with small modifications, if images are black or white.. but you have to decide if "white" is 0 or 255, perhaps you have the definitive answer, I have not. :-)
Hope useful
UPDATE: I suppose that white images with zeros inside.. may be PNG or other image format with transparency.

The mask I am creating is clipping the image I am trying to paste over it

I am trying to paste an image(noise) on top of a background image(back_eq).
The problem is that when applying the mask (mask = np.uint8(alpha/255) the mask gets clipped clipped mask
this is the original shape i am trying to paste the white shape should get onto the background (but black)
so the result is this clipped result
The problem fixes when instead of normalizing with 255 we use a value smaller s.a 245 or 240 (mask = np.uint8(alpha/240))
The problem is that this is a correct normalization. Any suggestion on how to fix the mask with a correct normalization being mandatory?
import numpy as np
import cv2
import matplotlib.pyplot as plt
noise = cv2.imread("3_noisy.jpg")
noise = cv2.resize(noise,(300,300), interpolation = cv2.INTER_LINEAR)
alpha = cv2.imread("3_alpha.jpg")
alpha = cv2.resize(alpha,(300,300), interpolation = cv2.INTER_LINEAR)
back_eq = cv2.imread('Results/back_eq.jpg')
back_eq_crop = cv2.imread('Results/back_eq_crop.jpg')
im_3_tone = cv2.imread('Results/im_3_tone.jpg')
final = back_eq.copy()
back_eq_h, back_eq_w, _ = back_eq.shape
noisy_h, noisy_w,_ = noise.shape
l1 = back_eq_h//2 - noisy_h//2
l2 = back_eq_h//2 + noisy_h//2
l3 = back_eq_w//2 - noisy_w//2
l4 = back_eq_w//2 + noisy_w//2
print(alpha.shape)
# normalizing the values
mask = np.uint8(alpha/255)
# masking back_eq_crop
masked_back_eq_crop = cv2.multiply(back_eq_crop,(1-mask))
cv2.imshow('as',masked_back_eq_crop)
cv2.waitKey(0)
cv2.destroyAllWindows()
# creating the masked region
mask_to_add = cv2.multiply(im_3_tone, mask)
cv2.imshow('as',mask_to_add)
cv2.waitKey(0)
cv2.destroyAllWindows()
# Combining
masked_image = cv2.add(masked_back_eq_crop, mask_to_add)
final[l1:l2, l3:l4] = masked_image
cv2.imshow('aa',masked_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
plt.figure()
plt.imshow(final[:, :, ::-1]);plt.axis("off");plt.title("Final Image")
plt.show()
retval=cv2.imwrite("Results/Final Image.jpg", final)
To use binary mask threshold of 255, you have to have a properly prepared image - preferably already binary. Because 255 means only pure white (#FFFFFF) will stay white. Even the lightest gray will become black.
And in your case, well... the image has antialiasing (edges are softened), and you're doing scaling in the code. But moreover, your white is not pure white. There's a hole in the result.
To show it, instead of just talking:
I loaded your mask in GIMP,I loaded your mask pic in gimp, got the 'select by colour' tool, disabled antialiasing and turned threshold to 0 - everything so that only pure white FFFFFF gets selected, the same as your code.
Aaaand we see the holes. The tail is pixely already, same with hair... the hole in the face is there... Hole's colour is #FEFEFE - 254, making it black with threshold of 255.
The best threshold in such (pseudo) "black-and-white" is actually near the middle (128). Because antialiasing makes the edges be blackish-gray or whiteish-gray - no middle grays, so middle gray separates the two groups nicely. And your "visually white but not pure white" (+similar blacks) get into those groups as well. Even if you believe to have only pure black and pure white in your image - if you load it as colour or grayscale, you have 0 and 255 values anyways, so 128 will work. (I don't have access to my old code right now, but I believe I kept my thresholds around 200 when I played with images?)
tl;dr:
Threshold 255 only makes #FFFFFF white, it's never good
your picture has a lot of "visually white but not #FFFFFF white" pixels
there's nothing bad in using lower threshold, even around middle of the range for pseudo black-and-white

Proper image thresholding to prepare it for OCR in python using opencv

I am really new to opencv and a beginner to python.
I have this image:
I want to somehow apply proper thresholding to keep nothing but the 6 digits.
The bigger picture is that I intend to try to perform manual OCR to the image for each digit separately, using the k-nearest neighbours algorithm on a per digit level (kNearest.findNearest)
The problem is that I cannot clean up the digits sufficiently, especially the '7' digit which has this blue-ish watermark passing through it.
The steps I have tried so far are the following:
I am reading the image from disk
# IMREAD_UNCHANGED is -1
image = cv2.imread(sys.argv[1], cv2.IMREAD_UNCHANGED)
Then I'm keeping only the blue channel to get rid of the blue watermark around digit '7', effectively converting it to a single channel image
image = image[:,:,0]
# openned with -1 which means as is,
# so the blue channel is the first in BGR
Then I'm multiplying it a bit to increase contrast between the digits and the background:
image = cv2.multiply(image, 1.5)
Finally I perform Binary+Otsu thresholding:
_,thressed1 = cv2.threshold(image,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
As you can see the end result is pretty good except for the digit '7' which has kept a lot of noise.
How to improve the end result? Please supply the image example result where possible, it is better to understand than just code snippets alone.
You can try to medianBlur the gray(blur) image with different kernels(such as 3, 51), divide the blured results, and threshold it. Something like this:
#!/usr/bin/python3
# 2018/09/23 17:29 (CST)
# (中秋节快乐)
# (Happy Mid-Autumn Festival)
import cv2
import numpy as np
fname = "color.png"
bgray = cv2.imread(fname)[...,0]
blured1 = cv2.medianBlur(bgray,3)
blured2 = cv2.medianBlur(bgray,51)
divided = np.ma.divide(blured1, blured2).data
normed = np.uint8(255*divided/divided.max())
th, threshed = cv2.threshold(normed, 100, 255, cv2.THRESH_OTSU)
dst = np.vstack((bgray, blured1, blured2, normed, threshed))
cv2.imwrite("dst.png", dst)
The result:
Why not just keep values in the image that are above a certain threshold?
Like this:
import cv2
import numpy as np
img = cv2.imread("./a.png")[:,:,0] # the last readable image
new_img = []
for line in img:
new_img.append(np.array(list(map(lambda x: 0 if x < 100 else 255, line))))
new_img = np.array(list(map(lambda x: np.array(x), new_img)))
cv2.imwrite("./b.png", new_img)
Looks great:
You could probably play with the threshold even more and get better results.
It doesn't seem easy to completely remove the annoying stamp.
What you can do is flattening the background intensity by
computing a lowpass image (Gaussian filter, morphological closing); the filter size should be a little larger than the character size;
dividing the original image by the lowpass image.
Then you can use Otsu.
As you see, the result isn't perfect.
I tried a slightly different approach then Yves on the blue channel:
Apply median filter (r=2):
Use Edge detection (e.g. Sobel operator):
Automatic thresholding (Otsu)
Closing of the image
This approach seems to make the output a little less noisy. However, one has to address the holes in the numbers. This can be done by detecting black contours which are completely surrounded by white pixels and simply filling them with white.

Remove features from binarized image

I wrote a little script to transform pictures of chalkboards into a form that I can print off and mark up.
I take an image like this:
Auto-crop it, and binarize it. Here's the output of the script:
I would like to remove the largest connected black regions from the image. Is there a simple way to do this?
I was thinking of eroding the image to eliminate the text and then subtracting the eroded image from the original binarized image, but I can't help thinking that there's a more appropriate method.
Sure you can just get connected components (of certain size) with findContours or floodFill, and erase them leaving some smear. However, if you like to do it right you would think about why do you have the black area in the first place.
You did not use adaptive thresholding (locally adaptive) and this made your output sensitive to shading. Try not to get the black region in the first place by running something like this:
Mat img = imread("desk.jpg", 0);
Mat img2, dst;
pyrDown(img, img2);
adaptiveThreshold(255-img2, dst, 255, ADAPTIVE_THRESH_MEAN_C,
THRESH_BINARY, 9, 10); imwrite("adaptiveT.png", dst);
imshow("dst", dst);
waitKey(-1);
In the future, you may read something about adaptive thresholds and how to sample colors locally. I personally found it useful to sample binary colors orthogonally to the image gradient (that is on the both sides of it). This way the samples of white and black are of equal size which is a big deal since typically there are more background color which biases estimation. Using SWT and MSER may give you even more ideas about text segmentation.
I tried this:
import numpy as np
import cv2
im = cv2.imread('image.png')
gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
grayout = 255*np.ones((im.shape[0],im.shape[1],1), np.uint8)
blur = cv2.GaussianBlur(gray,(5,5),1)
thresh = cv2.adaptiveThreshold(blur,255,1,1,11,2)
contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
wcnt = 0
for item in contours:
area =cv2.contourArea(item)
print wcnt,area
[x,y,w,h] = cv2.boundingRect(item)
if area>10 and area<200:
roi = gray[y:y+h,x:x+w]
cntd = 0
for i in range(x,x+w):
for j in range(y,y+h):
if gray[j,i]==0:
cntd = cntd + 1
density = cntd/(float(h*w))
if density<0.5:
for i in range(x,x+w):
for j in range(y,y+h):
grayout[j,i] = gray[j,i];
wcnt = wcnt + 1
cv2.imwrite('result.png',grayout)
You have to balance two things, removing the black spots but balance that with not losing the contents of what is on the board. The output I got is this:
Here is a Python numpy implementation (using my own mahotas package) of the method for the top answer (almost the same, I think):
import mahotas as mh
import numpy as np
Imported mahotas & numpy with standard abbreviations
im = mh.imread('7Esco.jpg', as_grey=1)
Load the image & convert to gray
im2 = im[::2,::2]
im2 = mh.gaussian_filter(im2, 1.4)
Downsample and blur (for speed and noise removal).
im2 = 255 - im2
Invert the image
mean_filtered = mh.convolve(im2.astype(float), np.ones((9,9))/81.)
Mean filtering is implemented "by hand" with a convolution.
imc = im2 > mean_filtered - 4
You might need to adjust the number 4 here, but it worked well for this image.
mh.imsave('binarized.png', (imc*255).astype(np.uint8))
Convert to 8 bits and save in PNG format.

Python (creating a negative of this black and white image)

I am trying to create a negative of this black and white image. The opposite of white (255) is black (0) and vice versa. The opposite of a pixel with a value of 100 is 155.
I cannot use convert, invert, point, eval, lambda.
Here is my code but it doesnt work yet. Could you please let me know which part i am wrong.
def bw_negative(filename):
"""
This function creates a black and white negative of a bitmap image
using the following parameters:
filename is the name of the bitmap image
"""
#Create the handle and then create a list of pixels.
image = Image.open(filename)
pixels = list(image.getdata())
pixel[255] = 0
pixel[0] = 255
for i in range(255,0):
for j in range(0,255):
pixel[i] = j
print pixels[i]
image.putdata(pixels)
image.save ('new.bmp')
Python is an interpreted language, which has the advantage that you can use an interactive interpreter-session to try out things. Try to open the image file in an interactive session and look at the list you get from list(image.getdata()). Once you understand what that list contains, you can think about a way to invert the image.

Categories