python - Cropping an image of handwritten digit - python

I'm trying to predict handwritten digits using MNIST as the dataset & python. Right now, I have to give already cropped images as input to the program.
Further processing to make it to MNIST dataset format is done using the following function, but how to auto crop a random image given as input ?
def imageprepare(argv):
"""
This function returns the pixel values.
The imput is a png file location.
"""
im = Image.open(argv).convert('L')
width = float(im.size[0])
height = float(im.size[1])
newImage = Image.new('L', (28, 28), (255)) #creates white canvas of 28x28 pixels
if width > height: #check which dimension is bigger
#Width is bigger. Width becomes 20 pixels.
nheight = int(round((20.0/width*height),0)) #resize height according to ratio width
if (nheigth == 0): #rare case but minimum is 1 pixel
nheigth = 1
# resize and sharpen
img = im.resize((20,nheight), Image.ANTIALIAS).filter(ImageFilter.SHARPEN)
wtop = int(round(((28 - nheight)/2),0)) #caculate horizontal pozition
newImage.paste(img, (4, wtop)) #paste resized image on white canvas
else:
#Height is bigger. Heigth becomes 20 pixels.
nwidth = int(round((20.0/height*width),0)) #resize width according to ratio height
if (nwidth == 0): #rare case but minimum is 1 pixel
nwidth = 1
# resize and sharpen
img = im.resize((nwidth,20), Image.ANTIALIAS).filter(ImageFilter.SHARPEN)
wleft = int(round(((28 - nwidth)/2),0)) #caculate vertical pozition
newImage.paste(img, (wleft, 4)) #paste resized image on white canvas
#newImage.save("sample.png")
tv = list(newImage.getdata()) #get pixel values
#normalize pixels to 0 and 1. 0 is pure white, 1 is pure black.
tva = [ (255-x)*1.0/255.0 for x in tv]
return tva

You can use OpenCV contours to locate potential digits within your actual image, some of the techniques will depend on the actual data you are working from. There is an example of digit candidate location at http://www.pyimagesearch.com/2017/02/13/recognizing-digits-with-opencv-and-python/
that can give you some pointers.
However, you may get problems with some scripts as I believe that while in all European scripts every digit is supposed to be contiguous and distinct I am not sure that both points apply to all scripts.

Related

Slice image in dynamic number of squares (grid) and save corner coordinates of those squares in list

I'm reading in an image with the Pillow library in Python. I want to "slice" it into squares, and save the corner coordinates of each of the squares in a list. For example in the image below, I would like to save the corner coordinates for square 15. (Top left corner is 0,0)
The first think I do after reading in the image is calculate the modulus of the height and width in pixels by the number of slices, and crop the image so the resulting number of pixels per square is the same and an integer.
from PIL import Image, ImageDraw, ImageFont
fileName = 'eyeImg_86.png'
img = Image.open(fileName)
vertical_slices = 8
horizontal_slices = 4
height = img.height
width = img.width
new_height = height - (height % horizontal_slices)
new_width = width - (width % vertical_slices)
img = img.crop((0, 0, new_width, new_height))
Then I calculate the size in pixels of each vertical and horizontal step.
horizontal_step = int(new_width / vertical_slices)
vertical_step = int(new_height / horizontal_slices)
And then I loop over the ranges between 0 to the total number of vertical and horizontal slices and append to a nested list (each inner list is a row)
points = []
for i in range(horizontal_slices+1):
row = []
for j in range(vertical_slices+1):
row.append((horizontal_step*j, vertical_step*i))
points.append(row)
Here's where I'm struggling to draw and to calculate what I need inside each of these squares. If I try to loop over all those points and draw them on the image.
with Image.open(fileName) as im:
im = im.convert(mode='RGB')
draw = ImageDraw.Draw(im)
for i in range(horizontal_slices+1):
if i < horizontal_slices:
for j in range(vertical_slices+1):
if j < vertical_slices:
draw.line([points[i][j], points[i+1][j-1]], fill=9999999)
Is there an easy way that I can dynamically give it the rows and columns and save each of the square coordinates to a list of tuples for example?
I'd like to both be able to draw them on top of the original image, and also calculate the number of black pixels inside each of the squares.
EDIT: To add some clarification, since the number of rows and columns of the grid is arbitrary, it will likely not be made of squares but rectangles. Furthermore, the numbering of these rectangles should be done row-wise from left to right, like reading.
Thank you
There were (from my understanding) inconsistencies in your use of "horizontal/vertical"; I also removed the points list, since you can easily convert the rectangle number to its upper-left corner coords (see the function in the end); I draw the grid directly by drawing all horizontal lines and all vertical lines).
from PIL import Image, ImageDraw, ImageFont
fileName = 'test.png'
img = Image.open(fileName)
vertical_slices = 8
horizontal_slices = 6
height = img.height
width = img.width
new_height = height - (height % vertical_slices)
new_width = width - (width % horizontal_slices)
img = img.crop((0, 0, new_width, new_height))
horizontal_step = int(new_width / horizontal_slices)
vertical_step = int(new_height / vertical_slices)
# drawing the grid
img = img.convert(mode='RGB')
pix = img.load()
draw = ImageDraw.Draw(img)
for i in range(horizontal_slices+1):
draw.line([(i*horizontal_step,0), (i*horizontal_step,new_height)], fill=9999999)
for j in range(vertical_slices+1):
draw.line([(0,j*vertical_step), (new_width,j*vertical_step)], fill=9999999)
# with rectangles being numbered from 1 (upper left) to v_slices*h_slices (lower right) in reading order
def num_to_ul_corner_coords(num):
i = (num-1)%horizontal_slices
j = (num-1)//horizontal_slices
return(i*horizontal_step,j*vertical_step)
This should do what you want, provided your picture is pure black and white:
def count_black_pixels(num) :
cnt = 0
x, y = num_to_ul_corner_coords(num)
for i in range(horizontal_step):
for j in range(vertical_step):
if pix[x+i,y+j] == (0,0,0):
cnt += 1
perc = round(cnt/(horizontal_step*vertical_step)*100,2)
return cnt, perc

Cropping an image by discarding boundary pixels such that it matches 3:4 ratio

I am working on an image enhancement use case where one of the tasks is to rescale an image to a 3:4 ratio. But rather than blindly resizing the image by calculation on the height and width from the original image, I want it to be cropped, or in other words, I want to discard boundary pixels such that it matches the ratio and don't cut the primary object.
I have the segmentation mask using which I am getting the bounding box. I am also removing the background making it transparent for some other things. I am sharing both the binary mask and the original image.
I am using the below code to generate the box.
import cv2
import numpy as np
THRESHOLD = 0.9
mask = cv2.imread("mask.png")
mask = mask/255
mask[mask > THRESHOLD] = 1
mask[mask <= THRESHOLD] = 0
out_layer = mask[:,:,2]
x_starts = [np.where(out_layer[i]==1)[0][0] if len(np.where(out_layer[i]==1)[0])!=0 else out_layer.shape[0]+1 for i in range(out_layer.shape[0])]
x_ends = [np.where(out_layer[i]==1)[0][-1] if len(np.where(out_layer[i]==1)[0])!=0 else 0 for i in range(out_layer.shape[0])]
y_starts = [np.where(out_layer.T[i]==1)[0][0] if len(np.where(out_layer.T[i]==1)[0])!=0 else out_layer.T.shape[0]+1 for i in range(out_layer.T.shape[0])]
y_ends = [np.where(out_layer.T[i]==1)[0][-1] if len(np.where(out_layer.T[i]==1)[0])!=0 else 0 for i in range(out_layer.T.shape[0])]
startx = min(x_starts)
endx = max(x_ends)
starty = min(y_starts)
endy = max(y_ends)
start = (startx,starty)
end = (endx,endy)
If I understood your problem correctly, you just want to have the masking of the person in an image of size ratio 3:4 without cropping the mask. The approach you are talking about is possible but a bit unnecessary. I am sharing below the approach you can use with explanation and also I have used another approach to find the box. Use any approach you like.
import cv2
import numpy as np
MaskImg = cv2.imread("WomanMask.png", cv2.IMREAD_GRAYSCALE)
cv2.imwrite("RuntimeImages/Input MaskImg.png", MaskImg)
ret, MaskImg = cv2.threshold(MaskImg, 20, 255, cv2.THRESH_BINARY)
cv2.imwrite("RuntimeImages/MaskImg after threshold.png", MaskImg)
# Finding biggest contour in the image
# (Assuming that the woman mask will cover the biggest area of the mask image)
# Getting all external contours
Contours = cv2.findContours(MaskImg, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2]
# exit if no white pixel in the image (no contour found)
if len(Contours) == 0:
print("There was no white pixel in the image.")
exit()
# Sorting contours in decreasing order according to their area
Contours = sorted(Contours, key=lambda x:cv2.contourArea(x), reverse=True)
# Getting the biggest contour
BiggestContour = Contours[0] # This is the contour of the girl mask
# Finding the bounding rectangle
BB = cv2.boundingRect(BiggestContour)
print(f"Bounding rectangle : {BB}")
# Getting the position, width, and height of the woman mask
x, y = BB[0], BB[1]
Width, Height = BB[2], BB[3]
# Setting the (height / width) ratio required
Ratio = ( 3 / 4 ) # 3 : 4 :: Height : Width
# Getting the new dimentions of the image to fit the mask
if (Height > Width):
NewHeight = Height
NewWidth = int(NewHeight / Ratio)
else:
NewWidth = Width
NewHeight = int(NewWidth * Ratio)
# Getting the position of the woman mask in this new image
# It will be placed at the center
X = int((NewWidth - Width) / 2)
Y = int((NewHeight - Height) / 2)
# Creating the new image with the woman mask at the center
NewImage = np.zeros((NewHeight, NewWidth), dtype=np.uint8)
NewImage[Y : Y+Height, X : X+Width] = MaskImg[y : y+Height, x : x+Width]
cv2.imwrite("RuntimeImages/Final Image.png", NewImage)
Below is the final output mask image

Image translation using numpy

I want to perform image translation by a certain amount (shift the image vertically and horizontally).
The problem is that when I paste the cropped image back on the canvas, I just get back a white blank box.
Can anyone spot the issue here?
Many thanks
img_shape = image.shape
# translate image
# percentage of the dimension of the image to translate
translate_factor_x = random.uniform(*translate)
translate_factor_y = random.uniform(*translate)
# initialize a black image the same size as the image
canvas = np.zeros(img_shape)
# get the top-left corner coordinates of the shifted image
corner_x = int(translate_factor_x*img_shape[1])
corner_y = int(translate_factor_y*img_shape[0])
# determine which part of the image will be pasted
mask = image[max(-corner_y, 0):min(img_shape[0], -corner_y + img_shape[0]),
max(-corner_x, 0):min(img_shape[1], -corner_x + img_shape[1]),
:]
# determine which part of the canvas the image will be pasted on
target_coords = [max(0,corner_y),
max(corner_x,0),
min(img_shape[0], corner_y + img_shape[0]),
min(img_shape[1],corner_x + img_shape[1])]
# paste image on selected part of the canvas
canvas[target_coords[0]:target_coords[2], target_coords[1]:target_coords[3],:] = mask
transformed_img = canvas
plt.imshow(transformed_img)
This is what I get:
For image translation, you can make use of the somewhat obscure numpy.roll function. In this example I'm going to use a white canvas so it is easier to visualize.
image = np.full_like(original_image, 255)
height, width = image.shape[:-1]
shift = 100
# shift image
rolled = np.roll(image, shift, axis=[0, 1])
# black out shifted parts
rolled = cv2.rectangle(rolled, (0, 0), (width, shift), 0, -1)
rolled = cv2.rectangle(rolled, (0, 0), (shift, height), 0, -1)
If you want to flip the image so the black part is on the other side, you can use both np.fliplr and np.flipud.
Result:
Here is a simple solution that translates an image by tx and ty pixels using only array indexing, that does not roll over, and handles negative values as well:
tx, ty = 8, 5 # translation on x and y axis, in pixels
N, M = image.shape
image_translated = np.zeros_like(image)
image_translated[max(tx,0):M+min(tx,0), max(ty,0):N+min(ty,0)] = image[-min(tx,0):M-max(tx,0), -min(ty,0):N-max(ty,0)]
Example:
(Note that for simplicity it does not handle cases where tx > M or ty > N).

Importing images like MNIST

I have a 100 images, each 10 for every digit and i am trying to convert it like MNIST images in python. But, constantly i am getting an error. Error is posted down!
from PIL import Image, ImageFilter
from os import listdir
def imageprepare(argv):
"""
This function returns the pixel values.
The imput is a png file location.
"""
imagesList = listdir(argv)
for image in imagesList:
im = Image.open(argv).convert('L')
width = float(im.size[0])
height = float(im.size[1])
newImage = Image.new('L', (28, 28), (255)) # creates white canvas of 28x28 pixels
if width > height: # check which dimension is bigger
# Width is bigger. Width becomes 20 pixels.
nheight = int(round((20.0 / width * height), 0)) # resize height according to ratio width
if (nheight == 0): # rare case but minimum is 1 pixel
nheight = 1
# resize and sharpen
img = im.resize((20, nheight), Image.ANTIALIAS).filter(ImageFilter.SHARPEN)
wtop = int(round(((28 - nheight) / 2), 0)) # calculate horizontal position
newImage.paste(img, (4, wtop)) # paste resized image on white canvas
else:
# Height is bigger. Heigth becomes 20 pixels.
nwidth = int(round((20.0 / height * width), 0)) # resize width according to ratio height
if (nwidth == 0): # rare case but minimum is 1 pixel
nwidth = 1
# resize and sharpen
img = im.resize((nwidth, 20), Image.ANTIALIAS).filter(ImageFilter.SHARPEN)
wleft = int(round(((28 - nwidth) / 2), 0)) # caculate vertical pozition
newImage.paste(img, (wleft, 4)) # paste resized image on white canvas
# newImage.save("sample.png
tv = list(newImage.getdata()) # get pixel values
# normalize pixels to 0 and 1. 0 is pure white, 1 is pure black.
tva = [(255 - x) * 1.0 / 255.0 for x in tv]
print(tva)
return tva
argv= 'images/'
x=imageprepare(argv)#file path here
print(len(x))# mnist IMAGES are 28x28=784 pixels
error:
File "C:/Users/lenovo/.spyder-py3/Project1/test12.py", line 47, in
x=imageprepare(argv)#file path here
File "C:/Users/lenovo/.spyder-py3/Project1/test12.py", line 14, in imageprepare
im = Image.open(argv).convert('L')
File "C:\Users\lenovo\Anaconda3\lib\site-packages\PIL\Image.py", line 2477, in open
fp = builtins.open(filename, "rb")
PermissionError: [Errno 13] Permission denied: 'images/'
From the log above, it seems that you have no permission on folder images/ which has been passed as an argument to function imageprepare. Have you tried to change the access privileges of images? Or just run this from prompt as Administrator.

Python PIL/Image make 3x3 Grid from sequence Images

I'm trying to make a 3x3 Grid by sequence images but can't seem to get it right. The images are in folder, named from 0 - 8 (total 9 images), the output of the final one image grid of 3x3 should as follow
image0 image1 image2
image3 image4 image5
image6 image7 image8
I was trying to follow How do you merge images into a canvas using PIL/Pillow? but couldn't get it work correctly.
There are no need to change anything in the image, just merge them and make a 3x3 Grid
To make a grid of arbitrary shape (cols*img_height, rows*img_width) out of rows*cols images:
def image_grid(imgs, rows, cols):
assert len(imgs) == rows*cols
w, h = imgs[0].size
grid = Image.new('RGB', size=(cols*w, rows*h))
grid_w, grid_h = grid.size
for i, img in enumerate(imgs):
grid.paste(img, box=(i%cols*w, i//cols*h))
return grid
In your case, assuming imgs is a list of PIL images:
grid = image_grid(imgs, rows=3, cols=3)
Here's an example how this can be done (consider image is one of your images):
img_w, img_h = image.size
background = Image.new('RGBA',(1300, 1300), (255, 255, 255, 255))
bg_w, bg_h = background.size
offset = (10,(((bg_h - img_h)) / 2)-370)
background.paste(image1,offset)
Adjust the offset, width and height to fit your requirements.

Categories