Recognise numbers from a Sudoku grid using pillow - python

I am making a sudoku solver and I want to be able to recognise numbers from a sudoku grid. I'm using pillow's cropping tool to cut out the squares and then see if they match a png of any of the numbers. But the function I made to crop out a square isn't consistent.
I've tried first resizing the image so that all the squares are going to be the same size, then cropping based on a row and column input, and then multiplying by a ninth of the image size. But that ended up including unnecessary lines that made the image different, so I shrunk the size of the crop. I tried many different offsets but none would work.
Sudoku = Image.open('image.png')
w,h = Sudoku.size
new_w = w//9*9
new_h = h//9*9
Sudoku = Sudoku.resize((new_w,new_h))
def get_square(r,c,cutoff):
square_size = Sudoku.size[0]/9
Top = r*square_size+cutoff
Left = c*square_size+cutoff
Bottom = (r+1)*square_size-cutoff
Right = (c+1)*square_size-cutoff
tup = (Left,Top,Right,Bottom)
return Sudoku.crop(tup)
I expected the square to be shown normally and that numbers would always appear in the same place for each square. Instead when I save an image of a number in one square and then compare it with the same number from a different square, the numbers were shifted instead of lining up.
Any ideas?

Related

Find minimal number of rectangles in the image

I have binary images where rectangles are placed randomly and I want to get the positions and sizes of those rectangles.
If possible I want the minimal number of rectangles necessary to exactly recreate the image.
On the left is my original image and on the right the image I get after applying scipys.find_objects()
(like suggested for this question).
import scipy
# image = scipy.ndimage.zoom(image, 9, order=0)
labels, n = scipy.ndimage.measurements.label(image, np.ones((3, 3)))
bboxes = scipy.ndimage.measurements.find_objects(labels)
img_new = np.zeros_like(image)
for bb in bboxes:
img_new[bb[0], bb[1]] = 1
This works fine if the rectangles are far apart, but if they overlap and build more complex structures this algorithm just gives me the largest bounding box (upsampling the image made no difference). I have the feeling that there should already exist a scipy or opencv method which does this.
I would be glad to know if somebody has an idea on how to tackle this problem or even better knows of an existing solution.
As result I want a list of rectangles (ie. lower-left-corner : upper-righ-corner) in the image. The condition is that when I redraw those filled rectangles I want to get exactly the same image as before. If possible the number of rectangles should be minimal.
Here is the code for generating sample images (and a more complex example original vs scipy)
import numpy as np
def random_rectangle_image(grid_size, n_obstacles, rectangle_limits):
n_dim = 2
rect_pos = np.random.randint(low=0, high=grid_size-rectangle_limits[0]+1,
size=(n_obstacles, n_dim))
rect_size = np.random.randint(low=rectangle_limits[0],
high=rectangle_limits[1]+1,
size=(n_obstacles, n_dim))
# Crop rectangle size if it goes over the boundaries of the world
diff = rect_pos + rect_size
ex = np.where(diff > grid_size, True, False)
rect_size[ex] -= (diff - grid_size)[ex].astype(int)
img = np.zeros((grid_size,)*n_dim, dtype=bool)
for i in range(n_obstacles):
p_i = np.array(rect_pos[i])
ps_i = p_i + np.array(rect_size[i])
img[tuple(map(slice, p_i, ps_i))] = True
return img
img = random_rectangle_image(grid_size=64, n_obstacles=30,
rectangle_limits=[4, 10])
Here is something to get you started: a naïve algorithm that walks your image and creates rectangles as large as possible. As it is now, it only marks the rectangles but does not report back coordinates or counts. This is to visualize the algorithm alone.
It does not need any external libraries except for PIL, to load and access the left side image when saved as a PNG. I'm assuming a border of 15 pixels all around can be ignored.
from PIL import Image
def fill_rect (pixels,xp,yp,w,h):
for y in range(h):
for x in range(w):
pixels[xp+x,yp+y] = (255,0,0,255)
for y in range(h):
pixels[xp,yp+y] = (255,192,0,255)
pixels[xp+w-1,yp+y] = (255,192,0,255)
for x in range(w):
pixels[xp+x,yp] = (255,192,0,255)
pixels[xp+x,yp+h-1] = (255,192,0,255)
def find_rect (pixels,x,y,maxx,maxy):
# assume we're at the top left
# get max horizontal span
width = 0
height = 1
while x+width < maxx and pixels[x+width,y] == (0,0,0,255):
width += 1
# now walk down, adjusting max width
while y+height < maxy:
for w in range(x,x+width,1):
if pixels[x,y+height] != (0,0,0,255):
break
if pixels[x,y+height] != (0,0,0,255):
break
height += 1
# fill rectangle
fill_rect (pixels,x,y,width,height)
image = Image.open('A.png')
pixels = image.load()
width, height = image.size
print (width,height)
for y in range(16,height-15,1):
for x in range(16,width-15,1):
if pixels[x,y] == (0,0,0,255):
find_rect (pixels,x,y,width,height)
image.show()
From the output
you can observe the detection algorithm can be improved, as, for example, the "obvious" two top left rectangles are split up into 3. Similar, the larger structure in the center also contains one rectangle more than absolutely needed.
Possible improvements are either to adjust the find_rect routine to locate a best fit¹, or store the coordinates and use math (beyond my ken) to find which rectangles may be joined.
¹ A further idea on this. Currently all found rectangles are immediately filled with the "found" color. You could try to detect obviously multiple rectangles, and then, after marking the first, the other rectangle(s) to check may then either be black or red. Off the cuff I'd say you'd need to try different scan orders (top-to-bottom or reverse, left-to-right or reverse) to actually find the minimally needed number of rectangles in any combination.

Split Image into arbitrary number of boxes

I need to split an RGBA image into an arbitrary number of boxes that are as equally sized as possible
I have attempted to use numpy.array_split, but am unsure of how to do so while preserving the RGBA channels
I have looked the following questions, none of them detail how to split an image into n boxes, they reference splitting the image into boxes of predetermined pixel size, or how to split the image into some shape.
While it seems that it would be some simple math to get number of boxes from box size and image size, I am unsure of how to do so.
How to Split Image Into Multiple Pieces in Python
Cutting one image into multiple images using the Python Image Library
Divide image into rectangles information in Python
While attempting to determine the number of boxes from pixel box size, I used the formula
num_boxes = (img_size[0]*img_size[1])/ (box_size_x * box_size_y)
but that did not result in the image being split up properly
To clarify, I would like to be able to input an image that is a numpy array of size (a,b,4) and a number of boxes and output the images in some form (np array preferred, but whatever works)
I appreciate any help, even if you aren't able to provide the full method, I would appreciate some direction.
I have tried
def split_image(image, n_boxes):
return numpy.array_split(image,n_boxes)
#doesn't work with colors
def split_image(image, n_boxes):
box_size = factor_int(n_boxes)
M = im.shape[0]//box_size[0]
N = im.shape[1]//box_size[1]
return [im[x:x+M,y:y+N] for x in range(0,im.shape[0],M) for y in range(0,im.shape[1],N)]
factor_int returns integer as close to a square as possible from Factor an integer to something as close to a square as possible
I am still not sure if your inputs are actually the image and the dimensions of the boxes or the image and the number of boxes. Nor am I sure if your problem is deciding where to chop the image or knowing how to chop a 4-channel image, but maybe something in here will get you started.
I started with this RGBA image - the circles are transparent, not white:
#!/usr/bin/env python3
from PIL import Image
import numpy as np
import math
# Open image and get dimensions
im = Image.open('start.png').convert('RGBA')
# Make Numpy array from image and get height and width
ni = np.array(im)
h ,w = ni.shape[:2]
print(f'Height: {h}, width: {w}')
BOXES = 4
for i in range(BOXES):
this = ni[:, i*w//BOXES:(i+1)*w//BOXES, :]
Image.fromarray(this).save(f'box-{i}.png')
You can change BOXES but leaving it at 4 gets you these 4 output images:
[] []4

Remove noise from grayscale image PIL

I am working on a handwriting recognition code for a school project. We want to collect the data our self, and I’m currently working on a program that scans a document with handwritten letters on it and create a own image for every letter. I cut the image to the exact size of the letter, resize the letter so every letter has the same dimensions and place them on a white background so they have the same dimensions but the original ratio stays the same. I already have it working quite well, the only problem is when I have a little bit of noise in the picture, it doesn’t work. I have the image (see attachment) and a list of all the pixels of the image. What would be a good way to cut the image to the image boundaries and not to the noise.
the code I use to cut the image:
def cut_to_edge(image, data, width, height):
left = width
right = 0
down = 0
up = height
for i in range(len(data)):
for j in range(len(data[i])):
if data[i][j] < 225:
if j < left:
left = j
if j > right:
right = j
if i < up:
up = i
if i > down:
down = i
letter = image.crop((left, up, right, down))
return letter
image is the image (obviously),
data is a 2 dimensional list with every pixel ([[row1][row2]etc.]),
width and height are the dimensions of the image
The image I need to cut,
how it should look,
How it looks now
If your noise is like little points, you can write a Median filter and that will solve your problem. The main idea of this filter is to loop through the image pixel by pixel and replace each pixel value with the median of the neighboring pixels.
But first you need to identify the type of noise you have and then apply the right filter.

Using OpenCV remap function crops image

I am trying to warp an 640x360 image via the OpenCV remap function (in python 2.7). The steps executed are the following
Generate a curve and store its x and y coordinates in two seperate arrays, curve_x and curve_y.I am attaching the generated curve as an image(using pyplot):
Load image via the opencv imread function
original = cv2.imread('C:\\Users\\User\\Desktop\\alaskan-landscaps3.jpg')
Execute a nested for loop so that each pixel is shifted upwards in proportion to the height of the curve at that point.For each pixel I calculate a warping factor by dividing the distance between the curve's y coordinate and the "ceiling" (360) by the height of the image. The factor is then multiplied with the distance between the pixel's y-coordinate and the "ceiling" in order to find the new distance that the pixel must have from the "ceiling" (it will be shorter since we have an upward shift). Finally I subtract this new distance from the "ceiling" to obtain the new y-coordinate for the pixel. I thought of this formula in order to ensure that all entries in the map_y array used in the remap function will be within the area of the original image.
for i in range(0, y_size):
for j in range(0,x_size):
map_y[i][j]= y_size-((y_size - i) * ((y_size - curve_y[j]) / y_size))
map_x[i][j]=j`
Then using the remap function
warped=cv2.remap(original,map_x,map_y,cv2.INTER_LINEAR)
The resulting image appears to be warped somewhat along the curve's path but it is cropped - I am attaching both the original and resulting image
I know I must be missing something but I can't figure out where the mistake is in my code - I don't understand why since all y-coordinates in map_y are between 0-360 the top third of the image has disappeared following the remapping
Any pointers or help will be appreciated. Thanks
[EDIT:] I have edited my function as follows:
#array to store previous y-coordinate, used as a counter during mapping process
floor_y=np.zeros((x_size),np.float32)
#for each row and column of picture
for i in range(0, y_size):
for j in range(0,x_size):
#calculate distance between top of the curve at given x coordinate and top
height_above_curve = (y_size-1) - curve_y_points[j]
#calculated a mapping factor, using total height of picture and distance above curve
mapping_factor = (y_size-1)/height_above_curve
# if there was no curve at given x-coordinate then do not change the pixel coordinate
if(curve_y_points[j]==0):
map_y[i][j]=j
#if this is the first time the column is traversed, save the curve y-coordinate
elif (floor_y[j]==0):
#the pixel is translated upwards according to the height of the curve at that point
floor_y[j]=i+curve_y_points[j]
map_y[i][j]=i+curve_y_points[j] # new coordinate saved
# use a modulo operation to only translate each nth pixel where n is the mapping factor.
# the idea is that in order to fit all pixels from the original picture into a new smaller space
#(because the curve squashes the picture upwards) a number of pixels must be removed
elif ((math.floor(i % mapping_factor))==0):
#increment the "floor" counter so that the next group of pixels from the original image
#are mapped 1 pixel higher up than the previous group in the new picture
floor_y[j]=floor_y[j]+1
map_y[i][j]=floor_y[j]
else:
#for pixels that must be skipped map them all to the last pixel actually translated to the new image
map_y[i][j]=floor_y[j]
#all x-coordinates remain unchanges as we only translate pixels upwards
map_x[i][j] = j
#printout function to test mappings at x=383
for j in range(0, 360):
print('At x=383,y='+str(j)+'for curve_y_points[383]='+str(curve_y_points[383])+' and floor_y[383]='+str(floor_y[383])+' mapping is:'+str(map_y[j][383]))
The bottom line is that now the higher part of the image should not receive mappings from the lowest part so overwriting of pixels should not take place. Yet i am still getting a hugely exaggerated upwards warping effect in the picture which I cannot explain. (see new image below).The top of the curved part is at around y=140 in the original picture yet now is very close to the top i.e y around 300. There is also the question of why I am not getting a blank space at the bottom for the pixels below the curve.
I'm thinking that maybe there is also something going on with the order of rows and columns in the map_y array?
I don't think the image is being cropped. Rather, the values are "crowded" in the top-middle pixels, so that they get overwritten. Consider the following example with a simple function on a checkerboard.
import numpy as np
import cv2
import pickle
y_size=200
x_size=200
x=np.linspace(0,x_size,x_size+1)
y=(-(x-x_size/2)*(x-x_size/2))/x_size+x_size
plt.plot(x,y)
The function looks like this:
Then let's produce an image with a regular pattern.
test=np.zeros((x_size,y_size),dtype=np.float32)
for i in range(0, y_size):
for j in range(0,x_size):
if i%2 and j%2:
test[i][j]=255
cv2.imwrite('checker.png',test)
Now let's apply your shift function to that pattern:
map_y=np.zeros((x_size,y_size),dtype=np.float32)
map_x=np.zeros((x_size,y_size),dtype=np.float32)
for i in range(0, y_size):
for j in range(0,x_size):
map_y[i][j]= y_size-((y_size - i) * ((y_size - y[j]) / y_size))
map_x[i][j]=j
warped=cv2.remap(test,map_x,map_y,cv2.INTER_LINEAR)
cv2.imwrite('warped.png',warped)
If you notice, because of the shift, more than one value corresponds to the top-middle areas, which makes it look like it is cropped. But if you check to the top left and right corners of the image, notice that the values are sparser, thus the "cropping" effect does not occur much. I hope the simple example helps better to understand what is going on.

Trim scanned images with PIL?

What would be the approach to trim an image that's been input using a scanner and therefore has a large white/black area?
the entropy solution seems problematic and overly intensive computationally. Why not edge detect?
I just wrote this python code to solve this same problem for myself. My background was dirty white-ish, so the criteria that I used was darkness and color. I simplified this criteria by just taking the smallest of the R, B or B value for each pixel, so that black or saturated red both stood out the same. I also used the average of the however many darkest pixels for each row or column. Then I started at each edge and worked my way in till I crossed a threshold.
Here is my code:
#these values set how sensitive the bounding box detection is
threshold = 200 #the average of the darkest values must be _below_ this to count (0 is darkest, 255 is lightest)
obviousness = 50 #how many of the darkest pixels to include (1 would mean a single dark pixel triggers it)
from PIL import Image
def find_line(vals):
#implement edge detection once, use many times
for i,tmp in enumerate(vals):
tmp.sort()
average = float(sum(tmp[:obviousness]))/len(tmp[:obviousness])
if average <= threshold:
return i
return i #i is left over from failed threshold finding, it is the bounds
def getbox(img):
#get the bounding box of the interesting part of a PIL image object
#this is done by getting the darekest of the R, G or B value of each pixel
#and finding were the edge gest dark/colored enough
#returns a tuple of (left,upper,right,lower)
width, height = img.size #for making a 2d array
retval = [0,0,width,height] #values will be disposed of, but this is a black image's box
pixels = list(img.getdata())
vals = [] #store the value of the darkest color
for pixel in pixels:
vals.append(min(pixel)) #the darkest of the R,G or B values
#make 2d array
vals = np.array([vals[i * width:(i + 1) * width] for i in xrange(height)])
#start with upper bounds
forupper = vals.copy()
retval[1] = find_line(forupper)
#next, do lower bounds
forlower = vals.copy()
forlower = np.flipud(forlower)
retval[3] = height - find_line(forlower)
#left edge, same as before but roatate the data so left edge is top edge
forleft = vals.copy()
forleft = np.swapaxes(forleft,0,1)
retval[0] = find_line(forleft)
#and right edge is bottom edge of rotated array
forright = vals.copy()
forright = np.swapaxes(forright,0,1)
forright = np.flipud(forright)
retval[2] = width - find_line(forright)
if retval[0] >= retval[2] or retval[1] >= retval[3]:
print "error, bounding box is not legit"
return None
return tuple(retval)
if __name__ == '__main__':
image = Image.open('cat.jpg')
box = getbox(image)
print "result is: ",box
result = image.crop(box)
result.show()
For starters, Here is a similar question. Here is a related question. And a another related question.
Here is just one idea, there are certainly other approaches. I would select an arbitrary crop edge and then measure the entropy* on either side of the line, then proceed to re-select the crop line (probably using something like a bisection method) until the entropy of the cropped-out portion falls below a defined threshold. As I think, you may need to resort to a brute root-finding method as you will not have a good indication of when you have cropped too little. Then repeat for the remaining 3 edges.
*I recall discovering that the entropy method in the referenced website was not completely accurate, but I could not find my notes (I'm sure it was in a SO post, however.)
Edit:
Other criteria for the "emptiness" of an image portion (other than entropy) might be contrast ratio or contrast ratio on an edge-detect result.

Categories