I am currently working on the below and am struggling to understand the best approach.
I've searched a lot but was not able to find answers that would match what I am trying to do
The problem:
Relocating an Object (e.g. Shoe) within the existing image (white background) to certain location (e.g. move up)
Inserting and positioning the Object (e.g. Shoe) at by the user specified location within a new background (still white) with by the user specified new height / width
How far I got:
I've managed identify the object within the picture using CV2, got the outer contours, added a little padding and cropped the object (see below). I am happy with cropping it that way as all my images have a one coloured background and I will keep the background in the same colour.
Where I am stuck:
My cropped Object and old image background / new background do not share the same shape, hence I am not able to overlay / concatenate / merge ...
Given both images are store as np arrays, I assume the answer will be to somehow place the Shoe crop np.array within the background np.array, however I have no clue how to do this.
Maybe there is an easier / different way to do this?
Would be very grateful to hear from anyone who can lead me into the right direction.
Code
#importing dependencies
import os
import numpy as np
import cv2
from matplotlib import pyplot as plt
# Config
path = '/Users/..../Shoes/'
img_list = os.listdir(path)
img_path = path + img_list[0]
#Outline
color = (0,255,0)
thickness = 3
padding = 10
# convert to RGB
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
# create a binary thresholded image
_, binary = cv2.threshold(gray, 225, 255, cv2.THRESH_BINARY_INV)
# find the contours from the thresholded image
contours, hierarchy = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Identifying outer contours
x_axis = []
y_axis = []
for i in range(len(contours)):
for y in range (len(contours[i])):
x_axis.append(contours[i][y][0][0])
y_axis.append(contours[i][y][0][1])
min_x = min(x_axis) - padding
min_y = min(y_axis) - padding
max_x = max(x_axis) + padding
max_y = max(y_axis) + padding
# Defining start and endpoint of outline Rectangle based on identified outer corners + Padding
start_point = (min_x, min_y)
end_point = (max_x, max_y)
image_outline = cv2.rectangle(image, start_point, end_point, color, thickness)
plt.imshow(image_outline)
plt.show()
#Crop Image
crop_img = image[min_y:max_y, min_x:max_x]
print(crop_img.shape)
plt.imshow(crop_img)
plt.show()
I think I got to the solution, this centers the image for any new given background height/width
Still interested in quicker / cleaner ways
#Define the new height and width you want to have
new_height = 1200
new_width = 1200
#Check current hight and with of Cropped image
crop_height = crop_img.shape[0]
crop_width = crop_img.shape[1]
#calculate how much you need add to the sides and top - basically halft of the remaining height / with ... currently not working correctly for odd numbers
add_sides = int((new_width - crop_width)/2)
add_top_and_btm = int((new_height - crop_height)/2)
# Adding background to the sides
bg_sides = np.zeros(shape=[crop_height, add_sides, 3], dtype=np.uint8)
bg_sides2 = 255 * np.ones(shape=[crop_height, add_sides, 3], dtype=np.uint8)
new_crop_img = np.insert(crop_img, [1], bg_sides2, axis=1)
new_crop_img = np.insert(new_crop_img, [-1], bg_sides2, axis=1)
# Then adding Background to top and bottom
bg_top_and_btm = np.zeros(shape=[add_top_and_btm, new_width, 3],
dtype=np.uint8)
bg_top_and_btm2 = 255 * np.ones(shape=[add_top_and_btm, new_width, 3],
dtype=np.uint8)
new_crop_img = np.insert(new_crop_img, [1], bg_top_and_btm2, axis=0)
new_crop_img = np.insert(new_crop_img, [-1], bg_top_and_btm2, axis=0)
plt.imshow(new_crop_img)
Related
So I have started a program that takes two images, one that's the model image and the other that's an image with a change I want it to detect the differences and show me with circling the differences. I have come to an issue with finding the difference coordinates as my circle keeps ending up in the middle of the image.
This is the code I have:
import cv2 as cv
import numpy as np
from PIL import Image, ImageChops
#Ideal Image and The main Image
img2= cv.imread("ideal.jpg")
img1 = cv.imread("Actual.jpg")
#Verifys if there is or isnt a differance in the Image for the If statement
diff = cv.subtract(img2, img1)
results = not np.any(diff)
#Tells the User if there is a Differance within the 2 images with the model image and the image given
if results is True:
print("The Images are the same!")
else:
print("The images are differant")
#This is to make the image show the differance to circle
img_1=Image.open("Actual.jpg")
img_2=Image.open("ideal.jpg")
diff=ImageChops.difference(img_1,img_2)
diff.save("Differance.jpg")
#Reads the image Just saved
Differance = cv.imread("Differance.jpg", 0)
#Resize the Image to make it smaller
img1s = cv.resize(img1, (0, 0), fx=0.5, fy=0.5)
Differance = cv.resize(Differance, (0, 0), fx=0.5, fy=0.5)
# Find anything not black, i.e. The differance
nz = cv.findNonZero(Differance)
# Find top, bottom, left and right edge of the Differance
a = nz[:,0,0].min()
b = nz[:,0,0].max()
c = nz[:,0,1].min()
d = nz[:,0,1].max()
# Average top and bottom edges, left and right edges, to give centre
c0 = (a+b)/2
c1 = (c+d)/2
#The Center Coords
c3 = (int(c0),int(c1))
#Values for the below code so it doesnt look messy
radius = 50
color = (0, 0, 255)
thickness = 2
#This Places a Circle around the center of the differance
Finished = cv.circle(img1s, c3, radius, color, thickness)
#Saves the Final Image with the circle around it
cv.imwrite("Final.jpg", Finished)
And the Images attached 1
2
This code currently takes both images and blacks out the background leaving only the difference within the image then the program is meant to take the location of the difference and place a circle around the center of the main image that is the one with the difference on it.
Your main problem is JPG format which changes pixels to better compress image - and this creates differences in all area. If you display diff or difference then you should see many gray pixels
I hope you see pixels below ball
If you use PNG for original image (without ball) and later use this image to create image with ball and also save in PNG then code will works correctly.
My version without PIL.
Press any key to close window with image.
import cv2 as cv
import numpy as np
# load images
img1 = cv.imread("img1.png")
img2 = cv.imread("img2.png")
# calculate difference
diff = cv.subtract(img1, img2) # other order `(img2, img1)` gives worse result
# saves difference
cv.imwrite("difference.png", diff)
# show difference - press any key to close
cv.imshow('diff', diff)
cv.waitKey(0)
cv.destroyWindow('diff')
if not np.any(diff):
print("The images are the same!")
else:
print("The images are differant")
# resize images to make them smaller
#img1_resized = cv.resize(img1, (0, 0), fx=0.5, fy=0.5)
#diff_resized = cv.resize(diff, (0, 0), fx=0.5, fy=0.5)
img1_resized = img1
diff_resized = diff
# convert to grayscale (without saving and loading again)
diff_resized = cv.cvtColor(diff_resized, cv.COLOR_BGR2GRAY)
# find anything not black in differance
non_zero = cv.findNonZero(diff_resized)
#print(non_zero)
# find top, bottom, left and right edge of the differance
x_min = non_zero[:,0,0].min()
x_max = non_zero[:,0,0].max()
y_min = non_zero[:,0,1].min()
y_max = non_zero[:,0,1].max()
print('x:', x_min, x_max)
print('y:', y_min, y_max)
sizes = [x_max-x_min+1, y_max-y_min+1]
print('width :', sizes[0])
print('height:', sizes[1])
# center
center_x = (x_min + x_max) // 2
center_y = (y_min + y_max) // 2
center = (center_x, center_y)
print('center:', center)
# radius
radius = max(sizes) // 2
print('radius:', radius)
color = (0, 0, 255)
thickness = 2
# draw circle around the center of the differance
finished = cv.circle(img1_resized, center, radius, color, thickness)
# saves final image with circle
#cv.imwrite("final.png", finished)
# show final image - press any key to close
cv.imshow('finished', finished)
cv.waitKey(0)
cv.destroyWindow('finished')
img1.png
img2.png
difference.png
final.png
EDIT:
If you work with JPG then you can try to reduce noises
diff = cv.subtract(img1, img2)
diff_gray = cv.cvtColor(diff, cv.COLOR_BGR2GRAY)
diff_gray[diff_gray < 50] = 0
For different images you may need different values instead of 50.
You may also try thresholding
(_, diff_gray) = cv.threshold(diff_gray, 50, 0, cv.THRESH_TOZERO)
It may need also other functions like blur(), erode(), dilate(),
do not need PIL
take Differance image
threshold it
use findcontour to find regions
if contours finded then draw it
for cnt in contours:
out_image = cv2.drawContours(out_image, [cnt], 0, (255,0,0), -1)
(x,y),radius = cv2.minEnclosingCircle(cnt)
center = (int(x),int(y))
radius = int(radius)
out_image = cv2.circle(out_image,center,radius,(0,255,0),2)
I would like to be able to make a certain shape in either a PIL image or an OpenCV image 3 times larger and smaller without changing the resolution of the image or changing the shape of the shape I want to make larger. I have tried using OpenCV's dilation method but that is not it's intended use, plus it changed the shape of the image. For an example:
Thanks.
Here's a way of doing it:
find the interesting shape, i.e. non-white ROI area
extract it
scale it up by a factor
clear the original image to white
paste the scaled ROI back into image with same centre
#!/usr/bin/env python3
import cv2
import numpy as np
if __name__ == "__main__":
# Open image
orig = cv2.imread('image.png',cv2.IMREAD_COLOR)
# Get extent of interesting part, i.e. non-white part
y, x, _ = np.nonzero(~orig)
y0, y1 = np.min(y), np.max(y) # top and bottom rows
x0, x1 = np.min(x), np.max(x) # left and right cols
h, w = y1-y0, x1-x0 # height and width
ROI = orig[y0:y1, x0:x1] # extract ROI
cv2.imwrite('ROI.png', ROI) # DEBUG only
# Upscale ROI
factor = 3
scaledROI = cv2.resize(ROI, (w*factor,h*factor), interpolation=cv2.INTER_NEAREST)
newH, newW = scaledROI.shape[:2]
# Clear original image to white
orig[:] = [255,255,255]
# Get centre of original shape, and position of top-left of ROI in output image
cx, cy = (x0 + x1) //2, (y0 + y1)//2
top = cy - newH//2
left = cx - newW//2
# Paste in rescaled ROI
orig[top:top+newH, left:left+newW] = scaledROI
cv2.imwrite('result.png', orig)
That transforms this:
to this:
Puts me in mind of a pantograph:
I have a set of arbitrary images. Half the images are pictures, half are masks defining ROIS.
In the current version of my program I use the ROI to crop the image (i.e I extract the rectangle in the image matching the bounding box of the ROI mask). The problem is, the ROI mask isn't perfect and it's better to over predict than under predict in my case.
So I want to copy more than the ROI rectangle, but if I do this, I may be trying to crop out of the image.
i.e:
x, y, w, h = cv2.boundingRect(mask_contour)
img = img[int(y-h*0.05):int(y + h * 1.05), int(x-w*0.05):int(x + w * 1.05)]
can fail because it tries to access out of bounds pixels. I could just clamp the values, but I wanted to know if there is a better approach
You can add a boarder using OpenCV
import cv2 as cv
import random
src = cv.imread('/home/stephen/lenna.png')
borderType = cv.BORDER_REPLICATE
boarderSize = .5
top = int(boarderSize * src.shape[0]) # shape[0] = rows
bottom = top
left = int(boarderSize * src.shape[1]) # shape[1] = cols
right = left
value = [random.randint(0,255), random.randint(0,255), random.randint(0,255)]
dst = cv.copyMakeBorder(src, top, bottom, left, right, borderType, None, value)
cv.imshow('img', dst)
c = cv.waitKey(0)
Maybe you could try to limit the coordinates beforehand. Please see the code below:
[ymin, ymax] = [max(0,int(y-h*0.05)), min(h, int(y+h*1.05))]
[xmin, xmax] = [max(0,int(x-w*1.05)), min(w, int(x+w*1.05))]
img = img[ymin:ymax, xmin:xmax]
I have this image with tables where I want to remove the tabular structure from the image so that it can work more effectively with Tesseract. I used the following code to create a boundary around the table (and individual cells) so that it can be deleted.
img =cv2.imread('bfir.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,50,150,apertureSize = 3)
img1 = np.ones(img.shape, dtype=np.uint8)*255
ret,thresh = cv2.threshold(gray,127,255,1)
(_,contours,h) = cv2.findContours(thresh,1,2)
for cnt in contours:
approx = cv2.approxPolyDP(cnt,0.01*cv2.arcLength(cnt,True),True)
if len(approx)==4:
cv2.drawContours(img1,[cnt],0,(0,255,0),2)
This draws green lines around the table like this image.
Next, I tried the cv2.subtract method to subtract the table from the image, somewhat like this.
final_img = cv2.subtract(img1, img)
But this didn't work as I expected and gives me a grayscale image with the table still in it. Link
While I just want the original image in B&W with the table removed. I am using OpenCV for the first time so I don't know what I am doing wrong and I am sorry for the long post but if anybody can please help with how to go about with this or just point me in the right direction about how to remove the table, that would be very much appreciated.
EDIT:
As suggested by RobAu it can also work with simply drawing the contours in white in the first place but I don't know how to do that without losing the rest of the data in the preprocessing stage.
You could try and simply overwrite the cells that represent the borders. This can be done by creating a mask image, and then using that as reference as to where to overwrite pixels in the original.
This can be done with:
mask_image = np.zeros(img.shape[0:2], np.uint8)
cv2.drawContours(mask_image, contours, -1, color=255, thickness=2)
border_points = np.array(np.where(mask_image == 255)).transpose()
background = [0, 0, 0] # Change this to the colour you want
for point in border_points :
img[point[0], point[1]] = background
Update:
You could use the 3-channel you already created for the mask, but that slightly complicates the algorithms. The mask image propose is more fitted for the task, but I will try to adapt it to your code:
# Create your mask image as usual...
border_points = np.array(np.where(img1[:,:,1] == 255)).transpose() # Only look at channel 2
background = [0, 0, 0] # Change this to the colour you want
for point in border_points :
img[point[0], point[1]] = background
Update to do as #RobAu suggested (quicker than my previous methods):
line_thickness = 3 # Change this value until it looks the best.
cv2.drawContours(img, contours, -1, color=(0,0,0), thickness=line_thickness )
Please note I didn't test this code. So it might need some further fiddling.
As a reference to the comments of this question, this is an example of a code that locates rectangles and creates new images for each one, this was an attempt at creating individual images of a picture of shredded paper. Some of the values will need to be changed for it to locate the rectangles with the right amount of size
There is also some code for tracking sizes of images and the code is made up by 50% what i have written and 50% by stackoverflow help.
import cv2
import numpy as np
fileName = ['9','8','7','6','5','4','3','2','1','0']
img = cv2.imread('#YOUR IMAGE#')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = cv2.bilateralFilter(gray, 11, 17, 17)
kernel = np.ones((5,5),np.uint8)
erosion = cv2.erode(gray,kernel,iterations = 2)
kernel = np.ones((4,4),np.uint8)
dilation = cv2.dilate(erosion,kernel,iterations = 2)
edged = cv2.Canny(dilation, 30, 200)
_, contours, hierarchy = cv2.findContours(edged, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
rects = [cv2.boundingRect(cnt) for cnt in contours]
rects = sorted(rects,key=lambda x:x[1],reverse=True)
i = -1
j = 1
y_old = 5000
x_old = 5000
for rect in rects:
x,y,w,h = rect
area = w * h
print('width: %d and height: %d' %(w,h))
if w > 50 and h > 500:
print('abs:')
print(abs(x_old - x))
if abs(x_old - x) > 0:
print('writing')
x_old = x
x,y,w,h = rect
out = img[y+10:y+h-10,x+10:x+w-10]
cv2.imwrite('assets/newImage' + fileName[i] + '.jpg', out)
j+=1
if (y_old - y) > 1000:
i += 1
y_old = y
Even though, the given input image links are not working & so I obviously doesn't know the following is what you have asked for, I learnt something from your question, when I was working on, removing table structure lines from given image, I like to share what I have learnt, for the future readers.
I followed the steps provided in opencv documentation to remove the lines.
But that only removed the horizontal lines. When I tried to remove vertical lines, the result image only had the vertical lines. The text in the table was not there.
Then I came across your question & saw final_img = cv2.subtract(img1, img) in the question. Tried that & it worked great.
Here are the steps that I followed:
# Load the image
src = cv.imread(argv[0], cv.IMREAD_COLOR)
# Check if image is loaded fine
if src is None:
print ('Error opening image: ' + argv[0])
return -1
# Show source image
cv.imshow("src", src)
# [load_image]
# [gray]
# Transform source image to gray if it is not already
if len(src.shape) != 2:
gray = cv.cvtColor(src, cv.COLOR_BGR2GRAY)
else:
gray = src
# Show gray image
# show_wait_destroy("gray", gray)
# [gray]
# [bin]
# Apply adaptiveThreshold at the bitwise_not of gray, notice the ~ symbol
gray = cv.bitwise_not(gray)
bw = cv.adaptiveThreshold(gray, 255, cv.ADAPTIVE_THRESH_MEAN_C, \
cv.THRESH_BINARY, 15, -2)
# Show binary image
# show_wait_destroy("binary", bw)
# [bin]
# [init]
# Create the images that will use to extract the horizontal and vertical lines
horizontal = np.copy(bw)
vertical = np.copy(bw)
# [horiz]
# [vert]
# Specify size on vertical axis
rows = vertical.shape[0]
verticalsize = rows / 10
# Create structure element for extracting vertical lines through morphology operations
verticalStructure = cv.getStructuringElement(cv.MORPH_RECT, (1, verticalsize))
# Apply morphology operations
vertical = cv.erode(vertical, verticalStructure)
vertical = cv.dilate(vertical, verticalStructure)
# [init]
# [horiz]
# Specify size on horizontal axis
cols = horizontal.shape[1]
horizontal_size = cols / 30
# Create structure element for extracting horizontal lines through morphology operations
horizontalStructure = cv.getStructuringElement(cv.MORPH_RECT, (horizontal_size, 1))
# Apply morphology operations
horizontal = cv.erode(horizontal, horizontalStructure)
horizontal = cv.dilate(horizontal, horizontalStructure)
lines_removed = cv.subtract(gray, vertical + horizontal)
show_wait_destroy("lines_removed", ~lines_removed)
Input:
Output:
Few things that I changed from the sources:
verticalsize = rows / 10, here, I do not understand the significance of the number 10. In the documentation, 30 was used. I got better result with 10. I guess, the less the division number, the large the structure element & here, as we are targeting straight lines, reducing the number works.
In the documentation, vertical lines are processed after horizontal lines. I reversed the order
I swapped the parameters to cv2.substract(). I used cv2.subtract(img, img1).
I am very new to OpenCV Python and I really need some help here.
So what I am trying to do here is to extract out these words in the image below.
The words and shapes are all hand drawn, so they are not perfect. I have did some coding below.
First of all, I grayscale the image
img_final = cv2.imread(file_name)
img2gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
Then I use THRESH_INV to show the content
ret, new_img = cv2.threshold(image_final, 100 , 255, cv2.THRESH_BINARY_INV)
After which, I dilate the content
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(3 , 3))
dilated = cv2.dilate(new_img,kernel,iterations = 3)
I dilate the image is because I can identify text as one cluster
After that, I apply boundingRect around the contour and draw around the rectangle
contours, hierarchy = cv2.findContours(dilated,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE) # get contours
index = 0
for contour in contours:
# get rectangle bounding contour
[x,y,w,h] = cv2.boundingRect(contour)
#Don't plot small false positives that aren't text
if w < 10 or h < 10:
continue
# draw rectangle around contour on original image
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,255),2)
This is what I got after that.
I am only able to detect one of the text. I have tried many other methods but this is the closet results I have got and it does not fulfill the requirement.
The reason for me to identify the text is so that I can get the X and Y coordinate of each of the text in this image by putting a bounding Rectangle "boundingRect()".
Please help me out. Thank you so much
You can use the fact that the connected component of the letters are much smaller than the large strokes of the rest of the diagram.
I used opencv3 connected components in the code but you can do the same things using findContours.
The code:
import cv2
import numpy as np
# Params
maxArea = 150
minArea = 10
# Read image
I = cv2.imread('i.jpg')
# Convert to gray
Igray = cv2.cvtColor(I,cv2.COLOR_RGB2GRAY)
# Threshold
ret, Ithresh = cv2.threshold(Igray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
# Keep only small components but not to small
comp = cv2.connectedComponentsWithStats(Ithresh)
labels = comp[1]
labelStats = comp[2]
labelAreas = labelStats[:,4]
for compLabel in range(1,comp[0],1):
if labelAreas[compLabel] > maxArea or labelAreas[compLabel] < minArea:
labels[labels==compLabel] = 0
labels[labels>0] = 1
# Do dilation
se = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(25,25))
IdilateText = cv2.morphologyEx(labels.astype(np.uint8),cv2.MORPH_DILATE,se)
# Find connected component again
comp = cv2.connectedComponentsWithStats(IdilateText)
# Draw a rectangle around the text
labels = comp[1]
labelStats = comp[2]
#labelAreas = labelStats[:,4]
for compLabel in range(1,comp[0],1):
cv2.rectangle(I,(labelStats[compLabel,0],labelStats[compLabel,1]),(labelStats[compLabel,0]+labelStats[compLabel,2],labelStats[compLabel,1]+labelStats[compLabel,3]),(0,0,255),2)