Remove background of the image using opencv Python

Remove background of the image using opencv Python - python

I have two images, one with only background and the other with background + detectable object (in my case its a car). Below are the images
I am trying to remove the background such that I only have car in the resulting image. Following is the code that with which I am trying to get the desired results
import numpy as np
import cv2
original_image = cv2.imread('IMG1.jpg', cv2.IMREAD_COLOR)
gray_original = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
background_image = cv2.imread('IMG2.jpg', cv2.IMREAD_COLOR)
gray_background = cv2.cvtColor(background_image, cv2.COLOR_BGR2GRAY)
foreground = np.absolute(gray_original - gray_background)
foreground[foreground > 0] = 255
cv2.imshow('Original Image', foreground)
cv2.waitKey(0)
The resulting image by subtracting the two images is
Here is the problem. The expected resulting image should be a car only.
Also, If you take a deep look in the two images, you'll see that they are not exactly same that is, the camera moved a little so background had been disturbed a little. My question is that with these two images how can I subtract the background. I do not want to use grabCut or backgroundSubtractorMOG algorithm right now because I do not know right now whats going on inside those algorithms.
What I am trying to do is to get the following resulting image
Also if possible, please guide me with a general way of doing this not only in this specific case that is, I have a background in one image and background+object in the second image. What could be the best possible way of doing this. Sorry for such a long question.

I solved your problem using the OpenCV's watershed algorithm. You can find the theory and examples of watershed here.
First I selected several points (markers) to dictate where is the object I want to keep, and where is the background. This step is manual, and can vary a lot from image to image. Also, it requires some repetition until you get the desired result. I suggest using a tool to get the pixel coordinates.
Then I created an empty integer array of zeros, with the size of the car image. And then I assigned some values (1:background, [255,192,128,64]:car_parts) to pixels at marker positions.
NOTE: When I downloaded your image I had to crop it to get the one with the car. After cropping, the image has size of 400x601. This may not be what the size of the image you have, so the markers will be off.
Afterwards I used the watershed algorithm. The 1st input is your image and 2nd input is the marker image (zero everywhere except at marker positions). The result is shown in the image below.
I set all pixels with value greater than 1 to 255 (the car), and the rest (background) to zero. Then I dilated the obtained image with a 3x3 kernel to avoid losing information on the outline of the car. Finally, I used the dilated image as a mask for the original image, using the cv2.bitwise_and() function, and the result lies in the following image:
Here is my code:
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Load the image
img = cv2.imread("/path/to/image.png", 3)
# Create a blank image of zeros (same dimension as img)
# It should be grayscale (1 color channel)
marker = np.zeros_like(img[:,:,0]).astype(np.int32)
# This step is manual. The goal is to find the points
# which create the result we want. I suggest using a
# tool to get the pixel coordinates.
# Dictate the background and set the markers to 1
marker[204][95] = 1
marker[240][137] = 1
marker[245][444] = 1
marker[260][427] = 1
marker[257][378] = 1
marker[217][466] = 1
# Dictate the area of interest
# I used different values for each part of the car (for visibility)
marker[235][370] = 255 # car body
marker[135][294] = 64 # rooftop
marker[190][454] = 64 # rear light
marker[167][458] = 64 # rear wing
marker[205][103] = 128 # front bumper
# rear bumper
marker[225][456] = 128
marker[224][461] = 128
marker[216][461] = 128
# front wheel
marker[225][189] = 192
marker[240][147] = 192
# rear wheel
marker[258][409] = 192
marker[257][391] = 192
marker[254][421] = 192
# Now we have set the markers, we use the watershed
# algorithm to generate a marked image
marked = cv2.watershed(img, marker)
# Plot this one. If it does what we want, proceed;
# otherwise edit your markers and repeat
plt.imshow(marked, cmap='gray')
plt.show()
# Make the background black, and what we want to keep white
marked[marked == 1] = 0
marked[marked > 1] = 255
# Use a kernel to dilate the image, to not lose any detail on the outline
# I used a kernel of 3x3 pixels
kernel = np.ones((3,3),np.uint8)
dilation = cv2.dilate(marked.astype(np.float32), kernel, iterations = 1)
# Plot again to check whether the dilation is according to our needs
# If not, repeat by using a smaller/bigger kernel, or more/less iterations
plt.imshow(dilation, cmap='gray')
plt.show()
# Now apply the mask we created on the initial image
final_img = cv2.bitwise_and(img, img, mask=dilation.astype(np.uint8))
# cv2.imread reads the image as BGR, but matplotlib uses RGB
# BGR to RGB so we can plot the image with accurate colors
b, g, r = cv2.split(final_img)
final_img = cv2.merge([r, g, b])
# Plot the final result
plt.imshow(final_img)
plt.show()
If you have a lot of images you will probably need to create a tool to annotate the markers graphically, or even an algorithm to find markers automatically.

The problem is that you're subtracting arrays of unsigned 8 bit integers. This operation can overflow.
To demonstrate
>>> import numpy as np
>>> a = np.array([[10,10]],dtype=np.uint8)
>>> b = np.array([[11,11]],dtype=np.uint8)
>>> a - b
array([[255, 255]], dtype=uint8)
Since you're using OpenCV, the simplest way to achieve your goal is to use cv2.absdiff().
>>> cv2.absdiff(a,b)
array([[1, 1]], dtype=uint8)

I recommend using OpenCV's grabcut algorithm. You first draw a few lines on the foreground and background, and keep doing this until your foreground is sufficiently separated from the background. It is covered here: https://docs.opencv.org/trunk/d8/d83/tutorial_py_grabcut.html
as well as in this video: https://www.youtube.com/watch?v=kAwxLTDDAwU

Related

How set white color background when concatenating images via python?

import numpy as np
from imageio import imread, imwrite
im1 = imread('https://api.sofascore.app/api/v1/team/2697/image')[...,:3]
im2 = imread('https://api.sofascore.app/api/v1/team/2692/image')[...,:3]
result = np.hstack((im1,im2))
imwrite('result.jpg', result)
Original images opening directly from the url's (I'm trying to concatenate the two images into one and keep the background white):
As can be seen both have no background, but when joining the two via Python, the defined background becomes this moss green:
I tried modifying the color reception:
im1 = imread('https://api.sofascore.app/api/v1/team/2697/image')[...,:1]
im2 = imread('https://api.sofascore.app/api/v1/team/2692/image')[...,:1]
But the result is a Black & White with the background still looking like it was converted from the previous green, even though the PNG's don't have such a background color.
How should I proceed to solve my need?

There is a 4th channel in your images - transparency. You are discarding that channel with [...,:1]. This is a mistake.
If you retain the alpha channel this will work fine:
import numpy as np
from imageio import imread, imwrite
im1 = imread('https://api.sofascore.app/api/v1/team/2697/image')
im2 = imread('https://api.sofascore.app/api/v1/team/2692/image')
result = np.hstack((im1,im2))
imwrite('result.png', result)
However, if you try to make a jpg, you will have a problem:
>>> imwrite('test.jpg', result)
OSError: JPEG does not support alpha channel.
This is correct, as JPGs do not do transparency. If you would like to use transparency and also have your output be a JPG, I suggest a priest.
You can replace the transparent pixels by using np.where and looking for places that the alpha channel is 0:
result = np.hstack((im1,im2))
result[np.where(result[...,3] == 0)] = [255, 255, 255, 255]
imwrite('result.png', result)

If you want to improve image quality, here is a solution. #Brondy
# External libraries used for
# Image IO
from PIL import Image
# Morphological filtering
from skimage.morphology import opening
from skimage.morphology import disk
# Data handling
import numpy as np
# Connected component filtering
import cv2
black = 0
white = 255
threshold = 160
# Open input image in grayscale mode and get its pixels.
img = Image.open("image.jpg").convert("LA")
pixels = np.array(img)[:,:,0]
# Remove pixels above threshold
pixels[pixels > threshold] = white
pixels[pixels < threshold] = black
# Morphological opening
blobSize = 1 # Select the maximum radius of the blobs you would like to remove
structureElement = disk(blobSize) # you can define different shapes, here we take a disk shape
# We need to invert the image such that black is background and white foreground to perform the opening
pixels = np.invert(opening(np.invert(pixels), structureElement))
# Create and save new image.
newImg = Image.fromarray(pixels).convert('RGB')
newImg.save("newImage1.PNG")
# Find the connected components (black objects in your image)
# Because the function searches for white connected components on a black background, we need to invert the image
nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(np.invert(pixels), connectivity=8)
# For every connected component in your image, you can obtain the number of pixels from the stats variable in the last
# column. We remove the first entry from sizes, because this is the entry of the background connected component
sizes = stats[1:,-1]
nb_components -= 1
# Define the minimum size (number of pixels) a component should consist of
minimum_size = 100
# Create a new image
newPixels = np.ones(pixels.shape)*255
# Iterate over all components in the image, only keep the components larger than minimum size
for i in range(1, nb_components):
if sizes[i] > minimum_size:
newPixels[output == i+1] = 0
# Create and save new image.
newImg = Image.fromarray(newPixels).convert('RGB')
newImg.save("new_img.PNG")

If you want to change the background of a Image, pixellib is the best solution because it seemed the most reasonable and easy library to use.
import pixellib
from pixellib.tune_bg import alter_bg
change_bg = alter_bg()
change_bg.load_pascalvoc_model("deeplabv3_xception_tf_dim_ordering_tf_kernels.h5")
change_bg.color_bg("sample.png", colors=(255,255,255), output_image_name="colored_bg.png")
This code requires pixellib to be higher or the same as 0.6.1

How To Get The Pixel Count Of A Segmented Area in an Image I used Vgg16 for Segmentation

I am new to deep learning but have succeeded in semantic segmentation of the image I am trying to get the pixel count of each class in the label. As an example in the image I want to get the pixel count of the carpet, or the chandelier or the light stand. How do I go about? Thanks any suggestions will help.

Edit: In what format the regions are returned? Do you have only the final image or the regions are given as contours? If you have them as contours (list of coordinates), you can apply findContourArea directly on that structure.
If you can receive/sample the regions one by one in an image (but do not have the contour), you can sequentially paint each of the colors/classes in a clear image, either convert it to grayscale or directly paint it in grayscale or binary, or binarize with threshold; then numberPixels = len(cv2.findNonZero(bwImage)). cv2.findContour and cv2.contourArea should do the same.
Instead of rendering each class in a separate image, if your program receives only the final segmentation and not per-class contours, you can filter/mask the regions by color ranges on that image. I built that and it seemed to do the job, 14861 pixels for the pink carpet:
import cv2
import numpy as np
# rgb 229, 0, 178 # the purple carpet in RGB (sampled with IrfanView)
# b,g,r = 178, 0, 229 # cv2 uses BGR
class_color = [178, 0, 229]
multiclassImage = cv2.imread("segmented.png")
cv2.imshow("MULTI", multiclassImage)
filteredImage = multiclassImage.copy()
low = np.array(class_color);
mask = cv2.inRange(filteredImage, low, low)
filteredImage[mask == 0] = [0, 0, 0]
filteredImage[mask != 0] = [255,255,255]
cv2.imshow("FILTER", filteredImage)
# numberPixelsFancier = len(cv2.findNonZero(filteredImage[...,0]))
# That also works and returns 14861 - without conversion, taking one color channel
bwImage = cv2.cvtColor(filteredImage, cv2.COLOR_BGR2GRAY)
cv2.imshow("BW", bwImage)
numberPixels = len(cv2.findNonZero(bwImage))
print(numberPixels)
cv2.waitKey(0)
If you don't have the values of the colors given or/and can't control them, you can use numpy.unique(): https://numpy.org/doc/stable/reference/generated/numpy.unique.html and it will return the unique colors, then they could be applied in the algorithm above.
Edit 2: BTW, another way to compute or verify such counts is by calculating histograms. That's with IrfanView on the black-white image:

Set the values below a certain threshold of a CV2 Colormap to transparent

I'm currently trying to apply an activation heatmap to a photo.
Currently, I have the original photo, as well as a mask of probabilities. I multiply the probabilities by 255 and then round down to the nearest integer. I'm then using cv2.applyColorMap with COLORMAP.JET to apply the colormap to the image with an opacity of 25%.
img_cv2 = cv2.cvtColor(np_img, cv2.COLOR_RGB2BGR)
heatmapshow = np.uint8(np.floor(mask * 255))
colormap = cv2.COLORMAP_JET
heatmapshow = cv2.applyColorMap(np.uint8(heatmapshow - 255), colormap)
heatmap_opacity = 0.25
image_opacity = 1.0 - heatmap_opacity
heatmap_arr = cv2.addWeighted(heatmapshow, heatmap_opacity, img_cv2, image_opacity, 0)
This current code successfully produces a heatmap. However, I'd like to be able to make two changes.
Keep the opacity at 25% For all values above a certain threshold (Likely > 0, but I'd prefer more flexibility), but then when the mask is below that threshold, reduce the opacity to 0% for those cells. In other words, if there is very little activation, I want to preserve the color of the original image.
If possible I'd also like to be able to specify a custom colormap, since the native ones are pretty limited, though I might be able to get away without this if I can do the custom opacity thing.
I read on Stackoverflow that you can possibly trick cv2 into not overlaying any color with NaN values, but also read that only works for floats and not ints, which complicates things since I'm using int8. I'm also concerned that this functionality could change in the future as I don't believe this is intentional design purposefully built into cv2.
Does anyone have a good way of accomplishing these goals? Thanks!

With regard to your second question:
Here is how to create a simple custom two color gradient color map in Python/OpenCV.
Input:
import cv2
import numpy as np
# load image as grayscale
img = cv2.imread('lena_gray.png', cv2.IMREAD_GRAYSCALE)
# convert to 3 equal channels
img = cv2.merge((img, img, img))
# create 1 pixel red image
red = np.full((1, 1, 3), (0,0,255), np.uint8)
# create 1 pixel blue image
blue = np.full((1, 1, 3), (255,0,0), np.uint8)
# append the two images
lut = np.concatenate((red, blue), axis=0)
# resize lut to 256 values
lut = cv2.resize(lut, (1,256), interpolation=cv2.INTER_LINEAR)
# apply lut
result = cv2.LUT(img, lut)
# save result
cv2.imwrite('lena_red_blue_lut_mapped.png', result)
# display result
cv2.imshow('RESULT', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result of colormap applied to image:
With regard to your first question:
You are blending the heat map image with the original image using a constant "opacity" value. You can replace the single opacity value with an image. You just have to do the addWeighted manually as heatmap * opacity_img + original * (1-opacity_img) where your opacity image is float in the range 0 to 1. Then clip and convert back to uint8. If your opacity image is binary, then you can use cv2.bitWiseAnd() in place of multiply.

denoising binary image in python

For my project i'm trying to binarize an image with openCV in python. I used the adaptive gaussian thresholding from openCV to convert the image with the following result:
I want to use the binary image for OCR but it's too noisy. Is there any way to remove the noise from the binary image in python? I already tried fastNlMeansDenoising from openCV but it doesn't make a difference.
P.S better options for binarization are welcome as well

You should start by adjusting the parameters to the adaptive threshold so it uses a larger area. That way it won't be segmenting out noise. Whenever your output image has more noise than the input image, you know you're doing something wrong.
I suggest as an adaptive threshold to use a closing (on the input grey-value image) with a structuring element just large enough to remove all the text. The difference between this result and the input image is exactly all the text. You can then apply a regular threshold to this difference.

It is also possible using GraphCuts for this kind of task. You will need to install the maxflow library in order to run the code. I quickly copied the code from their tutorial and modified it, so you could run it more easily. Just play around with the smoothing parameter to increase or decrease the denoising of the image.
import cv2
import numpy as np
import matplotlib.pyplot as plt
import maxflow
# Important parameter
# Higher values means making the image smoother
smoothing = 110
# Load the image and convert it to grayscale image
image_path = 'your_image.png'
img = cv2.imread('image_path')
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = 255 * (img > 128).astype(np.uint8)
# Create the graph.
g = maxflow.Graph[int]()
# Add the nodes. nodeids has the identifiers of the nodes in the grid.
nodeids = g.add_grid_nodes(img.shape)
# Add non-terminal edges with the same capacity.
g.add_grid_edges(nodeids, smoothing)
# Add the terminal edges. The image pixels are the capacities
# of the edges from the source node. The inverted image pixels
# are the capacities of the edges to the sink node.
g.add_grid_tedges(nodeids, img, 255-img)
# Find the maximum flow.
g.maxflow()
# Get the segments of the nodes in the grid.
sgm = g.get_grid_segments(nodeids)
# The labels should be 1 where sgm is False and 0 otherwise.
img_denoised = np.logical_not(sgm).astype(np.uint8) * 255
# Show the result.
plt.subplot(121)
plt.imshow(img, cmap='gray')
plt.title('Binary image')
plt.subplot(122)
plt.title('Denoised binary image')
plt.imshow(img_denoised, cmap='gray')
plt.show()
# Save denoised image
cv2.imwrite('img_denoised.png', img_denoised)
Result

You could try the morphological transformation close to remove small "holes".
First define a kernel using numpy, you might need to play around with the size. Choose the size of the kernel as big as your noise.
kernel = np.ones((5,5),np.uint8)
Then run the morphologyEx using the kernel.
denoised = cv2.morphologyEx(image, cv2.MORPH_CLOSE, kernel)
If text gets removed you can try to erode the image, this will "grow" the black pixels. If the noise is as big as the data, this method will not help.
erosion = cv2.erode(img,kernel,iterations = 1)

Remove features from binarized image

I wrote a little script to transform pictures of chalkboards into a form that I can print off and mark up.
I take an image like this:
Auto-crop it, and binarize it. Here's the output of the script:
I would like to remove the largest connected black regions from the image. Is there a simple way to do this?
I was thinking of eroding the image to eliminate the text and then subtracting the eroded image from the original binarized image, but I can't help thinking that there's a more appropriate method.

Sure you can just get connected components (of certain size) with findContours or floodFill, and erase them leaving some smear. However, if you like to do it right you would think about why do you have the black area in the first place.
You did not use adaptive thresholding (locally adaptive) and this made your output sensitive to shading. Try not to get the black region in the first place by running something like this:
Mat img = imread("desk.jpg", 0);
Mat img2, dst;
pyrDown(img, img2);
adaptiveThreshold(255-img2, dst, 255, ADAPTIVE_THRESH_MEAN_C,
THRESH_BINARY, 9, 10); imwrite("adaptiveT.png", dst);
imshow("dst", dst);
waitKey(-1);
In the future, you may read something about adaptive thresholds and how to sample colors locally. I personally found it useful to sample binary colors orthogonally to the image gradient (that is on the both sides of it). This way the samples of white and black are of equal size which is a big deal since typically there are more background color which biases estimation. Using SWT and MSER may give you even more ideas about text segmentation.

I tried this:
import numpy as np
import cv2
im = cv2.imread('image.png')
gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
grayout = 255*np.ones((im.shape[0],im.shape[1],1), np.uint8)
blur = cv2.GaussianBlur(gray,(5,5),1)
thresh = cv2.adaptiveThreshold(blur,255,1,1,11,2)
contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
wcnt = 0
for item in contours:
area =cv2.contourArea(item)
print wcnt,area
[x,y,w,h] = cv2.boundingRect(item)
if area>10 and area<200:
roi = gray[y:y+h,x:x+w]
cntd = 0
for i in range(x,x+w):
for j in range(y,y+h):
if gray[j,i]==0:
cntd = cntd + 1
density = cntd/(float(h*w))
if density<0.5:
for i in range(x,x+w):
for j in range(y,y+h):
grayout[j,i] = gray[j,i];
wcnt = wcnt + 1
cv2.imwrite('result.png',grayout)
You have to balance two things, removing the black spots but balance that with not losing the contents of what is on the board. The output I got is this:

Here is a Python numpy implementation (using my own mahotas package) of the method for the top answer (almost the same, I think):
import mahotas as mh
import numpy as np
Imported mahotas & numpy with standard abbreviations
im = mh.imread('7Esco.jpg', as_grey=1)
Load the image & convert to gray
im2 = im[::2,::2]
im2 = mh.gaussian_filter(im2, 1.4)
Downsample and blur (for speed and noise removal).
im2 = 255 - im2
Invert the image
mean_filtered = mh.convolve(im2.astype(float), np.ones((9,9))/81.)
Mean filtering is implemented "by hand" with a convolution.
imc = im2 > mean_filtered - 4
You might need to adjust the number 4 here, but it worked well for this image.
mh.imsave('binarized.png', (imc*255).astype(np.uint8))
Convert to 8 bits and save in PNG format.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.