I am trying to use DeepLab v3 to detect object and mask where the actual object is.
DeepLab model produces a resized_im(3D) and a mask seg_map (2D) of 0 and non-zero values, 0 means it's the background.
Currently, it is only possible to plot an image with an overlay mask on the object. I want to crop the object out of the resized_im with transparent background. Is there any advice for the work?
You can play around with the notebook here:
https://colab.research.google.com/drive/138dTpcYfne40hqrb13n_36okSGYhrJnz?usp=sharing&hl=en#scrollTo=p47cYGGOQE1W&forceEdit=true&sandboxMode=true
I also tried here: How to crop image based on binary mask but none seems to work on my case
You just need to convert your segmentation mask to boolean numpy array, then multiply image by it. Don't forget that your image has 3 channels while mask has only 1. It may look something like that:
# seg_map - segmentation mask from network, resized_im - your input image
mask = np.greater(seg_map, 0) # get only non-zero positive pixels/labels
mask = np.expand_dims(mask, axis=-1) # (H, W) -> (H, W, 1)
mask = np.concatenate((mask, mask, mask), axis=-1) # (H, W, 1) -> (H, W, 3), (don't like it, so if you know how to do it better, please let me know)
crops = resized_im * mask # apply mask on image
You can use different logical numpy function if you want to choose certain labels, for example:
mask = np.equal(seg_map, 5) # to get only objects with label 5
Related
I'm currently trying to apply an activation heatmap to a photo.
Currently, I have the original photo, as well as a mask of probabilities. I multiply the probabilities by 255 and then round down to the nearest integer. I'm then using cv2.applyColorMap with COLORMAP.JET to apply the colormap to the image with an opacity of 25%.
img_cv2 = cv2.cvtColor(np_img, cv2.COLOR_RGB2BGR)
heatmapshow = np.uint8(np.floor(mask * 255))
colormap = cv2.COLORMAP_JET
heatmapshow = cv2.applyColorMap(np.uint8(heatmapshow - 255), colormap)
heatmap_opacity = 0.25
image_opacity = 1.0 - heatmap_opacity
heatmap_arr = cv2.addWeighted(heatmapshow, heatmap_opacity, img_cv2, image_opacity, 0)
This current code successfully produces a heatmap. However, I'd like to be able to make two changes.
Keep the opacity at 25% For all values above a certain threshold (Likely > 0, but I'd prefer more flexibility), but then when the mask is below that threshold, reduce the opacity to 0% for those cells. In other words, if there is very little activation, I want to preserve the color of the original image.
If possible I'd also like to be able to specify a custom colormap, since the native ones are pretty limited, though I might be able to get away without this if I can do the custom opacity thing.
I read on Stackoverflow that you can possibly trick cv2 into not overlaying any color with NaN values, but also read that only works for floats and not ints, which complicates things since I'm using int8. I'm also concerned that this functionality could change in the future as I don't believe this is intentional design purposefully built into cv2.
Does anyone have a good way of accomplishing these goals? Thanks!
With regard to your second question:
Here is how to create a simple custom two color gradient color map in Python/OpenCV.
Input:
import cv2
import numpy as np
# load image as grayscale
img = cv2.imread('lena_gray.png', cv2.IMREAD_GRAYSCALE)
# convert to 3 equal channels
img = cv2.merge((img, img, img))
# create 1 pixel red image
red = np.full((1, 1, 3), (0,0,255), np.uint8)
# create 1 pixel blue image
blue = np.full((1, 1, 3), (255,0,0), np.uint8)
# append the two images
lut = np.concatenate((red, blue), axis=0)
# resize lut to 256 values
lut = cv2.resize(lut, (1,256), interpolation=cv2.INTER_LINEAR)
# apply lut
result = cv2.LUT(img, lut)
# save result
cv2.imwrite('lena_red_blue_lut_mapped.png', result)
# display result
cv2.imshow('RESULT', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result of colormap applied to image:
With regard to your first question:
You are blending the heat map image with the original image using a constant "opacity" value. You can replace the single opacity value with an image. You just have to do the addWeighted manually as heatmap * opacity_img + original * (1-opacity_img) where your opacity image is float in the range 0 to 1. Then clip and convert back to uint8. If your opacity image is binary, then you can use cv2.bitWiseAnd() in place of multiply.
By reading a few answers on stackoverflow, I've learned this much so far:
The mask has to be a numpy array (which has the same shape as the image) with data type CV_8UC1 and have values from 0 to 255.
What is the meaning of these numbers, though? Is it that any pixels with a corresponding mask value of zero will be ignored in the detection process and any pixels with a mask value of 255 will be used? What about the values in between?
Also, how do I initialize a numpy array with data type CV_8UC1 in python? Can I just use dtype=cv2.CV_8UC1
Here is the code I am using currently, based on the assumptions I'm making above. But the issue is that I don't get any keypoints when I run detectAndCompute for either image. I have a feeling it might be because the mask isn't the correct data type. If I'm right about that, how do I correct it?
# convert images to grayscale
base_gray = cv2.cvtColor(self.base, cv2.COLOR_BGRA2GRAY)
curr_gray = cv2.cvtColor(self.curr, cv2.COLOR_BGRA2GRAY)
# initialize feature detector
detector = cv2.ORB_create()
# create a mask using the alpha channel of the original image--don't
# use transparent or partially transparent parts
base_cond = self.base[:,:,3] == 255
base_mask = np.array(np.where(base_cond, 255, 0))
curr_cond = self.base[:,:,3] == 255
curr_mask = np.array(np.where(curr_cond, 255, 0), dtype=np.uint8)
# use the mask and grayscale images to detect good features
base_keys, base_desc = detector.detectAndCompute(base_gray, mask=base_mask)
curr_keys, curr_desc = detector.detectAndCompute(curr_gray, mask=curr_mask)
print("base keys: ", base_keys)
# []
print("curr keys: ", curr_keys)
# []
So here is most, if not all, of the answer:
What is the meaning of those numbers
0 means to ignore the pixel and 255 means to use it. I'm still unclear on the values in between, but I don't think all nonzero values are considered "equivalent" to 255 in the mask. See here.
Also, how do I initialize a numpy array with data type CV_8UC1 in python?
The type CV_8U is the unsigned 8-bit integer, which, using numpy, is numpy.uint8. The C1 postfix means that the array is 1-channel, instead of 3-channel for color images and 4-channel for rgba images. So, to create a 1-channel array of unsigned 8-bit integers:
import numpy as np
np.zeros((480, 720), dtype=np.uint8)
(a three-channel array would have shape (480, 720, 3), four-channel (480, 720, 4), etc.) This mask would cause the detector and extractor to ignore the entire image, though, since it's all zeros.
how do I correct [the code]?
There were two separate issues, each separately causing each keypoint array to be empty.
First, I forgot to set the type for the base_mask
base_mask = np.array(np.where(base_cond, 255, 0)) # wrong
base_mask = np.array(np.where(base_cond, 255, 0), dtype=uint8) # right
Second, I used the wrong image to generate my curr_cond array:
curr_cond = self.base[:,:,3] == 255 # wrong
curr_cond = self.curr[:,:,3] == 255 # right
Some pretty dumb mistakes.
Here is the full corrected code:
# convert images to grayscale
base_gray = cv2.cvtColor(self.base, cv2.COLOR_BGRA2GRAY)
curr_gray = cv2.cvtColor(self.curr, cv2.COLOR_BGRA2GRAY)
# initialize feature detector
detector = cv2.ORB_create()
# create a mask using the alpha channel of the original image--don't
# use transparent or partially transparent parts
base_cond = self.base[:,:,3] == 255
base_mask = np.array(np.where(base_cond, 255, 0), dtype=np.uint8)
curr_cond = self.curr[:,:,3] == 255
curr_mask = np.array(np.where(curr_cond, 255, 0), dtype=np.uint8)
# use the mask and grayscale images to detect good features
base_keys, base_desc = detector.detectAndCompute(base_gray, mask=base_mask)
curr_keys, curr_desc = detector.detectAndCompute(curr_gray, mask=curr_mask)
TL;DR: The mask parameter is a 1-channel numpy array with the same shape as the grayscale image in which you are trying to find features (if image shape is (480, 720), so is mask).
The values in the array are of type np.uint8, 255 means "use this pixel" and 0 means "don't"
Thanks to Dan Mašek for leading me to parts of this answer.
I'm trying to paint part of an image as black and white using OpenCV2 and Python3. This is the code I'm trying:
(x, y, w, h) = cv2.boundingRect(c)
cv2.rectangle(frame, (x,y), (x+w,y+h),0,0)
sub_face = frame[y:y+h, x:x+w]
# apply a gaussian blur on this new recangle image
# sub_face = cv2.GaussianBlur(sub_face,(9, 9), 30, borderType = 0)
sub_face = cv2.cvtColor(sub_face, cv2.COLOR_BGR2GRAY)
# merge this blurry rectangle to our final image
result_frame[y:y+sub_face.shape[0], x:x+sub_face.shape[1]] = sub_face
When I apply the GaussianBlur method, it works properly, but when I try the cvtColor method it fails with a message (on the last line): could not broadcast input array from shape (268,182) into shape (268,182,3). What am I doing wrong?
The c variable in first line is a contour (from motion detection).
I'm new into Python and OpenCV.
Thanks!
You are trying to assign a single channel that results from your cv2.cvtColor call to three channels at once as result_frame is a RGB / three channel image. You are probably wanting to assign the single channel to all three channels. One way to do this cleanly is to exploit NumPy broadcasting by creating a singleton channel in the third dimension, then broadcasting the result over all channels. Since you are using the cv2 interface to OpenCV, the native datatype used for manipulating images is a NumPy array:
# merge this blurry rectangle to our final image
result_frame[y:y+sub_face.shape[0], x:x+sub_face.shape[1]] = sub_face[:,:,None]
The : operation in this context accesses all values in a particular dimension. In this case, we want the first and second dimensions. Therefore, sub_face[:,:,None] will make your single channel image 3D with the third dimension being a singleton (i.e. 1). Using NumPy broadcasting will then broadcast this single channel image to all channels simultaneously.
Note that I didn't have to explicitly access the third dimension when assigning to result_frame. That is because result_frame[y:y+sub_face.shape[0], x:x+sub_face.shape[1]] and result_frame[y:y+sub_face.shape[0], x:x+sub_face.shape[1],:] are the same thing as dropping indexing after the last dimension you specify implicitly assumes :.
You converted sub_face to a single channel image, but result_frame is a 3 channel image.
In the last line you are trying to assign a single channel array to a 3 channel slice.
You could do this:
result_frame[y:y+sub_face.shape[0], x:x+sub_face.shape[1], 0] = sub_face
result_frame[y:y+sub_face.shape[0], x:x+sub_face.shape[1], 1] = sub_face
result_frame[y:y+sub_face.shape[0], x:x+sub_face.shape[1], 2] = sub_face
I am trying to extract blue colour of an input image. For that I create a blue HSV colour boundary and threshold HSV image by using the command
mask_img = cv2.inRange(hsv, lower_blue, upper_blue)
After that I used a bitwise_and on the input image and the threshold image by using
res = cv2.bitwise_and(img, img, mask = mask_img)
Where img is the input image. I got this code from opencv. But I didn't understand why are three arguments used in bitwise_and and what actually each arguments mean? Why the same image is used at src1 and src2 ?
And also what is the use of mask keyword here? Please help me to find out the answer
The basic concept behind this is the value of color black ,it's value is 0 in OPEN_CV.So black + anycolor= anycolor because value of black is 0.
Now suppose we have two images one is named img1 and other is img2.
img2 contains a logo which we want to place on the img1. We create threshold and then the mask and mask_inv of img2,and also create roi of img1.
Now we have to do two things to add the logo of img2 on img1.
We create background of roi as img1_bg with help of : mask_inv,mask_inv will have two region one black and one white, in the white region we will put img1 part and leave black as it is-
img1_bg = cv2.bitwise_and(roi,roi,mask = mask_inv)
In your question you have used directly the mask of the img created
res = cv2.bitwise_and(img,img,mask = mask_img)
and in img2 we need to create the logo as foreground of roi ,
img2_fg = cv2.bitwise_and(img2,img2,mask = mask)
here we have used mask layer , the logo part of img2 gets filled in the white part of mask
Now when we add both we get a perfect combined roi
For full description and understanding visit:
OPEN CV CODE FILES AND FULL DESCRIPTION
The operation of "And" will be performed only if mask[i] doesn't equal zero, else the the result of and operation will be zero. The mask should be either white or black image with single channel. you can see this link
http://docs.opencv.org/2.4.13.2/modules/core/doc/operations_on_arrays.html?highlight=bitwise#bitwise-and
what is actually each arguments mean?
res = cv2.bitwise_and(img,img,mask = mask_img)
src1: the first image (the first object for merging)
src2: the second image (the second object for merging)
mask: understood as rules to merge. If region of image (which is gray-scaled, and then masked) has black color (valued as 0), then it is not combined (merging region of the first image with that of the second one), vice versa, it will be carried out. In your code, referenced image is "mask_img".
In my case, my code is correct, when it makes white + anycolor = anycolor
import cv2
import numpy as np
# Load two images
img1 = cv2.imread('bongSung.jpg')
img2 = cv2.imread('opencv.jpg')
# I want to put logo on top-left corner, so I create a ROI
rows, cols, channels = img2.shape
roi = img1[0:rows, 0:cols]
# NOw we need to create a mask of the logo, mask is conversion to grayscale of an image
img2gray = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(img2gray, 220, 255, cv2.THRESH_BINARY_INV)
cv2.imshow('mask', mask)
mask_inv = cv2.bitwise_not(mask)
#cv2.imshow("mask_inv", mask_inv)
#When using bitwise_and() in opencv with python then white + anycolor = anycolor; black + anycolor = black
img1_bg = cv2.bitwise_and(roi,roi,mask = mask_inv)
#cv2.imshow("img1_bg", img1_bg)
cv2.imshow("img2", img2)
img2_fg = cv2.bitwise_and(img2,img2,mask = mask)
cv2.imshow('img2_fg', img2_fg)
dst = cv2.add(img1_bg,img2_fg)
img1[0:rows, 0:cols] = dst
#cv2.imshow("Image", img1)
cv2.waitKey(0)
cv2.destroyAllWindows()
From above answers we may know the definitions of the parameters of bitwise_and(), but they all do not answer the other question
Why the same image is used at src1 and src2 ?
This question should be caused by the too simplified function definition in the document of OpenCV, it may be ambiguous to some people, in the document the bitwise_and() is defined as
dst(I)=sur1(I) ^ sur2(I), if mask(I) != 0, where ^ represents the 'and' operator
from this definition at first sight I cannot get the picture about how to process the dst(I) when the mask(I) is 0.
From the test result, I think that it should give a more clear function definition as
dst(I)=sur1(I) ^sur2(I), if mask(I) != 0,
otherwise the dst(I) keep its original value and the default value of all elements of the dst array is 0.
Now we may know that using the same image for sur1 and sur2, it will only keep the original image parts in the area of mask(I) !=0 and the other area shows the part of the dst image (as the mask shape)
Additionally for other bitwise operations the definitions should be the same as above, they also need to add the otherwise condition and the default value description of the dst array
The below link explains clearly the bitwise operation and also the significance of each parameters.
http://opencvexamples.blogspot.com/2013/10/bitwise-and-or-xor-and-not.html
void bitwise_and(InputArray src1, InputArray src2, OutputArray dst, InputArray mask=noArray())
Calculates the per-element bit-wise conjunction of two arrays or an array and a scalar.
Parameters:
src1 – first input array or a scalar.
src2 – second input array or a scalar.
src – single input array.
value – scalar value.
dst – output array that has the same size and type as the input arrays.
mask – optional operation mask, 8-bit single channel array, that specifies elements of the output array to be changed
Regarding using img twice, my guess is that we don't really care what img[i] and img[i] is, as it's just img[i] for binary. What matters is that, as mentioned by Mohammed Awney, when the mask is 0, we make img[i] be 0, otherwise we leave the pixel alone. This is a way to make certain pixels in img black, according to our mask.
bitwise_and ( InputArray src1,
InputArray src2,
OutputArray dst,
InputArray mask = noArray()
)
src1 first input array or a scalar.
src2 second input array or a scalar.
dst output array that has the same size and type as the input arrays.
mask optional operation mask, 8-bit single channel array, that specifies elements of the output array to be changed.
dst(I)=src1(I)∧src2(I)if mask(I)≠0
and mask operate on dst
computes bitwise conjunction of the two arrays (dst = src1 & src2) Calculates the per-element bit-wise conjunction of two arrays or an array and a scalar.
I need to calculate histogram on only one part of on my image, but this part has circular shape (like disc). I create mask to find that part on image
cv2.rectangle(mask,(0, 0), (width, height), (0,0,0), -1)
cv2.circle(mask,(int(avgkrug[0]),int(avgkrug[1])),radijusp2,(255,255,255),-1)
cv2.circle(mask,(int(avgkrug[0]),int(avgkrug[1])),radijusp1,(0,0,0),-1)
Using code above, I found my "disc-shape" region of interest.
Now I'm trying to calculate histogram :
for ch, col in enumerate(color):
hist_item = cv2.calcHist([img],[ch],mask,[256],[0,255])
...
but got this error
error: (-215) !mask.data || mask.type() == CV_8UC1 in function cv::calcHist
However, if I save mask on dics and read it using cv2.imread() then this error doesn't appear.
I also tried this use this line
hist_item = cv2.calcHist([slika],[ch],mask.astype(np.uint8),[256],[0,255])
How can I use mask that I create to calc histogram, so I don't need to w/r from disc?
The mask you create needs to be uint8 type , so when creating the mask make it uint8, and then pass it to compute the histogram.
mask = np.zeros(image.shape[:2], dtype="uint8")
and now compute histogram by passing original image, and the curresponding mask.
hist_item = cv2.calcHist([image],[ch],mask,[256],[0,255])