I'm trying to apply a treshold to an image, but not a regular simple treshold.
I need to set to black pixels if they fit the conditonal, and if not, set them to white.
I could just loop over pixels, but on a 1080p image, it's far too long.
I'm using HSV for the comparisons I need to make.
Here is the conditional (this example is how I would use it if it was in a loop):
if abs(input_pixel_color.hue - reference.hue) < 2 and input_pixel_color.saturation >= 0.25 and input_pixel_color.brightness >= 0.42:
set_to_black
else:
set_to_white
input_pixel is the HSV value of the pixel in the loop.
reference is a variable to be compared to.
I thought about using numpy, but I really don't know how to write this :/
Thanks in advance
Updated
Now that your actual intended processing has become clearer, you would probably be better served by OpenCV inRange() function. Like this:
#!/usr/local/bin/python3
import cv2 as cv
import numpy as np
# Load the image and convert to HLS
image = cv.imread("image.jpg")
hls = cv.cvtColor(image,cv.COLOR_BGR2HLS)
# Define lower and uppper limits for each component
lo = np.array([50,0,0])
hi = np.array([70,255,255])
# Mask image to only select filtered pixels
mask = cv.inRange(hls,lo,hi)
# Change image to white where we found our colour
image[mask>0]=(255,255,255)
cv.imwrite("result.png",image)
So, if we use this image:
We are selecting Hues in the range 50-70, and making them white:
If you go here to a colour converter, you can see that "Green" is Hue=120, but OpenCV divides Hues by 2 so that 360 degrees becomes 180 and still fits in a uint8. So, our 60 in the code means 120 in online colour converters.
The ranges OpenCV uses for uint8 images are:
Hue 0..180
Lightness 0..255
Saturation 0..255
As I said before, you should get in the habit of looking at your data types, shapes and ranges in your debugger. To see the shape, dtype, and maximum Hue, Lightness and Saturation, use:
print(hls.dtype, hls.shape)
print(hls[...,0].max())
print(hls[...,1].max())
print(hls[...,2].max())
Original Answer
There are several ways to do that. The most performant is probably with the OpenCV function cv2.inRange() and there are plenty of answers on StackOverflow about that.
Here is a Numpy way. If you read the comments and look at the printed values, you can see how to combine logical AND with logical OR and so on, as well as how to address specific channels.
#!/usr/bin/env python3
from random import randint, seed
import numpy as np
# Generate a repeatable random HSV image
np.random.seed(42)
h, w = 4, 5
HSV = np.random.randint(1,100,(h,w,3),dtype=np.uint8)
print('Initial HSV\n',HSV)
# Create mask of all pixels with acceptable Hue, i.e. H > 50
HueOK = HSV[...,0] > 50
print('HueOK\n',HueOK)
# Create mask of all pixels with acceptable Saturation, i.e. S > 20 AND S < 80
SatOK = np.logical_and(HSV[...,1]>20, HSV[...,1]<80)
print('SatOK\n',SatOK)
# Create mask of all pixels with acceptable value, i.e. V < 20 OR V > 60
ValOK = np.logical_or(HSV[...,2]<20, HSV[...,2]>60)
print('ValOK\n',ValOK)
# Combine masks
combinedMask = HueOK & SatOK & ValOK
print('Combined\n',combinedMask)
# Now, if you just want to set the masked pixels to 255
HSV[combinedMask] = 255
print('Result1\n',HSV)
# Or, if you want to set the masked pixels to one value and the others to another value
HSV = np.where(combinedMask,255,0)
print('Result2\n',HSV)
Sample Output
Initial HSV
[[[93 98 96]
[52 62 76]
[93 4 99]
[15 22 47]
[60 72 85]]
[[26 72 61]
[47 66 26]
[21 45 76]
[25 87 40]
[25 35 83]]
[[66 40 87]
[24 26 75]
[18 95 15]
[75 86 18]
[88 57 62]]
[[94 86 45]
[99 26 19]
[37 24 63]
[69 54 3]
[33 33 39]]]
HueOK
[[ True True True False True]
[False False False False False]
[ True False False True True]
[ True True False True False]]
SatOK
[[False True False True True]
[ True True True False True]
[ True True False False True]
[False True True True True]]
ValOK
[[ True True True False True]
[ True False True False True]
[ True True True True True]
[False True True True False]]
Combined
[[False True False False True]
[False False False False False]
[ True False False False True]
[False True False True False]]
Result1
[[[ 93 98 96]
[255 255 255]
[ 93 4 99]
[ 15 22 47]
[255 255 255]]
[[ 26 72 61]
[ 47 66 26]
[ 21 45 76]
[ 25 87 40]
[ 25 35 83]]
[[255 255 255]
[ 24 26 75]
[ 18 95 15]
[ 75 86 18]
[255 255 255]]
[[ 94 86 45]
[255 255 255]
[ 37 24 63]
[255 255 255]
[ 33 33 39]]]
Result2
[[ 0 255 0 0 255]
[ 0 0 0 0 0]
[255 0 0 0 255]
[ 0 255 0 255 0]]
Notes:
1) You can also access pixels not selected by the mask, using negation:
# All unmasked pixels become 3
HSV[~combinedMask] = 3
2) The ellipsis (...) is just a shortcut meaning "all other dimensions I didn't bother listing", so HSV[...,1] is the same as HSV[:,:,1]
3) If you don't like writing HSV[...,0] for Hue, and HSV[...,1] for Saturation, you can split the channels
H, S, V = cv2.split(HSV)
Then you can just use H instead of HSV[...,0]. When you are finished, if you want to re-assemble the channels back into a 3-channel image, you can do:
HSV = cv2.merge((H,S,V))
or
HSV = np.dstack((H,S,V))
Related
I am trying to normalize the images and used the following code to do that but when trying to normalize
img = cv2.normalize(img, None, 0, 255, cv2.NORM_MINMAX)
when I print the image using print(img)
i get the following as if No normalization was applied to the image
[[199 204 205 ... 29 30 34]
[195 200 203 ... 30 30 32]
[190 195 200 ... 35 31 29]
...
[ 7 3 1 ... 16 16 15]
[ 19 13 7 ... 18 18 17]
[ 35 26 19 ... 18 20 19]]
I tried to use another approach as
img/255 or img/255.0.
I still can see black images and upon printing print(img) I get the following as:
[[0.78039216 0.8 0.80392157 ... 0.11372549 0.11764706 0.13333333]
[0.76470588 0.78431373 0.79607843 ... 0.11764706 0.11764706 0.1254902 ]
[0.74509804 0.76470588 0.78431373 ... 0.1372549 0.12156863 0.11372549]
I am kind of confused on why I get the black images ?
...
You probably have very small areas with luminosity that is very close to 255. That will "halt" the normalization.
What you can do is use some kind of thresholding to remove, say, all intensities from 220 to 255 and map them to 220. If you normalize that, the points with intensity 220 will be driven up to 255, but this time the darker values will get amplified too.
However, I think you're likely to get better answers if you describe in more detail what you're trying to accomplish - what the image is, and to what end you want to normalize it.
I recently trained a object detection model in Tensorflow but for some reason some of the images have input tensors that are incompatible with the python signature. This is the code I'm running in google colab for inference:
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore') # Suppress Matplotlib warnings
def load_image_into_numpy_array(path):
"""Load an image from file into a numpy array.
Puts image into numpy array to feed into tensorflow graph.
Note that by convention we put it into a numpy array with shape
(height, width, channels), where channels=3 for RGB.
Args:
path: the file path to the image
Returns:
uint8 numpy array with shape (img_height, img_width, 3)
"""
return np.array(Image.open(path))
for image_path in img:
print('Running inference for {}... '.format(image_path), end='')
image_np=load_image_into_numpy_array(image_path)
# Things to try:
# Flip horizontally
# image_np = np.fliplr(image_np).copy()
# Convert image to grayscale
# image_np = np.tile(
# np.mean(image_np, 2, keepdims=True), (1, 1, 3)).astype(np.uint8)
# The input needs to be a tensor, convert it using `tf.convert_to_tensor`.
input_tensor=tf.convert_to_tensor(image_np)
# The model expects a batch of images, so add an axis with `tf.newaxis`.
input_tensor=input_tensor[tf.newaxis, ...]
# input_tensor = np.expand_dims(image_np, 0)
detections=detect_fn(input_tensor)
# All outputs are batches tensors.
# Convert to numpy arrays, and take index [0] to remove the batch dimension.
# We're only interested in the first num_detections.
num_detections=int(detections.pop('num_detections'))
detections={key:value[0,:num_detections].numpy()
for key,value in detections.items()}
detections['num_detections']=num_detections
# detection_classes should be ints.
detections['detection_classes']=detections['detection_classes'].astype(np.int64)
image_np_with_detections=image_np.copy()
viz_utils.visualize_boxes_and_labels_on_image_array(
image_np_with_detections,
detections['detection_boxes'],
detections['detection_classes'],
detections['detection_scores'],
category_index,
use_normalized_coordinates=True,
max_boxes_to_draw=100, #max number of bounding boxes in the image
min_score_thresh=.25, #min prediction threshold
agnostic_mode=False)
%matplotlib inline
plt.figure()
plt.imshow(image_np_with_detections)
print('Done')
plt.show()
And this is the error message I get when running inference:
Running inference for /content/gdrive/MyDrive/TensorFlow/workspace/training_demo/images/test/image_part_002.png...
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-23-5b465e5474df> in <module>()
40
41 # input_tensor = np.expand_dims(image_np, 0)
---> 42 detections=detect_fn(input_tensor)
43
44 # All outputs are batches tensors.
6 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py in _convert_inputs_to_signature(inputs, input_signature, flat_input_signature)
2804 flatten_inputs)):
2805 raise ValueError("Python inputs incompatible with input_signature:\n%s" %
-> 2806 format_error_message(inputs, input_signature))
2807
2808 if need_packing:
ValueError: Python inputs incompatible with input_signature:
inputs: (
tf.Tensor(
[[[[ 0 0 0 255]
[ 0 0 0 255]
[ 0 0 0 255]
...
[ 0 0 0 255]
[ 0 0 0 255]
[ 0 0 0 255]]
[[ 0 0 0 255]
[ 0 0 0 255]
[ 0 0 0 255]
...
[ 0 0 0 255]
[ 0 0 0 255]
[ 0 0 0 255]]
[[ 0 0 0 255]
[ 0 0 0 255]
[ 0 0 0 255]
...
[ 0 0 0 255]
[ 0 0 0 255]
[ 0 0 0 255]]
...
[[ 34 32 34 255]
[ 35 33 35 255]
[ 35 33 35 255]
...
[ 41 38 38 255]
[ 40 37 37 255]
[ 40 37 37 255]]
[[ 36 34 36 255]
[ 35 33 35 255]
[ 36 34 36 255]
...
[ 41 38 38 255]
[ 41 38 38 255]
[ 43 40 40 255]]
[[ 36 34 36 255]
[ 36 34 36 255]
[ 37 35 37 255]
...
[ 41 38 38 255]
[ 40 37 37 255]
[ 39 36 36 255]]]], shape=(1, 1219, 1920, 4), dtype=uint8))
input_signature: (
TensorSpec(shape=(1, None, None, 3), dtype=tf.uint8, name='input_tensor'))
Does anyone know a way I could convert the input tensors of my images so I can run inference on them? I know for example one image where the inference works gas resolution 400x291 and the image where inference doesn't work has resolution 1920x1219. I used the SSD MobileNet V1 FPN 640x640 Model for my training.
The problem in your case is that your input tensor shape is of the form (1,1219,1920,4), more precisely the 4 is problematic.
The first element, 1, stands for the batch size (added in input_tensor[tf.newaxis, ...]).
You get that part right, but where you actually read the images, the problem takes place, because there are 4 channels (assuming you read RGB-A?) not 3 (typical RGB) or 1 (grayscale).
I recommend that you check your images and to force the conversion to RGB, i.e. Image.open(path).convert('RGB')
I have an array of an image
img = [[[63 48 27]
[ 63 48 27]
[ 63 48 27]
...
[117 88 70]
[113 84 66]
[111 82 64]]
[[ 64 49 28]
[ 64 49 28]
[ 64 49 28]
...
[117 88 70]
[114 85 67]
[111 82 64]]
[[ 65 50 29]
[ 66 51 30]
[ 66 51 30]
...
[118 89 71]
[114 85 67]
[111 82 64]]...
And another array of the pixels that I want to keep from that image array:
mask = [[[False False False ... False False False]
[False False False ... False False False]
[False False False ... False False False]
...
[False False False ... False False False]
[False False False ... False False False]
[False False False ... False False False]]]
I thought I could just do img[mask] but I get boolean index did not match indexed array along dimension 0; dimension is 549 but corresponding boolean dimension is 1. How can I either expand the mask array back to the right dimension, it comes from converting a detectron2 mask to numpy array mask = outputs['instances'].pred_masks.numpy() (originally it's a tensor). Or, and this could be easier I think, if the value in the numpy mask array is False then convert the element in the image array to white/255.
The function I'm using is:
from matplotlib.image import imread
import scipy.misc
def cropper(org_image_path, mask_array, out_file_name):
img = imread(org_image_path)
output = img[mask_array]
scipy.misc.toimage(output).save(out_file_name)
Given an image of shape (M, N, 3) and a mask of shape (1, M, N), you can set the False elements of the image to 255 in all channels by using simple boolean indexing. Whereas broadcasting lines up indices on the right, indexing fills them in on the left. That means that to get the mask to correspond to your image dimensions you need to remove the first axis. There are a number of ways of doing this:
mask = mask[0]
mask = mask[0, ...]
mask = mask.squeeze()
mask = np.squeeze(mask)
mask = mask.reshape(mask.shape[1:])
mask = np.reshape(mask, mask.shape[1:])
...
With the first two dimensions matching up, you can negate the mask and preform a direct assignment:
img[~mask] = 255
This can be combined into a simple one-liner:
img[~mask[0]] = 255
For those interested, Nicholas' comment was a useful bump (thank you) - just changed the mask array to be on the same axis & the same shape. Then was just a simple edit to change the values:
from matplotlib.image import imread
import scipy.misc
from PIL import Image
def cropper(org_image_path, mask_array, out_file_name):
img = imread(org_image_path)
mask_array = np.moveaxis(mask_array, 0, -1)
mask_array = np.repeat(mask_array, 3, axis=2)
output = np.where(mask_array==False, 255, img)
im = Image.fromarray(output)
im.save(out_file_name)
I have a function like:
def calcChromaFromPixel(red, green, blue):
r = int(red)
g = int(green)
b = int(blue)
return math.sqrt(math.pow(r - g, 2) +
math.pow(r - b, 2) +
math.pow(g - b, 2))
and I have an RGB Image, which is already converted into an numpy array with a shape like [width, height, 3], where 3 are the color channels.
What I want to do is to apply the method to every pixel and build the mean from the result. I already have done the obvious thing and iterated over the array with two loops, but that seems to be a really slow thing to do... Is there a faster and prettier way to do that?!
Thanks :)
Code:
import math
import numpy as np
np.random.seed(1)
# FAKE-DATA
img = np.random.randint(0,255,size=(4,4,3))
print(img)
# LOOP APPROACH
def calcChromaFromPixel(red, green, blue):
r = int(red)
g = int(green)
b = int(blue)
return math.sqrt(math.pow(r - g, 2) +
math.pow(r - b, 2) +
math.pow(g - b, 2))
bla = np.zeros(img.shape[:2])
for a in range(img.shape[0]):
for b in range(img.shape[1]):
bla[a,b] = calcChromaFromPixel(*img[a,b])
print('loop')
print(bla)
# VECTORIZED APPROACH
print('vectorized')
res = np.linalg.norm(np.stack(
(img[:,:,0] - img[:,:,1],
img[:,:,0] - img[:,:,2],
img[:,:,1] - img[:,:,2])), axis=0)
print(res)
Out:
[[[ 37 235 140]
[ 72 137 203]
[133 79 192]
[144 129 204]]
[[ 71 237 252]
[134 25 178]
[ 20 254 101]
[146 212 139]]
[[252 234 156]
[157 142 50]
[ 68 215 215]
[233 241 247]]
[[222 96 86]
[141 233 137]
[ 7 63 61]
[ 22 57 1]]]
loop
[[ 242.56545508 160.44313634 138.44132331 97.21111048]
[ 246.05283985 192.94040531 291.07730932 98.66103588]
[ 124.99599994 141.90842117 207.88939367 17.20465053]
[ 185.66636744 133.02631319 77.82030583 69.29646456]]
vectorized
[[ 242.56545508 160.44313634 138.44132331 97.21111048]
[ 246.05283985 192.94040531 291.07730932 98.66103588]
[ 124.99599994 141.90842117 207.88939367 17.20465053]
[ 185.66636744 133.02631319 77.82030583 69.29646456]]
I'm working in a project where I need to subtract the RGB values from an Image. In example I want to subtract the BLUE channel from RED, so RED gets the difference value of the subtraction.
I have the next properties of the image:
Dimension:1456x2592,
bpp:3
The image I'm using gives me the following arrays:
[[[ 63 58 60]
[ 63 58 60]
[ 64 59 61]
...,
[155 155 161]
[155 155 161]
[155 155 161]]
[[ 58 53 55]
[ 60 55 57]
[ 62 57 59]
...,
[157 157 163]
[157 157 163]
[158 158 164]]
I know those are the values(RGB) from the image, so now I move on to do the code (I based on this code)
import cv2
import numpy as np
from PIL import Image
# read image into matrix.
m = cv2.imread("ITESO.jpeg")
# get image properties.
h,w,bpp = np.shape(m)
# iterate over the entire image.
# BLUE = 0, GREEN = 1, RED = 2.
for py in range(0,h):
for px in range(0,w):
#m[py][px][2] = 2
n = m[py][px][2] //n takes the value of RED
Y = [n, 0, 0] //I create an array with [RED, 0, 0]
m, Y = np.array(m), np.array(Y)
m = np.absolute(m - Y) //Get the matriz with the substraction
y = 1
x = 1
print (m)
print (m[x][y])
#display image
#cv2.imshow('matrix', m)
#cv2.waitKey(0)
cv2.imwrite('new.jpeg',m)
img = Image.open('new.jpeg')
img.show()
img = Image.open('new.jpeg').convert('L')
img.save('new_gray_scale.jpg')
img.show()
When I print the J matrix it gives the following arrays:
B,G,R
Blue = BLUE - RED
[[[ 3 58 60]
[ 3 58 60]
[ 4 59 61]
...,
[ 95 155 161]
[ 95 155 161]
[ 95 155 161]]
[[ 2 53 55]
[ 0 55 57]
[ 2 57 59]
...,
[ 97 157 163]
[ 97 157 163]
[ 98 158 164]]
But I'm not able to open the new image and if I set one RGB channel to one value it shows me the image. I use the next lines for that:
import cv2
import numpy as np
# read image into matrix.
m = cv2.imread("python.png")
# get image properties.
h,w,bpp = np.shape(m)
# iterate over the entire image.
for py in range(0,h):
for px in range(0,w):
m[py][px][0] = 0 //setting channel Blue to values of 0
# display image
cv2.imshow('matrix', m)
cv2.waitKey(0)
How can I subtract the RGB channels from each other?
PS: In MatLab it works like a charm, but I'm not able to do it in python.
Pay attention that this operation is changing the dtype of the matrix (image) from uint8 to int32, and this can cause other problems. A better way (and more efficient) to do this, IMO, is this:
import cv2
import numpy as np
img = cv2.imread('image.png').astype(np.float) # BGR, float
img[:, :, 2] = np.absolute(img[:, :, 2] - img[:, :, 0]) # R = |R - B|
img = img.astype(np.uint8) # convert back to uint8
cv2.imwrite('new-image.png', img) # save the image
cv2.imshow('img', img)
cv2.waitKey()
Code manipulating RGB negative values to zero...
m = cv2.imread("img.jpg")
# get image properties.
h,w,bpp = np.shape(m)
# iterate over the entire image.
# BLUE = 0, GREEN = 1, RED = 2.
for py in range(0,h):
for px in range(0,w):
n = m[py][px][1]
Y = [0, 0, n]
m, Y = np.array(m), np.array(Y)
a = (m - Y)
if (a[py][px][0] <=0): #if Blue is negative or equal 0
a[py][px][0] = 0 #Blue set to 0
cv2.imwrite('img_R-G.jpg',a)
img = Image.open('img_R-G.jpg').convert('L')
img.save('img_R-G_GS.jpg')