I recently trained a object detection model in Tensorflow but for some reason some of the images have input tensors that are incompatible with the python signature. This is the code I'm running in google colab for inference:
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore') # Suppress Matplotlib warnings
def load_image_into_numpy_array(path):
"""Load an image from file into a numpy array.
Puts image into numpy array to feed into tensorflow graph.
Note that by convention we put it into a numpy array with shape
(height, width, channels), where channels=3 for RGB.
Args:
path: the file path to the image
Returns:
uint8 numpy array with shape (img_height, img_width, 3)
"""
return np.array(Image.open(path))
for image_path in img:
print('Running inference for {}... '.format(image_path), end='')
image_np=load_image_into_numpy_array(image_path)
# Things to try:
# Flip horizontally
# image_np = np.fliplr(image_np).copy()
# Convert image to grayscale
# image_np = np.tile(
# np.mean(image_np, 2, keepdims=True), (1, 1, 3)).astype(np.uint8)
# The input needs to be a tensor, convert it using `tf.convert_to_tensor`.
input_tensor=tf.convert_to_tensor(image_np)
# The model expects a batch of images, so add an axis with `tf.newaxis`.
input_tensor=input_tensor[tf.newaxis, ...]
# input_tensor = np.expand_dims(image_np, 0)
detections=detect_fn(input_tensor)
# All outputs are batches tensors.
# Convert to numpy arrays, and take index [0] to remove the batch dimension.
# We're only interested in the first num_detections.
num_detections=int(detections.pop('num_detections'))
detections={key:value[0,:num_detections].numpy()
for key,value in detections.items()}
detections['num_detections']=num_detections
# detection_classes should be ints.
detections['detection_classes']=detections['detection_classes'].astype(np.int64)
image_np_with_detections=image_np.copy()
viz_utils.visualize_boxes_and_labels_on_image_array(
image_np_with_detections,
detections['detection_boxes'],
detections['detection_classes'],
detections['detection_scores'],
category_index,
use_normalized_coordinates=True,
max_boxes_to_draw=100, #max number of bounding boxes in the image
min_score_thresh=.25, #min prediction threshold
agnostic_mode=False)
%matplotlib inline
plt.figure()
plt.imshow(image_np_with_detections)
print('Done')
plt.show()
And this is the error message I get when running inference:
Running inference for /content/gdrive/MyDrive/TensorFlow/workspace/training_demo/images/test/image_part_002.png...
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-23-5b465e5474df> in <module>()
40
41 # input_tensor = np.expand_dims(image_np, 0)
---> 42 detections=detect_fn(input_tensor)
43
44 # All outputs are batches tensors.
6 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py in _convert_inputs_to_signature(inputs, input_signature, flat_input_signature)
2804 flatten_inputs)):
2805 raise ValueError("Python inputs incompatible with input_signature:\n%s" %
-> 2806 format_error_message(inputs, input_signature))
2807
2808 if need_packing:
ValueError: Python inputs incompatible with input_signature:
inputs: (
tf.Tensor(
[[[[ 0 0 0 255]
[ 0 0 0 255]
[ 0 0 0 255]
...
[ 0 0 0 255]
[ 0 0 0 255]
[ 0 0 0 255]]
[[ 0 0 0 255]
[ 0 0 0 255]
[ 0 0 0 255]
...
[ 0 0 0 255]
[ 0 0 0 255]
[ 0 0 0 255]]
[[ 0 0 0 255]
[ 0 0 0 255]
[ 0 0 0 255]
...
[ 0 0 0 255]
[ 0 0 0 255]
[ 0 0 0 255]]
...
[[ 34 32 34 255]
[ 35 33 35 255]
[ 35 33 35 255]
...
[ 41 38 38 255]
[ 40 37 37 255]
[ 40 37 37 255]]
[[ 36 34 36 255]
[ 35 33 35 255]
[ 36 34 36 255]
...
[ 41 38 38 255]
[ 41 38 38 255]
[ 43 40 40 255]]
[[ 36 34 36 255]
[ 36 34 36 255]
[ 37 35 37 255]
...
[ 41 38 38 255]
[ 40 37 37 255]
[ 39 36 36 255]]]], shape=(1, 1219, 1920, 4), dtype=uint8))
input_signature: (
TensorSpec(shape=(1, None, None, 3), dtype=tf.uint8, name='input_tensor'))
Does anyone know a way I could convert the input tensors of my images so I can run inference on them? I know for example one image where the inference works gas resolution 400x291 and the image where inference doesn't work has resolution 1920x1219. I used the SSD MobileNet V1 FPN 640x640 Model for my training.
The problem in your case is that your input tensor shape is of the form (1,1219,1920,4), more precisely the 4 is problematic.
The first element, 1, stands for the batch size (added in input_tensor[tf.newaxis, ...]).
You get that part right, but where you actually read the images, the problem takes place, because there are 4 channels (assuming you read RGB-A?) not 3 (typical RGB) or 1 (grayscale).
I recommend that you check your images and to force the conversion to RGB, i.e. Image.open(path).convert('RGB')
Related
I am doing a project in which I have image of electricity meter reading. I need to extract the digits in the image.
I converted the image to a numpy array using the PIL Image function.
This is the code I typed
import numpy as np
from PIL import Image
img_data = Image.open('meter1.jpg' )
img_arr = np.array(img_data)
print(img_arr)
I got this numpy array as the output
[[[ 2 96 10]
[ 2 96 10]
[ 2 96 10]
...
[ 18 144 47]
[ 13 141 48]
[ 10 139 46]]
[[ 11 105 19]
[ 10 106 19]
[ 10 104 18]
...
[ 28 156 59]
[ 26 156 60]
[ 24 155 59]]
[[ 19 115 26]
[ 16 115 25]
[ 17 113 24]
...
[ 30 162 60]
[ 28 164 62]
[ 26 165 64]]
...
[[ 0 126 18]
[ 0 126 18]
[ 0 126 18]
...
[ 4 211 77]
[ 4 211 79]
[ 6 213 83]]
[[ 0 126 18]
[ 0 126 18]
...
[ 4 212 76]
[ 4 211 79]
[ 6 213 83]]
[[ 1 124 17]
[ 1 124 17]
[ 1 124 17]
...
[ 5 211 76]
[ 5 210 79]
[ 7 212 81]]]
How do I use this numpy array to extract the numerical values or the digits or the numbers from this image?
It is a seven segment display. Was is useful to convert the image to numpy array? Is there any other approach to do this. I have not done much of hand-on python so please help
it's better first to try doing some coding, because this way your coding skills improve. by the way, I wrote a script that save your digits into separate image files. hope it helps you in your project and improving your skills.
import numpy as np
from PIL import Image
import os
directory,filename = os.path.split(__file__)
#x = np.array([ [ [255,0,0], [0,255,0], [0,0,255] ],[ [0,0,0],[128,128,128],[2,5,200] ] ],dtype=np.uint8)
main_img = Image.open('img.jpg')
x = np.array(main_img)
print(x)
#print(x[1][1] )# x[row number][column number]
#print(x.shape[1] )# x.shape = # of rows,# of cols
data_cols = []
bg_color = x[0][0]
start = False
start_id = -1
for j in range(0,x.shape[1]):
for i in range(0,x.shape[0]):
if (x[i][j][0] < 5) and (x[i][j][2] < 10):
if not start:
start_id = j
start = True
break
if i == x.shape[0]-1:
start = False
end_id = j
if start_id>=0:
data_cols.append([start_id,end_id])
start_id=-1
print("Number of digits>",len(data_cols))
images = []
for i in range(0,len(data_cols)):
images.append(x[:,data_cols[i][0]:data_cols[i][1]])
i = 0
for im_array in images:
im = Image.fromarray(im_array,'RGB')
im.save(directory + "\\" + str(i) + ".png")
i += 1
I'm trying to apply a treshold to an image, but not a regular simple treshold.
I need to set to black pixels if they fit the conditonal, and if not, set them to white.
I could just loop over pixels, but on a 1080p image, it's far too long.
I'm using HSV for the comparisons I need to make.
Here is the conditional (this example is how I would use it if it was in a loop):
if abs(input_pixel_color.hue - reference.hue) < 2 and input_pixel_color.saturation >= 0.25 and input_pixel_color.brightness >= 0.42:
set_to_black
else:
set_to_white
input_pixel is the HSV value of the pixel in the loop.
reference is a variable to be compared to.
I thought about using numpy, but I really don't know how to write this :/
Thanks in advance
Updated
Now that your actual intended processing has become clearer, you would probably be better served by OpenCV inRange() function. Like this:
#!/usr/local/bin/python3
import cv2 as cv
import numpy as np
# Load the image and convert to HLS
image = cv.imread("image.jpg")
hls = cv.cvtColor(image,cv.COLOR_BGR2HLS)
# Define lower and uppper limits for each component
lo = np.array([50,0,0])
hi = np.array([70,255,255])
# Mask image to only select filtered pixels
mask = cv.inRange(hls,lo,hi)
# Change image to white where we found our colour
image[mask>0]=(255,255,255)
cv.imwrite("result.png",image)
So, if we use this image:
We are selecting Hues in the range 50-70, and making them white:
If you go here to a colour converter, you can see that "Green" is Hue=120, but OpenCV divides Hues by 2 so that 360 degrees becomes 180 and still fits in a uint8. So, our 60 in the code means 120 in online colour converters.
The ranges OpenCV uses for uint8 images are:
Hue 0..180
Lightness 0..255
Saturation 0..255
As I said before, you should get in the habit of looking at your data types, shapes and ranges in your debugger. To see the shape, dtype, and maximum Hue, Lightness and Saturation, use:
print(hls.dtype, hls.shape)
print(hls[...,0].max())
print(hls[...,1].max())
print(hls[...,2].max())
Original Answer
There are several ways to do that. The most performant is probably with the OpenCV function cv2.inRange() and there are plenty of answers on StackOverflow about that.
Here is a Numpy way. If you read the comments and look at the printed values, you can see how to combine logical AND with logical OR and so on, as well as how to address specific channels.
#!/usr/bin/env python3
from random import randint, seed
import numpy as np
# Generate a repeatable random HSV image
np.random.seed(42)
h, w = 4, 5
HSV = np.random.randint(1,100,(h,w,3),dtype=np.uint8)
print('Initial HSV\n',HSV)
# Create mask of all pixels with acceptable Hue, i.e. H > 50
HueOK = HSV[...,0] > 50
print('HueOK\n',HueOK)
# Create mask of all pixels with acceptable Saturation, i.e. S > 20 AND S < 80
SatOK = np.logical_and(HSV[...,1]>20, HSV[...,1]<80)
print('SatOK\n',SatOK)
# Create mask of all pixels with acceptable value, i.e. V < 20 OR V > 60
ValOK = np.logical_or(HSV[...,2]<20, HSV[...,2]>60)
print('ValOK\n',ValOK)
# Combine masks
combinedMask = HueOK & SatOK & ValOK
print('Combined\n',combinedMask)
# Now, if you just want to set the masked pixels to 255
HSV[combinedMask] = 255
print('Result1\n',HSV)
# Or, if you want to set the masked pixels to one value and the others to another value
HSV = np.where(combinedMask,255,0)
print('Result2\n',HSV)
Sample Output
Initial HSV
[[[93 98 96]
[52 62 76]
[93 4 99]
[15 22 47]
[60 72 85]]
[[26 72 61]
[47 66 26]
[21 45 76]
[25 87 40]
[25 35 83]]
[[66 40 87]
[24 26 75]
[18 95 15]
[75 86 18]
[88 57 62]]
[[94 86 45]
[99 26 19]
[37 24 63]
[69 54 3]
[33 33 39]]]
HueOK
[[ True True True False True]
[False False False False False]
[ True False False True True]
[ True True False True False]]
SatOK
[[False True False True True]
[ True True True False True]
[ True True False False True]
[False True True True True]]
ValOK
[[ True True True False True]
[ True False True False True]
[ True True True True True]
[False True True True False]]
Combined
[[False True False False True]
[False False False False False]
[ True False False False True]
[False True False True False]]
Result1
[[[ 93 98 96]
[255 255 255]
[ 93 4 99]
[ 15 22 47]
[255 255 255]]
[[ 26 72 61]
[ 47 66 26]
[ 21 45 76]
[ 25 87 40]
[ 25 35 83]]
[[255 255 255]
[ 24 26 75]
[ 18 95 15]
[ 75 86 18]
[255 255 255]]
[[ 94 86 45]
[255 255 255]
[ 37 24 63]
[255 255 255]
[ 33 33 39]]]
Result2
[[ 0 255 0 0 255]
[ 0 0 0 0 0]
[255 0 0 0 255]
[ 0 255 0 255 0]]
Notes:
1) You can also access pixels not selected by the mask, using negation:
# All unmasked pixels become 3
HSV[~combinedMask] = 3
2) The ellipsis (...) is just a shortcut meaning "all other dimensions I didn't bother listing", so HSV[...,1] is the same as HSV[:,:,1]
3) If you don't like writing HSV[...,0] for Hue, and HSV[...,1] for Saturation, you can split the channels
H, S, V = cv2.split(HSV)
Then you can just use H instead of HSV[...,0]. When you are finished, if you want to re-assemble the channels back into a 3-channel image, you can do:
HSV = cv2.merge((H,S,V))
or
HSV = np.dstack((H,S,V))
I am hoping to fix my uneven brightness/lightness of all my images(Hoping to get all brightness).
After getting the difference in lightness channel for my loop images to my reference images. I add the difference and save it to new images...however after checking the new images, I realised I still gotten uneven brightness...Is there anything wrong with my coding??? Any help or correction is appreciated. I have tried this code on both LAB and HSV colorspace, still the same. Below is the code and a couple of result that I got.
from PIL import Image
import numpy as np
import cv2
path = 'R:\\Temp\\zzzz\\AlignedPhoto_in_PNG\\'
path1 = 'R:\\Temp\\zzzz\\Testing1\\'
img = cv2.imread(path + 'aligned_IMG_1770.png')
img = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
a = np.mean(img[:,:,0])
for i in range (1770,1869):
img1 = cv2.imread(path + 'aligned_IMG_%d.png'%(i))
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2LAB)
img1[:,:,0], img1[:,:,1], img1[:,:,2] = cv2.split(img1)
print(img1[:,:,0])
b = np.mean(img1[:,:,0])
diff= b-a
print(diff)
img1[:,:,0] = img1[:,:,0] + diff
img1 = cv2.merge([img1[:,:,0], img1[:,:,1], img1[:,:,2]])
print(img1[:,:,0])
img1 = cv2.cvtColor(img1, cv2.COLOR_LAB2BGR)
cv2.imwrite(path1 + 'Testing1_%d.png'%(i), img1)
Also, any guidance on how I can edit the existing code to make sure after adding the difference, the new value does not exceed the max/min range of Lightness Channel in LAB or the max/min range of Value Channel in HSV? I realised after addition if the new value is >255 , the value jump to starting counting from 1. I googled around on how to fix this or set the range but I dun understand how to do it
Below is a few images result I got from above code. Hopefully that help to identify what went wrong with my code as I am still getting uneven brightness for the new images after adding the difference.
[[ 39 39 39 ..., 38 38 36]
[ 39 38 39 ..., 39 39 39]
[ 40 40 40 ..., 39 39 39]
...,
[119 119 122 ..., 165 166 167]
[118 118 120 ..., 169 166 166]
[115 116 117 ..., 175 169 167]]
0.0
[[ 39 39 39 ..., 38 38 36]
[ 39 38 39 ..., 39 39 39]
[ 40 40 40 ..., 39 39 39]
...,
[119 119 122 ..., 165 166 167]
[118 118 120 ..., 169 166 166]
[115 116 117 ..., 175 169 167]]
[[ 0 0 0 ..., 0 0 0]
[ 0 0 0 ..., 0 0 0]
[ 0 0 0 ..., 0 0 0]
...,
[117 119 119 ..., 165 163 131]
[117 117 118 ..., 170 166 131]
[115 116 116 ..., 176 171 134]]
-1.48181156101
[[255 255 255 ..., 255 255 255]
[255 255 255 ..., 255 255 255]
[255 255 255 ..., 255 255 255]
...,
[115 117 117 ..., 163 161 129]
[115 115 116 ..., 168 164 129]
[113 114 114 ..., 174 169 132]]
[[ 0 0 0 ..., 0 0 0]
[ 0 0 0 ..., 0 0 0]
[ 0 0 0 ..., 0 0 0]
...,
[ 0 97 115 ..., 165 164 165]
[ 0 96 114 ..., 169 166 164]
[ 0 95 113 ..., 175 170 166]]
-3.69765536832
[[253 253 253 ..., 253 253 253]
[253 253 253 ..., 253 253 253]
[253 253 253 ..., 253 253 253]
...,
[253 93 111 ..., 161 160 161]
[253 92 110 ..., 165 162 160]
[253 91 109 ..., 171 166 162]]
That's why maths is a skill every programmer should have.
You correct your brightness by adding diff.
So if you want a to equal the sum of b and diff
a = b + diff
and you know a and be. then how do you get diff?
diff = a - b
not
diff = b - a
Otherwise you will make darker images darker and brighter images brighter instead of bringing them to your reference mean a...
Of course using a global offset will cause problems with pixels that exceed your value range. You have to work around this problem. Otherwise your new mean will be wrong.
I'm trying to reshape a numpy array as:
data3 = data3.reshape((data3.shape[0], 28, 28))
where data3 is:
[[54 68 66 ..., 83 72 58]
[63 63 63 ..., 51 51 51]
[41 45 80 ..., 44 46 81]
...,
[58 60 61 ..., 75 75 81]
[56 58 59 ..., 72 75 80]
[ 4 4 4 ..., 8 8 8]]
data3.shape is (52, 2352 )
But I keep getting the following error:
ValueError: cannot reshape array of size 122304 into shape (52,28,28)
Exception TypeError: TypeError("'NoneType' object is not callable",) in <function _remove at 0x10b6477d0> ignored
What is happening and how to fix this error?
UPDATE:
I'm doing this to obtain data3 that is being used above:
def image_to_feature_vector(image, size=(28, 28)):
return cv2.resize(image, size).flatten()
data3 = np.array([image_to_feature_vector(cv2.imread(imagePath)) for imagePath in imagePaths])
imagePaths contains paths to all the images in my dataset. I actually want to convert the data3 to a flat list of 784-dim vectors, however the
image_to_feature_vector
function converts it to a 3072-dim vector!!
You can reshape the numpy matrix arrays such that before(a x b x c..n) = after(a x b x c..n). i.e the total elements in the matrix should be same as before, In your case, you can transform it such that transformed data3
has shape (156, 28, 28) or simply :-
import numpy as np
data3 = np.arange(122304).reshape(52, 2352 )
data3 = data3.reshape((data3.shape[0]*3, 28, 28))
print(data3.shape)
Output is of the form
[[[ 0 1 2 ..., 25 26 27]
[ 28 29 30 ..., 53 54 55]
[ 56 57 58 ..., 81 82 83]
...,
[ 700 701 702 ..., 725 726 727]
[ 728 729 730 ..., 753 754 755]
[ 756 757 758 ..., 781 782 783]]
...,
[122248 122249 122250 ..., 122273 122274 122275]
[122276 122277 122278 ..., 122301 122302 122303]]]
First, your input image's number of elements should match the number of elements in the desired feature vector.
Assuming the above is satisfied, the below should work:
# Reading all the images to a one numpy array. Paths of the images are in the imagePaths
data = np.array([np.array(cv2.imread(imagePaths[i])) for i in range(len(imagePaths))])
# This will contain the an array of feature vectors of the images
features = data.flatten().reshape(1, 784)
I'm working in a project where I need to subtract the RGB values from an Image. In example I want to subtract the BLUE channel from RED, so RED gets the difference value of the subtraction.
I have the next properties of the image:
Dimension:1456x2592,
bpp:3
The image I'm using gives me the following arrays:
[[[ 63 58 60]
[ 63 58 60]
[ 64 59 61]
...,
[155 155 161]
[155 155 161]
[155 155 161]]
[[ 58 53 55]
[ 60 55 57]
[ 62 57 59]
...,
[157 157 163]
[157 157 163]
[158 158 164]]
I know those are the values(RGB) from the image, so now I move on to do the code (I based on this code)
import cv2
import numpy as np
from PIL import Image
# read image into matrix.
m = cv2.imread("ITESO.jpeg")
# get image properties.
h,w,bpp = np.shape(m)
# iterate over the entire image.
# BLUE = 0, GREEN = 1, RED = 2.
for py in range(0,h):
for px in range(0,w):
#m[py][px][2] = 2
n = m[py][px][2] //n takes the value of RED
Y = [n, 0, 0] //I create an array with [RED, 0, 0]
m, Y = np.array(m), np.array(Y)
m = np.absolute(m - Y) //Get the matriz with the substraction
y = 1
x = 1
print (m)
print (m[x][y])
#display image
#cv2.imshow('matrix', m)
#cv2.waitKey(0)
cv2.imwrite('new.jpeg',m)
img = Image.open('new.jpeg')
img.show()
img = Image.open('new.jpeg').convert('L')
img.save('new_gray_scale.jpg')
img.show()
When I print the J matrix it gives the following arrays:
B,G,R
Blue = BLUE - RED
[[[ 3 58 60]
[ 3 58 60]
[ 4 59 61]
...,
[ 95 155 161]
[ 95 155 161]
[ 95 155 161]]
[[ 2 53 55]
[ 0 55 57]
[ 2 57 59]
...,
[ 97 157 163]
[ 97 157 163]
[ 98 158 164]]
But I'm not able to open the new image and if I set one RGB channel to one value it shows me the image. I use the next lines for that:
import cv2
import numpy as np
# read image into matrix.
m = cv2.imread("python.png")
# get image properties.
h,w,bpp = np.shape(m)
# iterate over the entire image.
for py in range(0,h):
for px in range(0,w):
m[py][px][0] = 0 //setting channel Blue to values of 0
# display image
cv2.imshow('matrix', m)
cv2.waitKey(0)
How can I subtract the RGB channels from each other?
PS: In MatLab it works like a charm, but I'm not able to do it in python.
Pay attention that this operation is changing the dtype of the matrix (image) from uint8 to int32, and this can cause other problems. A better way (and more efficient) to do this, IMO, is this:
import cv2
import numpy as np
img = cv2.imread('image.png').astype(np.float) # BGR, float
img[:, :, 2] = np.absolute(img[:, :, 2] - img[:, :, 0]) # R = |R - B|
img = img.astype(np.uint8) # convert back to uint8
cv2.imwrite('new-image.png', img) # save the image
cv2.imshow('img', img)
cv2.waitKey()
Code manipulating RGB negative values to zero...
m = cv2.imread("img.jpg")
# get image properties.
h,w,bpp = np.shape(m)
# iterate over the entire image.
# BLUE = 0, GREEN = 1, RED = 2.
for py in range(0,h):
for px in range(0,w):
n = m[py][px][1]
Y = [0, 0, n]
m, Y = np.array(m), np.array(Y)
a = (m - Y)
if (a[py][px][0] <=0): #if Blue is negative or equal 0
a[py][px][0] = 0 #Blue set to 0
cv2.imwrite('img_R-G.jpg',a)
img = Image.open('img_R-G.jpg').convert('L')
img.save('img_R-G_GS.jpg')