Iam a beginner and I am learning to code an image classifier. My goal is to create a predict function.
in this project I want to make a car prediction model, I have a problem when I will load_image from the keras.preprocessing function there is an error JpegImageFile' object has no attribute 'load_img'
this is my code
from google.colab.patches import cv2_imshow
import cv2
import glob
from keras.preprocessing import image
import numpy as np
ambil = glob.glob("*.jpg")
for foto in ambil:
lol = cv2.imread(foto)
with open(foto, 'rb') as f:
np_image_string = np.array([f.read()])
image = Image.open(foto)
width, height = image.size
gambar_masuk = np.array(image.getdata()).reshape(height, width, 3).astype(np.uint8)
num_detections, detection_boxes, detection_classes, detection_scores, detection_masks, image_info = session.run(
['NumDetections:0', 'DetectionBoxes:0', 'DetectionClasses:0', 'DetectionScores:0', 'DetectionMasks:0', 'ImageInfo:0'],
feed_dict={'Placeholder:0': np_image_string})
num_detections = np.squeeze(num_detections.astype(np.int32), axis=(0,))
detection_boxes = np.squeeze(detection_boxes * image_info[0, 2], axis=(0,))[0:num_detections]
detection_scores = np.squeeze(detection_scores, axis=(0,))[0:num_detections]
detection_classes = np.squeeze(detection_classes.astype(np.int32), axis=(0,))[0:num_detections]
detection_boxes = detection_boxes[detection_classes==3]
detection_scores = detection_scores[detection_classes==3]
detection_boxes = detection_boxes[detection_scores>0.8]
detection_boxes = detection_boxes.astype(int)
print(detection_boxes)
urut=1
for kotak in detection_boxes:
hasil = lol[kotak[0]:kotak[2],kotak[1]:kotak[3],:]
hasil_potong = 'hasil'+str(urut)+'.jpg'
cv2.imwrite(hasil_potong, hasil)
lihat = cv2.imread(hasil_potong)
cv2_imshow(lihat)
img = image.load_img(lihat, target_size = (size_, size_))
You are overwriting image. You have these two lines:
from keras.preprocessing import image
:
:
image = Image.open(foto)
You import image from keras.processing, but then you overwrite it in the second shown line.
Either import image a different way, or use a different variable name for the opened image...
Related
for doing face recognition with Deepface I am trying to extract vector information of an image to store in db. So that next time in order to match, I will extract the new image's vector information and look into the db . If the search has results then its a match. I used verify method of the DeepFace but its comparing between 2 images and returning with this:
from deepface import DeepFace
import os
detected_face = DeepFace.detectFace("sly.jpg")
print (detected_face)
this is the output for above:
result = DeepFace.verify("sly1.jpg","sly2.jpg");
for this I get:
Using VGG-Face model backend and cosine distance.
{'verified': True, 'distance': 1.1920928955078125e-07, 'max_threshold_to_verify': 0.4, 'model': 'VGG-Face', 'similarity_metric': 'cosine'}
This is the comparison result, but I need information of only one image without comparison because I will have lots of records to search(for vector info) when a new face will be tested. Any help will be appreciated.
I am assuming it is this repo among others with same name and installed with setup.py or pip install deepface.
I tested this on google colab. For using locally use cv2.imshow(...) instead of cv2_imshow(...).
Downloading test images
!wget "http://*.jpg" -O "1.jpg"
!wget "https://*.jpg" -O "2.jpg"
Check image
import cv2
from google.colab.patches import cv2_imshow
im1 = cv2.imread("1.jpg")
#cv2.imshow("img", im1)
cv2_imshow(im1)
Face Detection
The output from DeepFace.detectFace returns normalized cropped face. For mtcnn I got image of shape (224, 224, 3). You can verify and view the image with,
from deepface import DeepFace
import cv2
from google.colab.patches import cv2_imshow
#backends = ['opencv', 'ssd', 'dlib', 'mtcnn']
backends = ['mtcnn']
for backend in backends:
#face detection and alignment
detected_face = DeepFace.detectFace("1.jpg", detector_backend = backend)
print(detected_face)
print(detected_face.shape)
im = cv2.cvtColor(detected_face * 255, cv2.COLOR_BGR2RGB)
#cv2.imshow("image", im)
cv2_imshow(im)
Output
[[[0.12156863 0.05882353 0.02352941]
[0.2901961 0.18039216 0.1254902 ]
[0.3137255 0.20392157 0.14901961]
...
[0.06666667 0.01176471 0.01176471]
[0.05882353 0.01176471 0.00784314]
[0.03921569 0.00784314 0.00392157]]
[[0.26666668 0.2 0.16470589]
[0.19215687 0.08235294 0.02745098]
[0.33333334 0.22352941 0.16862746]
...
[0.03921569 0.00392157 0.00392157]
[0.04313726 0.00784314 0.00784314]
[0.04313726 0. 0.00392157]]
[[0.11764706 0.05098039 0.01568628]
[0.21176471 0.10588235 0.05882353]
[0.44313726 0.3372549 0.27058825]
...
[0.02352941 0.00392157 0. ]
[0.02352941 0.00392157 0. ]
[0.02745098 0. 0. ]]
...
[[0.24313726 0.1882353 0.13725491]
[0.24313726 0.18431373 0.13725491]
[0.22745098 0.16470589 0.11372549]
...
[0.654902 0.69803923 0.78431374]
[0.62352943 0.67058825 0.7529412 ]
[0.38431373 0.4117647 0.45882353]]
[[0.23529412 0.18039216 0.12941177]
[0.22352941 0.16862746 0.11764706]
[0.22745098 0.16470589 0.11764706]
...
[0.6392157 0.69803923 0.78039217]
[0.6156863 0.6745098 0.75686276]
[0.36862746 0.40392157 0.4627451 ]]
[[0.21568628 0.16862746 0.10980392]
[0.2 0.15294118 0.09803922]
[0.20784314 0.14901961 0.10196079]
...
[0.6313726 0.6901961 0.77254903]
[0.6039216 0.6627451 0.74509805]
[0.36078432 0.39607844 0.4509804 ]]]
(224, 224, 3)
Face Embedding
Since, you are looking for embedding vector you can get it with below. It is a modified version of verify function. I kept option for two images, distance calculation, verification, but you can modify it generate face embedding for a single face only. I did not remove any unused imports.
"""
Modified verify function for face embedding generation
backends = ['opencv', 'ssd', 'dlib', 'mtcnn']
"""
from keras.preprocessing import image
import warnings
warnings.filterwarnings("ignore")
import time
import os
from os import path
from pathlib import Path
import gdown
import numpy as np
import pandas as pd
from tqdm import tqdm
import json
import cv2
from keras import backend as K
import keras
import tensorflow as tf
import pickle
from deepface import DeepFace
from deepface.basemodels import VGGFace, OpenFace, Facenet, FbDeepFace, DeepID
from deepface.extendedmodels import Age, Gender, Race, Emotion
from deepface.commons import functions, realtime, distance as dst
def FaceEmbeddingAndDistance(img1_path, img2_path = '', model_name ='Facenet', distance_metric = 'cosine', model = None, enforce_detection = True, detector_backend = 'mtcnn'):
#--------------------------------
#ensemble learning disabled.
if model == None:
if model_name == 'VGG-Face':
print("Using VGG-Face model backend and", distance_metric,"distance.")
model = VGGFace.loadModel()
elif model_name == 'OpenFace':
print("Using OpenFace model backend", distance_metric,"distance.")
model = OpenFace.loadModel()
elif model_name == 'Facenet':
print("Using Facenet model backend", distance_metric,"distance.")
model = Facenet.loadModel()
elif model_name == 'DeepFace':
print("Using FB DeepFace model backend", distance_metric,"distance.")
model = FbDeepFace.loadModel()
elif model_name == 'DeepID':
print("Using DeepID2 model backend", distance_metric,"distance.")
model = DeepID.loadModel()
elif model_name == 'Dlib':
print("Using Dlib ResNet model backend", distance_metric,"distance.")
from deepface.basemodels.DlibResNet import DlibResNet #this is not a must because it is very huge.
model = DlibResNet()
else:
raise ValueError("Invalid model_name passed - ", model_name)
else: #model != None
print("Already built model is passed")
#------------------------------
#face recognition models have different size of inputs
#my environment returns (None, 224, 224, 3) but some people mentioned that they got [(None, 224, 224, 3)]. I think this is because of version issue.
if model_name == 'Dlib': #this is not a regular keras model
input_shape = (150, 150, 3)
else: #keras based models
input_shape = model.layers[0].input_shape
if type(input_shape) == list:
input_shape = input_shape[0][1:3]
else:
input_shape = input_shape[1:3]
input_shape_x = input_shape[0]
input_shape_y = input_shape[1]
#------------------------------
#tuned thresholds for model and metric pair
threshold = functions.findThreshold(model_name, distance_metric)
#------------------------------
#----------------------
#crop and align faces
img1 = functions.preprocess_face(img=img1_path, target_size=(input_shape_y, input_shape_x), enforce_detection = enforce_detection, detector_backend = detector_backend)
img2 = functions.preprocess_face(img=img2_path, target_size=(input_shape_y, input_shape_x), enforce_detection = enforce_detection, detector_backend = detector_backend)
#----------------------
#find embeddings
img1_representation = model.predict(img1)[0,:]
img2_representation = model.predict(img2)[0,:]
print("FACE 1 Embedding:")
print(img1_representation)
print("FACE 2 Embedding:")
print(img2_representation)
#----------------------
#find distances between embeddings
if distance_metric == 'cosine':
distance = dst.findCosineDistance(img1_representation, img2_representation)
elif distance_metric == 'euclidean':
distance = dst.findEuclideanDistance(img1_representation, img2_representation)
elif distance_metric == 'euclidean_l2':
distance = dst.findEuclideanDistance(dst.l2_normalize(img1_representation), dst.l2_normalize(img2_representation))
else:
raise ValueError("Invalid distance_metric passed - ", distance_metric)
print("DISTANCE")
print(distance)
#----------------------
#decision
if distance <= threshold:
identified = "true"
else:
identified = "false"
print("IDENTIFIED")
print(identified)
Above function is called via,
FaceEmbeddingAndDistance("1.jpg", "2.jpg", model_name='Facenet', detector_backend = 'mtcnn')
Output
FACE 1 Embedding:
[-0.7229302 -1.766835 -1.5399052 0.59634393 1.203212 -1.693247
-0.90845925 0.5264039 2.148173 -0.9786542 -0.00369854 -1.2710322
-1.5515596 -0.4111185 -0.36896533 -0.30051672 0.35091963 0.5073533
-1.7270111 -0.5230838 0.3376239 -1.0811361 1.5242224 -0.6137103
-1.3100258 0.80050004 -0.7087368 -0.64483845 1.0830203 2.6056807
-0.76527536 -0.83047277 -0.7335422 -0.01964059 -0.86749244 2.9645889
-2.426583 -0.11157394 -2.3535717 -0.65058017 0.30864614 -0.77746457
-0.6233895 0.44898677 2.5578005 -0.583796 0.8406945 1.1105415
-1.652044 -0.6351479 0.07651432 -1.0454555 -1.8752071 0.50948805
-1.6050931 -1.1769634 -0.02965304 1.5107706 0.83292925 -0.5382068
-1.5981512 -0.6405941 0.5521577 0.22957848 0.506649 0.24680384
-0.91464925 -0.18441322 -0.6801975 -1.0448433 0.52288735 -0.79405725
0.5974493 -0.40668172 -0.00640235 -0.742475 0.1928863 0.31236258
-0.37383577 -1.5883486 -1.5336255 -0.74254227 -0.8524561 -1.4625055
-2.718953 -0.7180952 -1.2140683 -0.5232462 1.2576898 -1.1097553
2.3971314 0.8855096 -0.16556528 -0.07307663 -1.8778017 0.8690948
-0.39043528 -0.5494097 -2.2382076 0.7101087 0.15859437 0.2959841
0.8605075 -0.2040207 0.77952844 0.04542177 0.92514265 -1.988945
0.9418363 1.6509243 -0.20324889 0.2974357 0.37681833 1.095943
1.6308782 -1.2553837 -0.10246387 -1.4697052 -0.5832107 -0.34192032
-1.1347024 1.5154309 -0.00527111 -1.165709 -0.7296148 -0.20767921
1.2530949 -0.9487353 ]
FACE 2 Embedding:
[ 0.9399996 1.3996615 -1.2931366 0.6869738 -0.03219241 0.96111965
0.7378809 -0.24804354 -0.8128112 0.19901593 0.48911542 -0.91603553
-1.1671298 0.88576627 0.25427592 1.1395477 0.45400882 -1.4845027
-0.90582514 -1.1371222 0.47669724 1.2933927 1.4533392 -0.46943524
0.10245587 -1.4916894 -2.3223586 -0.10979578 1.7803721 1.0051152
-0.09164213 -0.64848715 -1.4191641 1.811776 0.73174113 0.2582223
-0.26430857 1.7021953 -1.0571098 -1.1215096 0.3606074 1.5136883
-0.30045512 0.26225814 -0.19101554 1.269355 1.0674374 -0.2550623
-1.0582973 1.7474637 -1.7739134 -0.67914337 -0.1877765 1.1581128
-2.281225 1.3955555 -1.2690883 -0.16299461 1.337664 -0.8831901
-0.6862674 2.0526903 -0.6325836 1.333468 -0.10851342 -0.64831966
-1.0277263 1.4572504 -0.29905424 -0.33187118 -0.54727656 1.1528811
0.12454037 -1.5835186 -0.2271783 1.3911225 1.0170195 0.5741334
-1.3088373 -0.5950714 -0.6856393 -0.910367 -2.0136826 -0.73777384
0.319223 -2.1968741 0.9673934 -0.604423 -0.08049382 -1.948634
1.88159 0.20169139 0.7295723 -1.0224706 1.2995481 -0.3402595
1.1711328 -0.64862376 0.42063504 -0.01502114 -0.7048841 1.4360497
-1.2988033 0.31773448 1.534014 0.98858756 1.3450235 -0.9417385
0.26414695 -0.01988658 0.7418235 -0.04945141 -0.44838902 1.5288658
-1.1905407 0.13961646 -0.17101136 -0.18599203 -1.9648114 0.66071814
-0.07431012 1.5870664 1.5989372 -0.21751085 0.78908855 -1.5576671
0.02266342 0.20999858]
DISTANCE
0.807837575674057
IDENTIFIED
false
It becomes easier in DeepFace 0.0.41.
from deepface import DeepFace
from deepface.commons import functions
models = ['VGG-Face', 'Facenet', 'OpenFace', 'DeepFace', 'DeepID', 'Dlib']
model = DeepFace.build_model(models[0])
target_size = model.layers[0].input_shape
img1_path = "img1.jpg"
img2_path = "img2.jpg"
#detect and align
img1 = functions.preprocess_face(img1_path, target_size = target_size)
img2 = functions.preprocess_face(img2_path, target_size = target_size)
#find vector embeddings
img1_embedding = model.predict(img1)
img2_embedding = model.predict(img2)
TensorFlow version: 1.14
Python version: 3.6.9
My purpose is to build an object detection system with classification. I used Object Detection API and I want to feed its output bounding boxes to another neural networks (there are 6 different objects to detect and then I want to classify these object with Keras neural networks by object's features).
When I use Object Detection API only its OK, but if I want to use model.predict() script crashes. As I've read there's a problem with graph and sessions.
I'm pretty fresh to all these stuff, so I want to ask: is this possible to use multiple models simultaneously?
I've read about creating two sessions and graphs but the input of Object Detection model is a live video from the webcam and I don't want to lose performance of a script. I tried to start session with each frame, but it's very slow.
Also maybe upgrading script to Tensorflow 2.0 will be helpful?
EDIT:
I want to detect fruits and pass them to another Keras models which will predict their state. Detecting fruits works good, but I cannot use additional Keras model, because of the following error:
Tensor Tensor("dense_3/Sigmoid:0", shape=(?, 1), dtype=float32) is not an element of this graph.
Code provided:
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
from keras import models
from keras.preprocessing import image
import cv2
if 'cap' in globals():
cap.release()
cap = cv2.VideoCapture(0)
sys.path.append("..")
graph = tf.get_default_graph()
from utils import label_map_util
from utils import visualization_utils as vis_util
def limit(value, max_val, min_val):
if(value > max_val):
value = max_val
elif(value < min_val):
value = min_val
return value
# What model to download.
MODEL_NAME = 'inference_graph'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = 'training/labelmap.pbtxt'
NUM_CLASSES = 6
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)
def load_image_into_numpy_array_updated(image):
return np.array(image).astype(np.uint8)
# PATH_TO_TEST_IMAGES_DIR = 'test_images'
# TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3) ]
# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)
# Loading a keras model
model = models.load_model('new_banana.h5')
with detection_graph.as_default():
with tf.Session(graph=detection_graph) as sess:
while True:
ret, image_np = cap.read()
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
# Each box represents a part of the image where a particular object was detected.
boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
# Each score represent how level of confidence for each of the objects.
# Score is shown on the result image, together with the class label.
scores = detection_graph.get_tensor_by_name('detection_scores:0')
classes = detection_graph.get_tensor_by_name('detection_classes:0')
num_detections = detection_graph.get_tensor_by_name('num_detections:0')
# Actual detection.
(boxes, scores, classes, num_detections) = sess.run(
[boxes, scores, classes, num_detections],
feed_dict={image_tensor: image_np_expanded})
image_np_copy = image_np.copy()
# Visualization of the results of a detection.
vis_util.visualize_boxes_and_labels_on_image_array(
image_np,
np.squeeze(boxes),
np.squeeze(classes).astype(np.int32),
np.squeeze(scores),
category_index,
use_normalized_coordinates=True,
line_thickness=8,
min_score_thresh=0.7)
# Code what are used to get thresholded bounding boxes from image
# enlarge them about compenser value, limitates them
# print them and send them to another script
# 0 - apple, 2 - banana, 3 - orange, 4 - pear, 5 - pepper, 6 - tomato
min_score_thresh = 0.7
bboxes = boxes[scores > min_score_thresh]
bclasses = classes[scores > min_score_thresh]
image_np_new = cv2.resize(image_np_copy, (800,600))
im_width, im_height = (800, 600)
if bclasses.size > 0:
final_box = []
cropped_images = []
compenser = 30
if(bclasses[0] == 2): #if any of detected classes stands for 'banana'
for box in bboxes:
ymin, xmin, ymax, xmax = box
ymin0 = int(im_height * ymin) - compenser
ymax0 = int(im_height * ymax) + compenser
xmin0 = int(im_width * xmin) - compenser
xmax0 = int(im_width * xmax) + compenser
ymin1 = limit(ymin0, im_height, 0)
ymax1 = limit(ymax0, im_height, 0)
xmax1 = limit(xmax0, im_width, 0)
xmin1 = limit(xmin0, im_width, 0)
image_cropped = image_np_new[ymin1:ymax1, xmin1:xmax1]
height, width, _ = image_cropped.shape
if width > height:
image_cropped = cv2.resize(image_cropped, (200, 150))
image_cropped = cv2.rotate(image_cropped, cv2.ROTATE_90_CLOCKWISE)
else:
image_cropped = cv2.resize(image_cropped, (150, 200))
image_cropped = load_image_into_numpy_array_updated(image_cropped)
image_cropped = image_cropped.reshape((1,) + image_cropped.shape)
image_cropped = image_cropped/255
cropped_images.append(image_cropped)
if (len(cropped_images) > 0):
for image in cropped_images:
print(image.shape)
# input tensor 200, 150, 3
classes = model.predict_classes(image, batch_size=10)
print(classes)
cv2.imshow('object detection', image_np)
if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
cap.release()
break
I have a problem with TF dataset generator. I do not why, but when I get picture from dataset by running it through session, it returns Tensors where colors are inverted. I tried to changed BGR to RGB, but this is not the problem.
It is partially solved by inverting the image array (img = 1 - img ), but I would like not this problem to occur in first place. Does somebody know what could be the cause?
import os
import glob
import random
import tensorflow as tf
from tensorflow import Tensor
class PairGenerator(object):
person1 = 'img'
person2 = 'person2'
label = 'same_person'
#def __init__(self, lfw_path='./tf_dataset/resources' + os.path.sep + 'lfw'):
def __init__(self, lfw_path='/home/tom/Devel/ai-dev/tensorflow-triplet-loss/data/augmentor'):
self.all_people = self.generate_all_people_dict(lfw_path)
print(self.all_people.keys())
def generate_all_people_dict(self, lfw_path):
# generates a dictionary between a person and all the photos of that person
all_people = {}
for person_folder in os.listdir(lfw_path):
person_photos = glob.glob(lfw_path + os.path.sep + person_folder + os.path.sep + '*.jpg')
all_people[person_folder] = person_photos
return all_people
def get_next_pair(self):
all_people_names = list(self.all_people.keys())
while True:
# draw a person at random
person1 = random.choice(all_people_names)
# flip a coin to decide whether we fetch a photo of the same person vs different person
same_person = random.random() > 0.5
if same_person:
person2 = person1
else:
# repeatedly pick random names until we find a different name
person2 = person1
while person2 == person1:
person2 = random.choice(all_people_names)
person1_photo = random.choice(self.all_people[person1])
yield ({self.person1: person1_photo,
self.label: same_person})
class Inputs(object):
def __init__(self, img: Tensor, label: Tensor):
self.img = img
self.label = label
def feed_input(self, input_img, input_label=None):
# feed the input images that are necessary to make a prediction
feed_dict = {self.img: input_img}
# optionally also include the label:
# if we're just making a prediction without calculating loss, that won't be necessary
if input_label is not None:
feed_dict[self.label] = input_label
return feed_dict
class Dataset(object):
img_resized = 'img_resized'
label = 'same_person'
def __init__(self, generator=PairGenerator()):
self.next_element = self.build_iterator(generator)
def build_iterator(self, pair_gen: PairGenerator):
batch_size = 10
prefetch_batch_buffer = 5
dataset = tf.data.Dataset.from_generator(pair_gen.get_next_pair,
output_types={PairGenerator.person1: tf.string,
PairGenerator.label: tf.bool})
dataset = dataset.map(self._read_image_and_resize)
dataset = dataset.batch(batch_size)
dataset = dataset.prefetch(prefetch_batch_buffer)
iter = dataset.make_one_shot_iterator()
element = iter.get_next()
return Inputs(element[self.img_resized],
element[PairGenerator.label])
def _read_image_and_resize(self, pair_element):
target_size = [224, 224]
# read images from disk
img_file = tf.read_file(pair_element[PairGenerator.person1])
print("////")
print(PairGenerator.person1)
img = tf.image.decode_image(img_file, channels=3)
# let tensorflow know that the loaded images have unknown dimensions, and 3 color channels (rgb)
img.set_shape([None, None, 3])
# resize to model input size
img_resized = tf.image.resize_images(img, target_size)
#img_resized = tf.image.flip_up_down(img_resized)
#img_resized = tf.image.rot90(img_resized)
pair_element[self.img_resized] = img_resized
pair_element[self.label] = tf.cast(pair_element[PairGenerator.label], tf.float32)
return pair_element
generator = PairGenerator()
iter = generator.get_next_pair()
for i in range(10):
print(next(iter))
ds = Dataset(generator)
import matplotlib.pyplot as plt
imgplot = plt.imshow(out)
imgplot = plt.imshow(1 - out)
Ok so the solution was
imgplot = plt.imshow(out/255)
I have set up the Google's DeepLab V3 Demo on my local system and it runs successfully after making some minor changes. It's as:
# -*- coding: utf-8 -*-
# DeepLab Demo
# This demo will demostrate the steps to run deeplab semantic segmentation model on sample input images.
import os
from io import BytesIO
import tarfile
import tempfile
from six.moves import urllib
from matplotlib import gridspec
from matplotlib import pyplot as plt
import numpy as np
from PIL import Image
import tensorflow as tf
class DeepLabModel(object):
"""Class to load deeplab model and run inference."""
INPUT_TENSOR_NAME = 'ImageTensor:0'
OUTPUT_TENSOR_NAME = 'SemanticPredictions:0'
INPUT_SIZE = 513
FROZEN_GRAPH_NAME = 'frozen_inference_graph'
def __init__(self, tarball_path):
"""Creates and loads pretrained deeplab model."""
self.graph = tf.Graph()
graph_def = None
# Extract frozen graph from tar archive.
tar_file = tarfile.open(tarball_path)
for tar_info in tar_file.getmembers():
if self.FROZEN_GRAPH_NAME in os.path.basename(tar_info.name):
file_handle = tar_file.extractfile(tar_info)
graph_def = tf.GraphDef.FromString(file_handle.read())
break
tar_file.close()
if graph_def is None:
raise RuntimeError('Cannot find inference graph in tar archive.')
with self.graph.as_default():
tf.import_graph_def(graph_def, name='')
self.sess = tf.Session(graph=self.graph)
def run(self, image):
"""Runs inference on a single image.
Args:
image: A PIL.Image object, raw input image.
Returns:
resized_image: RGB image resized from original input image.
seg_map: Segmentation map of `resized_image`.
"""
width, height = image.size
resize_ratio = 1.0 * self.INPUT_SIZE / max(width, height)
target_size = (int(resize_ratio * width), int(resize_ratio * height))
resized_image = image.convert('RGB').resize(target_size, Image.ANTIALIAS)
batch_seg_map = self.sess.run(
self.OUTPUT_TENSOR_NAME,
feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(resized_image)]})
seg_map = batch_seg_map[0]
return resized_image, seg_map
def create_pascal_label_colormap():
"""Creates a label colormap used in PASCAL VOC segmentation benchmark.
Returns:
A Colormap for visualizing segmentation results.
"""
colormap = np.zeros((256, 3), dtype=int)
ind = np.arange(256, dtype=int)
for shift in reversed(range(8)):
for channel in range(3):
colormap[:, channel] |= ((ind >> channel) & 1) << shift
ind >>= 3
return colormap
def label_to_color_image(label):
"""Adds color defined by the dataset colormap to the label.
Args:
label: A 2D array with integer type, storing the segmentation label.
Returns:
result: A 2D array with floating type. The element of the array
is the color indexed by the corresponding element in the input label
to the PASCAL color map.
Raises:
ValueError: If label is not of rank 2 or its value is larger than color
map maximum entry.
"""
if label.ndim != 2:
raise ValueError('Expect 2-D input label')
colormap = create_pascal_label_colormap()
if np.max(label) >= len(colormap):
raise ValueError('label value too large.')
return colormap[label]
def vis_segmentation(image, seg_map):
"""Visualizes input image, segmentation map and overlay view."""
plt.figure(figsize=(15, 5))
grid_spec = gridspec.GridSpec(1, 4, width_ratios=[6, 6, 6, 1])
plt.subplot(grid_spec[0])
plt.imshow(image)
plt.axis('off')
plt.title('input image')
plt.subplot(grid_spec[1])
seg_image = label_to_color_image(seg_map).astype(np.uint8)
plt.imshow(seg_image)
plt.axis('off')
plt.title('segmentation map')
plt.subplot(grid_spec[2])
plt.imshow(image)
plt.imshow(seg_image, alpha=0.7)
plt.axis('off')
plt.title('segmentation overlay')
unique_labels = np.unique(seg_map)
ax = plt.subplot(grid_spec[3])
plt.imshow(
FULL_COLOR_MAP[unique_labels].astype(np.uint8), interpolation='nearest')
ax.yaxis.tick_right()
plt.yticks(range(len(unique_labels)), LABEL_NAMES[unique_labels])
plt.xticks([], [])
ax.tick_params(width=0.0)
plt.grid('off')
plt.show()
LABEL_NAMES = np.asarray([
'background', 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus',
'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike',
'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tv'
])
FULL_LABEL_MAP = np.arange(len(LABEL_NAMES)).reshape(len(LABEL_NAMES), 1)
FULL_COLOR_MAP = label_to_color_image(FULL_LABEL_MAP)
# #title Select and download models {display-mode: "form"}
MODEL_NAME = 'mobilenetv2_coco_voctrainaug' # #param ['mobilenetv2_coco_voctrainaug', 'mobilenetv2_coco_voctrainval', 'xception_coco_voctrainaug', 'xception_coco_voctrainval']
_DOWNLOAD_URL_PREFIX = 'http://download.tensorflow.org/models/'
_MODEL_URLS = {
'mobilenetv2_coco_voctrainaug':
'deeplabv3_mnv2_pascal_train_aug_2018_01_29.tar.gz',
'mobilenetv2_coco_voctrainval':
'deeplabv3_mnv2_pascal_trainval_2018_01_29.tar.gz',
'xception_coco_voctrainaug':
'deeplabv3_pascal_train_aug_2018_01_04.tar.gz',
'xception_coco_voctrainval':
'deeplabv3_pascal_trainval_2018_01_04.tar.gz',
}
_TARBALL_NAME = 'deeplab_model.tar.gz'
model_dir = tempfile.mkdtemp()
tf.gfile.MakeDirs(model_dir)
download_path = os.path.join(model_dir, _TARBALL_NAME)
print('downloading model, this might take a while...')
urllib.request.urlretrieve(_DOWNLOAD_URL_PREFIX + _MODEL_URLS[MODEL_NAME],
download_path)
print('download completed! loading DeepLab model...')
MODEL = DeepLabModel(download_path)
print('model loaded successfully!')
# """## Run on sample images
#
# Select one of sample images (leave `IMAGE_URL` empty) or feed any internet image
# url for inference.
#
# Note that we are using single scale inference in the demo for fast computation,
# so the results may slightly differ from the visualizations in
# [README](https://github.com/tensorflow/models/blob/master/research/deeplab/README.md),
# which uses multi-scale and left-right flipped inputs.
# """
# #title Run on sample images {display-mode: "form"}
SAMPLE_IMAGE = 'image1.jpg' # #param ['image1', 'image2', 'image3']
IMAGE_URL = 'https://raw.githubusercontent.com/tensorflow/models/master/research/deeplab/g3doc/img/image1.jpg' ##param {type:"string"}
_SAMPLE_URL = ('https://github.com/tensorflow/models/blob/master/research/'
'deeplab/g3doc/img/%s.jpg?raw=true')
def run_visualization(url):
"""Inferences DeepLab model and visualizes result."""
try:
# f = urllib.request.urlopen(url)
# jpeg_str = f.read()
# original_im = Image.open(BytesIO(jpeg_str))
original_im = Image.open("human.jpg")
except IOError:
print('Cannot retrieve image. Please check url: ' + url)
return
print('running deeplab on image %s...' % url)
resized_im, seg_map = MODEL.run(original_im)
vis_segmentation(resized_im, seg_map)
image_url = SAMPLE_IMAGE
run_visualization(SAMPLE_IMAGE)
I have used various images with this model and it's working. Here's an example output:
Now I need to extract the mask as a separate image, how can I achieve that?
Thanks in advance!
The seg_map hold the segmented image.
resized_im, seg_map = MODEL.run(original_im)
Its a matplot Image array. You can convert it into numpy array using
np.array(seg_map) or use it whatever way you like.
In tensorflow Object Detection API we are using ssd_mobilenet_v1_coco_2017_11_17 model to detect 90 general objects. I want to use this model for detection.
Next, I have trained faster_rcnn_inception_v2_coco_2018_01_28 model to detect a custom object. I wish to use this in the same code where I will be able to detect those 90 objects as well as my new trained custom object. How to achieve this with single code?
I have achieved this by doing the following code in detect_object.py
import numpy as np
import tensorflow as tf
import sys
from PIL import Image
import cv2
from utils import label_map_util
from utils import visualization_utils as vis_util
# ------------------ Knife Model Initialization ------------------------------ #
knife_label_map = label_map_util.load_labelmap('training/labelmap.pbtxt')
knife_categories = label_map_util.convert_label_map_to_categories(
knife_label_map, max_num_classes=1, use_display_name=True)
knife_category_index = label_map_util.create_category_index(knife_categories)
knife_detection_graph = tf.Graph()
with knife_detection_graph.as_default():
knife_od_graph_def = tf.GraphDef()
with tf.gfile.GFile('inference_graph_3/frozen_inference_graph.pb', 'rb') as fid:
knife_serialized_graph = fid.read()
knife_od_graph_def.ParseFromString(knife_serialized_graph)
tf.import_graph_def(knife_od_graph_def, name='')
knife_session = tf.Session(graph=knife_detection_graph)
knife_image_tensor = knife_detection_graph.get_tensor_by_name('image_tensor:0')
knife_detection_boxes = knife_detection_graph.get_tensor_by_name(
'detection_boxes:0')
knife_detection_scores = knife_detection_graph.get_tensor_by_name(
'detection_scores:0')
knife_detection_classes = knife_detection_graph.get_tensor_by_name(
'detection_classes:0')
knife_num_detections = knife_detection_graph.get_tensor_by_name(
'num_detections:0')
# ---------------------------------------------------------------------------- #
# ------------------ General Model Initialization ---------------------------- #
general_label_map = label_map_util.load_labelmap('data/mscoco_label_map.pbtxt')
general_categories = label_map_util.convert_label_map_to_categories(
general_label_map, max_num_classes=90, use_display_name=True)
general_category_index = label_map_util.create_category_index(
general_categories)
general_detection_graph = tf.Graph()
with general_detection_graph.as_default():
general_od_graph_def = tf.GraphDef()
with tf.gfile.GFile('ssd_mobilenet_v1_coco_2017_11_17/frozen_inference_graph.pb', 'rb') as fid:
general_serialized_graph = fid.read()
general_od_graph_def.ParseFromString(general_serialized_graph)
tf.import_graph_def(general_od_graph_def, name='')
general_session = tf.Session(graph=general_detection_graph)
general_image_tensor = general_detection_graph.get_tensor_by_name(
'image_tensor:0')
general_detection_boxes = general_detection_graph.get_tensor_by_name(
'detection_boxes:0')
general_detection_scores = general_detection_graph.get_tensor_by_name(
'detection_scores:0')
general_detection_classes = general_detection_graph.get_tensor_by_name(
'detection_classes:0')
general_num_detections = general_detection_graph.get_tensor_by_name(
'num_detections:0')
# ---------------------------------------------------------------------------- #
def knife(image_path):
try:
image = cv2.imread(image_path)
image_expanded = np.expand_dims(image, axis=0)
(boxes, scores, classes, num) = knife_session.run(
[knife_detection_boxes, knife_detection_scores,
knife_detection_classes, knife_num_detections],
feed_dict={knife_image_tensor: image_expanded})
classes = np.squeeze(classes).astype(np.int32)
scores = np.squeeze(scores)
boxes = np.squeeze(boxes)
for c in range(0, len(classes)):
class_name = knife_category_index[classes[c]]['name']
if class_name == 'knife' and scores[c] > .80:
confidence = scores[c] * 100
break
else:
confidence = 0.00
except:
print("Error occurred in knife detection")
confidence = 0.0 # Some error has occurred
return confidence
def general(image_path):
try:
image = cv2.imread(image_path)
image_expanded = np.expand_dims(image, axis=0)
(boxes, scores, classes, num) = general_session.run(
[general_detection_boxes, general_detection_scores,
general_detection_classes, general_num_detections],
feed_dict={general_image_tensor: image_expanded})
classes = np.squeeze(classes).astype(np.int32)
scores = np.squeeze(scores)
boxes = np.squeeze(boxes)
object_name = []
object_score = []
for c in range(0, len(classes)):
class_name = general_category_index[classes[c]]['name']
if scores[c] > .30: # If confidence level is good enough
object_name.append(class_name)
object_score.append(str(scores[c] * 100)[:5])
except:
print("Error occurred in general detection")
object_name = ['']
object_score = ['']
return object_name, object_score
if __name__ == '__main__':
print(' in main')
I can do
import detect_object
detect_object.knife("image.jpg") # to detect whether knife is present in image(this is custom trained model)
detect_object.general("image.jpg") # to detect those 90 objects from TF API
I know there is knife model in TF API but it is not that much accurate so I retrained it for only knife. Finally I have two models
1. First model is to detect only knife,
2. Second model is to detect general object as usual
You cant combine both the models. Have two sections of code which will load one model at a time and identify whatever it can see in the image.
Other option is to re-train a single model that can identify all objects you are interested in