I am using tensorflow's imageNet trained model to extract the last pooling layer's features as representation vectors for a new dataset of images.
The model as is predicts on a new image as follows:
python classify_image.py --image_file new_image.jpeg
I edited the main function so that I can take a folder of images and return the prediction on all images at once and write the feature vectors in a csv file. Here is how I did that:
def main(_):
maybe_download_and_extract()
#image = (FLAGS.image_file if FLAGS.image_file else
# os.path.join(FLAGS.model_dir, 'cropped_panda.jpg'))
#edit to take a directory of image files instead of a one file
if FLAGS.data_folder:
images_folder=FLAGS.data_folder
list_of_images = os.listdir(images_folder)
else:
raise ValueError("Please specify image folder")
with open("feature_data.csv", "wb") as f:
feature_writer = csv.writer(f, delimiter='|')
for image in list_of_images:
print(image)
current_features = run_inference_on_image(images_folder+"/"+image)
feature_writer.writerow([image]+current_features)
It worked just fine for around 21 images but then crashed with the following error:
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1912, in as_graph_def
raise ValueError("GraphDef cannot be larger than 2GB.")
ValueError: GraphDef cannot be larger than 2GB.
I thought by calling the method run_inference_on_image(images_folder+"/"+image) the previous image data would be overwritten to only consider the new image data, which doesn't seem to be the case. How to resolve this issue?
The problem here is that each call to run_inference_on_image() adds nodes to the same graph, which eventually exceeds the maximum size. There are at least two ways to fix this:
The easy but slow way is to use a different default graph for each call to run_inference_on_image():
for image in list_of_images:
# ...
with tf.Graph().as_default():
current_features = run_inference_on_image(images_folder+"/"+image)
# ...
The more involved but more efficient way is to modify run_inference_on_image() to run on multiple images. Relocate your for loop to surround this sess.run() call, and you will no longer have to reconstruct the entire model on each call, which should make processing each image much faster.
You can move the create_graph() to somewhere before this loop for image in list_of_images: (which loops over files).
What it does is performing inference multiple times on the same graph.
The simplest way is put create_graph() at the first of main function.
Then, it just create graph only
A good explanation of why such errors is mentioned here, I encountered the same error while using tf dataset api and came to the understanding that data when iterated over in the session, its getting appended on the existing graph. so what I did is used tf.reset_default_graph() before the dataset iterator to make sure that previous graph is cleared away.
Hope this helps for such a scenario.
Related
I trying to read a large number (54K) of 512x512x3 .png images into an array to create a dataset afterwards and feed to a Keras model. I am using the code below, however I am getting the cv2.OutofMemory error (at around image 50K...) pointing to the fourth line of my code. I have been reading a bit about it, and: I am using the 64bit version, and the images can not be resized as it is a fixed input representation. Is there anything that can be done from a memory management side of things to make it work?
'''
#Images (512x512x3)
X_data = []
files = glob.glob ('C:\Users\77901677\Projects\images1\*.png')
for myFile in files:
image = cv2.imread (myFile)
X_data.append (image)
dataset_image = np.array(X_data)
# Annontations (multilabel) 512x512x2
Y_data = []
files = glob.glob ('C:\\Users\\77901677\\Projects\\annotations1\\*.png')
for myFile in files:
mask = cv2.imread (myFile)
# Gets rid of first channel which is empty
mask = mask[:,:,1:]
Y_data.append (mask)
dataset_mask = np.array(Y_data)
'''
Any ideas or advices are welcome
You can reduce the memory by cutting one of your variables, because you have 2x the array at the moment.
You could use yield for this, thus creating a generator, which will only load your file one at a time, instead of storing it all in an auxiliary variable.
def myGenerator():
files = glob.glob ('C:\\Users\\77901677\\Projects\\annotations1\\*.png')
for myFile in files:
mask = cv2.imread (myFile)
# Gets rid of first channel which is empty
yield mask[:,:,1:]
# initialise your numpy array here
yData = np.zeros(NxHxWxC)
# initialise the generator
mygenerator = myGenerator() # create a generator
for I, data in enumerate(myGenerator):
yData[I,::] = data # load the data
But, this is not optimal for you. If you plan to train a model in the next step, you will have memory issues for sure. In keras, you can additionally implement a Keras Sequence Generator, which will load your files in batches (similarly to this yield generator) to your model in the training stage. I recommend this article here, which demonstrates an easy implementation of it, it's what I use for my keras/tf model pipelines.
It's good practice to use generators when feeding our models large amounts of data.
The following code is part of my code for a tf graph to read images. When I use this code to iterate through the data, the program gets stuck in tf.io.read_file(path) after a few hundred images forever and doesn't do anything. More specifically, the code even can't be paused and I had to restart the session every time.
#tf.function()
def read_image(path):
image = tf.io.read_file(path)
image = tf.image.decode_jpeg(image)
return image
...
div8k_list=[os.path.join(div8k_save_path, x) for x in os.listdir(div8k_save_path)]
train_path = tf.data.Dataset.from_tensor_slices(div8k_list)
train_images = train_path.map(read_image, num_parallel_calls=tf.data.AUTOTUNE)
I first suspected that there were a few corrupted images or wrong paths in the data that were causing this problem and tested the following code.
for path in train_path:
print(path)
image = tf.io.read_file(path)
image = tf.image.decode_jpeg(image)
Surprisingly, there was no common characteristic of the image path the loop was stuck. And it was not a problem of the image because the loop was once stuck at 1056.png but when I explicitly loaded 1056.png, there was no problem.
What could be the cause of this problem?
edit: to summarize, the program is stuck at read_image forever, while I couldn't find a problem in the dataset.
My dataset is the DIV8K dataset and I am running in COLAB.
EDIT The function that is slowing my code is decode_jpeg, because the following definition of read_image worked multiple times.
#tf.function()
def read_image(path):
image = tf.io.read_file(path)
image = tf.image.decode_jpeg(image)
return image
As mentioned in the comment, try the following function to decode the image file as it can handle mixed extension file format (jpg, png etc), ref.
tf.io.decode_image(image, expand_animations = False)
However, decode_jpeg should also able to handle the .png file format now. Without the image file, it's hard to break down what's causing to prevent this. Most probably the file is somehow corrupted or not the valid extension for the decode_jpeg, though it's named that way, check this solution.
I wrote a Keras/Tensorflow callback that writes a confusion matrix to the Images tab in Tensorboard. This worked fine in TF 2.1. Unfortunately I had to convert it to TF 1.14, as other packages depend on this version.
Everything works fine (more or less), except for the Tensorboard Reports.
The Problem
As you can see in the screenshot below, there are many categories(? tags? channels? I am not sure about the terminology) instead of only one.
screenshot: multiple categories for images
Seemingly correlated to this issue, the training scalars like "val_loss" only plot the first datapoint and nothing afterwards. See second screenshot
screenshot: scalars showing one data point
Also, Tensorboard is printing the following error:
File <path/to/event-file> updated even though the current file is <path/to/new-event-file>
So I assume somehow my TF FileWriters are disagreeing on where to write.
Regarding the Confusion Matrix Callback: The writing function looks like this:
def _figure_to_summary(self, fig, step):
# attach a new canvas if not exists
if fig.canvas is None:
matplotlib.backends.backend_agg.FigureCanvasAgg(fig)
fig.canvas.draw()
w, h = fig.canvas.get_width_height()
img = np.fromstring(fig.canvas.tostring_rgb(), dtype=np.uint8, sep='')
img = img.reshape((1, h, w, 3))
with K.get_session().as_default():
tensor = tf.constant(img)
image = tf.summary.image(self.title, tensor).eval()
self._summary_writer.add_summary(image, global_step=step)
self._summary_writer.flush()
With fig being the matplot figure of the Confusion Matrix and step as an int, which should enable Tensorboard to add the little slider over the image to show the history of the matrix.
The model is trained as follows:
run_id = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
tb_path = os.path.join("tensor/", run_id)
tb_reporter = TensorBoard(tb_path)
summ_wr = FileWriter(tb_path)
conf_matr = ConfusionMatrix(va, TARGET_CLASSES.keys(), summ_wr, normalize=True)
cb_list = [tb_reporter, conf_matr]
model.fit(tr, epochs=500, validation_data=va, callbacks=cb_list)
Where summ_wr becomes self._summary_writer of the callback.
What I tried
I only tried changing the way the summary is written. Trying tf.merge_all(), opening, closing and reopening the FileWriter in various combinations, but nothing changed. When I deactivate the custom callback, the Tensorboard callback works as expected.
My Questions
How can I write Image Data into the same category every time?
Will the image get the scrollbar?
How can I fix the problem, that the Keras Tensorboard callback does not show its data?
I assume all the problems are related and solving one results in solving all of them, but I am completely stumped on how to do it.
I am thankful for any suggestions :)
Edit:
I just found this question: Tensorboard Image Summaries
This seems to solve the problem, unfortunately the answer does not have a lot of context, so I don't how to integrate the solution into the code.
The solution I found consists of two parts.
First: I stole the FileWriter from the Tensorboard Callback from keras. Tensorboard.writer contains the FileWriter after set_model has been called.
Second: The Summary must be created via the constructor and be passed the tag-keyword. This forces the images into the same group and the writers do not interfere with each other.
I found the second part of the solution here: TensorBoard: How to write images to get a steps slider?
I am pretty new to Keras. I am trying to train a model using ImageDataGenerator. I have a very large amount of images for training saved in .npy format. I wanted to use flow_from_directory() so I stored the images as recommended in the documentation (one folder per class). The problem is this only works for png, jpeg, tiff, etc. but won't work with my .npy files.
Is there any way I could use this function or something similar that gives me all the augmentation possibilities that ImageDataGenerator gives?
Thank you very much, any help is appreciated
Yes, it's possible if you are willing to adapt the source code of the ImageDataGenerator (which is actually quite straightforward to read and understand). Looking at the keras-preprocessing github, I think it would suffice to replace the load_img method in the DirectoryIterator class with your own load_array method that reads .npy files from disk instead of images:
...
# build batch of image data
for i, j in enumerate(index_array):
fname = self.filenames[j]
## Replace the code below with your own function
img = load_img(os.path.join(self.directory, fname),
color_mode=self.color_mode,
target_size=self.target_size,
interpolation=self.interpolation)
x = img_to_array(img, data_format=self.data_format)
...
So minimally, you would make the following change to that line:
...
# build batch of image data
for i, j in enumerate(index_array):
fname = self.filenames[j]
img = np.load(os.path.join(self.directory, fname))
...
But probably you will want to implement some of the additional logic that Keras' load_img utility function also has like color mode, target size etc. and wrap everything in your own load_array function.
Hoping that somebody can help out with a TensorFlow query. It's not a difficult one, I'm sure. I am just somewhat lacking in knowledge relating to TensorFlow and NumPy.
Without any prior experience of TensorFlow I have implemented Python code from a tutorial for doing image classification. This works. Once trained, it can tell the difference between a cat and a dog.
This is currently hard-wired for a single image. I would like to be able to classify multiple images (i.e. the contents of a folder), and do this efficiently. What I have done so far in an effort to achieve this is to simply add a loop around everything, so it runs all the code for each image. However, timing the operation shows that classification of each successive image takes longer than the previous one. Therefore there is some kind of incremental overhead. Some operation is taking more time with every loop. I cannot immediately see what this is.
There are two options to improve this. Either:
(1) Leave the loop largely as it is and prevent this slowdown, or
(2) (Preferable IMHO, if it is possible) Pass a list of images to TensorFlow for classification, and get back a list of results. This seems more efficient.
This is the code:
import tensorflow as tf
import numpy as np
import os,glob,cv2
import sys,argparse
import time
try:
inputdir = [redacted - insert input dir here]
for f in os.listdir(inputdir):
start_time = time.time()
filename = os.path.join(inputdir,f)
image_size=128
num_channels=3
images = []
image = cv2.imread(filename) # read image using OpenCV
# Resize image to desired size and preprocess exactly as done during training...
image = cv2.resize(image, (image_size, image_size),0,0, cv2.INTER_LINEAR)
images.append(image)
images = np.array(images, dtype=np.uint8)
images = images.astype('float32')
images = np.multiply(images, 1.0/255.0)
# The input to the network is of shape [None image_size image_size num_channels]. Hence we reshape.
x_batch = images.reshape(1, image_size,image_size,num_channels)
sess = tf.Session() # restore the saved model
saver = tf.train.import_meta_graph('dogs-cats-model.meta') # Step 1: Recreate the network graph. At this step only graph is created
saver.restore(sess, tf.train.latest_checkpoint('./')) # Step 2: Load the weights saved using the restore method
graph = tf.get_default_graph() # access the default graph which we have restored
# Now get hold of the op that we can be processed to get the output.
# In the original network y_pred is the tensor that is the prediction of the network
y_pred = graph.get_tensor_by_name("y_pred:0")
## Feed the images to the input placeholders...
x= graph.get_tensor_by_name("x:0")
y_true = graph.get_tensor_by_name("y_true:0")
y_test_images = np.zeros((1, 2))
# Create the feed_dict that is required to be fed to calculate y_pred...
feed_dict_testing = {x: x_batch, y_true: y_test_images}
result=sess.run(y_pred, feed_dict=feed_dict_testing)
# Note: result is a numpy.ndarray
print(f + '\t' + str(result) + ' ' + '%.2f' % (time.time()-start_time) + ' seconds')
# next image
except:
import traceback
tb = traceback.format_exc()
print(tb)
finally:
input() # keep window open until key is pressed
What I tried to do to modify the above was to create a list of filenames using...
images.append(image)
...and then taking the rest of the code out of the loop. However, this didn't work. It resulted in the following error:
ValueError: cannot reshape array of size 294912 into shape
(1,128,128,3)
At this line:
x_batch = images.reshape(1, image_size,image_size,num_channels)
Apparently this Reshape method doesn't work (as implemented, at least) on a list of images.
So my questions are:
What would causing the steady increase in image classification time that I have seen as images are iterated?
Can I perform classification on multiple images in one go, rather than one-by-one in a loop?
Thanks in advance.
Your issues:
1 a) The main reason why it is so slow is: You are re-creating the graph for every image.
1 b) The incremental overhead is coming from creating a new session every time without destroying the old session. The with syntax helps with that. e.g.:
with tf.Session(graph=tf.Graph()) as session:
# do something with the session
But that won't be a noticable issue after addressing a).
When thinking about the problem, one might realise which parts of your code depend on the image and which don't. The TensorFlow related part that is different per image is the call to session.run, feeding in the image. Everything else can be moved out of the loop.
2) You can also classify multiple images in one go. The first dimension of x_batch is the batch size. You are specifying one. But you may exhaust your memory resources trying to do that for a very large number of images.