Sorry, this is a long one!
I am 80% sure that the problem is that I don't fully understand how tensorflow uses the tf.train.batch function to queue data.
I am trying to adapt one of the tensorflow tutorials to classify a large number of images.
Tutorial can be found here: https://www.tensorflow.org/tutorials/deep_cnn
I have built some modules which can encode my raw data in the same format that cifar10 uses. I am using this to construct training and evaluation data which the program is able to evaluate to a high degree of accuracy. Accuracy varies depending on the quality of the imagesets I put in. To keep things simple I have trained it using 32x32 monochrome tiles of either yellow or blue (category 0 and 1 respectively). Conveniently the network is able to identify whether it is being given a yellow or blue tile with 100% accuracy.
I have also been able to adapt cifar10_eval.py to output predictions rather than an accuracy percentage. This allows me to feed in un-classified data and output predictions as a list. To do this I have exchanged the statement:
top_k_op = tf.nn.in_top_k(logits, labels, 1)
for:
output_2 = tf.argmax(logits, 1)
I have added a variable and a boolean to the eval_once function call to allow it to access the definition for "output_2" and to let me switch between this and "top_k_op" depending on whether I am in evaluation mode or if I am predicting new data.
So far so good. This method works for small amounts of input data but fails as soon as I want to output more than 128 classifications. Not coincidentally 128 is the batch size.
In theory the first item (3073 bytes) in the binary should correspond to the first item in the list which is churned out when I am predicting new data. This happens for inputs of up to 128 images but the data gets jumbled up when I try to categorise more images. Actually, some of the predictions are lost completely!
There are a couple of reasons that this happens. The tutorial isn't designed to care about the order in which data is read or processed, just that individual images correspond with their labels. Originally the data loss was randomised(!) but I have managed to remove the random element by removing multi-threading (threads = 1 rather than 16) and stopped it from shuffling filenames.
filename_queue = tf.train.string_input_producer(filenames, shuffle=False)
string_input_producer has a hidden/optional argument which shuffles the file names. For model evaluation I have set this to false as above.
However.... I am still stuck with jumbled data loss when evaluating data larger than a single batch.
Does anyone know why this happens and have any ideas about how it could be fixed?
In theory I could redesign the code to rebuild the graph and evaluate it for 128 images at a time. However, I want to classify millions of images and feel that I'd be asking for trouble trying to open a new graph instance per batch.
PS, I've done my homework:
I have verified that my initial data to binary conversion works by running a program which can read cifar10-style files and interpret it as a big tile of images. I have run this code on both the original cifar10 binaries and my own binaries and am able to reconstruct both perfectly.
When I encode uncategorised data I add a category label of zero to make sure the tutorial can read the file. However, I make sure that this label is chucked away at the file reading stage and thus is not used when generating a list of predictions.
I have verified the output predictions by printing the list directly onto the screen as a python output and also by using it to assemble a PNG image which can be compared with the original inputs. This verification works perfectly for small batch sizes and starts to fall apart in larger batch sizes.
I've also made some modifications to the tutorial not discussed in this post. These are simple modifications such as changing the number of categories to 2 rather than 10. Am confident that this is not the issue.
PPS, here is a copy of some functions from the modified script. I haven't pasted everything because this question is already huge:
from cifar10_eval:
def eval_once(saver, summary_writer, top_k_op, output_2, summary_op, mapping=False):
"""Run Eval once.
Args:
saver: Saver.
summary_writer: Summary writer.
top_k_op: Top K op.
summary_op: Summary op.
"""
with tf.Session() as sess:
ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir)
if ckpt and ckpt.model_checkpoint_path:
# Restores from checkpoint
saver.restore(sess, ckpt.model_checkpoint_path)
# Assuming model_checkpoint_path looks something like:
# /my-favorite-path/cifar10_train/model.ckpt-0,
# extract global_step from it.
global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]
else:
print('No checkpoint file found')
return
# Start the queue runners.
coord = tf.train.Coordinator()
try:
threads = []
for qr in tf.get_collection(tf.GraphKeys.QUEUE_RUNNERS):
threads.extend(qr.create_threads(sess, coord=coord, daemon=True,
start=True))
num_iter = int(math.ceil(FLAGS.num_examples / FLAGS.batch_size))
true_count = 0 # Counts the number of correct predictions.
total_sample_count = num_iter * FLAGS.batch_size
step = 0
output=[]
if mapping: # if in mapping mode generate a map, if in default mode (variable set to False by default) then tally predictions instead.
while step < num_iter and not coord.should_stop():
step += 1
hold = sess.run(output_2)
print(hold)
for i in range (len(hold)):
output.append(hold[i])
return(output)
from cifar10_input:
def inputs(mapping, data_dir, batch_size):
"""Construct input for CIFAR evaluation using the Reader ops.
Args:
mapping: bool, indicating if one should use the raw or pre-classified eval data set.
data_dir: Path to the CIFAR-10 data directory.
batch_size: Number of images per batch.
Returns:
images: Images. 4D tensor of [batch_size, IMAGE_SIZE, IMAGE_SIZE, 3] size.
labels: Labels. 1D tensor of [batch_size] size.
"""
filelist = os.listdir(data_dir)
filenames = []
if mapping:
# from Raw_Image_Processor import file_name
for f in filelist:
if f.startswith("raw_batch"):
filenames.append(os.path.join(data_dir, f))
num_examples_per_epoch = NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN
else:
for f in filelist:
if f.startswith("eval_batch"):
filenames.append(os.path.join(data_dir, f))
num_examples_per_epoch = NUM_EXAMPLES_PER_EPOCH_FOR_EVAL
for f in filenames:
if not tf.gfile.Exists(f):
raise ValueError('Failed to find file: ' + f)
# Create a queue that produces the filenames to read.
filename_queue = tf.train.string_input_producer(filenames, shuffle=False)
# Read examples from files in the filename queue.
read_input = read_cifar10(filename_queue)
reshaped_image = tf.cast(read_input.uint8image, tf.float32)
height = IMAGE_SIZE
width = IMAGE_SIZE
# Image processing for evaluation.
# Crop the central [height, width] of the image.
resized_image = tf.image.resize_image_with_crop_or_pad(reshaped_image,
height, width)
# Subtract off the mean and divide by the variance of the pixels.
float_image = tf.image.per_image_standardization(resized_image)
# Set the shapes of tensors.
float_image.set_shape([height, width, 3])
read_input.label.set_shape([1])
# Ensure that the random shuffling has good mixing properties.
min_fraction_of_examples_in_queue = 0.4
min_queue_examples = int(num_examples_per_epoch *
min_fraction_of_examples_in_queue)
# Generate a batch of images and labels by building up a queue of examples.
return _generate_image_and_label_batch(float_image, read_input.label,
min_queue_examples, batch_size,
shuffle=False)
from cifar10_input:
def _generate_image_and_label_batch(image, label, min_queue_examples,
batch_size, shuffle):
"""Construct a queued batch of images and labels.
Args:
image: 3-D Tensor of [height, width, 3] of type.float32.
label: 1-D Tensor of type.int32
min_queue_examples: int32, minimum number of samples to retain
in the queue that provides of batches of examples.
batch_size: Number of images per batch.
shuffle: boolean indicating whether to use a shuffling queue.
Returns:
images: Images. 4D tensor of [batch_size, height, width, 3] size.
labels: Labels. 1D tensor of [batch_size] size.
"""
# Create a queue that shuffles the examples, and then
# read 'batch_size' images + labels from the example queue.
num_preprocess_threads = 16
if shuffle:
images, label_batch = tf.train.shuffle_batch(
[image, label],
batch_size=batch_size,
num_threads=num_preprocess_threads,
capacity=min_queue_examples + 3 * batch_size,
min_after_dequeue=min_queue_examples)
else:
images, label_batch = tf.train.batch(
[image, label],
batch_size=batch_size,
num_threads=1,
capacity=1,
enqueue_many = False)
# Display the training images in the visualizer.
tf.summary.image('images', images)
return images, tf.reshape(label_batch, [batch_size])
Edit:
Partial solution is given in the comments below. Information loss is dependent on batch size so it turns out that increasing the batch size (in mapping mode only) is an effective fix.
However, I'm still unsure why it loses and/or scrambles information when the batch size is exceeded. Presumably the batches are taken is some non-sequential order. I don't need it to take the project forwards but if someone could explain how or why this happens it would be greatly appreciated.
Edit 2:
It's back! I've set the batch size to be equivalent to one binary file (in my case roughly 10,000 images). Data is not lost or jumbled within this batch but when I try to process multiple files (about 30) it mixes up the batches a little rather than outputting them on a FIFO basis.
A picture is probably the easiest way for you to see what is going on:
classification map
This is a reconstructed image of a rock face from which the classifier has been trained to recognize three categories. As you can see the reconstruction is mostly smooth. However, there are two clean breaks near the top of the image where a batch (or 3) has been outputted in non-chronological order. These should have appeared at the bottom of the image rather than near the top.
Related
To train a neural network, I modified a code I found on YouTube. It looks as follows:
def data_generator(samples, batch_size, shuffle_data = True, resize=224):
num_samples = len(samples)
while True:
random.shuffle(samples)
for offset in range(0, num_samples, batch_size):
batch_samples = samples[offset: offset + batch_size]
X_train = []
y_train = []
for batch_sample in batch_samples:
img_name = batch_sample[0]
label = batch_sample[1]
img = cv2.imread(os.path.join(root_dir, img_name))
#img, label = preprocessing(img, label, new_height=224, new_width=224, num_classes=37)
img = preprocessing(img, new_height=224, new_width=224)
label = my_onehot_encoded(label)
X_train.append(img)
y_train.append(label)
X_train = np.array(X_train)
y_train = np.array(y_train)
yield X_train, y_train
Now, I tried to train a neural network using this code, train sample size is 105.000 (image files which contain 8 characters out of 37 possibilities, A-Z, 0-9 and blank space).
I used a relatively small batch size (32, I think that is already too small) to get it more efficient but nevertheless it took like forever to train one quarter of the first epoch (I had 826 steps per epoch, and it took 90 minutes for 199 steps... steps_per_epoch = num_train_samples // batch_size).
The following functions are included in the data generator:
def shuffle_data(data):
data=random.shuffle(data)
return data
I don't think we can make this function anyhow more efficient or exclude it from the generator.
def preprocessing(img, new_height, new_width):
img = cv2.resize(img,(new_height, new_width))
img = img/255
return img
For preprocessing/resizing the data I use this code to get the images to a unique size of e.g. (224, 224, 3). I think, this part of the generator takes the most time, but I don't see a possibility to exclude it from the generator (since my memory would be full, if we resize the images outside the batches).
#One Hot Encoding of the Labels
from numpy import argmax
# define input string
def my_onehot_encoded(label):
# define universe of possible input values
characters = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ '
# define a mapping of chars to integers
char_to_int = dict((c, i) for i, c in enumerate(characters))
int_to_char = dict((i, c) for i, c in enumerate(characters))
# integer encode input data
integer_encoded = [char_to_int[char] for char in label]
# one hot encode
onehot_encoded = list()
for value in integer_encoded:
character = [0 for _ in range(len(characters))]
character[value] = 1
onehot_encoded.append(character)
return onehot_encoded
I think, in this part there could be one approach to make it more efficient. I am thinking about to exclude this code from the generator and produce the array y_train outside of the generator, so that the generator does not have to one hot encode the labels every time.
What do you think? Or should I maybe go for a completely different approach?
I have found your question very intriguing because you give only clues. So here is my investigation.
Using your snippets, I have found GitHub repository and 3 part video tutorial on YouTube that mainly focuses on the benefits of using generator functions in Python.
The data is based on this kaggle (I would recommend to check out different kernels on that problem to compare the approach that you already tried with another CNN networks and review API in use).
You do not need to write a data generator from scratch, though it is not hard, but inventing the wheel is not productive.
Keras has the ImageDataGenerator class.
Plus here is a more generic example for DataGenerator.
Tensorflow offers very neat pipelines with their tf.data.Dataset.
Nevertheless, to solve the kaggle's task, the model needs to perceive single images only, hence the model is a simple deep CNN. But as I understand, you are combining 8 random characters (classes) into one image to recognize multiple classes at once. For that task, you need R-CNN or YOLO as your model. I just recently opened for myself YOLO v4, and it is possible to make it work for specific task really quick.
General advice about your design and code.
Make sure the library uses GPU. It saves a lot of time. (Even though I repeated flowers experiment from the repository very fast on CPU - about 10 minutes, but resulting predictions are no better than a random guess. So full training requires a lot of time on CPU.)
Compare different versions to find a bottleneck. Try a dataset with 48 images (1 per class), increase the number of images per class, and compare. Reduce image size, change the model structure, etc.
Test brand new models on small, artificial data to prove the idea or use iterative process, start from projects that can be converted to your task (handwriting recognition?).
I am trying to fit a CNN model (AlexNet architecture) with a 4900 images(480*640*3) dataset and I would like to do Data Augmentation, I have done a custom generator which use ImageDataGenerator method, because the images are on different paths and the labels too, so I have done a class who take all paths and save on two lists the images paths and its labels, then it loads on batches of 32 images and labels and fit the image data generator:
This is the method of the custom generator called from the model when it´s fit, and is where I fit the ImageDataGenerator
def __getitem__(self,index) :
batch_x=self.img_filenames[index * self.batch_size : (index+1) * self.batch_size]
batch_y=self.labels[index * self.batch_size: (index+1) * self.batch_size]
gen=ImageDataGenerator(rescale=1./255,
rotation_range=90,
brightness_range=(0.1,0.9),
horizontal_flip=True)
X=[plt.imread(filename) for filename in batch_x]
X,Y = next(gen.flow(x= np.array(X), y= np.array(batch_y), batch_size=self.batch_size))
return X,Y
I have some questions:
What is supposed that ImageDataGenerator returns, if I pass 32(batch_size) differents images, it returns 32 modified images, 1 for each one, or 32 images for each one, and if I only pass 1 image with a batch size of 32, it returns 32 modified images from that one? I'm almost sure that are 1 for each one but I want to confirm.
Secondly, if I want to have 40k images, if I change the index to 0 again when it exceed samples//batch_size,and change the len method multiplying by 2 or whatever I want, it is supposed that as the images are generated randomly, I will have 4900 new images or as much as I want isn´t it?
The main problem is that when it reach 0.5 accuracy it stops increasing, I have tried with 3 epochs and it is the same, it increase till 3 or 4 batches and then stops, so that is why my doubts.
Thanks you.
Let me try to answer
1. If you pass batch size 32 to ImageDataGenerator with horizontal_flip=True only, it flip all of 32 images horizontally and passes these 32 +32 (original + flipped) for training.
If you set horizontal_flip and vertical_flip, then 32+32+32 images will be passed for training.
For brightness_range it produces one image for each brightness scale corresponding to one original image. It means if your brightness scale is 0.1-0.5, then 32*5 images were produced.
I am not sure about the second question. A better choice is to do more data augmentation both on training and test data.
For third question, you should try efficient net with focal loss
I printed X.shape and seems to be the 32 images but modified, so It doesn´t multiply the images. And the method to augment the data which I said works fine too.
I would like to train a convolutional recurrent neural network for video frame prediction. The individual frames are quite big so it is challenging to fit the entire training data in memory at once. As such, I followed some tutorials online to create a custom data generator. When testing it, it seems to work but it is slower by a factor of at least 100 than using the pre-loaded data directly. Since I can only fit about a batch size of 8 on the GPU I understand that the data needs to be generated really fast, however, this does not seem to be the case.
I train my model on a single P100 and have 32 GB of memory available to be used by up to 16 cores.
class DataGenerator(tf.keras.utils.Sequence):
def __init__(self, images, input_images=5, predict_images=5, batch_size=16, image_size=(200, 200),
channels=1):
self.images = images
self.input_images = input_images
self.predict_images = predict_images
self.batch_size = batch_size
self.image_size = image_size
self.channels = channels
self.nr_images = int(len(self.images)-input_images-predict_images)
def __len__(self):
return int(np.floor(self.nr_images) / self.batch_size)
def __getitem__(self, item):
# Randomly select the beginning image of each batch
batch_indices = random.sample(range(0, self.nr_images), self.batch_size)
# Allocate the output images
x = np.empty((self.batch_size, self.input_images,
*self.image_size, self.channels), dtype='uint8')
y = np.empty((self.batch_size, self.predict_images,
*self.image_size, self.channels), dtype='uint8')
# Get the list of input an prediction images
for i in range(self.batch_size):
list_images_input = range(batch_indices[i], batch_indices[i]+self.input_images)
list_images_predict = range(batch_indices[i]+self.input_images,
batch_indices[i]+self.input_images+self.predict_images)
for j, ID in enumerate(list_images_input):
x[i, ] = np.load(np.reshape(self.images[ID], (*self.imagesize, self.channels))
# Read in the prediction images
for j, ID in enumerate(list_images_predict):
y[i, ] = np.load(np.reshape(self.images[ID], (*self.imagesize, self.channels))
return x, y
# Training the model using fit_generator
params = {'batch_size': 8,
'input_images': 5,
'predict_images': 5,
'image_size': (100, 100),
'channels': 1
}
data_path = "input_frames/"
input_images = sorted(glob.glob(data_path + "*.png"))
training_generator = DataGenerator(input_images, **params)
model.fit_generator(generator=training_generator, epochs=10, workers=6)
I would have expected that Keras will prepare the next data batch while the current batch is being processed on the GPU but it does not seem to catch up. In other words, preparing the data before sending it to the GPU seems to be the bottleneck.
Any idea on how to improve the performance of a data generator like this? Is there something missing that guarantees that the data is being prepared in a timely manner?
Thanks a lot!
When you use fit_generator, there is a workers= setting that can be used to scale up the number of generator workers. However you should ensure that the 'item' parameter in getitem is taken into account in order to ensure that the different workers (which are not synchronised) return different values depending on item index. i.e. instead of random sample, perhaps just return a slice of the data based on the index. You can shuffle the entire dataset before starting in order to make sure the dataset order is randomised.
Can you please try with use_multiprocessing=True? These are the numbers I observe on my GTX 1080Ti based system with the data generator you provided.
model.fit_generator(generator=training_generator, epochs=10, workers=6)
148/148 [==============================] - 9s 60ms/step
model.fit_generator(generator=training_generator, epochs=10, workers=6, use_multiprocessing=True)
148/148 [==============================] - 2s 11ms/step
You can try the prefetching of tf.data.Dataset. The prefetching allows you to compute the next batch(es) using your CPU while your GPU computes the gradient descent in the same time. Be careful: you need to change the numpy array into tf.constant in the data generator. Then try:
import tensoflow as tf
generator = DataGenerator(images)
spec = [tf.TypeSpec(shape=(generator.batch_size, generator.input_images,
*generator.image_size, generator.channels), dtype='uint8'),
tf.TypeSpec(shape=(generator.batch_size, generator.predict_images,
*generator.image_size, generator.channels), dtype='uint8')
dataset = tf.data.Dataset.from_generator(DataGenerator, output_signature=spec)
dataset.batch(batch_size).prefetch(-1) # this order is important
# a custom training loop is better than model.fit() otherwise prefetching can fail
def train_loop():
...
You can change the "-1" in prefetch() to another value like 1, 2 or more to get the maximum speed depending on your machine and the batch size.
this blog helps in setting up input data pipeline with tf.data and it also is much more efficient than using ImageDataGenerators and the code is also explained by using a custom data directory.
It also enhances the performance with prefetch, cache.
Prefetch processes the next batch while the current batch is being used.
I'd like to be able to read in batches of images. However, these batches must be constructed by a python script (they cannot be placed into a file ahead of time for various reasons). What's the most efficient way, in tensorflow to do the following:
(1) Provided: A python list of B variable-length strings that point to images, all have the same size. B is the batch size.
(2) For each string, load the image it corresponds to, and apply a random crop of 5% (the crop is random but the size of the crop is fixed)
(3) Concatenate the images together into a tensor of size B x H x W x 3
If this is not possible, does anyone have any benchmarks / data on the efficiency loss of loading and preprocessing the images in python then placing them into a queue? I assume the net will run considerably faster if image loading / preprocessing is done internally on tensorflow.
This is how I understand your problem:
you have some images
you have a function sample_batch() which returns a batch of filenames of size B
you want to read the images corresponding to these filenames and preprocess them
finally you output a batch of these examples
input = tf.placeholder(tf.string, name='Input')
queue = tf.FIFOQueue(capacity, tf.string, [()], name='Queue')
enqueue_op = queue.enqueue_many(input)
reader = tf.WholeFileReader()
filename, content = reader.read(queue)
image = tf.image.decode_jpeg(content, channels=3)
# Preprocessing
image = tf.random_crop(image, [H, W, 3])
image = tf.to_float(image)
batch_image = tf.train.batch([image], batch_size=B, name='Batch')
output = inference(batch_image)
Then in the session, you have to run the enqueue operation with the filenames from your sample_batch()function:
with tf.Session() as sess:
tf.train.start_queue_runners()
for i in range(NUM_STEPS):
batch_filenames = sample_batch()
sess.run(enqueue_op, feed_dict={input: batch_filenames})
sess.run(output)
If you have the images as a byte array you can use something similar to this in your graph:
jpegs = tf.placeholder(tf.string, shape=(None))
images = tf.map_fn(lambda jpeg : your_processing_fn(jpeg), jpegs,
dtype=tf.float32)
logits = your_inference_model(images,labels)
where your_processing_fn is a function which receives a jpeg tensor of bytes, decodes, resizes and crops it and returns a image of H x W x 3
You need the latest version of tensorflow as map_fn is not in 0.8 and below.
How do I get TensorFlow example queues into proper batches for training?
I've got some images and labels:
IMG_6642.JPG 1
IMG_6643.JPG 2
(feel free to suggest another label format; I think I may need another dense to sparse step...)
I've read through quite a few tutorials but don't quite have it all together yet.
Here's what I have, with comments indicating the steps required from TensorFlow's Reading Data page.
The list of filenames
(optional steps removed for the sake of simplicity)
Filename queue
A Reader for the file format
A decoder for a record read by the reader
Example queue
And after the example queue I need to get this queue into batches for training; that's where I'm stuck...
1. List of filenames
files = tf.train.match_filenames_once('*.JPG')
4. Filename queue
filename_queue = tf.train.string_input_producer(files, num_epochs=None, shuffle=True, seed=None, shared_name=None, name=None)
5. A reader
reader = tf.TextLineReader()
key, value = reader.read(filename_queue)
6. A decoder
record_defaults = [[""], [1]]
col1, col2 = tf.decode_csv(value, record_defaults=record_defaults)
(I don't think I need this step below because I already have my label in a tensor but I include it anyways)
features = tf.pack([col2])
The documentation page has an example to run one image, not get the images and labels into batches:
for i in range(1200):
# Retrieve a single instance:
example, label = sess.run([features, col5])
And then below it has a batching section:
def read_my_file_format(filename_queue):
reader = tf.SomeReader()
key, record_string = reader.read(filename_queue)
example, label = tf.some_decoder(record_string)
processed_example = some_processing(example)
return processed_example, label
def input_pipeline(filenames, batch_size, num_epochs=None):
filename_queue = tf.train.string_input_producer(
filenames, num_epochs=num_epochs, shuffle=True)
example, label = read_my_file_format(filename_queue)
# min_after_dequeue defines how big a buffer we will randomly sample
# from -- bigger means better shuffling but slower start up and more
# memory used.
# capacity must be larger than min_after_dequeue and the amount larger
# determines the maximum we will prefetch. Recommendation:
# min_after_dequeue + (num_threads + a small safety margin) * batch_size
min_after_dequeue = 10000
capacity = min_after_dequeue + 3 * batch_size
example_batch, label_batch = tf.train.shuffle_batch(
[example, label], batch_size=batch_size, capacity=capacity,
min_after_dequeue=min_after_dequeue)
return example_batch, label_batch
My question is: how do I use the above example code with the code I have above? I need batches to work with, and most of the tutorials come with mnist batches already.
with tf.Session() as sess:
sess.run(init)
# Training cycle
for epoch in range(training_epochs):
total_batch = int(mnist.train.num_examples/batch_size)
# Loop over all batches
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
If you wish to make this input pipeline work, you will need add an asynchronous queue'ing mechanism that generate batches of examples. This is performed by creating a tf.RandomShuffleQueue or a tf.FIFOQueue and inserting JPEG images that have been read, decoded and preprocessed.
You can use handy constructs that will generate the Queues and the corresponding threads for running the queues via tf.train.shuffle_batch_join or tf.train.batch_join. Here is a simplified example of what this would like. Note that this code is untested:
# Let's assume there is a Queue that maintains a list of all filenames
# called 'filename_queue'
_, file_buffer = reader.read(filename_queue)
# Decode the JPEG images
images = []
image = decode_jpeg(file_buffer)
# Generate batches of images of this size.
batch_size = 32
# Depends on the number of files and the training speed.
min_queue_examples = batch_size * 100
images_batch = tf.train.shuffle_batch_join(
image,
batch_size=batch_size,
capacity=min_queue_examples + 3 * batch_size,
min_after_dequeue=min_queue_examples)
# Run your network on this batch of images.
predictions = my_inference(images_batch)
Depending on how you need to scale up your job, you might need to run multiple independent threads that read/decode/preprocess images and dump them in your example queue. A complete example of such a pipeline is provided in the Inception/ImageNet model. Take a look at batch_inputs:
https://github.com/tensorflow/models/blob/master/inception/inception/image_processing.py#L407
Finally, if you are working with >O(1000) JPEG images, keep in mind that it is extremely inefficient to individually ready 1000's of small files. This will slow down your training quite a bit.
A more robust and faster solution to convert a dataset of images to a sharded TFRecord of Example protos. Here is a fully worked script for converting the ImageNet data set to such a format. And here is a set of instructions for running a generic version of this preprocessing script on an arbitrary directory containing JPEG images.