I am building a standard image classification model with Tensorflow. For this I have input images, each assigned with a label (number in {0,1}). The Data can hence be stored in a list using the following format:
/path/to/image_0 label_0
/path/to/image_1 label_1
/path/to/image_2 label_2
...
I want to use TensorFlow's queuing system to read my data and feed it to my model. Ignoring the labels, one can easily achieve this by using string_input_producer and wholeFileReader. Here the code:
def read_my_file_format(filename_queue):
reader = tf.WholeFileReader()
key, value = reader.read(filename_queue)
example = tf.image.decode_png(value)
return example
#removing label, obtaining list containing /path/to/image_x
image_list = [line[:-2] for line in image_label_list]
input_queue = tf.train.string_input_producer(image_list)
input_images = read_my_file_format(input_queue)
However, the labels are lost in that process as the image data is purposely shuffled as part of the input pipeline. What is the easiest way of pushing the labels together with the image data through the input queues?
Using slice_input_producer provides a solution which is much cleaner. Slice Input Producer allows us to create an Input Queue containing arbitrarily many separable values. This snippet of the question would look like this:
def read_labeled_image_list(image_list_file):
"""Reads a .txt file containing pathes and labeles
Args:
image_list_file: a .txt file with one /path/to/image per line
label: optionally, if set label will be pasted after each line
Returns:
List with all filenames in file image_list_file
"""
f = open(image_list_file, 'r')
filenames = []
labels = []
for line in f:
filename, label = line[:-1].split(' ')
filenames.append(filename)
labels.append(int(label))
return filenames, labels
def read_images_from_disk(input_queue):
"""Consumes a single filename and label as a ' '-delimited string.
Args:
filename_and_label_tensor: A scalar string tensor.
Returns:
Two tensors: the decoded image, and the string label.
"""
label = input_queue[1]
file_contents = tf.read_file(input_queue[0])
example = tf.image.decode_png(file_contents, channels=3)
return example, label
# Reads pfathes of images together with their labels
image_list, label_list = read_labeled_image_list(filename)
images = ops.convert_to_tensor(image_list, dtype=dtypes.string)
labels = ops.convert_to_tensor(label_list, dtype=dtypes.int32)
# Makes an input queue
input_queue = tf.train.slice_input_producer([images, labels],
num_epochs=num_epochs,
shuffle=True)
image, label = read_images_from_disk(input_queue)
# Optional Preprocessing or Data Augmentation
# tf.image implements most of the standard image augmentation
image = preprocess_image(image)
label = preprocess_label(label)
# Optional Image and Label Batching
image_batch, label_batch = tf.train.batch([image, label],
batch_size=batch_size)
See also the generic_input_producer from the TensorVision examples for full input-pipeline.
There are three main steps to solving this problem:
Populate the tf.train.string_input_producer() with a list of strings containing the original, space-delimited string containing the filename and the label.
Use tf.read_file(filename) rather than tf.WholeFileReader() to read your image files. tf.read_file() is a stateless op that consumes a single filename and produces a single string containing the contents of the file. It has the advantage that it's a pure function, so it's easy to associate data with the input and the output. For example, your read_my_file_format function would become:
def read_my_file_format(filename_and_label_tensor):
"""Consumes a single filename and label as a ' '-delimited string.
Args:
filename_and_label_tensor: A scalar string tensor.
Returns:
Two tensors: the decoded image, and the string label.
"""
filename, label = tf.decode_csv(filename_and_label_tensor, [[""], [""]], " ")
file_contents = tf.read_file(filename)
example = tf.image.decode_png(file_contents)
return example, label
Invoke the new version of read_my_file_format by passing a single dequeued element from the input_queue:
image, label = read_my_file_format(input_queue.dequeue())
You can then use the image and label tensors in the remainder of your model.
In addition to the answers provided there are few other things you can do:
Encode your label into the filename. If you have N different categories you can rename your files to something like: 0_file001, 5_file002, N_file003. Afterwards when you read the data from a reader key, value = reader.read(filename_queue) your key/value are:
The output of Read will be a filename (key) and the contents of that file (value)
Then parse your filename, extract the label and convert it to int. This will require a little bit of preprocessing of the data.
Use TFRecords which will allow you to store the data and labels at the same file.
Related
Hi everyone I'm facing an issue after that I elaborate images and labels. To create an unique dataset I use the zip function. After the elaboration both images and labels are 18k and it's correct but when I call the zip(image,labels), items become 563.
Here some code to let you to understand:
# Map the load_and_preprocess_image function over the dataset of image paths
images = image_paths.map(load_and_preprocess_image)
# Map the extract_label function over the dataset of image paths
labels = image_paths.map(extract_label)
# Zip the labels and images together to create a dataset of (image, label) pairs
#HERE SOMETHING STRANGE HAPPENS
data = tf.data.Dataset.zip((images,labels))
# Shuffle and batch the data
data = data.shuffle(buffer_size=1000).batch(32)
# Split the data into train and test sets
data = data.shuffle(buffer_size=len(data))
# Convert the dataset into a collection of data
num_train = int(0.8 * len(data))
train_data = image_paths.take(num_train)
val_data = image_paths.skip(num_train)
I cannot see where is the error. Can you help me plese? Thanks
I'd like to have a dataset of 18k images,labels
tf's zip
tf.data.Dataset.zip is not like Python's zip. The tf.data.Dataset.zip's input is tf datasets. You may check the images/label return from your map function is the correct tf.Dataset object.
check tf.ds
make sure your image/label is correct tf.ds.
print("ele: ", images_dataset.element_spec)
print("num: ", images_dataset.cardinality().numpy())
print("ele: ", labels_dataset.element_spec)
print("num: ", labels_dataset.cardinality().numpy())
workaround
In your case, combine the image and label processing in one map function and return both to bypass to use tf.data.Dataset.zip:
# load_and_preprocess_image_and_label
def load_and_preprocess_image_and_label(image_path):
""" load image and label then some operations """
return image, label
# Map the load_and_preprocess_image function over the dataset of image/label paths
train_list = tf.data.Dataset.list_files(str(PATH / 'train/*.jpg'))
data = train_list.map(load_and_preprocess_image_and_label,
num_parallel_calls=tf.data.AUTOTUNE)
I have a folder of jpeg images that I'm trying to convert to a folder of tfrecords. The best I can do, from this code, is to write all jpegs to one tfrecords file, but I'm not sure how to use that (large tfrecords file) AND my other starter code requires individual tfrecord files for each image. For example, I was given a folder of 5 tfrecs to use to begin with.
# Source: https://stackoverflow.com/questions/33849617/how-do-i-convert-a-directory-of-jpeg-images-to-tfrecords-file-in-tensorflow
# Note: modified from source
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
# images and labels array as input
def convert_to(images, labels, output_directory, name):
num_examples = labels.shape[0]
if images.shape[0] != num_examples:
raise ValueError("Images size %d does not match label size %d." %
(images.shape[0], num_examples))
rows = images.shape[1]
cols = images.shape[2]
depth = 1
filename = os.path.join(output_directory, name + '.tfrecords')
print('Writing', filename)
writer = tf.python_io.TFRecordWriter(filename)
for index in range(num_examples):
image_raw = images[index].tobytes()
example = tf.train.Example(features=tf.train.Features(feature={
'height': _int64_feature(rows),
'width': _int64_feature(cols),
'depth': _int64_feature(depth),
'label': _int64_feature(int(labels[index])),
'image_raw': _bytes_feature(image_raw)}))
writer.write(example.SerializeToString())
Above is my function convert_to, can this be changed to answer my question? Below is the rest, you can see the 2nd to last line (#s) it is correctly given the array and labels from the 300 images.
def read_image(file_name, images_path):
image = skimage.io.imread(images_path + file_name)
return image
def extract_image_index_make_label(img_name):
#remove_ext = img_name.split(".")[0]
# name, serie, repetition, char = remove_ext.split("_")
# label = int(char) + 1000 * int(repetition) + 1000_000 * int(serie)
label = random.randint(1,300)
return label
images_path = "/content/monet_jpg/"
image_list = os.listdir(images_path)
images = []
labels = []
for img_name in tqdm(image_list):
images.append(read_image(img_name, images_path))
labels.append(extract_image_index_make_label(img_name))
images_array = np.array(images)
labels = np.array(labels)
#print(images_array.shape)
print(images_array.shape, labels.shape)
# (300, 256, 256, 3) (300,)
convert_to(images_array, labels, ".", "ALL_MONET_TFREC")
Even using a folder of Tfrecs would still have efficiency benefits over a folder of jpegs correct? Anyway that is what my starter code is setup to use.
I can give you some examples from a real-case situation I have been working on:
Set of images (15000 .jpeg images, 15 classes (i.e. 1000 per class), 224x224x3). The size on disk is 398 MB, while the set of records in TFRecord is 336 MB. In this, we also consider the overhead of other metadata attachment to the TFRecord (for example, a str label attached to every protobuffer instance. Therefore, you may see a reduction of 398-336 = 62 MB, in spite of additional metadata being included. Consider that further reduction could be made if the proto contained only the serialized image.
Training speed. If using tf.data.Dataset() together with TFRecord() dataset, the speed of the training increases. For example, in the same scenario with the dataset, the length of an epoch decreased with ~30 seconds in my case when using tf.data.Dataset() + TFRecord() versus tf.data.Dataset() + tf.data.Dataset.from_tensor_slices() (without changing anything related to the pipeline except for the ones above, no other network, HParams, CPU, GPU etc.)
Indeed, those differences are specific to the task, but I have noticed overall improvements in :
Size alloted on disk (allocating one big chunk takes a bit less space than size resulting from splitting into a lot of chunks).
Training speed (which I find even more important, in my view).
Note also some other advantages not included in my previous example:
Fast I/O: the TFRecord format can be read with parallel I/O
operations, which is useful for TPUs or multiple hosts.
I've created a dataloader for my object detection task.
However, I cannot place the image/path name to a tensor. Instead I have it indexed, where in the last portion of the dataloader class, I have this:
target = {}
target['boxes'] = boxes
target['labels'] = labels
target['image_id'] = torch.tensor([index])
target['area'] = area
target['iscrowd'] = iscrowd
target['image_name'] = torch.tensor(index)
return image, target
where atm image_id and image_name are the same thing.
When I print out the image_name from the dataloader, I of course get this:
for image, target in valid_data_loader:
print(target[0]['image_name'])
Output:
tensor(0)
tensor(1)
tensor(2)
tensor(3)
tensor(4)
tensor(5)
tensor(6)
tensor(7)
I'm aware that strings can't be saved into torch tensors, so is there any way I can refer back to the original image name rather than the index of the tensor? Or would I just have to use the number that comes out and refer back to the dataset class (not dataloader)?
I ultimately want to save the image name, and attributes such as bounding box info to a separate numpy dataframe.
Ok, so this is a bit ad-hoc and not exactly what I was thinking but here is one method I have used to retrieve the paths/image names. I basically find the id from the dataloader by removing it from the tensor. I then use the tensor_id to find the corresponding id in the original dataframe:
for image, target in valid_data_loader:
tensor_id = target[0]['image_name'].item()
print(valid_df.iloc[tensor_id]['image_id'])
I don't know if this is efficient though but it got what I wanted...
I have a lot of images (pydicom files). I would like to divide in half. From 1 image, I would like 2 images: part left and part right.
Input: 1000x1000
Output: 500x1000 (width x height).
Currently, I can only read a file.
ds = pydicom.read_file(image_fps[0]) # read dicom image from filepath
First part, I would like to put half in one folder and the other half to second.
This is what I have:
enter image description here
This is what I want:
enter image description here
I use Mask-RCNN to object localization problem. I would like crop 50% of image size (pydicom file).
EDIT1:
import SimpleITK as sitk
filtered_image = sitk.GetImageFromArray(left_part)
sitk.WriteImage(filtered_image, '/home/wojtek/Mask/nnna.dcm', True)
I have dicom file, but I can't display it.
this transfer syntax JPEG 2000 Image Compression (Lossless Only), can not be read because Pillow lacks the jpeg 2000 decoder plugin
Once you have executed pydicom.dcm_read() your pixel data is available at ds.pixel_array. You can just slice the data you want and save it with any suitable library. In this example I will be using matplotlib as I also use that for verifying whether my slicing is correct. Adjust to your needs obviously, one thing you need to do is generate the correct path/filenames for saving. Have fun!
(this script assumes the filepaths are available in a paths variable)
import pydicom
import matplotlib
# for testing if the slice is correct
from matplotlib import pyplot as plt
for path in paths:
# read the dicom file
ds = pydicom.dcmread(path)
# find the shape of your pixel data
shape = ds.pixel_array.shape
# get the half of the x dimension. For the y dimension use shape[0]
half_x = int(shape[1] / 2)
# slice the halves
# [first_axis, second_axis] so [:,:half_x] means slice all from first axis, slice 0 to half_x from second axis
left_part = ds.pixel_array[:, :half_x]
right_part = ds.pixel_array[:,half_x:]
# to check whether the slices are correct, matplotlib can be convenient
# plt.imshow(left_part); do not do this in the loop
# save the files, see the documentation for matplotlib if you want a different format
# bmp, png are surely supported
path_to_left_image = 'generate\the\path\and\filename\for\the\left\image.bmp'
path_to_right_image = 'generate\the\path\and\filename\for\the\right\image.bmp'
matplotlib.image.imsave(path_to_left_image, left_part)
matplotlib.image.imsave(path_to_right_image, right_part)
If you want to save the DICOM files keep in mind that they may not be valid DICOM if you do not update the appropriate data. For instance the SOP Instance UID is technically not allowed to be the same as in the original DICOM file, or any other SOP Instance UID for that matter. How important that is, is up to you.
With a script like below you can define named slices and split any dicom image file it finds in the supplied path into the appropriate slices.
import os
import pydicom
import numpy as np
def save_partials(parts, path_to_directory):
"""
parts: list of tuples, each tuple specifying a name and a list of four slice offsets
path_to_directory: path to directory containing dicom files
any file with a .dcm extension will have its image data split into the specified slices and saved accordingly.
original file will not be modified
"""
dir_content = [os.path.join(path_to_directory, item) for item in os.listdir(path_to_directory)]
files = [i for i in dir_content if os.path.isfile(os.path.join(path_to_directory, i))]
for file in files:
root, extension = os.path.splitext(file)
if extension.lower() != '.dcm':
# not a .dcm file, continue with next iteration of loop
continue
for part in parts:
ds = pydicom.read_file(file)
if not isinstance(ds.pixel_array, np.ndarray):
# no image data available
continue
part_name = part[0]
p = part[1] # slice list
ds.PixelData = ds.pixel_array[p[0]:p[1], p[2]:p[3]].tobytes()
ds.Rows = p[1] - p[0]
ds.Columns = p[3] - p[2]
##
## Here you can modify any tags using ds.KeyWord
##
new_file_name = "{r}-{pn}{ext}".format(r=root, pn=part_name, ext=extension)
ds.save_as(new_file_name)
print('saved {}'.format(new_file_name))
dir_path = '/home/wojtek/Mask'
parts = [('left', [0,512,0,256]),
('right', [0,512,256,512])]
save_partials(parts, dir_path)
I try to create a Dataset for Tensorflow from a CSV file that I created with pandas.
The csv file looks like this:
feature1 feature2 filepath label
0.25 0.35 test1.jpg A
0.33 0.15 test2.jpg B
I read the dataframe like this
mydf = pd.read_csv("TraingDatafinal.csv",header=0)
Now I have defined a function which should return a dataframe. This is all according to the quickstart guide
def train_input_fn(features, labels, batch_size):
"""An input function for training"""
# Convert the inputs to a Dataset.
dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
# Shuffle, repeat, and batch the examples.
dataset = dataset.shuffle(1000).repeat().batch(batch_size)
dataset = dataset.map(mappingfunction)
# Return the dataset
return dataset
I call this function like this;
mydataset = train_input_fn(mydf.drop(["label"],axis=1),mydf["label"],200)
This works, if I remove the mapping but I get a questionmark when I print the shape. Why? The dimensions seem to be clearly defined.
This is where the real struggle begins. I want to create a mapping function, that replaces the filepath with an array of the image.
I tried to achieve that by writing this mappingfunction
def mappingfunction(feature,label):
print(feature['Filename'])
image = tf.read_file(feature['Filename'])
image = tf.image.decode_image(image)
return image,label
This will only return the image and the label. I don't know how I would realize it to return all the features but the filepath.
But even this simplified verison won't work. I get an "expected binary or unicode string" error. Can you help me?
The mapping function should return all features and the label. For example:
def mappingfunction(feature,label):
print(feature['Filename'])
image = tf.read_file(feature['Filename'])
image = tf.image.decode_image(image)
features['image'] = image
return features, label
This will add an image key to the features dictionary.