I have 3D image(tiff) data and each volume inside a folder. I want to read the data and make batch tensor for convolution network. I can read the data as numpy array but I don't how to make batch tensor input for CNN. Here is the code I have
import os
import tensorflow as tf
import numpy as np
from skimage import io
from matplotlib import pyplot as plt
from pathlib import Path
data_dir = 'C:/Users/myname/Documents/Projects/Segmentation/DeepLearning/L-net/data/'
data_folders = os.listdir(data_dir)
train_input = []
train_output = []
test_input = []
test_output = []
for idx, folder in enumerate(data_folders):
im = io.imread(data_dir+folder+'/f0.tiff')
im = im/im.max()
train_input.append(tf.convert_to_tensor(im, dtype=tf.float32))
im = io.imread(data_dir+folder+'/g0.tiff')
im = im/im.max()
train_output.append(tf.convert_to_tensor(im, dtype=tf.float32))
Since I am using 3D filters for CNN, input should 5D tesnor. Can someone help me with this? Thanks.
With your approach, you have to load all data at once into memory and also you have to take care of all dimensions. I would suggest using Keras flow_from_directory and generators. Keras has this ImageDataGenerator class which allows the users to perform image collection from directories, change all images to any size you want, shuffle them, ... . You can find the documentation here at their website.
Download the train dataset and test dataset, extract them into 2 different folders named as “train” and “test”. The train folder should contain ‘n’ folders each containing images of respective classes. For example, In the Dog vs Cats data set, the train folder should have 2 folders, namely “Dog” and “Cats” containing respective images inside them.
This is an example on how to create a dataset for your model's input:
train_generator = train_datagen.flow_from_directory(
directory=r"C:/Users/myname/Documents/Projects/Segmentation/DeepLearning/L-net/data/",
target_size=(224, 224), # the size of your input images
color_mode="rgb", # could be grayscale or rgb
batch_size=32, # Number of images in each batsh
class_mode="categorical",
shuffle=True, # Whether to shuffle the images or not
seed=42 # Random seed for applying random image augmentation
)
You can perform training like this:
STEP_SIZE_TRAIN=train_generator.n//train_generator.batch_size
STEP_SIZE_VALID=valid_generator.n//valid_generator.batch_size
model.fit_generator(generator=train_generator,
steps_per_epoch=STEP_SIZE_TRAIN,
validation_data=valid_generator,
validation_steps=STEP_SIZE_VALID,
epochs=10
)
Related
When I append my labels I end up with 20580 for the length of y when what I'm hoping to do is end up with 120 which is the number of categories. How can I append the categories to my labels?
import numpy as np
import matplotlib.pyplot as plt
import os
import cv2
import random as rand
import time
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Activation, Dropout, Flatten, MaxPooling2D
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.optimizers import Adam
config = tf.compat.v1.ConfigProto(gpu_options=tf.compat.v1.GPUOptions(allow_growth=True))
sess = tf.compat.v1.Session(config=config)
DATADIR = "C:/Users/samue/Documents/Datasets/DogBreeds/images/Images"
CATEGORIES = os.listdir("C:/Users/samue/Documents/Datasets/DogBreeds/images/Images")
IMG_SIZE = 100
training_data = []
def create_training_data():
for category in CATEGORIES:
path = os.path.join(DATADIR, category)
class_num = CATEGORIES.index(category)
for img in os.listdir(path):
try:
img_array = cv2.imread(os.path.join(path,img), cv2.IMREAD_GRAYSCALE)
new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
training_data.append([new_array, class_num])
except Exception as e:
pass
create_training_data()
rand.shuffle(training_data)
X = []
y = []
for features, label in training_data:
X.append(features)
y.append(label)
X = np.array(X).reshape(-1, IMG_SIZE, IMG_SIZE, 1)
y = np.array(y).reshape(-1,)
print(len(CATEGORIES))
print(len(X))
print(len(y))
The outputs I get at the end are:
120
20580
20580
I think you should step back a little from the implementation details or even this specific problem to try to understand what is going on. In image classification, the objective is to classify the input, a 2D or - 3D tensor if it's multichannel image - by assigning it to a label. The number of labels is finite, you can only classify into a certain number of class.
To give an example, let's take the MNIST database. It is a well known dummy-dataset used for image classification tasks. In the training set, there are 60,000 1x28x28-images representing handwritten digits. Generally speaking, the goal with this dataset is to classify properly each image to a total of 10 labels. The labels correspond to numbers "0", "1", "2", ..., and so on until "9". So the question in this particular case is given image X, my model needs to predict a class for this image: either "0", "1", ..., or "9", there are only 10 possibilities. In supervised learning, we use labels to train the model. For any given input, we need to know the ground-truth i.e. the real class this input belongs to. So in turn you end up with as many inputs as there are labels: because each one is assigned it's own label, regardless of the number of unique possible labels.
In your use case, it seems you are working with a total of 120 classes and 20,580 images. That's 20,580 unique data inputs. Remember, we need to have, for each one of those images, a corresponding ground-truth: the real class this image belongs to. So naturally you would end up with a total of 20,580 labels as well.
This might have been the source of your confusion: in my own terms label is different to class. A class set is a unique set of entities (animals, digits, ...) while a label refers to a particular class inside a class set.
I think you are a bit confused. You should have a data set consisting of 120 classes.
For each of those classes you need to have images characteristic of that class. For example assume you are building a classifier to distinguish between images of dogs and images of cats. So you have 2 classes. You can structure your directories as follows
source_dir
----------cats_dir
------------------cats first image
------------------cats second images
------------------cats nth image
----------dogs_dir
------------------dogs first image
------------------dogs second image
------------------dogs m th image
For your case you will have 120 sub directories (class directories) below the source_dir and each of these should contain images associated with class. In your case it would appear that you have a total of 20580 images. If they are evenly distributed you have roughly 171 images for each class. Now you want to use these images to train a CNN. You can do it the way you were proceeding however I recommend against it because you will end up putting all 20580 100 X 100 images into memory all at once. This will take a very big memory and you are likely to get an OOM (out of memory) error. The way you solve that is to feed the data to your model in batches. For example 32 images at a time. Now Keras has useful functions to assist you in doing that. If you have the directory structure as shown above you can use the ImageDataGenerator.flow_from_directory to feed your images to the model in batches. Documentation is here. This function also enables you to use image augmentation to help expand the diversity of your data set. Below is the code I recommend for the example of dog/cat classification I mentioned above.
source_dir-r'c:\temp\cats_and_dogs'
v_split=.2 # set this to determine the percentage of data to allocate to the validation set
data_gen=ImageDataGenerator(rescale=1/255,validation_split=v_split)
train_gen=data_gen.flow_from_directory(source_dir, target_size=(100,100),
class_mode='categorical', batch_size=32,
subset='training', color_mode='grayscale)
valid_gen=data_gen.flow_from_directory(source_dir, target_size=(100,100),
class_mode='categorical', batch_size=32,
subset='validation', color_mode='grayscale, shuffle=False)
when you compile your model set loss='categorical_crossentropy'.
you can use the two generators above as inputs to model.fit
i would like to ask my almost final question related to the previous questions :
there is problem description :
so i have two category(dogs and cats), below i have following code for reading data into list and converting them to the array(numpy array)
it is for mounting google drive
from google.colab import drive
drive.mount("/content/drive", force_remount=True)
importing all necessary libraries(glob actually i dont need but let stay)
import numpy as np
import matplotlib.pyplot as plt
import os
import cv2
import glob
it is just demonstrating of reading and displaying images for the future purpose
#Set main directory and also categories. read the images
MainDirectory ="/content/drive/My Drive/Colab Notebooks/2020YearDeepLearning/Animals/PetImages/"
Categories =["Dog","Cat"]
for category in Categories:
path =os.path.join(MainDirectory,category)
print(path)
for img in os.listdir(path):
img_array =cv2.imread(os.path.join(path,img),cv2.IMREAD_GRAYSCALE)
plt.imshow(img_array,cmap="gray")
plt.show()
break
break
Demonstration of reshaping image
IMG_SIZE=70
img_array =cv2.resize(img_array,(IMG_SIZE,IMG_SIZE))
plt.imshow(img_array,cmap='gray')
plt.show()
now there is actual code which means to read data and also labels(dogs and cat , dog is 0 and cat is 1) and putting them into array
#Create a training Data
training_data =[]
for category in Categories:
path =os.path.join(MainDirectory,category)
class_num =Categories.index(category)
for img in os.listdir(path):
try:
img_array =cv2.imread(os.path.join(path,img),cv2.IMREAD_GRAYSCALE)
img_array =cv2.resize(img_array,(IMG_SIZE,IMG_SIZE))
training_data.append([img_array,class_num])
except Exception as e:
pass
after that one i just shuffled data
import random
random.shuffle(training_data)
separating data into X and y and convert to the numpy array with corresponding reshaping
X =[]
y =[]
for features,label in training_data:
X.append(features)
y.append(label)
X =np.array(X).reshape(-1,IMG_SIZE,IMG_SIZE,1)
y =np.array(y)
i would like to demonstrate that there is only two possible value for y(dog is 0 and cat is 1 )
print(np.unique(y)) - which returns[0, 1]
now actual code
#create simple convolutional neural network
#normalize data and load all necessary libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPool2D,Activation
X =X/255.0
model =Sequential()
model.add(Conv2D(filters=32,kernel_size=(3,3),input_shape=X.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Conv2D(filters=32,kernel_size=(3,3)))
model.add(Activation('relu'))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(units=32))
model.add(Dense(units=1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
i have trained data using following command
model.fit(X,y,batch_size=16,validation_split=0.1,epochs=10)
and here is image of training
after that , i took random picture of cat and dog and run following command (this example i am using dog picture)
#for testing
image =cv2.imread("/content/drive/My Drive/Colab Notebooks/2020YearDeepLearning/Animals/test.jpg")
image =cv2.resize(image,(IMG_SIZE,IMG_SIZE))
image =np.array(image).reshape(-1,IMG_SIZE,IMG_SIZE,1)
print(model.predict_classes(image))
result is this one :
for more details.
[[0]
[0]
[0]]
for cat i am getting this one
[[1]
[0]
[0]]
should i get result with three element? I mean array of three element? Actually i have two class right? please tell me if i am wrong
Here is what I suspect:
If your image is not gray, meaning it has three channels like a normal RBG image would have, then your resize here image =np.array(image).reshape(-1,IMG_SIZE,IMG_SIZE,1) actually makes the returned image the shape (3, IMG_SIZE, IMG_SIZE, 1), which means that you actually feed in three samples each with 1 channel when you predict, and of course you will get back three results.
Plus, when you load image to train, you load with grayscale, but when you load for predicting, you forgot to do so. So this is why your training works but not predicting.
I'm using keras' pre-trained model VGG16, following this link: Transfer learning I'm trying predict content of an image:
# example of using a pre-trained model as a classifier
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.applications.vgg16 import preprocess_input
from keras.applications.vgg16 import decode_predictions
from keras.applications.vgg16 import VGG16
# load an image from file
image = load_img('dog.jpg', target_size=(224, 224))
# convert the image pixels to a numpy array
image = img_to_array(image)
# reshape data for the model
image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))
# prepare the image for the VGG model
image = preprocess_input(image)
# load the model
model = VGG16()
# predict the probability across all output classes
yhat = model.predict(image)
# convert the probabilities to class labels
label = decode_predictions(yhat)
# retrieve the most likely result, e.g. highest probability
label = label[0][0]
# print the classification
print('%s (%.2f%%)' % (label[1], label[2]*100))
Full Error Output:
ValueError: decode_predictions expects a batch of predictions (i.e. a 2D array of shape (samples, 2622)) for V1 or (samples, 8631) for V2.Found array with shape: (1, 1000)
This is link to a seemingly similar question on SO.
Any comments and suggestions highly appreciated. Thank you!
I ran your code and it works properly. Since I do not have your image dog.jpg I used a color jpg image of an Afghan dog and the network identified it correctly as an Afghan Hound. So I suspect there is something amiss with your image. Yhat is a 1 X 1000 array as expected. Ensure you image is an rgb image.
thank you for your help. I was running this in Colab and had earlier tests code where in different cell i have imported for:
from keras_vggface.vggface import VGGFace
from keras_vggface.utils import preprocess_input
from keras_vggface.utils import decode_predictions
That was the reason for the error.... –
As I read during my research for my studies it is not necessary to feed a Convolutional Neural Network with input of the same size because we can apply the so called Spatial Pyramid Pooling as one layer to get this images of the same size before our Fully Connected Layers where we need same sized inputs. That's clear to me.
But I am completely lost how I can get the input, in my case a bunch of different sized images, in a useful dataframe or an array...
I know, how to load one image to Python. I took this code to get an array of one image:
from PIL import Image
import numpy as np
# Open image and make sure it is RGB - not palette
im = Image.open('C:/Users/tobis/OneDrive/Desktop/Masterarbeit/data/2017-IWT4S-HDR_LP-dataset/crop_h1/I00001.png').convert('RGB')
# Make into Numpy array
na = np.array(im)
# Check shape
print(na.shape)
But loading the next picture into this array is already a problem for me. Several questions arise:
1. Is an array a useful tool to work with these images of different sizes? Or do I need a pandas dataframe or something like this?
2. Is there a way to automate the process of loading this images to my dataframe/array?
I am very confused at the moment because I cannot imagine how to work around this issues because I do not understand how we can handle the loading of this images of different sizes and how Python works with these. I hope, my questions are more or less clear.
Thank you!
Yes, multi-dimensional arrays (tensors) are very useful to store image representations of different sizes. Avoid Pandas for data input purposes - it is much less computationally efficient than numpy arrays or tensors (i.e. tensorflow or pytorch)
Absolutely. Keras has the ImageDataGenerator class for this express purpose. Some examples are on that page as well as from here:
# example of progressively loading images from file
from keras.preprocessing.image import ImageDataGenerator
# create generator
datagen = ImageDataGenerator()
# prepare an iterators for each dataset
train_it = datagen.flow_from_directory('data/train/', class_mode='binary')
val_it = datagen.flow_from_directory('data/validation/', class_mode='binary')
test_it = datagen.flow_from_directory('data/test/', class_mode='binary')
# confirm the iterator works
batchX, batchy = train_it.next()
print('Batch shape=%s, min=%.3f, max=%.3f' % (batchX.shape, batchX.min(), batchX.max()))
And pytorch has the DataLoader class. Example:
# normalize data inputs
transform = transforms.Compose([
transforms.ToTensor(), # Transform to tensor
transforms.Normalize((0.5,), (0.5,)) # Min-max scaling to [-1, 1]
])
# load train/test sets
trainset = torchvision.datasets.FashionMNIST(root=data_dir, train=True, download=True, transform=transform)
testset = torchvision.datasets.FashionMNIST(root=data_dir, train=False, download=True, transform=transform)
# define classes
classes = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal',
'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
# initialize train/test generators
trainloader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=True)
testloader = torch.utils.data.DataLoader(testset, batch_size=5, shuffle=False)
I am interested in training and evaluating a convolutional neural net model on my own set of images. I want to use the tf.layers module for my model definition, along with a tf.learn.Estimator object to train and evaluate the model using the fit() and evaluate() methods, respectively.
Here is the tutorial that I have been following, which is helpful for showcasing the tf.layers module and the tf.learn.Estimator class. However, the dataset that it uses (MNIST) is simply imported and loaded (as NumPy arrays). See the following main function from the tutorial script:
def main(unused_argv):
# Load training and eval data
mnist = learn.datasets.load_dataset("mnist")
train_data = mnist.train.images # Returns np.array
train_labels = np.asarray(mnist.train.labels, dtype=np.int32)
eval_data = mnist.test.images # Returns np.array
eval_labels = np.asarray(mnist.test.labels, dtype=np.int32)
# Create the Estimator
mnist_classifier = learn.Estimator(
model_fn=cnn_model_fn, model_dir="/tmp/mnist_convnet_model")
# Set up logging for predictions
# Log the values in the "Softmax" tensor with label "probabilities"
tensors_to_log = {"probabilities": "softmax_tensor"}
logging_hook = tf.train.LoggingTensorHook(
tensors=tensors_to_log, every_n_iter=50)
# Train the model
mnist_classifier.fit(
x=train_data,
y=train_labels,
batch_size=100,
steps=20000,
monitors=[logging_hook])
# Configure the accuracy metric for evaluation
metrics = {
"accuracy":
learn.MetricSpec(
metric_fn=tf.metrics.accuracy, prediction_key="classes"),
}
# Evaluate the model and print results
eval_results = mnist_classifier.evaluate(
x=eval_data, y=eval_labels, metrics=metrics)
print(eval_results)
Full code here
I have my own images, which I have in both jpg format within a certain directory structure:
data
train
classA
1.jpg
2.jpg
...
classB
3.jpg
4.jpg
...
...
validate
classA
5.jpg
6.jpg
...
classB
...
...
And I have also converted my image directories into TFRecord format, with one TFRecord file for train and one for validation. I followed this tutorial, which uses the build_image_data.py script from the Inception model that comes with TensorFlow as a blackbox that outputs these TFRecord files. I admit that I may have put the cart before the horse a bit by creating these, but I thought that perhaps there was a way to use these as inputs to the tf.learn.Estimator's fit() and evaluate() methods.
Questions
How can I format my jpg (or TFRecord) data so that I can use them as inputs to the Estimator object's functions?
I'm assuming I have to convert my images and labels to NumPy arrays, as it shows in the code above, however, it is not clear how the mnist.train.images and mnist.train.validation are formatted.
Does anyone have any experience with converting jpg files and labels to NumPy arrays that this Estimator class expects as inputs?
Any help would be greatly appreciated.
The file that you have referenced, cnn_mnist.py, and specifically the following function mnist_classifier.fit, requires Numpy arrays as input for x and y. Therefore, I will address your second and third questions as TFRecords may not be easily incorporated into the referenced code.
however, it is not clear how the mnist.train.images and mnist.train.validation are formatted
mnist.train.images is a Numpy array with shape (55000, 784), where 55000 is the number of images and 784 is the dimension of each flattened image (28 x 28). mnist.validation.images is also a Numpy array with shape (5000, 784).
Does anyone have any experience with converting jpg files and labels to NumPy arrays that this Estimator class expects as inputs?
The following code reads in one JPEG image as a three-dimensional Numpy array:
from scipy.misc import imread
filename = '1.jpg'
np_1 = imread(filename)
I assume all of these images are the same size or that you are able to resize them to the same size, considering that you have already generated TFRecords files from this dataset. All that is left to do is flatten the image, read in the other images iteratively and flatten them, and then vertically stack all the images. This object can be fed into the Estimator function.
Below is code to flatten and vertically stack two three-dimensional Numpy arrays:
import numpy as np
np_1_2 = np.vstack((np_1.flatten(), np_2.flatten()))