How do I reshape the dimensions of an image to contain the number of images (i.e., 1) as well? - python

I am running a neural network model on some images. Initially, for training, I converted all the images into a pandas dataframe of dimension (# of images in the dataset) x r x g x b, where r, g, b are the colour values of each image. Now when I am trying to test the model on a single externally downloaded image, it is giving a dimension error as, obviously, the image's dimension is only r x g x b. How do I add the number of images as a dimension into this image?
EDIT: Here's the code:
#load the data as a pandas data frame
import pandas as pd
dataset = pd.read_csv(os.path.join(data_path, 'data.csv'))
# split into input (X) and output (Y) variables
X = dataset.values[:,0]
Y = dataset.values[:,1]
# Load all the images and resize them into a single numpy array of consistent dimension
from scipy.misc import imresize
from scipy.misc import imread
import numpy as np
temp = []
for img_name in X:
img_path = os.path.join(data_dir, 'Train', img_name)
img = imread(img_path)
img = imresize(img, (32, 32))
img = img.astype('float32')
temp.append(img)
X = np.stack(temp)
# Convert the data classes from words into a number format readable by the program
from sklearn.preprocessing import LabelEncoder
lb = LabelEncoder()
Y = lb.fit_transform(Y)
Y = keras.utils.np_utils.to_categorical(Y)
# Split the data into 67% for training and 33% for testing
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.33)
### Define the neural network model
### Compile and train the model on the data
### Evaluate it
# Test it on an externally downloaded image
img = imread(os.path.join(image_folder, downloaded_image)).astype('float32')
plt.imshow(imresize(img, (128, 128)))
print('X_train shape: ', X_train.shape)
print('Downloaded image shape: ', img.shape)
This returns:
X_train shape: (13338, 32, 32, 3)
Downloaded image shape: (448, 720, 3)
I want to make the downloaded image's shape to be (1, 448, 720, 3) so that it matches the dimensions of X_train's shape, because when I try to predict the class of the downloaded image, it returns a dimension error:
pred = cnn_model.predict_classes(img)
print('Predicted:', lb.inverse_transform(pred))
This returns:
ValueError: Error when checking : expected conv2d_71_input to have 4 dimensions, but got array with shape (960, 640, 3)

From your description, it seems like you don't really mean to use the number of images as a feature, but rather as a sample weight. Conceptually, you probably want to transform
k x r x g x b
to
r x g x b
... # repeat k times
r x g x b
which would naturally make the input and output dimensions identical, BTW. If this increases learning time too much, and your library has a sample weight parameter, you should consider using it.
If you'd like to just technically add a dimension, you can use np.expand_dims:
>>> np.expand_dims(np.array([[1, 2, 3], [3, 4, 5]]), axis=0).shape
(1, 2, 3)
However, I cannot say I'm sure that this is fundamentally what you what.

Related

What shape do I need to have my array of color jpeg images in order to enter it into a CNN, and how do I reshape it to the required shape?

I am trying to train a CNN with 20 jpeg images just as an exercise. I chose the shape of the input layer to be input_shape=(32, 32, 3) but I am getting errors. When I run a "print shape" for the image data array I get (10, ). I am not sure why it is like this. Shouldnt a color image shape have 3 or 4 dimensions? The shapes of my array of jpegs seems to be (10,)...one dimensional. How do I transform the shape in order to use the fit function below and to what shape?
import tensorflow as tf
from tensorflow.keras import layers, models
from matplotlib import pyplot
import random
import numpy as np
from os import listdir
from matplotlib import image
# load all the cat train images in the cat train directory
imagesWithLabels = []
for filename in listdir('C:/AI/images/airplanes'):
# load image
img_data = image.imread('C:/AI/images/airplanes/' +\
filename)
# store loaded image in a list
imagesWithLabels.append((img_data,0))
print('> loaded %s %s' % (filename, img_data.shape))
for filename in listdir('C:/AI/images/automobiles'):
# load image
img_data = image.imread('C:/AI/images/automobiles/' +\
filename)
# store loaded image in the list
imagesWithLabels.append((img_data,1))
print('> loaded %s %s' % (filename, img_data.shape))
#check to see that all 20 images of planes and autos are there
len(imagesWithLabels)
random.shuffle(imagesWithLabels)
type(imagesWithLabels)
train = imagesWithLabels[:10]
test = imagesWithLabels[10:]
x_train, y_train = zip(*train)
x_test, y_test = zip(*test)
x_train = np.array(x_train)
x_test = np.array(x_test)
y_train = np.array(y_train)
y_test = np.array(y_test)
type(x_train)
type(x_test)
for i in range(10):
pyplot.imshow(x_train[i] )
pyplot.show()
CNN_model = models.Sequential()
CNN_model.add(layers.Conv2D(50, (2, 2), activation='relu',\
input_shape=(32, 32, 3)))
CNN_model.add(layers.MaxPooling2D((3, 3)))
CNN_model.add(layers.Flatten())
CNN_model.add(layers.Dense(50, activation='relu'))
CNN_model.add(layers.Dropout(.1))
CNN_model.add(layers.Dense(10, activation='softmax'))
optimizer = tf.optimizers.Adam(learning_rate = .005)
CNN_model.compile(optimizer=optimizer,
loss=tf.keras.losses.SparseCategoricalCrossentropy
(from_logits=False),metrics=['accuracy'])
history = CNN_model.fit(x_train, y_train, epochs=5,validation_data=( x_test, y_test))
Assuming all the images are RGB images of the same size (W, H, C) = (32, 32, 3), you can load them and store them in a Python list as you are doing (with labels). If you read 10 images, then this list is to length 10, and each item is a Numpy array or Tensor of shape (32, 32, 3).
However, to train a network you must transform this list of Tensors into one Tensor. You achieve this by stacking the images along one new dimension:
x_train, y_train = zip(*train)
x_test, y_test = zip(*test)
x_train = tf.stack(x_train)
x_test = tf.stack(x_test)
y_train = tf.convert_to_tensor(y_train)
y_test = tf.convert_to_tensor(y_test)
Those tensors should be of size (N, 32, 32, 3) where N is the number of images in each set. Do not forget to transform everything into a Tensor before use in a network. You should use tf.convert_to_tensor() which convert Python lists and Numpy arrays to Tensors.

Wrong shape Dataset Tensorflow

Im new to tensorflow and Im trying to feed some data with tensorflow.Dataset. Im using Cityscape dataset with 8 different classes. Here is my code:
import os
import cv2
import numpy as np
import tensorflow as tf
H = 256
W = 256
id2cat = np.array([0,0,0,0,0,0,0, 1,1,1,1, 2,2,2,2,2,2, 3,3,3,3, 4,4, 5, 6,6, 7,7,7,7,7,7,7,7,7])
def readImage(x):
x = cv2.imread(x, cv2.IMREAD_COLOR)
x = cv2.resize(x, (W, H))
x = x / 255.0
x = x.astype(np.float32)
return x
def readMask(path):
mask = cv2.imread(path, 0)
mask = cv2.resize(mask, (W, H))
mask = id2cat[mask]
return mask.astype(np.int32)
def preprocess(x, y):
def f(x, y):
image = readImage(x)
mask = readMask(y)
return image, mask
image, mask = tf.numpy_function(f, [x, y], [tf.float32, tf.int32])
mask = tf.one_hot(mask, 3, dtype=tf.int32)
image.set_shape([H, W, 3])
mask.set_shape([H, W, 3])
return image, mask
def tf_dataset(x, y, batch=8):
dataset = tf.data.Dataset.from_tensor_slices((x, y))
dataset = dataset.shuffle(buffer_size=5000)
dataset = dataset.map(preprocess)
dataset = dataset.batch(batch)
dataset = dataset.repeat()
dataset = dataset.prefetch(2)
return dataset
def loadCityscape():
trainPath = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'datasets\\Cityscape\\train')
imagesPath = os.path.join(trainPath, 'images')
maskPath = os.path.join(trainPath, 'masks')
images = []
masks = []
print('Loading images and masks for Cityscape dataset...')
for image in os.listdir(imagesPath):
images.append(readImage(os.path.join(imagesPath, image)))
for mask in os.listdir(maskPath):
if 'label' in mask:
masks.append(readMask(os.path.join(maskPath, mask)))
print('Loaded {} images\n'.format(len(images)))
return images, masks
images, masks = loadCityscape()
dataset = tf_dataset(images, masks, batch=8)
print(dataset)
That last print(dataset) shows:
<PrefetchDataset shapes: ((None, 256, 256, 3), (None, 256, 256, 3)), types: (tf.float32, tf.int32)>
Why am I obtaining (None, 256, 256, 3) instead of (8, 256, 256, 3)? I also have some doubts about how to iterate over this dataset.
Thanks a lot.
Tensorflow is a graph based mathematical framework that abstracts for you all of those complex vectorial or matricial operations you face, particularly in machine learning.
What the developers though is that it would be unconfortable to specify every single time how many input vectors you need to pass in your model for the training, so they decided to abstract it for you.
You will not interested if your model is fed with one single or thousands samples as long as the output matches with the input dimension (but also any internal operation should match in dimensions!).
So the None size is a placeholder for a possible changing shape, that is usually the batch size of the input.
We need a placeholder because (None, 2) is a different shape with respect of just (2,), because in the first case we know we will face 2 dimensions.
Even if the None dimension is unknown when you "compile" your model, it will be evaluated only when it is strictly needed, in other words when you run it. In this way your model will be happy to run on a batch size of 64 as like as 128 samples.
For the rest a (non-scalar) Tensor behaves like a normal numpy array:
tensor1 = tf.constant([ 0, 1, 2, 3]) # shape (4, )
tensor2 = tf.constant([ [0], [1], [2], [3]]) # shape (4, 1)
for x in tensor1:
print(x) # 0, 1, 2, 3
for x in tensor2:
print(x) # Tensor([0]), Tensor([1]), Tensor([2]), Tensor([3])
The only difference is that it can be allocated into any supported device memory (CPU / Cuda GPU).
Iterating through the dataset is just like slicing it at (usually) constant sizes, where that constant is your batch size, which will fill that empty None dimension.
This line of code will be responsible of slicing your dataset into "sub-tensors" ("sub-arrays") composed by its samples:
dataset = dataset.batch(N)
# iterating over it:
for batch in dataset: # I'm taking N samples here
...
Your "runtime" shape will be (N, 256, 256, 3), but if you will try to take an element from the dataset it could still have None in the shape... That's because we can't guarantee, for example, that the dimension of the dataset is exactly divisible by the batch size, so some trailing samples of a variable shape could still be possible. You will hardly get rid off that None dimension, but in some custom methods of your model you could achieve that.
If you are still unconfortable with tensors there is the tensor.numpy() method that gives you back a numpy array, but at the cost of copying it (usually to your CPU). This is not available in every step of the process.
There are many way to define a dataset in tensorflow, I suggest to read how they think you should build an input pipeline, because it will make your life easier if you understand how much tensorflow takes your code at higher levels of abstraction.

How to convert grayscale image shape with 1 channel to coloured image shape with 3 channels?

I want to load the mnist dataset to the mobilenet V1 CNN
then, I faced with this problem
ValueError: Error when checking input: expected input_1 to have shape (32, 32, 3) but got array with shape (28, 28, 1)
Below is my code
image_data, label_data = data['image'], data['label']
idx_list = {}
for i in range(10):
idx_list[i] = np.where(label_data == i) # return tuple dtype (rows indices, column indices)
selected_test_sample_indices = {}
for label in range(10):
selected_test_sample_indices[label] = random.sample(set(idx_list[label][0]), int(len(idx_list[label][0]) * 0.2))
selected_train_sample_indicies = {}
for label in range(10):
selected_train_sample_indicies[label] = list(set(idx_list[label][0])- set(selected_test_sample_indices[label]))
train_data_indicies, test_data_indicies = [],[]
for label, indicies in selected_train_sample_indicies.items():
train_data_indicies = train_data_indicies + indicies # merge 2 list
for label, indicies in selected_test_sample_indices.items():
test_data_indicies = test_data_indicies + indicies
random.shuffle(train_data_indicies)
random.shuffle(test_data_indicies)
y_train_data = np.array([label_data[idx] for idx in train_data_indicies])
X_train_data = np.array([image_data[idx] for idx in train_data_indicies])
y_test_data = np.array([label_data[idx] for idx in test_data_indicies])
X_test_data = np.array([image_data[idx] for idx in test_data_indicies])
number_of_classes = 10
y_train = y_train_data
y_test = y_test_data
X_train = X_train_data.reshape(X_train_data.shape[0], img_rows, img_cols, 1)
X_test = X_test_data.reshape(X_test_data.shape[0], img_rows, img_cols, 1)```
Whenn I tried to reshape I got the following error
ValueError: cannot reshape array of size 11146912 into shape (14218,32,32,1)
when I change it to (4500,32,32,3), the sum is lower than 11146912
It really confused me.
Please help me to fix this bug.
The MNIST dataset contains images in grayscale with the size of 28x28 pixels. That is why the shape of each image is (28, 28, 1) with each value between 0-255. Here's another stackoverflow question with the same problem. The most valid answer is to convert the grayscale images into rgb images and then resizing the images.
Well after converting the grayscale images to rgb images the shape of your images will change from
28 x 28 x 1 to 28 x 28 x 3
Then you need to resize it to 32. You can use openCV library for that.
resized_image = cv2.resize(image, (32, 32))
Then your resized_image shape would be 32 x 32 x 3

Keras ImageDataGenerator: problem with data and label shape

I wanted to generate more images using Keras as you can see in here,
using this code (almost the same as source>Random Rotations):
# Random Rotations
from keras.datasets import mnist
from keras.preprocessing.image import ImageDataGenerator
from matplotlib import pyplot
from keras import backend as K
datagen = ImageDataGenerator(rotation_range=90)
# fit parameters from data
datagen.fit(cats["images"])
print(np.asarray(cats["label"]).shape) #output=(12464,)
print(np.asarray(cats["images"]).shape) #output=(12464, 60, 60, 1)
# configure batch size and retrieve one batch of images
for X_batch, y_batch in datagen.flow(cats["images"], cats["label"], batch_size=9):
# create a grid of 3x3 images
for i in range(0, 9):
pyplot.subplot(330 + 1 + i)
pyplot.imshow(X_batch[i].reshape(28, 28), cmap=pyplot.get_cmap('gray'))
# show the plot
pyplot.show()
break
But I get the following error:
ValueError: x (images tensor) and y (labels) should have the same
length. Found: x.shape = (60, 60, 1), y.shape = (12464,)
This might help for further inspections :
I imagine there should be something wrong with the library as if I change the shape of my image into 60x60 instead of 60x60x1 I'll get:
ValueError: Input to .fit() should have rank 4. Got array with
shape: (12464, 60, 60)
It is very likely that the cats['images'] and cats['labels'] are Python lists. First convert them to arrays using np.array and then pass them to flow method:
cats['images'] = np.array(cats['images'])
cats['labels'] = np.array(cats['labels'])
You need to change the shape of your labels
labels = np.asarray(cats["label"]).reshape(( -1 , 1 ))

Reshape list of images into right format for CNN

I have a set of images in an np.array format with shape (nb_examples,1) and each element has the shape (128,128,3). What I am trying to do here is to have an array of shape (nb_examples,128,128,3). I've tried many techniques.. this is one of them:
import cv2
import os
import glob
import re
img_size_target = (128,128)
img_dir = "train_set/" # Enter Directory of all images
data_path = os.path.join(img_dir,'*.bmp')
files = glob.glob(data_path)
data = []
indexes = []
files_names = []
for f1 in np.sort(files):
#reading images using OpenCV
img = cv2.imread(f1)
files_names.append(str(f1))
data.append(img)
#using filename number to get the index of the sample
result = re.search('/foie_(.*)\.bmp', str(f1))
indexes.append(np.int(result.group(1)))
#Create the dataframe
train_df = pd.DataFrame({"id":indexes,"file": files_names, "images": data})
train_df.sort_values(by="id",ascending=True,inplace=True)
#Split train/validation set
ids_train, ids_valid, x_train, x_valid, y_train, y_valid = train_test_split(
train_df.id.values,
#Here I resize the original images from (700,960,3) to (128,128,3)
np.array(train_df.images.apply(lambda x: cv2.resize(x,img_size_target).reshape(img_size_target[0],img_size_target[1],3))),
train_df.target,
test_size=0.2, stratify=train_df.target, random_state=1337)
print(x_train.shape) #result is (1510,1)
print(x_train[0].shape) #result is (128,128,3)
x_train.reshape(x_train.shape[0],128,128,3)
I get the error
ValueError: cannot reshape array of size 1510 into shape (1510,128,128,3)
The final idea is to use these images for a CNN model. When I give the images as inputs to the CNN without reshaping I get the following error:
ValueError: Error when checking input: expected input_1 to have 4 dimensions, but got array with shape (1510, 1)
Thanks for the hint #madjaoue. The answer was:
x_train = np.array([img for img in x_train])
x_valid = np.array([img for img in x_valid])

Categories