How to convert grayscale image shape with 1 channel to coloured image shape with 3 channels? - python

I want to load the mnist dataset to the mobilenet V1 CNN
then, I faced with this problem
ValueError: Error when checking input: expected input_1 to have shape (32, 32, 3) but got array with shape (28, 28, 1)
Below is my code
image_data, label_data = data['image'], data['label']
idx_list = {}
for i in range(10):
idx_list[i] = np.where(label_data == i) # return tuple dtype (rows indices, column indices)
selected_test_sample_indices = {}
for label in range(10):
selected_test_sample_indices[label] = random.sample(set(idx_list[label][0]), int(len(idx_list[label][0]) * 0.2))
selected_train_sample_indicies = {}
for label in range(10):
selected_train_sample_indicies[label] = list(set(idx_list[label][0])- set(selected_test_sample_indices[label]))
train_data_indicies, test_data_indicies = [],[]
for label, indicies in selected_train_sample_indicies.items():
train_data_indicies = train_data_indicies + indicies # merge 2 list
for label, indicies in selected_test_sample_indices.items():
test_data_indicies = test_data_indicies + indicies
random.shuffle(train_data_indicies)
random.shuffle(test_data_indicies)
y_train_data = np.array([label_data[idx] for idx in train_data_indicies])
X_train_data = np.array([image_data[idx] for idx in train_data_indicies])
y_test_data = np.array([label_data[idx] for idx in test_data_indicies])
X_test_data = np.array([image_data[idx] for idx in test_data_indicies])
number_of_classes = 10
y_train = y_train_data
y_test = y_test_data
X_train = X_train_data.reshape(X_train_data.shape[0], img_rows, img_cols, 1)
X_test = X_test_data.reshape(X_test_data.shape[0], img_rows, img_cols, 1)```
Whenn I tried to reshape I got the following error
ValueError: cannot reshape array of size 11146912 into shape (14218,32,32,1)
when I change it to (4500,32,32,3), the sum is lower than 11146912
It really confused me.
Please help me to fix this bug.

The MNIST dataset contains images in grayscale with the size of 28x28 pixels. That is why the shape of each image is (28, 28, 1) with each value between 0-255. Here's another stackoverflow question with the same problem. The most valid answer is to convert the grayscale images into rgb images and then resizing the images.

Well after converting the grayscale images to rgb images the shape of your images will change from
28 x 28 x 1 to 28 x 28 x 3
Then you need to resize it to 32. You can use openCV library for that.
resized_image = cv2.resize(image, (32, 32))
Then your resized_image shape would be 32 x 32 x 3

Related

Tensorflow dataset with multiple inputs and target

I am trying to implement a model with the ArcFace Layer:
https://github.com/4uiiurz1/keras-arcface
to this extend I created a tf.data.dataset like so:
images= tf.data.Dataset.from_tensor_slices(train.A_image.to_numpy())
target = tf.keras.utils.to_categorical(
train.Label.to_numpy(), num_classes=n_class, dtype='float32'
)
target = tf.data.Dataset.from_tensor_slices(target)
images= images.map(transform_img)
dataset = tf.data.Dataset.zip((images, target, target))
when I call model.fit(dataset)
I get the following error:
ValueError: Layer model expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=<unknown> dtype=float32>]
But this should work according:
tf.data with multiple inputs / outputs in Keras
Can someone point out my folly?
Thanks!
Edit:
this solves some problems:
#reads in filepaths to images from dataframe train
images = tf.data.Dataset.from_tensor_slices(train.image.to_numpy())
#converts labels to one hot encoding vector
target = tf.keras.utils.to_categorical(train.Label.to_numpy(), num_classes=n_class, dtype='float32')
#reads in the image and resizes it
images= images.map(transform_img)
input_1 = tf.data.Dataset.zip((anchors, target))
dataset = tf.data.Dataset.zip((input_1, target))
And I think it's what we are trying. But I get a shape error for targets, it's (n_class, 1) instead of just (n_class,)
I.e. the fit methods throws this error
ValueError: Shapes (n_class, 1) and (n_class, n_class) are incompatible
and this warning
input expected is (None, n_class) but received an input of (n_class, 1)
I've made changes to the solution based on the arcface, you've wanted here is the code, i've managed to train it
The first one is from tensor slices as the original input and i used mnist to test it out
def map_data(inputs, outputs):
image = tf.cast(inputs['image_input'], tf.float32)
image = image / 255.
image = tf.expand_dims(image, axis=2)
labels = tf.one_hot(outputs, 10)
return {'image_input': image, 'label_input': labels}, labels
dataset = tf.data.Dataset.from_tensor_slices(({
'image_input': x_train, 'label_input': y_train
}, y_train))
dataset = dataset.map(map_data)
dataset = dataset.batch(2)
Here is the second type i have tried using a normal from tensor slices then i converted it to a multiple input, since both the normal labels are used for both the input and output
def map_data(images, annot_labels):
image = tf.cast(images, tf.float32)
image = image / 255.
image = tf.expand_dims(image, axis=2) # convert to 0 - 1 range
labels = tf.one_hot(annot_labels, 10)
return {'image_input': image, 'label_input': labels}, labels
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
dataset = dataset.map(map_data)
dataset = dataset.batch(2)
I think you should do it like this:
target = tf.keras.utils.to_categorical(train.Label.to_numpy(), num_classes=n_class, dtype='float32')
images_target = tf.data.Dataset.from_tensor_slices((train.A_image.to_numpy(), target))
images_target = images_target.map(lambda x, y: (transform_img(x), y))
target = tf.data.Dataset.from_tensor_slices(target)
dataset = tf.data.Dataset.zip((images_target, target))

ValueError: cannot reshape array of size 15525000 into shape (260,260) in Python?

I have a problem about reshaping dataframe for implementing CNN.
My dataframe shape : train.shape -> (230, 67502).
Then I wrote a code shown below.
Y_train = train["Label"]
X_train = train.drop(labels = ["Label"],axis = 1)
When I run this code below for plotting some images by iloc, It throws an error
img = X_train.iloc[0].to_numpy()
img = np.pad(img, (0, (67600-img.shape[0])), 'constant').reshape((260, 260))
plt.imshow(img,cmap='gray')
plt.title(train.iloc[0,0])
plt.axis("off")
plt.show()
Then I normalize X_train
X_train = X_train / 255.0
print("x_train shape: ",X_train.shape)
When I reshape X_train , it throws an error
X_train = X_train.values.reshape(-1, 260, 260)
print("x_train shape: ",X_train.shape)
ValueError: cannot reshape array of size 15525000 into shape (260,260)
How can I fix the issue?
Are you absolutely sure you need a 260 * 260 image? As 67500 == 270 * 250, you can try that and see how it looks!
Otherwise you would need to PAD - how exactly would depend upon your image.
But, one of the simplest might be to add 100 more 0's at the end to make it 67600 - hence 260 * 260
Well, 260*260 is 67600 and not 67500. So you can't cast your array into those dimensions.
To actually cast it into those dimensions you would need to pad or normalize the source image arrays. For example, check the Keras pad_sequences functionality documentation on dealing with this kind of issues.
Solution :
X_train = np.pad(X_train, ((0,0), (0, (67600-X_train.shape[1]))), 'constant').reshape(-1, 260, 260)

ValueError: Error when checking input: expected input_1 to have 4 dimensions, but got array with shape (6243, 256, 256)

I want to append the label on the training dataset and I do it as
def one_hot_label(img):
label = img
if label == 'A':
ohl = np.array([1, 0])
elif label == 'B':
ohl = np.array([0, 1])
return ohl
def train_data_with_label():
train_images = []
for i in tqdm(os.listdir(train_data)):
path_pre = os.path.join(train_data, i)
for img in os.listdir(path_pre):
if img.endswith('.jpg'):
path = os.path.join(path_pre, img)
img = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
train_images.append([np.array(img), one_hot_label(i)])
shuffle(train_images)
return train_images
However, the error returned when execute the input on Keras
training_images = train_data_with_label()
tr_img_data = np.array([i[0] for i in training_images])
tr_lbl_data = np.array([i[1] for i in training_images])
model = Sequential()
model.add(InputLayer(input_shape=(256, 256, 1)))
Can anyone help me to fix it?
Your input layer is expecting an array of shape (batch_size, 256, 256, 1) but it looks like you are passing in data of the shape (batch_size, 256, 256). You can try reshaping your training data as follows:
tr_img_data = np.expand_dims(tr_img_data, axis=-1)

How do I reshape the dimensions of an image to contain the number of images (i.e., 1) as well?

I am running a neural network model on some images. Initially, for training, I converted all the images into a pandas dataframe of dimension (# of images in the dataset) x r x g x b, where r, g, b are the colour values of each image. Now when I am trying to test the model on a single externally downloaded image, it is giving a dimension error as, obviously, the image's dimension is only r x g x b. How do I add the number of images as a dimension into this image?
EDIT: Here's the code:
#load the data as a pandas data frame
import pandas as pd
dataset = pd.read_csv(os.path.join(data_path, 'data.csv'))
# split into input (X) and output (Y) variables
X = dataset.values[:,0]
Y = dataset.values[:,1]
# Load all the images and resize them into a single numpy array of consistent dimension
from scipy.misc import imresize
from scipy.misc import imread
import numpy as np
temp = []
for img_name in X:
img_path = os.path.join(data_dir, 'Train', img_name)
img = imread(img_path)
img = imresize(img, (32, 32))
img = img.astype('float32')
temp.append(img)
X = np.stack(temp)
# Convert the data classes from words into a number format readable by the program
from sklearn.preprocessing import LabelEncoder
lb = LabelEncoder()
Y = lb.fit_transform(Y)
Y = keras.utils.np_utils.to_categorical(Y)
# Split the data into 67% for training and 33% for testing
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.33)
### Define the neural network model
### Compile and train the model on the data
### Evaluate it
# Test it on an externally downloaded image
img = imread(os.path.join(image_folder, downloaded_image)).astype('float32')
plt.imshow(imresize(img, (128, 128)))
print('X_train shape: ', X_train.shape)
print('Downloaded image shape: ', img.shape)
This returns:
X_train shape: (13338, 32, 32, 3)
Downloaded image shape: (448, 720, 3)
I want to make the downloaded image's shape to be (1, 448, 720, 3) so that it matches the dimensions of X_train's shape, because when I try to predict the class of the downloaded image, it returns a dimension error:
pred = cnn_model.predict_classes(img)
print('Predicted:', lb.inverse_transform(pred))
This returns:
ValueError: Error when checking : expected conv2d_71_input to have 4 dimensions, but got array with shape (960, 640, 3)
From your description, it seems like you don't really mean to use the number of images as a feature, but rather as a sample weight. Conceptually, you probably want to transform
k x r x g x b
to
r x g x b
... # repeat k times
r x g x b
which would naturally make the input and output dimensions identical, BTW. If this increases learning time too much, and your library has a sample weight parameter, you should consider using it.
If you'd like to just technically add a dimension, you can use np.expand_dims:
>>> np.expand_dims(np.array([[1, 2, 3], [3, 4, 5]]), axis=0).shape
(1, 2, 3)
However, I cannot say I'm sure that this is fundamentally what you what.

why I must reshape one image to [n,height,width,channel] in CNN

I try to apply a convolutional layer to a picture of shape [256,256,3]
a have an error when I user the tensor of the image directly
conv1 = conv2d(input,W_conv1) +b_conv1 #<=== error
error message:
ValueError: Shape must be rank 4 but is rank 3 for 'Conv2D' (op: 'Conv2D')
with input shapes: [256,256,3], [3,3,3,1].
but when I reshape the function conv2d work normally
x_image = tf.reshape(input,[-1,256,256,3])
conv1 = conv2d(x_image,W_conv1) +b_conv1
if I must reshape the tensor what the best value to reshape in my case and why?
import tensorflow as tf
import numpy as np
from PIL import Image
def img_to_tensor(img) :
return tf.convert_to_tensor(img, np.float32)
def weight_generater(shape):
return tf.Variable(tf.truncated_normal(shape,stddev=0.1))
def bias_generater(shape):
return tf.Variable(tf.constant(.1,shape=shape))
def conv2d(x,W):
return tf.nn.conv2d(x,W,[1,1,1,1],'SAME')
def pool_max_2x2(x):
return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,1,1,1],padding='SAME')
#read image
img = Image.open("img.tif")
sess = tf.InteractiveSession()
#convetir image to tensor
input = img_to_tensor(img).eval()
#print(input)
# get img dimension
img_dimension = tf.shape(input).eval()
print(img_dimension)
height,width,channel=img_dimension
filter_size = 3
feature_map = 32
x = tf.placeholder(tf.float32,shape=[height*width*channel])
y = tf.placeholder(tf.float32,shape=21)
# generate weigh [kernal size, kernal size,channel,number of filters]
W_conv1 = weight_generater([filter_size,filter_size,channel,1])
#for each filter W has his specific bais
b_conv1 = bias_generater([feature_map])
""" I must reshape the picture
x_image = tf.reshape(input,[-1,256,256,3])
"""
conv1 = conv2d(input,W_conv1) +b_conv1 #<=== error
h_conv1 = tf.nn.relu(conv1)
h_pool1 = pool_max_2x2(h_conv1)
layer1_dimension = tf.shape(h_pool1).eval()
print(layer1_dimension)
The first dimension is the batch size. If you are feeding 1 image at a time you can simply make the first dimension 1 and it doesn't change your data any, just changes the indexing to 4D:
x_image = tf.reshape(input, [1, 256, 256, 3])
If you reshape it with a -1 in the first dimension what you are doing is saying that you will feed in a 4D batch of images (shaped [batch_size, height, width, color_channels], and you are allowing the batch size to be dynamic (which is common to do).
You could also use
im = tf.expand_dims(input, axis=0)
to insert a dimension of 1 into the tensor's shape. im will be a rank 4 tensor. This way you do not have to specify the dimensions of the image.

Categories