I am writing a data pipeline to feed batches of time-series sequences and corresponding labels into an LSTM model which requires a 3D input shape. I currently have the following:
def split(window):
return window[:-label_length], window[-label_length]
dataset = tf.data.Dataset.from_tensor_slices(data.sin)
dataset = dataset.window(input_length + label_length, shift=label_shift, stride=1, drop_remainder=True)
dataset = dataset.flat_map(lambda window: window.batch(input_length + label_length))
dataset = dataset.map(split, num_parallel_calls=tf.data.AUTOTUNE)
dataset = dataset.cache()
dataset = dataset.shuffle(shuffle_buffer, seed=shuffle_seed, reshuffle_each_iteration=False)
dataset = dataset.batch(batch_size=batch_size, drop_remainder=True)
dataset = dataset.prefetch(tf.data.AUTOTUNE)
The resulting shape of for x, y in dataset.take(1): x.shape is (32, 20), where 32 is the batch size and 20 the length of the sequence, but I need a shape of (32, 20, 1), where the additional dimension denotes the feature.
My question is how I can reshape, ideally in the split function that is passed into the dataset.map function before caching the data?
That's easy. Do this in your split function
def split(window):
return window[:-label_length, tf.newaxis], window[-label_length, tf.newaxis, tf.newaxis]
Related
I am using a model trained by myself to translate braille digits into plain text. As you can see this is a classification problem with 26 classes, one for each letter in the alphabet.
This is the dataset that I used to train my model: https://www.kaggle.com/datasets/shanks0465/braille-character-dataset
This is how I am generating my training and validation set:
os.mkdir('./images/')
alpha = 'a'
for i in range(0, 26):
os.mkdir('./images/' + alpha)
alpha = chr(ord(alpha) + 1)
rootdir = "C:\\Users\\ffernandez\\Downloads\\capstoneProject\\Braille Dataset\\Braille Dataset\\"
for file in os.listdir(rootdir):
letter = file[0]
copyfile(rootdir+file, './images/' + letter + '/' + file)
The resulting folder looks like this:
folder structure
And this is how I create the train and validation split:
datagen = ImageDataGenerator(rotation_range=20,
shear_range=10,
validation_split=0.2)
train_generator = datagen.flow_from_directory('./images/',
target_size=(28,28),
subset='training')
val_generator = datagen.flow_from_directory('./images/',
target_size=(28,28),
subset='validation')
Finally this is the code corresponding to the design, compilation and training of the model:
K.clear_session()
model_ckpt = ModelCheckpoint('BrailleNet.h5',save_best_only=True)
reduce_lr = ReduceLROnPlateau(patience=8,verbose=0)
early_stop = EarlyStopping(patience=15,verbose=1)
entry = L.Input(shape=(28,28,3))
x = L.SeparableConv2D(64,(3,3),activation='relu')(entry)
x = L.MaxPooling2D((2,2))(x)
x = L.SeparableConv2D(128,(3,3),activation='relu')(x)
x = L.MaxPooling2D((2,2))(x)
x = L.SeparableConv2D(256,(2,2),activation='relu')(x)
x = L.GlobalMaxPooling2D()(x)
x = L.Dense(256)(x)
x = L.LeakyReLU()(x)
x = L.Dense(64,kernel_regularizer=l2(2e-4))(x)
x = L.LeakyReLU()(x)
x = L.Dense(26,activation='softmax')(x)
model = Model(entry,x)
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit_generator(train_generator,validation_data=val_generator,epochs=666,
callbacks=[model_ckpt,reduce_lr,early_stop],verbose=0)
Then this is the code for testing an image of the letter 'a' in braille has the same size as the training and validation set (28x28):
img_path = "./test/a1.JPG10whs.jpg"
img = plt.imread(img_path)
img_array = tf.keras.utils.img_to_array(img)
img_batch = np.expand_dims(img_array, axis=0)
img_preprocessed = tf.keras.applications.resnet50.preprocess_input(img_batch)
prediction = model.predict(img_preprocessed)
print(tf.keras.applications.imagenet_utils.decode_predictions(prediction, top=3)[0])
Just when I execute that last line of code this error appears:
ValueError: decode_predictions expects a batch of predictions (i.e. a 2D array of shape (samples, 1000)). Found array with shape: (1, 26)
A similar question I found here on stackoverflow (ValueError: `decode_predictions` expects a batch of predictions (i.e. a 2D array of shape (samples, 1000)). Found array with shape: (1, 7)).
I've seen that using "decode_predictions" only makes sense if your model outputs the ImageNet classes (1000-dimensional) but if I can't use "decode_predictions" I don't know how to get my predictions.
My desired output would be like:
prediction = model.predict(img_preprocessed)
print(prediction)
output: 'a'
Any hint or suggestion on how to solve this issue is highly appreciated.
If we take a look at what the prediction object acually is we can see that it has 26 values. These values are the propabiity for each letter that the model predicts:
So we need a way to map the prediction value to the respective letter.
A simple way to do this could to create a list of all the 26 possible letters and search the max value in the prediction array. Example:
#Create prediction labels from a-z
alpha="a"
labels=["a"]
for i in range(0, 25):
alpha = chr(ord(alpha) + 1)
labels.append(alpha)
#Search the max value in prediction
labels[np.argmax(prediction)]
The output should be the character with the highest probability:
I have read other people's questions for similar issues, but can't figure it out in my case. My code is below, how do I fix this? Thank you.
data = ImageFolder(data_dir, transform=transforms.Compose([transforms.Resize((224,224)),transforms.ToTensor()]))
trainloader = torch.utils.data.DataLoader(data, batch_size=3600,
shuffle=True, num_workers=2)
dataiter = iter(trainloader)
x_train, y_train = dataiter.next()
print(x_train.size())
print(y_train.size())
torch.Size([3600, 3, 224, 224])
torch.Size([3600])
class Net(torch.nn.Module):
def __init__(self):
super().__init__()
# here we set up the tensors......
self.layer1 = torch.nn.Linear(224, 12)
self.layer2 = torch.nn.Linear(12, 10)
def forward(self, x):
# here we define the (forward) computational graph,
# in terms of the tensors, and elt-wise non-linearities
x = F.relu(self.layer1(x))
x = self.layer2(x)
return x
net = Net()
y = net.forward(x_train)
lossFn = torch.nn.CrossEntropyLoss()
loss = lossFn(y, y_train)
print(loss)
Your input to the network is a 2D image. That is a tensor with 4 dimensions: batch, channel, height and width.
However, you treat the 2D input as a 1D signal by applying nn.Linear layers to its width dimension only, resulting with an output of shape batchchannelheight*output_dim. In contrast, the nn.CrossEntropyLoss expects only one output vector per target label.
You need to change your Net to properly process images into a single vector of predictions.
You can checkout milestone image classification architectures here.
I am trying to implement a model with the ArcFace Layer:
https://github.com/4uiiurz1/keras-arcface
to this extend I created a tf.data.dataset like so:
images= tf.data.Dataset.from_tensor_slices(train.A_image.to_numpy())
target = tf.keras.utils.to_categorical(
train.Label.to_numpy(), num_classes=n_class, dtype='float32'
)
target = tf.data.Dataset.from_tensor_slices(target)
images= images.map(transform_img)
dataset = tf.data.Dataset.zip((images, target, target))
when I call model.fit(dataset)
I get the following error:
ValueError: Layer model expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=<unknown> dtype=float32>]
But this should work according:
tf.data with multiple inputs / outputs in Keras
Can someone point out my folly?
Thanks!
Edit:
this solves some problems:
#reads in filepaths to images from dataframe train
images = tf.data.Dataset.from_tensor_slices(train.image.to_numpy())
#converts labels to one hot encoding vector
target = tf.keras.utils.to_categorical(train.Label.to_numpy(), num_classes=n_class, dtype='float32')
#reads in the image and resizes it
images= images.map(transform_img)
input_1 = tf.data.Dataset.zip((anchors, target))
dataset = tf.data.Dataset.zip((input_1, target))
And I think it's what we are trying. But I get a shape error for targets, it's (n_class, 1) instead of just (n_class,)
I.e. the fit methods throws this error
ValueError: Shapes (n_class, 1) and (n_class, n_class) are incompatible
and this warning
input expected is (None, n_class) but received an input of (n_class, 1)
I've made changes to the solution based on the arcface, you've wanted here is the code, i've managed to train it
The first one is from tensor slices as the original input and i used mnist to test it out
def map_data(inputs, outputs):
image = tf.cast(inputs['image_input'], tf.float32)
image = image / 255.
image = tf.expand_dims(image, axis=2)
labels = tf.one_hot(outputs, 10)
return {'image_input': image, 'label_input': labels}, labels
dataset = tf.data.Dataset.from_tensor_slices(({
'image_input': x_train, 'label_input': y_train
}, y_train))
dataset = dataset.map(map_data)
dataset = dataset.batch(2)
Here is the second type i have tried using a normal from tensor slices then i converted it to a multiple input, since both the normal labels are used for both the input and output
def map_data(images, annot_labels):
image = tf.cast(images, tf.float32)
image = image / 255.
image = tf.expand_dims(image, axis=2) # convert to 0 - 1 range
labels = tf.one_hot(annot_labels, 10)
return {'image_input': image, 'label_input': labels}, labels
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
dataset = dataset.map(map_data)
dataset = dataset.batch(2)
I think you should do it like this:
target = tf.keras.utils.to_categorical(train.Label.to_numpy(), num_classes=n_class, dtype='float32')
images_target = tf.data.Dataset.from_tensor_slices((train.A_image.to_numpy(), target))
images_target = images_target.map(lambda x, y: (transform_img(x), y))
target = tf.data.Dataset.from_tensor_slices(target)
dataset = tf.data.Dataset.zip((images_target, target))
Im new to tensorflow and Im trying to feed some data with tensorflow.Dataset. Im using Cityscape dataset with 8 different classes. Here is my code:
import os
import cv2
import numpy as np
import tensorflow as tf
H = 256
W = 256
id2cat = np.array([0,0,0,0,0,0,0, 1,1,1,1, 2,2,2,2,2,2, 3,3,3,3, 4,4, 5, 6,6, 7,7,7,7,7,7,7,7,7])
def readImage(x):
x = cv2.imread(x, cv2.IMREAD_COLOR)
x = cv2.resize(x, (W, H))
x = x / 255.0
x = x.astype(np.float32)
return x
def readMask(path):
mask = cv2.imread(path, 0)
mask = cv2.resize(mask, (W, H))
mask = id2cat[mask]
return mask.astype(np.int32)
def preprocess(x, y):
def f(x, y):
image = readImage(x)
mask = readMask(y)
return image, mask
image, mask = tf.numpy_function(f, [x, y], [tf.float32, tf.int32])
mask = tf.one_hot(mask, 3, dtype=tf.int32)
image.set_shape([H, W, 3])
mask.set_shape([H, W, 3])
return image, mask
def tf_dataset(x, y, batch=8):
dataset = tf.data.Dataset.from_tensor_slices((x, y))
dataset = dataset.shuffle(buffer_size=5000)
dataset = dataset.map(preprocess)
dataset = dataset.batch(batch)
dataset = dataset.repeat()
dataset = dataset.prefetch(2)
return dataset
def loadCityscape():
trainPath = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'datasets\\Cityscape\\train')
imagesPath = os.path.join(trainPath, 'images')
maskPath = os.path.join(trainPath, 'masks')
images = []
masks = []
print('Loading images and masks for Cityscape dataset...')
for image in os.listdir(imagesPath):
images.append(readImage(os.path.join(imagesPath, image)))
for mask in os.listdir(maskPath):
if 'label' in mask:
masks.append(readMask(os.path.join(maskPath, mask)))
print('Loaded {} images\n'.format(len(images)))
return images, masks
images, masks = loadCityscape()
dataset = tf_dataset(images, masks, batch=8)
print(dataset)
That last print(dataset) shows:
<PrefetchDataset shapes: ((None, 256, 256, 3), (None, 256, 256, 3)), types: (tf.float32, tf.int32)>
Why am I obtaining (None, 256, 256, 3) instead of (8, 256, 256, 3)? I also have some doubts about how to iterate over this dataset.
Thanks a lot.
Tensorflow is a graph based mathematical framework that abstracts for you all of those complex vectorial or matricial operations you face, particularly in machine learning.
What the developers though is that it would be unconfortable to specify every single time how many input vectors you need to pass in your model for the training, so they decided to abstract it for you.
You will not interested if your model is fed with one single or thousands samples as long as the output matches with the input dimension (but also any internal operation should match in dimensions!).
So the None size is a placeholder for a possible changing shape, that is usually the batch size of the input.
We need a placeholder because (None, 2) is a different shape with respect of just (2,), because in the first case we know we will face 2 dimensions.
Even if the None dimension is unknown when you "compile" your model, it will be evaluated only when it is strictly needed, in other words when you run it. In this way your model will be happy to run on a batch size of 64 as like as 128 samples.
For the rest a (non-scalar) Tensor behaves like a normal numpy array:
tensor1 = tf.constant([ 0, 1, 2, 3]) # shape (4, )
tensor2 = tf.constant([ [0], [1], [2], [3]]) # shape (4, 1)
for x in tensor1:
print(x) # 0, 1, 2, 3
for x in tensor2:
print(x) # Tensor([0]), Tensor([1]), Tensor([2]), Tensor([3])
The only difference is that it can be allocated into any supported device memory (CPU / Cuda GPU).
Iterating through the dataset is just like slicing it at (usually) constant sizes, where that constant is your batch size, which will fill that empty None dimension.
This line of code will be responsible of slicing your dataset into "sub-tensors" ("sub-arrays") composed by its samples:
dataset = dataset.batch(N)
# iterating over it:
for batch in dataset: # I'm taking N samples here
...
Your "runtime" shape will be (N, 256, 256, 3), but if you will try to take an element from the dataset it could still have None in the shape... That's because we can't guarantee, for example, that the dimension of the dataset is exactly divisible by the batch size, so some trailing samples of a variable shape could still be possible. You will hardly get rid off that None dimension, but in some custom methods of your model you could achieve that.
If you are still unconfortable with tensors there is the tensor.numpy() method that gives you back a numpy array, but at the cost of copying it (usually to your CPU). This is not available in every step of the process.
There are many way to define a dataset in tensorflow, I suggest to read how they think you should build an input pipeline, because it will make your life easier if you understand how much tensorflow takes your code at higher levels of abstraction.
I am feeding many time series of length 100 and 3 features into a 1D Convnet. I have too many of these to use numpy arrays, therefore I need to use Dataset.from_generator().
The problem is that when I train the model on the dataset, it gives the error:
expected conv1d_input to have 3 dimensions, but got array with shape (100, 3)
The code below demonstrates the problem. The generator produces each element as an expected (100,3) array. Why does the model not recognise the generator output as valid?
Many thanks for any help. Julian
import numpy as np
import tensorflow as tf
def create_timeseries_element():
# returns a random time series of 100 intervals, each with 3 features,
# and a random one-hot array of 5 entries
data = np.random.rand(100,3)
label = np.eye(5, dtype='int')[np.random.choice(5)]
return data, label
def data_generator():
d, l = create_timeseries_element()
yield (d, l)
model = tf.keras.models.Sequential([
tf.keras.layers.Conv1D(128, 9, activation='relu', input_shape=(100, 3)),
tf.keras.layers.Conv1D(128, 9, activation='relu'),
tf.keras.layers.MaxPooling1D(2),
tf.keras.layers.Conv1D(256, 5, activation='relu'),
tf.keras.layers.Conv1D(256, 5, activation='relu'),
tf.keras.layers.GlobalAveragePooling1D(),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(5, activation='softmax')])
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
x_train = []
y_train = []
for _ in range(1000):
d, l = create_timeseries_element()
x_train.append(d)
y_train.append(l)
x_train = np.array(x_train)
y_train = np.array(y_train)
# train model with numpy arrays - this works
model.fit(x=x_train, y=y_train)
ds = tf.data.Dataset.from_generator(data_generator, output_types=(tf.float32, tf.int32),
output_shapes=(tf.TensorShape([100, 3]), tf.TensorShape([5])))
# train model with dataset - this fails
model.fit(ds)
Model expects a batch/list of samples. You can do that by simply setting batch property while creating your dataset as follows:
ds = tf.data.Dataset.from_generator(data_generator, output_types=(tf.float32, tf.int32),
output_shapes=(tf.TensorShape([100, 3]), tf.TensorShape([5])))
ds = ds.batch(16)
You can also do that another way when you prepare sample. In this way, You need to expand the sample dimension so that a sample acts as a batch (you can pass a list of samples too) and you have to do the following modifications in your output_shapes of dataset and create_timeseries_element function
def create_timeseries_element():
# returns a random time series of 100 intervals, each with 3 features,
# and a random one-hot array of 5 entries
# Expand dimensions to create a batch of single sample
data = np.expand_dims(np.random.rand(100, 3), axis=0)
label = np.expand_dims(np.eye(5, dtype='int')[np.random.choice(5)], axis=0)
return data, label
ds = tf.data.Dataset.from_generator(data_generator, output_types=(tf.float32, tf.int32), output_shapes=(tf.TensorShape([None, 100, 3]), tf.TensorShape([None, 5])))
The above changes will supply only a single batch (sample for first solution) for each epochs of your dataset. You can generate as much batches (samples for first solution) you want (e.g. 25) by passing a parameter to data_generator function while you define your dataset like follows:
def data_generator(count=1):
for _ in range(count):
d, l = create_timeseries_element()
yield (d, l)
ds = tf.data.Dataset.from_generator(data_generator, args=[25], output_types=(tf.float32, tf.int32), output_shapes=(tf.TensorShape([None, 100, 3]), tf.TensorShape([None, 5])))