I'm trying to fit a CNN Keras model, feeding it with data handled by the Datasets API from Tensorflow. However, I stumble again and again upon the same Exception, despite following the official documentation (see there):
ValueError: No data provided for "conv2d_8_input". Need data for each key in: ['conv2d_8_input']
# conv2d_8 is the first Conv2D layer of my model, see below
I'm using the MNIST dataset from tensorflow-datasets, images are normalized and class labels are converted into one-hot encodings. You can see an excerpt from the code below.
test_data, train_data = tfds.load("mnist", split=Split.ALL.subsplit([1, 3]))
# [...] Images are normalized using Dataset.map method
# [...] Labels are converted into one-hot encodings as well, using tf.one_hot function
model = keras.Sequential([
keras.layers.Conv2D(
32,
kernel_size=5,
padding="same",
input_shape=(28, 28, 1),
activation="relu",
),
keras.layers.MaxPooling2D(
(2, 2),
padding="same"
),
keras.layers.Conv2D(
64,
kernel_size=5,
padding="same",
activation="relu"
),
keras.layers.MaxPooling2D(
(2, 2),
padding="same"
),
keras.layers.Flatten(),
keras.layers.Dense(
512,
activation="relu"
),
keras.layers.Dropout(rate=0.4),
keras.layers.Dense(10, activation="softmax")
])
model.compile(
optimizer=tf.train.AdamOptimizer(0.01),
loss="categorical_crossentropy",
metrics=["accuracy"]
)
train_data = train_data.batch(32).repeat()
test_data = test_data.batch(32).repeat()
model.fit(
train_data,
epochs=10,
steps_per_epoch=30,
validation_data=test_data,
validation_steps=3
) # The exception occurs at this step
I don't understand why it doesn't work, I tried to feed the fit method with one shot iterators instead of the datasets, but I get the same result. I'm not used to Keras and TensorFlow (I usually work with PyTorch), so I think I may be missing something obvious.
For those coming to this page after following TF 2.0 Beta tutorial on Loading images (https://www.tensorflow.org/beta/tutorials/load_data/images):
I was able to avoid the error by returning a tuple in the preprocess_image function
def preprocess_image(image):
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.resize(image, [192, 192])
image /= 255.0 # normalize to [0,1] range
return (image,image)
I am not using the labels in my Use Case so you might have to do other changes to follow the tutorial
Ok, I got it. I enabled eager execution to see if Keras would yield a more precise exception, and I got this:
ValueError: Output of generator should be a tuple `(x, y, sample_weight)` or `(x, y)`. Found: {'image': <tf.Tensor: id=1012, shape=(32, 28, 28, 1), dtype=float64, numpy=array([...])>, 'label': <tf.Tensor: id=1013, shape=(32, 10), dtype=uint8, numpy=array([...]), dtype=uint8)>}
Indeed, the components of my datasets (images and their associated labels) have names ("image" and "label"), because this is how tensorflow_datasets loads them. As a result, an iterator on the datasets yields a dictionary with two values: "image" and "label".
However, Keras expects a tuple of two values (inputs, targets) (or three values (inputs, targets, sample_wheights)), and it doesn't like the dictionary yielded by the Dataset iterator (hence the error I got).
I added the following code before model.fit:
train_data = train_data.map(lambda x: tuple(x.values()))
test_data = test_data.map(lambda x: tuple(x.values()))
And it works.
You can load data from tensorflow-datasets directly as a tuple using as_supervised
test_data, train_data = tfds.load("mnist", split=tfds.Split.ALL.subsplit([1, 3]), as_supervised=True)
Related
I'm trying to train a CNN model for a speech emotion recognition task using spectrograms as input. I've reshaped the spectrograms to have the shape (num_frequency_bins, num_time_frames, 1) which I thought would be sufficient, but upon trying to fit the model to the dataset, which is stored in a Tensorflow dataset, I got the following error:
Input 0 of layer "sequential_12" is incompatible with the layer: expected shape=(None, 257, 1001, 1), found shape=(257, 1001, 1)
I tried reshaping the spectrograms to have the shape (1, num_frequency_bins, num_time_frames, 1), but that produced an error when creating the Sequential model:
ValueError: Exception encountered when calling layer "resizing_14" (type Resizing).
'images' must have either 3 or 4 dimensions.
Call arguments received:
• inputs=tf.Tensor(shape=(None, 1, 257, 1001, 1), dtype=float32)
So I passed in the shape as (num_frequency_bins, num_time_frames, 1) when creating the model, and then fitted the model to the training data with the 4-dimensional data, but that raised this error:
InvalidArgumentError: slice index 0 of dimension 0 out of bounds. [Op:StridedSlice] name: strided_slice/
So I'm kind of at a loss now. I genuinely have no idea what to do and how I can go about fixing this. I've read around but haven't come across anything useful. Would really appreciate any help.
Here's some of the code for context.
dataset = [[specgram_files[i], labels[i]] for i in range(len(specgram_files))]
specgram_files_and_labels_dataset = tf.data.Dataset.from_tensor_slices((specgram_files, labels))
def read_npy_file(data):
# 'data' stores the file name of the numpy binary file storing the features of a particular sound file
# item() returns numpy array of size 1 as a suitable python scalar.
# data.item() then returns the bytes string stored in the numpy array.
# decode() is then called on the bytes string to decode it from a bytes string to a regular string
# so that it can be passed as a parameter in np.load()
data = np.load(data.item().decode())
# Shape of data is now (1, rows, columns)
# Needs to be reshaped to (rows, columns, 1):
data = np.reshape(data, (data.shape[0], data.shape[1], 1))
return data.astype(np.float32)
specgram_dataset = specgram_files_and_labels_dataset.map(
lambda file, label: tuple([tf.numpy_function(read_npy_file, [file], [tf.float32]), label]),
num_parallel_calls=tf.data.AUTOTUNE)
num_files = len(train_df)
num_train = int(0.8 * num_files)
num_val = int(0.1 * num_files)
num_test = int(0.1 * num_files)
specgram_dataset.shuffle(buffer_size=1000)
specgram_train_ds = specgram_dataset.take(num_train)
specgram_test_ds = specgram_dataset.skip(num_train)
specgram_val_ds = specgram_test_ds.take(num_val)
specgram_test_ds = specgram_test_ds.skip(num_val)
batch_size = 32
specgram_train_ds.batch(batch_size)
specgram_val_ds.batch(batch_size)
specgram_train_ds = specgram_train_ds.cache().prefetch(tf.data.AUTOTUNE)
specgram_val_ds = specgram_val_ds.cache().prefetch(tf.data.AUTOTUNE)
for specgram, label in specgram_train_ds.take(1):
input_shape = specgram.shape
num_emotions = len(train_df["emotion"].unique())
model = models.Sequential([
layers.Input(shape=input_shape),
# downsampling the input.
layers.Resizing(32, 128),
layers.Conv2D(32, 3, activation="relu"),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, activation="relu"),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(128, activation="softmax"),
layers.Dense(num_emotions)
])
model.compile(
optimizer=tf.keras.optimizers.Adam(0.01),
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=["accuracy"]
)
EPOCHS = 10
model.fit(
specgram_train_ds,
validation_data=specgram_val_ds,
epochs=EPOCHS,
callbacks=tf.keras.callbacks.EarlyStopping(verbose=1, patience=2)
)
Assuming you know your input_shape, I would recommend first hard-coding it into your model:
model = models.Sequential([
layers.Input(shape=(257, 1001, 1),
# downsampling the input.
layers.Resizing(32, 128),
layers.Conv2D(32, 3, activation="relu"),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, activation="relu"),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(128, activation="softmax"),
layers.Dense(num_emotions)
])
Also, when using tf.data.Dataset.batch, you should assign the Dataset output to a variable:
batch_size = 32
specgram_train_ds = specgram_train_ds.batch(batch_size)
specgram_val_ds = specgram_val_ds.batch(batch_size)
Afterwards, make sure that specgram_train_ds really does have the correct shape:
specgrams, _ = next(iter(specgram_train_ds.take(1)))
assert specgrams.shape == (32, 257, 1001, 1)
some days ago I started with ML as I wanted to do a hcaptcha solver. I have everything ready, I just need to train a model that will classify the captcha images so I can send a request with the good answer and get the captcha token.
I've looked into some tutorials on how to train my own model with several classes. I have it the following way:
1 trainer folder, 1 validation folder and 1 testing folder. On the trainer and validation folder there is more subfolders named airplane, truck, boat, train,... each one containing aprox 20 images. On the testing folder, some random images related with the classes I have.
I have trained the model and it seems like I'm getting a 1 accuracy. Then I get some of the random testing images and try to predict them using this saved model. It does it's job and predicts them, returning an array of numbers. The thing is I don't know how to decode those predictions nor how to see the classes list with his representative integer before predicting.
I'm super new on this so I'm sure anything will help :)
My code below:
import os
from keras.preprocessing import image
from keras.models import Sequential
from keras import layers
from keras.models import load_model
import numpy as np
trainer_path = "./img/trainer"
validator_path = "./img/validator"
testing_path = "./img/tester"
WIDTH = 128
HEIGHT = 128
BATCH = 30
EPOCHS = 15
train_dataset = image.image_dataset_from_directory(
trainer_path,
label_mode="int",
batch_size=BATCH,
image_size=(WIDTH, HEIGHT)
)
validator_dataset = image.image_dataset_from_directory(
validator_path,
label_mode="int",
batch_size=BATCH,
image_size=(WIDTH, HEIGHT)
)
model = Sequential([
layers.Input((WIDTH, HEIGHT, 3)),
layers.Conv2D(16, 3, padding="same"),
layers.Conv2D(32, 3, padding="same"),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(10)
])
model.compile(
optimizer="adam",
loss=[
"sparse_categorical_crossentropy"
],
metrics=["accuracy"]
)
model_fit = model.fit(
train_dataset,
epochs=EPOCHS,
validation_data=validator_dataset,
verbose=2
)
#loading the saved model
model = load_model("./model")
for i in os.listdir(testing_path):
img = image.load_img(testing_path + "/" + i, target_size=(WIDTH, HEIGHT, 3))
img_array = image.img_to_array(img)
img_batch = np.expand_dims(img_array, axis=0)
prediction = model.predict(img_batch)
print(prediction)
print()
Output example:
[[ 875.5614 3123.8257 1521.7046 90.056526 335.5274
-785.3671 1075.9199 1105.3068 -14.917503 -3745.6494 ]]
You have to apply activation function on last Dense layer, if you want to classify the image it should be softmax (you will get probabilities for all classes), here is the link:
https://keras.io/api/layers/activations/
When it comes to class names it should be sorted by alphanumerical values, you can also pass class_names argument, here is the link to arguments of this function:
https://www.tensorflow.org/api_docs/python/tf/keras/utils/image_dataset_from_directory
I want to train an autoencoder on mp3 songs. Given the size of the dataset, it would be better if only part of the dataset is in memory at any given time.
What I tried
is using tfio and tf.data.Dataset but that gives me an error when fitting the model.
ValueError: Cannot iterate over a shape with unknown rank.
The code was as follows
segment_length = 1024
filenames= tf.data.Dataset.list_files('data/*')
def decode_mp3(mp3_path):
mp3_path = mp3_path.numpy().decode("utf-8")
audio = tfio.audio.AudioIOTensor(mp3_path)
audio_tensor = tf.cast(audio[:], tf.float32)
overflow = len(audio_tensor) % segment_length
audio_tensor = audio_tensor[:-overflow, 0]
audio_tensor = tf.reshape(audio_tensor,(len(audio_tensor), 1))
audio_tensor = audio_tensor[:, 0]
return audio_tensor
song_dataset = filenames.map(lambda path:
tf.py_function(func=decode_mp3, inp=[path], Tout=tf.float32))
segment_dataset = song_dataset.flat_map(lambda song:
tf.data.Dataset.from_tensor_slices(song)).batch(segment_length)
dataset = segment_dataset.map(lambda x: (x, x)) # add labels (identical to inputs here)
With a model like so
encoder = keras.models.Sequential([
keras.layers.Input((segment_length, 1)),
keras.layers.Conv1D(128, 3, strides=2, padding="same"),
...
)]
but as I said, calling fit would throw the error above. Even though the shape is exactly as I would hope
for x,y in dataset.take(1):
print(x.shape, y.shape)
> (1024, 1) (1024, 1)
Any help on this would be appreciated. I might be misunderstanding something with input shapes and datasets.
So I finally found part of the answer. The Input layer seems to be meant for models with the functional API (?) and I removed it. Now the model is like this
encoder = keras.models.Sequential([
keras.layers.Conv1D(128, 3, strides=2, padding="same", input_shape=(segment_length, 1)),
...
where the Input layer is replaced with an input_shape parameter in the first Conv1D layer. Also I batched the dataset with
ds = dataset.batch(2)
and that was important too. Any further clarification would still be appreciated. None the less, I hope this can help people with the same problem.
I am using a wrapper from sklearn to find the best hyperparameters for my Keras model. Briefly, this model is a conv autoencoder and takes in data with the shape of (x,x,x). Keras wrapper seems to take data with the shape of (x,x). Since it is autoencoder model, the data would be in the shape of (x,x,x) and I think because of this reason, I am getting the following error ValueError: Invalid shape for y: (3744, 288, 1). How can I resolve this?
full code
"""
# Load libraries
"""
import warnings
warnings.filterwarnings('ignore')
import numpy as np
import pandas as pd
from tensorflow import keras
from tensorflow.keras import layers
from matplotlib import pyplot as plt
import numpy as np
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasClassifier
# Set random seed
np.random.seed(0)
"""
## Load the data
"""
master_url_root = "https://raw.githubusercontent.com/numenta/NAB/master/data/"
df_small_noise_url_suffix = "artificialNoAnomaly/art_daily_small_noise.csv"
df_small_noise_url = master_url_root + df_small_noise_url_suffix
df_small_noise = pd.read_csv(
df_small_noise_url, parse_dates=True, index_col="timestamp"
)
df_daily_jumpsup_url_suffix = "artificialWithAnomaly/art_daily_jumpsup.csv"
df_daily_jumpsup_url = master_url_root + df_daily_jumpsup_url_suffix
df_daily_jumpsup = pd.read_csv(
df_daily_jumpsup_url, parse_dates=True, index_col="timestamp"
)
"""
## Prepare training data
"""
# Normalize and save the mean and std we get,
# for normalizing test data.
training_mean = df_small_noise.mean()
training_std = df_small_noise.std()
df_training_value = (df_small_noise - training_mean) / training_std
print("Number of training samples:", len(df_training_value))
"""
### Create sequences
Create sequences combining `TIME_STEPS` contiguous data values from the
training data.
"""
TIME_STEPS = 288
# Generated training sequences for use in the model.
def create_sequences(values, time_steps=TIME_STEPS):
output = []
for i in range(len(values) - time_steps):
output.append(values[i : (i + time_steps)])
return np.stack(output)
x_train = create_sequences(df_training_value.values)
print("Training input shape: ", x_train.shape)
"""
## Build a model
We will build a convolutional reconstruction autoencoder model. The model will
take input of shape `(batch_size, sequence_length, num_features)` and return
output of the same shape. In this case, `sequence_length` is 288 and
`num_features` is 1.
"""
# Create function returning a compiled network
def create_network(optimizer='Adam'):
model = keras.Sequential(
[
layers.Input(shape=(x_train.shape[1], x_train.shape[2])),
layers.Conv1D(
filters=32, kernel_size=7, padding="same", strides=2, activation="relu"
),
layers.Dropout(rate=0.2),
layers.Conv1D(
filters=16, kernel_size=7, padding="same", strides=2, activation="relu"
),
layers.Conv1DTranspose(
filters=16, kernel_size=7, padding="same", strides=2, activation="relu"
),
layers.Dropout(rate=0.2),
layers.Conv1DTranspose(
filters=32, kernel_size=7, padding="same", strides=2, activation="relu"
),
layers.Conv1DTranspose(filters=1, kernel_size=7, padding="same"),
]
)
model.compile(optimizer=keras.optimizers.optimizer(learning_rate=0.001), loss="mse", metrics=['mae'])
return model
# Hyper-parameter tuning
# Wrap Keras model so it can be used by scikit-learn
CAE = KerasClassifier(build_fn=create_network, verbose=0)
# Create hyperparameter space
epochs = [5, 10]
batches = [5, 10, 100]
optimizers = ['rmsprop', 'adam']
# Create hyperparameter options
hyperparameters = dict(optimizer=optimizers, epochs=epochs, batch_size=batches)
# Create grid search
grid = GridSearchCV(estimator=CAE, cv=3, param_grid=hyperparameters)
# Fit grid search (we use train data as test data here since this is reconctruction model)
grid_result = grid.fit(x_train, x_train, validation_split=0.1)
# View hyperparameters of best neural network
print(grid_result.best_params_)
This is a special problem with KerasClassifier.fit(). If you look at its source code, you'd see that it throws error if y has >2 dimensions. Perhaps it is not aimed at autoencoders optimization :)
Your choices are:
subclass KerasClassifier.fit() and fix this limitation
use another optimization engine (my preference would be optuna)
squeeze the extra dimension out in the end of the model and reduce dimensions in y_train.
for 3) use these lines:
layers.Reshape((288,)) # add in the end of model constructor
y_train = x_train.reshape(x_train.shape[:-1]) # to match the above change
grid_result = grid.fit(x_train, y_train, validation_split=0.1) # feed y_train
There is one more, most elegant solution:
replace keras.wrappers.scikit_learn.KerasClassifier with keras.wrappers.scikit_learn.KerasRegressor. The latter is not checking dimensions of y.
I want to implement an attempt to make softmax faster by using only the top k values in the vector.
For that I tried implementing a custom function for tensorflow to use in a model:
def softmax_top_k(logits, k=10):
values, indices = tf.nn.top_k(logits, k, sorted=False)
softmax = tf.nn.softmax(values)
logits_shape = tf.shape(logits)
return_value = tf.sparse_to_dense(indices, logits_shape, softmax)
return_value = tf.convert_to_tensor(return_value, dtype=logits.dtype, name=logits.name)
return return_value
I'm using the fashion mnist to test, whether that attempt is working:
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
# normalize the data
train_images = train_images / 255.0
test_images = test_images / 255.0
# split the training data into train and validate arrays (will be used later)
train_images, train_images_validate, train_labels, train_labels_validate = train_test_split(
train_images, train_labels, test_size=0.2, random_state=133742,
)
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=softmax_top_k)
])
model.compile(
loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
model.fit(
train_images, train_labels,
epochs=10,
validation_data=(train_images_validate, train_labels_validate),
)
model_without_cnn.compile(
loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
model_without_cnn.fit(
train_images, train_labels,
epochs=10,
validation_data=(train_images_validate, train_labels_validate),
)
But during the execution an error is occuring:
ValueError: An operation hasNonefor gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable).
I've found this: (How to make a custom activation function), which explaines how to implement a completly custom activation function to tensorflow. But since this uses and expands softmax, I thought that the gradient should still be the same.
This is my first week of coding with python and tensorflow, therefore I don't have a good overview over all the internal implementations, yet.
Is there a simpler way to extend softmax into a new function, rather than implementing it from scratch?
Thanks in advance!
Instead of using sparse tensors to make the tensor with "all zeros except softmaxed top-K values", use tf.scatter_nd:
import tensorflow as tf
def softmax_top_k(logits, k=10):
values, indices = tf.nn.top_k(logits, k, sorted=False)
softmax = tf.nn.softmax(values)
logits_shape = tf.shape(logits)
# Assuming that logits is 2D
rows = tf.tile(tf.expand_dims(tf.range(logits_shape[0]), 1), [1, k])
scatter_idx = tf.stack([rows, indices], axis=-1)
return tf.scatter_nd(scatter_idx, softmax, logits_shape)
EDIT: Here is a slightly more complex version for tensors with an arbitrary number of dimensions. The code still requires that the number of dimensions is known at graph construction time, though.
import tensorflow as tf
def softmax_top_k(logits, k=10):
values, indices = tf.nn.top_k(logits, k, sorted=False)
softmax = tf.nn.softmax(values)
# Make nd indices
logits_shape = tf.shape(logits)
dims = [tf.range(logits_shape[i]) for i in range(logits_shape.shape.num_elements() - 1)]
grid = tf.meshgrid(*dims, tf.range(k), indexing='ij')
scatter_idx = tf.stack(grid[:-1] + [indices], axis=-1)
return tf.scatter_nd(scatter_idx, softmax, logits_shape)