I'm new to tensorflow and I have trouble with feeding my custom data to keras model.
I've followed this guide:Load images to convert my .jpg files to tf.data.
Now I have my data converted to (image_batch, label_batch). The image_batch is EagerTensor with shape (32,224,224,3) and the label_batch is EagerTensor with shape (32,2).
Then I found this guide:Custom training: walkthrough but the data in the guild is converted to EagerTensor with shape (32,4).
I got Warning when executing the code:
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, activation=tf.nn.relu, input_shape=(3,)), # input shape required
tf.keras.layers.Dense(10, activation=tf.nn.relu),
tf.keras.layers.Dense(3)
])
predictions = model(image_batch)
WARNING:tensorflow:Model was constructed with shape (None, 3) for input Tensor("dense_input:0", shape=(None, 3), dtype=float32), but it was called on an input with incompatible shape (32, 224, 224, 3).
How should I adjust my model or what should I do with my data?
EDIT:
The model now works, but with one additional problem.
When I run the following code:
print("Prediction: {}".format(tf.argmax(predictions, axis=1)))
print(" Labels: {}".format(labels_batch))
it prints:
Prediction: [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
Labels: [[ True False]
[False True]
[ True False]
[False True]
[ True False]...(omitted)]
But I expected it prints something like:
Prediction: [0 1 0 1 1 1 0 1 0 1 1 0 0 0 0 0 1 1 0 1 0 0 1 0 0 0 0 1 0 0 1 0]
Labels: [2 0 2 0 0 0 1 0 2 0 0 1 1 2 2 2 1 0 1 0 1 2 0 1 1 1 1 0 2 2 0 2]
with Labels as a one dimensional array with integers.
I wonder if it is normal that the predictions are all 1? What should I do?
Your input is 32 images of shape (224, 224, 3) not (3,). Your input shape needs to be (224,224,3).
I am also noting that your output shape looks like it is going to be (224,224,3) as well, this won't match your labels. You need to flatten the data at some point or do something similar.
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, activation=tf.nn.relu, input_shape=(224,224,3)), # input shape required
tf.keras.layers.Dense(10, activation=tf.nn.relu),
tf.kears.layers.Flatten(),
tf.keras.layers.Dense(2)
])
The input shape to the Danse layer should have a dimension (None, n), where None is a batch_size. In your case, if you'd like to use a Dense layer you should first use a Flatten layer wich roll your images to the shape (32, 224 * 224 * 3). The code should be:
model = tf.keras.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(10, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.relu),
tf.keras.layers.Dense(3)
])
For more details please see https://www.tensorflow.org/api_docs/python/tf/keras/layers/Flatten
Related
I'm attempting to train my model using the "train_on_batch" function, as the data is too large to be fully put in at once. The shape of my training data is as follows: X.shape = (388, 108, 36, 36, 36), Y.shape = (388, 108). To make the data clear, there are 388 x and 388 y train files. Each of these training files contains 108 arrays of 3d arrays (36,36,36). For every 3d array, there is a corresponding binary. I'm trying to iterate through these 388 pairs of files 1 by 1 to use in the train_on_batch. Below is the CNN model:
model = Sequential()
model.add(Conv3D(filters=16, kernel_size=(3,3,3), padding='valid', input_shape=(108, 36, 36, 36)))
model.add(Activation('relu'))
model.add(MaxPool3D(pool_size=(2,2,2)))
model.add(Conv3D(32, kernel_size=(3,3,3)))
model.add(Activation('relu'))
model.add(MaxPool3D(pool_size=(2,2,2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(32))
model.add(Activation('relu'))
model.add(Dropout(0.1))
model.add(Dense(2))
model.add(Activation('softmax'))
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
This was my first for loop for trying to input the data:
for i in range(len(X_train)):
model.train_on_batch(X_train[i], Y_train[i], sample_weight=None)
Which resulted in the following error:
ValueError: Input 0 of layer "sequential" is incompatible with the layer: expected shape=(None, 108, 36, 36, 36), found shape=(108, 36, 36, 36)
To combat this I reshaped my data, which resulted in my input being accepted. I ensured that the y data was the same shape, however then I reached the error loop which I cannot figure out myself, and wanted to ask others. Here is the reshape resulting in ValueError: Shapes (1, 108) and (1, 2) are incompatible:
for i in range(len(X_train)):
new_X_train = X_train[i].reshape(1, 108, 36, 36, 36)
new_Y_train = Y_train[i].reshape(1, 108)
When I apply .astype('float32').reshape((-1,1)) on the Y, then I get the error that ValueError: Data cardinality is ambiguous:. This makes sense to me because since then the x and y data won't be the same format.
The output should be 0 or 1, as these are ct_scan slices, so it's identifying the array as either "nodule" or "non-nodule". For reference, here is what Y_train[0] looks like:
[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
I've been trying to wrap my head around this for a while. There are many questions that can help me solve my errors, but my issue is when I solve "Data cardinality is ambiguous", I get sent to the "shapes are incompatible". Vise-verca. I might be missing something, I tried what several threads have done with these individual problems but I can't seem to figure it out. Is it just the data format that my training files are in?
As it turns out, I was misinterpreting a comment I had read while following a guide on how to setup this model. By writing (108,36,36,36) instead of (36,36,36,1) I was telling the model the incorrect input shape. Once that was fixed, it worked.
I'm working on a classification Problem. The data i use is from the Aras Dataset. One line of the Data looks like the following:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 12 17
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 12 17
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 12 17
From the first 19 columns represent sensordata(binary). The last two columns represent the activites of two persons who lived in a household, where the data was collected.
i have diveded the dataset into different pieces, because it's not small at all, 30 Days with one datapoint every second.
What i want to do with my model: I want to train my model so it can predict what Person A&B are doing at the moment.
So here is my Code(X-Data:Column 1-19;Y-Data_Column 20-21):
*import keras
from keras import losses
from keras import regularizers
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import numpy as np
from keras.utils import to_categorical
import matplotlib.pyplot as plt
from tensorflow.keras import optimizers
batch_size =512
no_epochs = 5
verbosity = 1
x_train=np.loadtxt('x_train.txt')
x_val=np.loadtxt('x_val.txt')
x_test=np.loadtxt('x_test.txt')
y_train=np.loadtxt('y_train.txt')
y_val=np.loadtxt('y_val.txt')
y_test=np.loadtxt('y_test.txt')
y_train_onehot=keras.utils.to_categorical(y_train)
y_val_onehot=keras.utils.to_categorical(y_val)
y_test_onehot=keras.utils.to_categorical(y_test)
model = Sequential()
model.add(Dense(64, activation='relu', input_shape=[19,]))
model.add(Dense(128, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer=tf.keras.optimizers.Adam(
learning_rate=0.000001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False,
name='Adam'),
metrics=['accuracy'])
model.summary()
history=model.fit(x_train, y_train_onehot, batch_size, epochs=no_epochs,verbose=verbosity, shuffle=True,validation_data=(x_val, y_val_onehot))
Error: ValueError: Shapes (None, 2, 28) and (None, 2) are incompatible
When i do not convert the labels to the onehot format it is working, but it is not a useful result (i guess). Problem is, that i got this valueerror at the end and i know it has something to do with the fact that inside the vector are two onehot-vectors, but i have no idea how to solve this issue.
--> i tried to put both onehot vectors into one, but then every line has 729 columns(27*27 for each labelcombination), but then the labeldata gots to big an python won't work the script out.
Windows 10
Keras 2.4.3
Tensorflow 2.3.1
Python 3.7.9
I'm new to this whole topic, so don't be mad with me, if my question is stupid.
Your model requires two outputs. It is impossible with Sequential API. Create a new model with Functional API
This is my first attempt to LSTM layer and cant make this work. I have checked github bug tracker and SO topics but none of the available solutions solved my issue.
Whenever I change dense layer dimension or data shape I receive similar error.
Traceback (most recent call last):
File "F:/Programowanie/GitHub Repositories/MYOPM/ui_main.py", line 640, in train_ml_algorithm
self.ml.keras_LSTM_train()
File "F:\Programowanie\GitHub Repositories\MYOPM\machine_learning.py", line 648, in keras_LSTM_train
verbose=2)
File "F:\Programowanie\GitHub Repositories\MYOPM\venv\lib\site-packages\keras\engine\training.py", line 1154, in fit
batch_size=batch_size)
File "F:\Programowanie\GitHub Repositories\MYOPM\venv\lib\site-packages\keras\engine\training.py", line 621, in _standardize_user_data
exception_prefix='target')
File "F:\Programowanie\GitHub Repositories\MYOPM\venv\lib\site-packages\keras\engine\training_utils.py", line 135, in standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking target: expected dense_1 to have 2 dimensions, but got array with shape (20, 84, 1)
I got 3d-arrays which contain only binary data:
train - shape (20, 84, 147)
[[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
...
[[1 0 0 ... 0 0 0]
[1 0 0 ... 0 0 0]
[1 0 0 ... 0 0 0]
...
[0 0 0 ... 1 0 0]
[0 0 0 ... 1 0 0]
[0 0 0 ... 1 0 0]]
label - shape (20, 84, 1)
[[[0]
[0]
[0]
...
[1]
[1]
[1]]
...
[[1]
[1]
[1]
...
[1]
[1]
[1]]
Code:
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers.core import Dense, Dropout, Activation
model = Sequential()
model.add(LSTM(32, input_shape=(84, 147), return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(32, return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(1))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(train,
label,
epochs=100,
batch_size=64,
verbose=2)
This question is in continue to a previous question I've asked.
I've trained an LSTM model to predict a binary class (1 or 0) for batches of 100 samples with 3 features each, i.e: the shape of the data is (m, 100, 3), where m is the number of batches.
Data:
[
[[1,2,3],[1,2,3]... 100 sampels],
[[1,2,3],[1,2,3]... 100 sampels],
... avaialble batches in the training data
]
Target:
[
[1]
[0]
...
]
Model code:
def build_model(num_samples, num_features, is_training):
model = Sequential()
opt = optimizers.Adam(lr=0.0005, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0001)
batch_size = None if is_training else 1
stateful = False if is_training else True
first_lstm = LSTM(32, batch_input_shape=(batch_size, num_samples, num_features), return_sequences=True,
activation='tanh', stateful=stateful)
model.add(first_lstm)
model.add(LeakyReLU())
model.add(Dropout(0.2))
model.add(LSTM(16, return_sequences=True, activation='tanh', stateful=stateful))
model.add(Dropout(0.2))
model.add(LeakyReLU())
model.add(LSTM(8, return_sequences=False, activation='tanh', stateful=stateful))
model.add(LeakyReLU())
model.add(Dense(1, activation='sigmoid'))
if is_training:
model.compile(loss='binary_crossentropy', optimizer=opt,
metrics=['accuracy', keras_metrics.precision(), keras_metrics.recall(), f1])
return model
For the training stage, the model is NOT stateful. When predicting I'm using a stateful model, iterating over the data and outputting a probability for each sample:
for index, row in data.iterrows():
if index % 100 == 0:
predicting_model.reset_states()
vals = np.array([[row[['a', 'b', 'c']].values]])
prob = predicting_model.predict_on_batch(vals)
When looking at the probability at the end of a batch, it is exactly the value I get when predicting with the entire batch (not one by one). However, I've expected that the probability will always continue in the right direction when new samples arrive. What actually happens is that the probability output can spike to the wrong class on an arbitrary sample (see below).
Two samples of 100 sample batches over the time of prediction (label = 1):
and Label = 0:
Is there a way to achieve what I want (avoid extreme spikes while predicting probability), or is that a given fact?
Any explanation, advice would be appreciated.
Update
Thanks to #today advice, I've tried training the network with the hidden state output for each input time step using return_sequence=True on the last LSTM layer.
So now the labels look like so (shape (100,100)):
[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
...]
the model summary:
Layer (type) Output Shape Param #
=================================================================
lstm_1 (LSTM) (None, 100, 32) 4608
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU) (None, 100, 32) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 100, 32) 0
_________________________________________________________________
lstm_2 (LSTM) (None, 100, 16) 3136
_________________________________________________________________
dropout_2 (Dropout) (None, 100, 16) 0
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU) (None, 100, 16) 0
_________________________________________________________________
lstm_3 (LSTM) (None, 100, 8) 800
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU) (None, 100, 8) 0
_________________________________________________________________
dense_1 (Dense) (None, 100, 1) 9
=================================================================
Total params: 8,553
Trainable params: 8,553
Non-trainable params: 0
_________________________________________________________________
However, I get an exception:
ValueError: Error when checking target: expected dense_1 to have 3 dimensions, but got array with shape (75, 100)
What do I need to fix?
Note: This is just an idea and it might be wrong. Try it if you would like and I would appreciate any feedback.
Is there a way to achieve what I want (avoid extreme spikes while
predicting probability), or is that a given fact?
You can do this experiment: set the return_sequences argument of last LSTM layer to True and replicate the labels of each sample as much as the length of each sample. For example if a sample has a length of 100 and its label is 0, then create a new label for this sample which consists of 100 zeros (you can probably easily do this using numpy function like np.repeat). Then retrain your new model and test it on new samples afterwards. I am not sure of this, but I would expect more monotonically increasing/decreasing probability graphs this time.
Update: The error you mentioned is caused by the fact that the labels should be a 3D array (look at the output shape of last layer in the model summary). Use np.expand_dims to add another axis of size one to the end. The correct way of repeating the labels would look like this, assuming y_train has a shape of (num_samples,):
rep_y_train = np.repeat(y_train, num_reps).reshape(-1, num_reps, 1)
The experiment on IMDB dataset:
Actually, I tried the experiment suggested above on the IMDB dataset using a simple model with one LSTM layer. One time, I used only one label per each sample (as in original approach of #Shlomi) and the other time I replicated the labels to have one label per each timestep of a sample (as I suggested above). Here is the code if you would like to try it yourself:
from keras.layers import *
from keras.models import Sequential, Model
from keras.datasets import imdb
from keras.preprocessing.sequence import pad_sequences
import numpy as np
vocab_size = 10000
max_len = 200
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=vocab_size)
X_train = pad_sequences(x_train, maxlen=max_len)
def create_model(return_seq=False, stateful=False):
batch_size = 1 if stateful else None
model = Sequential()
model.add(Embedding(vocab_size, 128, batch_input_shape=(batch_size, None)))
model.add(CuDNNLSTM(64, return_sequences=return_seq, stateful=stateful))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
return model
# train model with one label per sample
train_model = create_model()
train_model.fit(X_train, y_train, epochs=10, batch_size=128, validation_split=0.3)
# replicate the labels
y_train_rep = np.repeat(y_train, max_len).reshape(-1, max_len, 1)
# train model with one label per timestep
rep_train_model = create_model(True)
rep_train_model.fit(X_train, y_train_rep, epochs=10, batch_size=128, validation_split=0.3)
Then we can create the stateful replicas of the training models and run them on some test data to compare their results:
# replica of `train_model` with the same weights
test_model = create_model(False, True)
test_model.set_weights(train_model.get_weights())
test_model.reset_states()
# replica of `rep_train_model` with the same weights
rep_test_model = create_model(True, True)
rep_test_model.set_weights(rep_train_model.get_weights())
rep_test_model.reset_states()
def stateful_predict(model, samples):
preds = []
for s in samples:
model.reset_states()
ps = []
for ts in s:
p = model.predict(np.array([[ts]]))
ps.append(p[0,0])
preds.append(list(ps))
return preds
X_test = pad_sequences(x_test, maxlen=max_len)
Actually, the first sample of X_test has a 0 label (i.e. belongs to negative class) and the second sample of X_test has a 1 label (i.e. belongs to positive class). So let's first see what the stateful prediction of test_model (i.e. the one that were trained using one label per sample) for these two samples would look like:
import matplotlib.pyplot as plt
preds = stateful_predict(test_model, X_test[0:2])
plt.plot(preds[0])
plt.plot(preds[1])
plt.legend(['Class 0', 'Class 1'])
The result:
Correct label (i.e. probability) at the end (i.e. timestep 200) but very spiky and fluctuating in between. Now let's compare it with the stateful predictions of the rep_test_model (i.e. the one that were trained using one label per each timestep):
preds = stateful_predict(rep_test_model, X_test[0:2])
plt.plot(preds[0])
plt.plot(preds[1])
plt.legend(['Class 0', 'Class 1'])
The result:
Again, correct label prediction at the end but this time with a much more smoother and monotonic trend, as expected.
Note that this was just an example for demonstration and therefore I have used a very simple model here with just one LSTM layer and I did not attempt to tune it at all. I guess with a better tuning of the model (e.g. adjusting the number of layers, number of units in each layer, activation functions used, optimizer type and parameters, etc.), you might get far better results.
I have data of the form :
A B C D E F G
1 0 0 1 0 0 1
1 0 0 1 0 0 1
1 0 0 1 0 1 0
1 0 1 0 1 0 0
...
1 0 1 0 1 0 0
0 1 1 0 0 0 1
0 1 1 0 0 0 1
0 1 0 1 1 0 0
0 1 0 1 1 0 0
A,B,C,D are my inputs and E,F,G are my outputs. I wrote the following code in Python using TensorFlow:
from __future__ import print_function
#from random import randint
import numpy as np
import tflearn
import pandas as pd
data,labels =tflearn.data_utils.load_csv('dummy_data.csv',target_column=-1,categorical_labels=False, n_classes=None)
print(data)
# Build neural network
net = tflearn.input_data(shape=[None, 4])
net = tflearn.fully_connected(net, 8)
net = tflearn.fully_connected(net, 8)
net = tflearn.fully_connected(net, 3, activation='softmax')
net = tflearn.regression(net)
# Define model
model = tflearn.DNN(net)
#Start training (apply gradient descent algorithm)
data_to_array = np.asarray(data)
print(data_to_array.shape)
#data_to_array= data_to_array.reshape(6,9)
print(data_to_array.shape)
model.fit(data_to_array, labels, n_epoch=10, batch_size=3, show_metric=True)
I am getting an error which says:
ValueError: Cannot feed value of shape (3, 6) for Tensor 'InputData/X:0', which has shape '(?, 4)'
I am guessing this is because my input data has 7 columns (0...6), but I want the input layer to take only the first four columns as input and predict the last 3 columns in the data as output. How can I model this?
If the data's in a numpy format, then the first 4 columns are taken with a simple slice:
data[:,0:4]
The : means "all rows", and 0:4 is a range of values 0,1,2,3, the first 4 columns.
If the data isn't in a numpy format, just convert it to a numpy format so you can slice easily.
Here's a related article on numpy slices: Numpy - slicing 2d row or column vector from array