I am working on a regression CNN using Keras/Tensorflow. I have a multi-output feed-forward model that I have trained up with some success. The model takes in a 201x201 grayscale image and returns two regression targets.
Here is an example of an input/target pair:
is associated with (z=562.59, a=4.53)
There exists an analytical solution for this problem, so I know it's solvable.
Here is the model architecture:
model_input = keras.Input(shape=input_shape, name='image')
x = model_input
x = Conv2D(32, kernel_size=(3, 3), activation='relu')(x)
x = MaxPooling2D(pool_size = (2,2))(x)
x = Conv2D(32, kernel_size=(3, 3), activation='relu')(x)
x = MaxPooling2D(pool_size = (2,2))(x)
x = Conv2D(32, kernel_size=(3, 3), activation='relu')(x)
x = MaxPooling2D(pool_size = (2,2))(x)
x = Conv2D(16, kernel_size=(3, 3), activation='relu')(x)
x = MaxPooling2D(pool_size = (4,4))(x)
x = Flatten()(x)
model_outputs = list()
out_names = ['z', 'a']
for i in range(2):
out_name = out_names[i]
local_output= x
local_output = Dense(10, activation='relu')(local_output)
local_output = Dropout(0.2)(local_output)
local_output = Dense(units=1, activation='linear', name = out_name)(local_output)
model_outputs.append(local_output)
model = Model(model_input, model_outputs)
model.compile(loss = 'mean_squared_error', optimizer='adam', loss_weights = [1,1])
My targets are on different scales, so I normalized one of them (name 'a') to the range [0,1] for training. Here is how I rescale:
def rescale(min, max, list):
scalar = 1./(max-min)
list = (list-min)*scalar
return list
Where min,max for each parameter are known a priori and are constant.
Here is how I trained:
model.fit({'image' : x_train},
{'z' : z_train, 'a' : a_train},
batch_size = 32,
epochs=20,
verbose=1,
validation_data = ({'image' : x_test},
{'z' : z_test, 'a' : a_test}))
When I predict for 'a', I get a fairly good accuracy, but with an offset:
This is a fairly easy thing to fix, I just apply a linear fit to the predictions and invert it to rescale:
But I can't think of a reason why this would be happening in the first place. I've used this same model architecture for other problems, and I get that same offset again. Has anyone seen this sort of thing before?
EDIT: This offset occurs in multiple different models of mine, which each predict different parameters but are rescaled/preprocessed in the same way. It happens regardless of how many epochs I train for, with more training resulting in predictions hugging the green line (in the first graph) more closely.
As a temporary work-around, I trained a single-node model to take the input as the original model's prediction and the output as the ground truth. This trained up nicely, and corrects the offset. What's strange though, is that I can apply this rescale model to ANY of the models with this issue, and it corrects the offset equally well.
Essentially: the offset has the same weight for multiple different models, which predict completely different parameters. This makes me think there is something to do with the activation or regularization.
Related
My problem is to create an autoencoder model to recognize anomalies on a cardboard-like surface. I know that I can use an autoencoder and train it using "good" samples, and later I can use it to recognize "bad" samples (i.e., anomalies).
I've built convolutional autoencoders in Tensorflow based on Building autoencoder in Keras and PyImageSearch autoencoders. Both examples use MNIST dataset and they work perfectly (loss is decreasing, accuracy is growing up to about 0.85). However, I tried to train both autoencoders using my custom pictures, and the problem is that my models don't converge - the training loss (I tried binary_crossentropy and mse as stated in the websites) gets stuck at some level, e.g. 3.0873e-04 or 0.0016 (depending on loss and way of normalizing the data), and accuracy is 0 or sth like 1.2618e-07. Below is the sample network architecture.
input_layer = layers.Input(shape=(28, 28, 1))
data_augmentation = tf.keras.Sequential()
data_augmentation.add(tensorflow.keras.layers.experimental.preprocessing.RandomFlip('horizontal'))
data_augmentation.add(tensorflow.keras.layers.experimental.preprocessing.RandomFlip('vertical'))
x = data_augmentation(input_layer, 0)
x = layers.Conv2D(16, (3, 3), padding='same', activation='relu')(x)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
# the block below was added once and removed later to reduce network depth
x = layers.Conv2D(8, (3, 3), padding='same', activation='relu')(x)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), padding='same', activation='relu')(x)
x = layers.UpSampling2D((2, 2))(x)
# the block below was added once and removed later to reduce network depth - END
x = layers.Conv2D(16, (3, 3), padding='same', activation='relu')(x)
x = layers.UpSampling2D((2, 2))(x)
decoded = layers.Conv2D(1, (3, 3), padding='same', activation='sigmoid')(x)
autoencoder = Model(input_layer, decoded)
My dataset consists of about 50k pictures of size 50x50 (and resized to 28x28) presenting tiny parts of cardboard-like sheet: Example 1, Example 2, and here you can see the source 900x900 picture I used to create 50x50 pictures. For the model, I convert them to grayscale.
I used two ways of data normalization: the first one (taken from the websites) was to split their values by 255, and the second one was to use min-max normalization. The pixel values are in range 59-168. Here is how I create the dataset:
imgs = [cv2.imread(fname) for fname in glob.glob('{}/*.jpg'.format(dir_with_pictures))]
imgs = [img for img in imgs if img is not None] # to remove pictures which OpenCV was not able to load for some reason
dataset = np.array([np.expand_dims(np.array(cv2.resize(cv2.cvtColor(img, cv2.COLOR_BGR2GRAY), (28, 28)), axis=-1) for img in imgs])
dataset = (dataset - np.min(dataset)) / (np.max(dataset) - np.min(dataset)) # one way of normalization, OR
dataset = dataset.astype('float32') / 255.0 # second way of normalization
And here I compile my model, and later it's trained using my dataset - either the whole dataset, or the training set creating by splitting the dataset 80/20:
autoencoder.compile(optimizer=Adam(learning_rate=1e-3), loss='mse', metrics=['accuracy'])
Is it possible that items in my dataset are 'too similar' to each other and that's why I can't train a good model? Or may there be anything other wrong with the dataset? What can I try to get a converging model? Should I pay attention to accuracy metric?
I've adapted a simple CNN from a tutorial on Analytics Vidhya.
Problem is that my accuracy on a holdout set is no better than random. I am training on ~8600 images each of cats and dogs, which should be enough data for decent model, but accuracy on the test set is at 49%. Is there a glaring omission in my code somewhere?
import os
import numpy as np
import keras
from keras.models import Sequential
from sklearn.model_selection import train_test_split
from datetime import datetime
from PIL import Image
from keras.utils.np_utils import to_categorical
from sklearn.utils import shuffle
def main():
cat=os.listdir("train/cats")
dog=os.listdir("train/dogs")
filepath="train/cats/"
filepath2="train/dogs/"
print("[INFO] Loading images of cats and dogs each...", datetime.now().time())
#print("[INFO] Loading {} images of cats and dogs each...".format(num_images), datetime.now().time())
images=[]
label = []
for i in cat:
image = Image.open(filepath+i)
image_resized = image.resize((300,300))
images.append(image_resized)
label.append(0) #for cat images
for i in dog:
image = Image.open(filepath2+i)
image_resized = image.resize((300,300))
images.append(image_resized)
label.append(1) #for dog images
images_full = np.array([np.array(x) for x in images])
label = np.array(label)
label = to_categorical(label)
images_full, label = shuffle(images_full, label)
print("[INFO] Splitting into train and test", datetime.now().time())
(trainX, testX, trainY, testY) = train_test_split(images_full, label, test_size=0.25)
filters = 10
filtersize = (5, 5)
epochs = 5
batchsize = 32
input_shape=(300,300,3)
#input_shape = (30, 30, 3)
print("[INFO] Designing model architecture...", datetime.now().time())
model = Sequential()
model.add(keras.layers.InputLayer(input_shape=input_shape))
model.add(keras.layers.convolutional.Conv2D(filters, filtersize, strides=(1, 1), padding='same',
data_format="channels_last", activation='relu'))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(units=2, input_dim=50,activation='softmax'))
#model.add(keras.layers.Dense(units=2, input_dim=5, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print("[INFO] Fitting model...", datetime.now().time())
model.fit(trainX, trainY, epochs=epochs, batch_size=batchsize, validation_split=0.3)
model.summary()
print("[INFO] Evaluating on test set...", datetime.now().time())
eval_res = model.evaluate(testX, testY)
print(eval_res)
if __name__== "__main__":
main()
For me the problem comes from the size of your network, you have only one Conv2D with a filter size of 10. This is way too small to learn the deep reprensation of your image.
Try to increment this a lot by using blocks of common architectures like VGGnet !
Example of a block :
x = Conv2D(32, (3, 3) , padding='SAME')(model_input)
x = LeakyReLU(alpha=0.3)(x)
x = BatchNormalization()(x)
x = Conv2D(32, (3, 3) , padding='SAME')(x)
x = LeakyReLU(alpha=0.3)(x)
x = BatchNormalization()(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Dropout(0.25)(x)
You need to try multiple blocks like that, and increasing the filter size in order to capture deeper features.
Other thing, you don't need to specify the input_dim of your dense layer, keras automaticly take care of that !
Last but not least, you need to fully connected network in oder to correctly classify your images, not only a single layer.
For example :
x = Flatten()(x)
x = Dense(256)(x)
x = LeakyReLU(alpha=0.3)(x)
x = Dense(128)(x)
x = LeakyReLU(alpha=0.3)(x)
x = Dense(2)(x)
x = Activation('softmax')(x)
Try those changes and keep me in touch !
Update after op's questions
Images are complex, they contain much information like shapes, edges, colors, etc
In order to capture the maximum amont of information you need to passes through multiple convolutions which will learn the different aspects of the image.
Imagine that like for example first convolution will learn to recognise a square, the second conv to recognise circles, the third to recognise edges, etc ..
And for my second point, the final fully connected acts like a classifier, the conv network will output a vector that "represents" a dog or a cat, now you need to learn that this kind of vector is one class or the other one.
And directly feeding that vector in the final layer is not enough to learn this representation.
Is that more clear ?
Last update for op's second comment
Here the two ways for defining a Keras model, both output the same thing !
model_input = Input(shape=(200, 1))
x = Dense(32)(model_input)
x = Dense(16)(x)
x = Activation('relu')(x)
model = Model(inputs=model_input, outputs=x)
model = Sequential()
model.add(Dense(32, input_shape=(200, 1)))
model.add(Dense(16, activation = 'relu'))
Example of architecure
model = Sequential()
model.add(keras.layers.InputLayer(input_shape=input_shape))
model.add(keras.layers.convolutional.Conv2D(32, (3,3), strides=(2, 2), padding='same', activation='relu'))
model.add(keras.layers.convolutional.Conv2D(32, (3,3), strides=(2, 2), padding='same', activation='relu'))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(keras.layers.convolutional.Conv2D(64, (3,3), strides=(2, 2), padding='same', activation='relu'))
model.add(keras.layers.convolutional.Conv2D(64, (3,3), strides=(2, 2), padding='same', activation='relu'))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(128, activation='relu'))
model.add(keras.layers.Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
Don't forget to normalize your data before feeding into your network.
A simple images_full = images_full / 255.0 on your data can boost your accuracy a lot.
Try it with grayscale images too, it's more computaly efficient.
I have a set of 100x100 images, and an output array corresponding to the size of the input (i.e. length of 10000), where each element can be an 1 or 0.
I am trying to write a python program using TensorFlow/Keras to train a CNN on this data, however, I am not sure how to setup the layers to handle it, or the type of network to use.
Currently, I am doing the following (based off the TensorFlow tutorials):
model = keras.Sequential([
keras.layers.Flatten(input_shape=(100, 100)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10000, activation=tf.nn.softmax)
])
model.compile(optimizer=tf.train.AdamOptimizer(),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
However, I can't seem to find what type of activation I should be using for the output layer to enable me to have multiple output values?
How would I set that up?
I am not sure how to setup the layers to handle it.
Your code is one way to handle that but as you might read in literature, is not the best one. State-of-the-art models usually use 2D Convolution Neural Networks. E.g:
img_input = keras.layers.Input(shape=img_shape)
conv1 = keras.layers.Conv2D(16, 3, activation='relu', padding='same')(img_input)
pol1 = keras.layers.MaxPooling2D(2)(conv1)
conv2 = keras.layers.Conv2D(32, 3, activation='relu', padding='same')(pol1)
pol2 = keras.layers.MaxPooling2D(2)(conv2)
conv3 = keras.layers.Conv2D(64, 3, activation='relu', padding='same')(pol2)
pol3 = keras.layers.MaxPooling2D(2)(conv3)
flatten = keras.layers.Flatten()(pol3)
dens1 = keras.layers.Dense(512, activation='relu')(flatten)
dens2 = keras.layers.Dense(512, activation='relu')(dens1)
drop1 = keras.layers.Dropout(0.2)(dens2)
output = keras.layers.Dense(10000, activation='softmax')(drop1)
I can't seem to find what type of activation I should be using for the
output layer to enable me to have multiple output values
Softmax is a good choice. It squashes a K-dimensional vector of arbitrary real values to a K-dimensional vector of real values, where each entry is in the range (0, 1].
You can pas output of your Softmax to top_k function to extract top k prediction:
softmax_out = tf.nn.softmax(logit)
tf.nn.top_k(softmax_out, k=5, sorted=True)
If you need multi-label classification you should change the above network. Last Activation function will change to sigmoid:
output = keras.layers.Dense(10000, activation='sigmoid')(drop1)
Then use tf.round and tf.where to extract labels:
indices = tf.where(tf.round(output) > 0.5)
final_output = tf.gather(x, indices)
I understand that the features extracted from an auto-encoder can be fed into an mlp for classification or regression purpose. This is something that I did earlier.
But what if I have 2 auto-encoders? Can I extract the features from the bottleneck layers of 2 auto-encoders and feed them into an mlp which performs classification based on these features? If yes, then how? I am not sure how to concatenate these two feature sets. I tried with numpy.hstack() which gives me 'unhashable slice' error, whereas, using tf.concat() gives me the error 'Input tensors to a Model must be Keras tensors.' the bottleneck layers of the two auto-encoders are of dimension (None,100) each. So, essentially, if I stack them horizontally, I should be getting a (None, 200). The hidden layer of the mlp may contain some (num_hidden=100) neurons. Could anyone please help?
x1 = autoencoder1.get_layer('encoder2').output
x2 = autoencoder2.get_layer('encoder2').output
#inp = np.hstack((x1, x2))
inp = tf.concat([x1, x2], 1)
x = tf.concat([x1, x2], 1)
h = Dense(num_hidden, activation='relu', name='hidden')(x)
y = Dense(1, activation='sigmoid', name='prediction')(h)
mymlp = Model(inputs=inp, outputs=y)
# Compile model
mymlp.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Train model
mymlp.fit(x_train, y_train, epochs=20, batch_size=8)
updated as per #twolffpiggott's suggestion:
from keras.layers import Input, Dense, Dropout
from keras import layers
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy as np
x1 = Data1
x2 = Data2
y = Data3
num_neurons1 = x1.shape[1]
num_neurons2 = x2.shape[1]
# Train-test split
x1_train, x1_test, x2_train, x2_test, y_train, y_test = train_test_split(x1, x2, y, test_size=0.2)
# scale data within [0-1] range
scalar = MinMaxScaler()
x1_train = scalar.fit_transform(x1_train)
x1_test = scalar.transform(x1_test)
x2_train = scalar.fit_transform(x2_train)
x2_test = scalar.transform(x2_test)
x_train = np.concatenate([x1_train, x2_train], axis =-1)
x_test = np.concatenate([x1_test, x2_test], axis =-1)
# Auto-encoder1
encoding_dim1 = 500
encoding_dim2 = 100
input_data = Input(shape=(num_neurons1,))
encoded = Dense(encoding_dim1, activation='relu', name='encoder1')(input_data)
encoded1 = Dense(encoding_dim2, activation='relu', name='encoder2')(encoded)
decoded = Dense(encoding_dim2, activation='relu', name='decoder1')(encoded1)
decoded = Dense(num_neurons1, activation='sigmoid', name='decoder2')(decoded)
# this model maps an input to its reconstruction
autoencoder1 = Model(inputs=input_data, outputs=decoded)
autoencoder1.compile(optimizer='sgd', loss='mse')
# training
autoencoder1.fit(x1_train, x1_train,
epochs=100,
batch_size=8,
shuffle=True,
validation_data=(x1_test, x1_test))
# Auto-encoder2
encoding_dim1 = 500
encoding_dim2 = 100
input_data = Input(shape=(num_neurons2,))
encoded = Dense(encoding_dim1, activation='relu', name='encoder1')(input_data)
encoded2 = Dense(encoding_dim2, activation='relu', name='encoder2')(encoded)
decoded = Dense(encoding_dim2, activation='relu', name='decoder1')(encoded2)
decoded = Dense(num_neurons2, activation='sigmoid', name='decoder2')(decoded)
# this model maps an input to its reconstruction
autoencoder2 = Model(inputs=input_data, outputs=decoded)
autoencoder2.compile(optimizer='sgd', loss='mse')
# training
autoencoder2.fit(x2_train, x2_train,
epochs=100,
batch_size=8,
shuffle=True,
validation_data=(x2_test, x2_test))
# MLP
num_hidden = 100
encoded1.trainable = False
encoded2.trainable = False
encoded1 = autoencoder1(autoencoder1.inputs)
encoded2 = autoencoder2(autoencoder2.inputs)
concatenated = layers.concatenate([encoded1, encoded2], axis=-1)
x = Dropout(0.2)(concatenated)
h = Dense(num_hidden, activation='relu', name='hidden')(x)
h = Dropout(0.5)(h)
y = Dense(1, activation='sigmoid', name='prediction')(h)
myMLP = Model(inputs=[autoencoder1.inputs, autoencoder2.inputs], outputs=y)
# Compile model
myMLP.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Training
myMLP.fit(x_train, y_train, epochs=200, batch_size=8)
# Testing
myMLP.predict(x_test)
giving me an error: unhashable type: 'list' from the line:
myMLP = Model(inputs=[autoencoder1.inputs, autoencoder2.inputs], outputs=y)
The problem is that you're mixing numpy arrays with keras tensors. This can't go.
There are two approaches.
Predict numpy arrays from each autoencoder, concat the arrays, send them to the third model
Connect all models, probably make the autoencoders untrainable, fit with one input for each autoencoder.
Personally, I'd go for the first. (Assuming the autoencoders are already trained and don't need change).
First approach
numpyOutputFromAuto1 = autoencoder1.predict(numpyInputs1)
numpyOutputFromAuto2 = autoencoder2.predict(numpyInputs2)
inputDataForThird = np.concatenate([numpyOutputFromAuto1,numpyOutputFromAuto2],axis=-1)
inputTensorForMlp = Input(inputsForThird.shape[1:])
h = Dense(num_hidden, activation='relu', name='hidden')(inputTensorForMlp)
y = Dense(1, activation='sigmoid', name='prediction')(h)
mymlp = Model(inputs=inputTensorForMlp, outputs=y)
....
mymlp.fit(inputDataForThird ,someY)
Second Approach
This is a little more complicated, and at first I don't see much reason to do this. (But of course there may be cases where it's a good choice)
Now we're totally forgetting numpy and working with keras tensors.
Creating the mlp on its own (good if you will use it later without the autoencoders):
inputTensorForMlp = Input(input_shape_compatible_with_concatenated_encoder_outputs)
x = Dropout(0.2)(inputTensorForMlp)
h = Dense(num_hidden, activation='relu', name='hidden')(x)
h = Dropout(0.5)(h)
y = Dense(1, activation='sigmoid', name='prediction')(h)
myMLP = Model(inputs=[autoencoder1.inputs, autoencoder2.inputs], outputs=y)
We probably want the bottleneck features of the autoencoders, right? If you happened to create the autoencoders properly with: encoder model, decoder model, join both, then it's easier to use just the encoder model. Else:
encodedOutput1 = autoencoder1.layers[bottleneckLayer].outputs #or encoder1.outputs
encodedOutput2 = autoencoder1.layers[bottleneckLayer].outputs #or encoder2.outputs
Creating a joined model. The concatenation must use a keras layer (we're working with keras tensors):
concatenated = Concatenate()([encodedOutput1,encodedOutput2])
output = myMLP(concatenated)
joinedModel = Model([autoencoder1.input,autoencoder2.input],output)
I'd also go with Daniel's first approach (for simplicity and efficiency), but if you're interested in the second; for instance if you're interested in running the network end-to-end, you'd approach it like this:
# make autoencoders not trainable
autoencoder1.trainable = False
autoencoder2.trainable = False
encoded1 = autoencoder1(kerasInputs1)
encoded2 = autoencoder2(kerasInputs2)
concatenated = layers.concatenate([encoded1, encoded2], axis=-1)
h = Dense(num_hidden, activation='relu', name='hidden')(concatenated)
y = Dense(1, activation='sigmoid', name='prediction')(h)
myMLP = Model([input_data1, input_data2], y)
myMLP.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Training
myMLP.fit([x1_train, x2_train], y_train, epochs=200, batch_size=8)
# Testing
myMLP.predict([x1_test, x2_test])
Key edits
The weights of both autoencoders should be frozen end-to-end (otherwise early-stage gradient updates from the randomly initialized MLP will likely result in the loss of much of their learning).
The autoencoder input layers should be assigned to separate variables input_data1 and input_data2 per autoencoder (instead of both to input_data). Even though autoencoder1.inputs returns a tf tensor, this is the source of the unhashable type: list exception, and replacing with [input_data1, input_data2] solves the issue.
When fitting the MLP for the end-to-end model, the input should be a list of x1_train and x2_train rather than the concatenated inputs. Same when predicting.
I am following the example on the page:Multi-input and multi-output models
The model setup to predict how many retweets and likes a news headline will receive. So the main_output is predicting the how many retweets and aux_output is predicting the likes?
from keras.layers import Input, Embedding, LSTM, Dense
from keras.models import Model
headline_data=[[i for i in range(100)]]
additional_data=[[100,200]]
labels=[1,2]
# Headline input: meant to receive sequences of 100 integers, between 1 and 10000.
# Note that we can name any layer by passing it a "name" argument.
main_input = Input(shape=(100,), dtype='int32', name='main_input')
# This embedding layer will encode the input sequence
# into a sequence of dense 512-dimensional vectors.
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)
# A LSTM will transform the vector sequence into a single vector,
# containing information about the entire sequence
lstm_out = LSTM(32)(x)
auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)
auxiliary_input = Input(shape=(5,), name='aux_input')
x = keras.layers.concatenate([lstm_out, auxiliary_input])
# We stack a deep densely-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
# And finally we add the main logistic regression layer
main_output = Dense(1, activation='sigmoid', name='main_output')(x)
# This defines a model with two inputs and two outputs:
model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])
# We compile the model and assign a weight of 0.2 to the auxiliary loss.
# To specify different loss_weights or loss for each different output,
# you can use a list or a dictionary. Here we pass a single loss as the loss argument,
# so the same loss will be used on all outputs.
# Since our inputs and outputs are named (we passed them a "name" argument), We could also have compiled the model via:
model.compile(optimizer='rmsprop',
loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
loss_weights={'main_output': 1., 'aux_output': 0.2})
# And trained it via:
model.fit({'main_input': headline_data, 'aux_input': additional_data},
{'main_output': labels, 'aux_output': labels},
epochs=50, batch_size=32)
I get error with AttributeError: 'list' object has no attribute 'ndim'
Your inputs/outputs must be NumPy arrays, in which the first dimension is the batch size. For instance:
headline_data = np.random.randint(1, 10000 + 1, size=(32, 100))
additional_data = np.random.randint(1, 10000 + 1, size=(32, 5))
labels = np.random.randint(0, 1 + 1, size=(32, 1))
Note that this is a toy example, and we are generating the input randomly.