I saw this code from https://github.com/raducrs/Applications-of-Deep-Learning/blob/master/Image%20captioning%20Flickr8k.ipynb and tried it to run in google colab, however when I run the code below it gave me error. It says
Merge is deprecated
I wonder how I can run this code with keras latest version.
LSTM_CELLS_CAPTION = 256
LSTM_CELLS_MERGED = 1000
image_pre = Sequential()
image_pre.add(Dense(100, input_shape=(IMG_FEATURES_SIZE,), activation='relu', name='fc_image'))
image_pre.add(RepeatVector(MAX_SENTENCE,name='repeat_image'))
caption_model = Sequential()
caption_model.add(Embedding(VOCABULARY_SIZE, EMB_SIZE,
weights=[embedding_matrix],
input_length=MAX_SENTENCE,
trainable=False, name="embedding"))
caption_model.add(LSTM(EMB_SIZE, return_sequences=True, name="lstm_caption"))
caption_model.add(TimeDistributed(Dense(100, name="td_caption")))
combined = Sequential()
combined.add(Merge([image_pre, caption_model], mode='concat', concat_axis=1,name="merge_models"))
combined.add(Bidirectional(LSTM(256,return_sequences=False, name="lstm_merged"),name="bidirectional_lstm"))
combined.add(Dense(VOCABULARY_SIZE,name="fc_merged"))
combined.add(Activation('softmax',name="softmax_combined"))
predictive = Model([image_pre.input, caption_model.input],combined.output)
Merge(mode='concat') is now Concatenate(axis=1).
The following generates a graph correctly on colab.
from tensorflow.python import keras
from keras.layers import *
from keras.models import Model, Sequential
IMG_FEATURES_SIZE = 10
MAX_SENTENCE = 80
VOCABULARY_SIZE = 1000
EMB_SIZE = 100
embedding_matrix = np.zeros((VOCABULARY_SIZE, EMB_SIZE))
LSTM_CELLS_CAPTION = 256
LSTM_CELLS_MERGED = 1000
image_pre = Sequential()
image_pre.add(Dense(100, input_shape=(IMG_FEATURES_SIZE,), activation='relu', name='fc_image'))
image_pre.add(RepeatVector(MAX_SENTENCE,name='repeat_image'))
caption_model = Sequential()
caption_model.add(Embedding(VOCABULARY_SIZE, EMB_SIZE,
weights=[embedding_matrix],
input_length=MAX_SENTENCE,
trainable=False, name="embedding"))
caption_model.add(LSTM(EMB_SIZE, return_sequences=True, name="lstm_caption"))
caption_model.add(TimeDistributed(Dense(100, name="td_caption")))
merge = Concatenate(axis=1,name="merge_models")([image_pre.output, caption_model.output])
lstm = Bidirectional(LSTM(256,return_sequences=False, name="lstm_merged"),name="bidirectional_lstm")(merge)
output = Dense(VOCABULARY_SIZE, name="fc_merged", activation='softmax')(lstm)
predictive = Model([image_pre.input, caption_model.input], output)
predictive.compile('sgd', 'binary_crossentropy')
predictive.summary()
Description:
This is a model with 2 inputs per sample: an image and a caption ( a sequence of words ).
The input graphs merge at the concatenation point (name='merge_models')
The image is processed simply by a Dense layer (you may want to add convolutions to the image branch ); the output of this dense layer is then copied MAX_SENTENCE times in preparation for the merge.
The captions are processed by an LSTM and a Dense layer.
The merge results in MAX_SENTENCE time-steps each with features from both branches.
The combined branch then ends up predicting one class out of VOCABULARY_SIZE.
The model.summary() is a good way to understand the graph.
Related
i ask what’s I doing to edait code in order preventing errors
I use keras model conv1d for raw dataset X_train= (142315, 23)
Y_train = (142315,)
my code
My code is
n_timesteps = X_train.shape[1] #23
input_layer = tensorflow.keras.layers.Input(shape=(n_timesteps,1))
conv_layer1 = tensorflow.keras.layers.Conv1D(filters=5,
kernel_size=7,
activation=”relu”)(input_layer)
max_pool1 = tensorflow.keras.layers.MaxPooling1D(pool_size=2, strides=5)(conv_layer1)
conv_layer2 = tensorflow.keras.layers.Conv1D(filters=3,
kernel_size=3,
activation=”relu”)(max_pool1)
flatten_layer = tensorflow.keras.layers.Flatten()(conv_layer2)
dense_layer = tensorflow.keras.layers.Dense(15, activation=”relu”)(flatten_layer)
output_layer = tensorflow.keras.layers.Dense(6, activation=”softmax”)(dense_layer)
model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer)
# Prints a string summary of the network.
model.summary()
and after that i use optimization technological for hyperprameters and when # Returning the details of the best solution. print this error can helpe me?????
error
5121 # Use logits whenever they are available. softmax and sigmoid
ValueError: Shapes (142315,) Hi and (142315, 2) are incompatible
I structured a Convolutional LSTM model to predict the forthcoming Bitcoin price data, using the analyzed past data of the Bitcoin close price and other features.
Let me jump straight to the code:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import tensorflow as tf
import tensorflow.keras as keras
import keras_tuner as kt
from keras_tuner import HyperParameters as hp
from keras.models import Sequential
from keras.layers import InputLayer, ConvLSTM1D, LSTM, Flatten, RepeatVector, Dense, TimeDistributed
from keras.callbacks import EarlyStopping
from tensorflow.keras.metrics import RootMeanSquaredError
from tensorflow.keras.optimizers import Adam
import keras.backend as K
from keras.losses import Huber
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
DIR = '../input/btc-features-targets'
SEG_DIR = '../input/segmented'
segmentized_features = os.listdir(SEG_DIR)
btc_train_features = []
for seg in segmentized_features:
train_features = pd.read_csv(f'{SEG_DIR}/{seg}')
train_features.set_index('date', inplace=True)
btc_train_features.append(scaler.fit_transform(train_features.values))
btc_train_targets = pd.read_csv(f'{DIR}/btc_train_targets.csv')
btc_train_targets.set_index('date', inplace=True)
btc_test_features = pd.read_csv(f'{DIR}/btc_test_features.csv')
btc_tef1 = btc_test_features.iloc[:111]
btc_tef2 = btc_test_features.iloc[25:]
btc_tef1.set_index('date', inplace=True)
btc_tef2.set_index('date', inplace=True)
btc_test_targets = pd.read_csv(f'{DIR}/btc_test_targets.csv')
btc_test_targets.set_index('date', inplace=True)
btc_trt_log = np.log(btc_train_targets)
btc_tefs1 = scaler.fit_transform(btc_tef1.values)
btc_tefs2 = scaler.fit_transform(btc_tef2.values)
btc_tet_log = np.log(btc_test_targets)
scaled_train_features = []
for features in btc_train_features:
shape = features.shape
scaled_train_features.append(np.expand_dims(features, [0,3]))
shape_2 = btc_tefs1.shape
btc_tefs1 = np.expand_dims(btc_tefs1, [0,3])
shape_3 = btc_tefs2.shape
btc_tefs2 = np.expand_dims(btc_tefs2, [0,3])
btc_trt_log = btc_trt_log.values[0]
btc_tet_log = btc_tet_log.values[0]
def build(hp):
model = keras.Sequential()
# Input Layer
model.add(InputLayer(input_shape=(111,32,1)))
# ConvLSTM1D
convLSTM_hp_filters = hp.Int(name='convLSTM_filters', min_value=32, max_value=512, step=32)
convLSTM_hp_kernel_size = hp.Choice(name='convLSTM_kernel_size', values=[3,5,7])
convLSTM_activation = hp.Choice(name='convLSTM_activation', values=['selu', 'relu'])
model.add(ConvLSTM1D(filters=convLSTM_hp_filters,
kernel_size=convLSTM_hp_kernel_size,
padding='same',
activation=convLSTM_activation,
use_bias=True,
bias_initializer='zeros'))
# Flatten
model.add(Flatten())
# RepeatVector
model.add(RepeatVector(5))
# LSTM
LSTM_hp_units = hp.Int(name='LSTM_units', min_value=32, max_value=512, step=32)
LSTM_activation = hp.Choice(name='LSTM_activation', values=['selu', 'relu'])
model.add(LSTM(units=LSTM_hp_units, activation=LSTM_activation, return_sequences=True))
# TimeDistributed Dense
dense_units = hp.Int(name='dense_units', min_value=32, max_value=512, step=32)
dense_activation = hp.Choice(name='dense_activation', values=['selu', 'relu'])
model.add(TimeDistributed(Dense(units=dense_units, activation=dense_activation)))
# TimeDistributed Dense_Output
model.add(Dense(1))
# Set Learning Rate
hp_learning_rate = hp.Choice(name='learning_rate', values=[1e-2, 1e-3, 1e-4])
# Compile Model
model.compile(optimizer=Adam(learning_rate=hp_learning_rate),
loss=Huber(),
metrics=[RootMeanSquaredError()])
return model
tuner = kt.Hyperband(build,
objective=kt.Objective('root_mean_squared_error', direction='min'),
max_epochs=10,
factor=3)
early_stop = EarlyStopping(monitor='root_mean_squared_error', patience=5)
opt_hps = []
for train_features in scaled_train_features:
tuner.search(train_features, btc_trt_log, epochs=50, callbacks=[early_stop])
opt_hps.append(tuner.get_best_hyperparameters(num_trials=1)[0])
models, epochs = ([] for _ in range(2))
for hps in opt_hps:
model = tuner.hypermodel.build(hps)
models.append(model)
history = model.fit(train_features, btc_trt_log, epochs=70, verbose=0)
rmse = history.history['root_mean_squared_error']
best_epoch = rmse.index(min(rmse)) + 1
epochs.append(best_epoch)
hypermodel = tuner.hypermodel.build(opt_hps[0])
for train_features, epoch in zip(scaled_train_features, epochs): hypermodel.fit(train_features, btc_trt_log, epochs=epoch)
tp1 = hypermodel.predict(btc_tefs1).flatten()
tp2 = hypermodel.predict(btc_tefs2).flatten()
test_predictions = np.concatenate((tp1, tp2[86:]), axis=None)
The hyperparameters of the model are configured using keras_tuner; as there were ResourceExhaustError issues output by the notebook when training is done with the full features dataset, sequentially segmented datasets are used instead (and apparently, referring to the study done utilizing the similar model architecture, training is able to be efficiently done through this training approach).
The input dimension of each segmented dataset is (111,32,1).
There aren't any issues reported until before the last code block. The models work fine. Yet, when the .predict() function is executed, the notebook prints out an error, which states that the dimension of the input features for making predictions is incompatible with the dimension of the input features used while training. I did not understand the reason behind its occurrence, since as far as I know, the input dimensions of a train dataset for a DNN model cannot be identical as the input dimensions of a test dataset.
Even though all the price data from 2018 to early 2021 are used as training datasets, predictions are only needed for the mid 2021 timeframe.
The dataset used for prediction has a dimension of (136,32,1).
I tried matching the dimension of this dataset to (111,32,1), through index slicing.
Now this showed issues in the output dimension. While predictions should be made for 136 data points, the result only returned 10.
Are there any issues relevant to the model configuration? Cannot interpret the current situation.
I am working on a modified resnet, and want to insert dropout after activation layers.
I have tried the following but due to the model not being sequential, it did not work:
def add_dropouts(model, probability = 0.5):
print("Adding Dropouts")
updated_model = tf.keras.models.Sequential()
for layer in model.layers:
print("layer = ", layer)
updated_model.add(layer)
if isinstance(layer, tf.keras.layers.Activation):
updated_model.add(tf.keras.layers.Dropout(probability))
print("updated model Summary = ", updated_model.summary)
print("model Summary = ", model.summary)
model = updated_model
return model
base_model = tf.keras.applications.ResNet50V2(include_top=False, input_shape=input_img_shape, pooling='avg')
base_model = add_dropouts(base_model, probability = 0.5)
Then i tried my own version using the functional API, but this method doesn't work and returns a value error say Tensor doesn't have output.
prev_layer = base_model.layers[0]
for layer in base_model.layers:
next_layer = layer(prev_layer.output)
if isinstance(layer, tf.keras.layers.Activation):
next_layer = Dropout(0.5)(next_layer.output)
prev_layer = next_layer
Does anyone know how someone would add dropout layers into resnet or any other pretrained network?
So eventually i figured out how to do it; but its very hacky. Go to:
C:\ProgramData\Anaconda3\envs*your env name*\Lib\site-packages\tensorflow\python\keras\applications
Go to resnet.py. This will also change resnetv2 instances because it is based on the original resnet. Just Cntrl+F for activation,and where you see an activation layer(which is usually in the format x = Layer(x) building the model a layer at a time) then just add:
x = Dropout(prob)(x)
Here is an example:
if not preact:
x = layers.BatchNormalization(
axis=bn_axis, epsilon=1.001e-5, name='conv1_bn')(x)
x = layers.Activation('relu', name='conv1_relu')(x)#insert layer after each of these
x = layers.Dropout(prob)(x) # added dropout
Do this for all similar search results for 'activation'.
Then you will see the dropout added in your model summary.
This question already has an answer here:
InvalidArgumentError: Received a label value of 8825 which is outside the valid range of [0, 8825) SEQ2SEQ model
(1 answer)
Closed last year.
I’m new to TensorFlow and python...
I’m trying to build a deep CNN for cell image classification Hep-2 dataset. The data set consists of 13596 images and I’m using 8701 images as my training data for CNN. Also, I have.CSV file which consists of image ID and its cell-type. I extracted the content and using image_ID from .CSV file as my labels. Both training data and Image ID has been converted to .astype(‘float32’). But, somehow I’m getting InvalidArgumentError which I have no idea what’s going on in there.
I’ve posted my code and error, any tips or help would be highly appreciated. Thank you in advance :)
I'm new to Stack Overflow as well. sorry for my messy formatting.
My CODE:
from PIL import Image
import glob
import os
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense, Conv2D, Dropout, Flatten, MaxPooling2D
from keras.optimizers import SGD
def extract_labels(image_names, Original_Labels):
temp = np.array([image.split('.')[0] for image in image_names])
temp2 = np.array([j[0] for i in temp for j in Original_Labels if(int(i) == int(j[0]))])
return temp2
def get_Labels():
df=pd.read_csv('gt_training.csv', sep=',')
labels = np.asarray(df)
path = 'path..../training/'
image_names_train = [f for f in os.listdir(path) if os.path.splitext(f)[-1] == '.png']
return labels, image_names_train
Train_images = glob.glob('path.../training/*.png')
train_data = np.array([np.array(Image.open(fname)) for fname in Train_images])
train_data = train_data.astype('float32')
train_data /= 255
#getting labels from .csv file for training data
labels, image_names_train = get_Labels()
train_labels = extract_labels(image_names_train, labels)
train_labels = train_labels.astype('float32')
print(train_labels.shape)
train_data = train_data.reshape(train_data.shape[0],78,78,1) #reshaping into 4-Dim
input_shape = (78, 78, 1) #1 because the provided dataset is in grey scale
#Adding pooling, dense layers to an an non-optimized empty CNN
model = Sequential()
model.add(Conv2D(6, kernel_size=(7,7),activation = tf.nn.tanh, input_shape = input_shape))
model.add(MaxPooling2D(pool_size = (2, 2)))
model.add(Conv2D(16, kernel_size=(4,4),activation = tf.nn.tanh))
model.add(MaxPooling2D(pool_size = (3, 3)))
model.add(Conv2D(32, kernel_size=(3,3),activation = tf.nn.tanh))
model.add(MaxPooling2D(pool_size = (3, 3)))
model.add(Flatten())
model.add(Dense(150, activation = tf.nn.tanh, kernel_regularizer = keras.regularizers.l2(0.00005)))
model.add(Dropout(0.5))
model.add(Dense(6, activation = tf.nn.softmax))
#setting an optimizer with a given loss function
opt = SGD(lr = 0.01, momentum = 0.9)
model.compile(optimizer = opt, loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
model.fit(x = train_data, y = train_labels, epochs = 10, batch_size = 77)
The error message I got:
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
InvalidArgumentError: Received a label value of 13269 which is outside the valid range of [0, 6). Label values: 8823 3208 9410 5223 8817 3799 6588 1779 1371 5017 9788 9886 3345 1815 5943 37 675 2396 4485 9528 11082 12457 13269 5488 3250 12896 13251 1854 10942 6287 6232 2944
[[node loss_24/dense_55_loss/sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at C:\Users\vardh\Anaconda3\envs\tf\lib\site-packages\keras\backend\tensorflow_backend.py:3009) ]] [Op:__inference_keras_scratch_graph_676176]
Function call stack:
keras_scratch_graph
Somehow I realized that my question is related to this Question InvalidArgumentError: Received a label value of 8825.....
Solution From that post:
#shaili posted,
In the last layer for eg you used model.add(Dense(1, activation='softmax')). Here 1 restricts its value from [0, 1) change its shape to the maximum output label. For eg your output is from label [0,7) then use model.add(Dense(7, activation='softmax'))
input_text = Input(shape=(max_len,), dtype=tf.string)
embedding = Lambda(ElmoEmbedding, output_shape=(max_len, 1024))(input_text)
x = Bidirectional(LSTM(units=512, return_sequences=True,
recurrent_dropout=0.2, dropout=0.2))(embedding)
x_rnn = Bidirectional(LSTM(units=512, return_sequences=True,
recurrent_dropout=0.2, dropout=0.2))(x)
x = add([x, x_rnn]) # residual connection to the first biLSTM
out = TimeDistributed(Dense(n_tags, activation="softmax"))(x)
Here in TimeDistributed layer n_tags is the length of tags from which I want to classify.
If I predict some other quantity such as q_tag whose length is different from n_tags i.e suppose 10 and length of n_tags is 7 and I received 8 as output label it will give the invalid argument error Received a label value of 8 which is outside the valid range of [0, 7).
From my experience,
Usually, this error generates due to the, no of classifications to be classified, inaccurately. Here in my code, model.add(Dense(6, activation = tf.nn.softmax) I gave 6 types of classification to be generated instead of 13596. However, this is not a fully working code, at least it gets my code running.
I'm using Tensorflow 2.0 and a pre-trained VGG16 model. I would like to visualize the activations. Therefore I want to extract them.
Currently I'm doing the following:
model = tf.keras.applications.VGG16(input_shape=(224, 224, 3), weights='imagenet')
model.outputs = [layer.output for layer in model.layers]
model.build(input_shape=(1, 224, 224 ,3))
activations = model(image_data)
However, I'm getting the following error when I try to call the last line:
ValueError: Structure is a scalar but len(flat_sequence) == 23 > 1
Would this approach:https://machinelearningmastery.com/how-to-visualize-filters-and-feature-maps-in-convolutional-neural-networks/
work for TF 2.0?
from keras.applications.vgg16 import VGG16
from matplotlib import pyplot
# load the model
model = VGG16()
# retrieve weights from the second hidden layer
filters, biases = model.layers[1].get_weights()
# normalize filter values to 0-1 so we can visualize them
f_min, f_max = filters.min(), filters.max()
filters = (filters - f_min) / (f_max - f_min)