I have two layer LSTM network. (config.n_input is 3, config.n_steps is 5)
I think this may be related to the shape of my inputs, but I'm not sure how to fix it, I tried changing the projecting of the LSTM so that they would be the same input size, but that didn't work.
self.input_data = tf.placeholder(tf.float32, [None, config.n_steps, config.n_input], name='input')
# Tensorflow LSTM cell requires 2x n_hidden length (state & cell)
self.initial_state = tf.placeholder(tf.float32, [None, 2*config.n_hidden], name='state')
self.targets = tf.placeholder(tf.float32, [None, config.n_classes], name='target')
_X = tf.transpose(self.input_data, [1, 0, 2]) # permute n_steps and batch_size
_X = tf.reshape(_X, [-1, config.n_input]) # (n_steps*batch_size, n_input)
input_cell = rnn_cell.LSTMCell(num_units=config.n_hidden, input_size=3, num_proj=300, forget_bias=1.0)
print(input_cell.output_size)
inner_cell = rnn_cell.LSTMCell(num_units=config.n_hidden, input_size=300)
cells = [input_cell, inner_cell]
cell = rnn.rnn_cell.MultiRNNCell(cells)
It returns the following error when attempt to run it.
tensorflow.python.pywrap_tensorflow.StatusNotOK: Invalid argument: Expected size[1] in [0, 0], but got 600
[[Node: RNN/MultiRNNCell/Cell1/Slice = Slice[Index=DT_INT32, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](_recv_state_0/_3, RNN/MultiRNNCell/Cell1/Slice/begin, RNN/MultiRNNCell/Cell1/Slice/size)]]
any superior explanations of the error message? Or are there any ways to easily fix this?
Add num_proj to your initial state:
# Tensorflow LSTM cell requires 2x n_hidden length (state & cell)
self.initial_state = tf.placeholder(tf.float32, [None, 2*config.n_hidden + 300], name='state')
This is quite an opaque error, and it might be a good idea idea for you to raise it on the TF GitHub issues page!
Related
I am working with lstm using tensor flow when I am running the code it is showing me the error. the code is running fine but when I am running the function tf.nn.dynamic_rnn(lstmCell, data, dtype=tf.float64) it is showing Value ERROR
import tensorflow as tf
wordsList = np.load('urduwords.npy')
wordVectors = np.load('urduwordsMatrix.npy')
batchSize = 24
lstmUnits = 64
numClasses = 2
iterations = 10000
tf.reset_default_graph()
labels = tf.placeholder(tf.float32, [batchSize, numClasses])
input_data = tf.placeholder(tf.int32, [batchSize, maxSeqLength])
print(labels)
data = tf.Variable(tf.zeros([batchSize, maxSeqLength, numDimensions]),dtype=tf.float32)
print(data)
data = tf.nn.embedding_lookup(wordVectors,input_data)
print(data)
lstmCell = tf.contrib.rnn.BasicLSTMCell(lstmUnits)
lstmCell = tf.contrib.rnn.DropoutWrapper(cell=lstmCell, output_keep_prob=0.1)
value, _ = tf.nn.dynamic_rnn(lstmCell, data, dtype=tf.float64)
How to resolve this error using tensor flow.
ValueError: Input 0 of layer basic_lstm_cell_1 is incompatible with the layer: expected ndim=2, found ndim=3. Full shape received: [24, 1, 2]
the shape of the input_data is
(24, 30, 1, 2)
and the shape of wordVector is
(24053, 1, 2)
the label shape is 4 dimension because of you feed the wrong type of data to tf,
please try to use NumberPy array or List
I'm trying to learn tensorflow and I'm getting the following error:
logits and labels must be broadcastable: logits_size=[32,1] labels_size=[16,1]
The code runs fine when I got this as input:
self.input = np.ones((500, 784))
self.y = np.ones((500, 1))
However, when I add and extra dimension the error is thrown:
self.input = np.ones((500, 2, 784))
self.y = np.ones((500, 1))
The code to build the graph
self.x = tf.placeholder(tf.float32, shape=[None] + self.config.state_size)
self.y = tf.placeholder(tf.float32, shape=[None, 1])
# network architecture
d1 = tf.layers.dense(self.x, 512, activation=tf.nn.relu, name="dense1")
d2 = tf.layers.dense(d1, 1, name="dense2")
with tf.name_scope("loss"):
self.cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=self.y, logits=d2))
self.train_step = tf.train.AdamOptimizer(self.config.learning_rate).minimize(self.cross_entropy,
global_step=self.global_step_tensor)
correct_prediction = tf.equal(tf.argmax(d2, 1), tf.argmax(self.y, 1))
self.accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
Could someone explain me why this is happening and how I can fix this?
logits is the name typically given to the output of the network, these are your predictions. A size of [32, 10] tells me that you have a batch size of 32, and 10 outputs, such as is common with mnist, as you appear to be working with.
Your labels are sized [16, 10], which is to say, you're providing 16 labels/vectors of size 10. The number of labels you're providing is in conflict with the output of the network, they should be the same.
I'm not quite clear what you're doing with the extra dimension in the input, but I guess you must be accidentally doubling the samples in some way. Perhaps the [500, 2, 784] shape is being reshaped to [1000, 784] automatically somewhere along the way, which is then not matching the 500 labels. Also, your self.y should be shaped [500, 10] not, [500, 1], your labels need to be in one-hot encoding format. E.g. a single label of shape [1, 10] for digit 3 would be [[0,0,0,1,0,0,0,0,0,0,0]], not in digit representation, e.g. [3] as you seem to have it set up in your sanity-test here.
I am trying to develop a seq2seq model from a low level perspective (creating by myself all the tensors needed). I am trying to feed the model with a sequence of vectors as a two-dimensional tensor, however, i can't iterate over one dimension of the tensor to extract vector by vector. Does anyone know what could I do to feed a batch of vectors and later get them one by one?
This is my code:
batch_size = 100
hidden_dim = 5
input_dim = embedding_dim
time_size = 5
input_sentence = tf.placeholder(dtype=tf.float64, shape=[embedding_dim,None], name='input')
output_sentence = tf.placeholder(dtype=tf.float64, shape=[embedding_dim,None], name='output')
input_array = np.asarray(input_sentence)
output_array = np.asarray(output_sentence)
gru_layer1 = GRU(input_array, input_dim, hidden_dim) #This is a class created by myself
for i in range(input_array.shape[-1]):
word = input_array[:,i]
previous_state = gru_encoder.h_t
gru_layer1.forward_pass(previous_state,word)
And this is the error that I get
TypeError: Expected binary or unicode string, got <tf.Tensor 'input_7:0' shape=(10, ?) dtype=float64>
Tensorflow does deferred execution.
You usually can't know how big the vector will be (words in a sentance, audio samples, etc...). The common thing to do is to cap it at some reasonably large value and then pad the shorter sequences with an empty token.
Once you do this you can select the data for a time slice with the slice operator:
data = tf.placeholder(shape=(batch_size, max_size, numer_of_inputs))
....
for i in range(max_size):
time_data = data[:, i, :]
DoStuff(time_data)
Also lookup tf.transpose for swapping batch and time indices. It can help with performance in certain cases.
Alternatively consider something like tf.nn.static_rnn or tf.nn.dynamic_rnn to do the boilerplate stuff for you.
Finally I found an approach that solves my problem. It worked using tf.scan() instead of a loop, which doesn't require the input tensor to have a defined number in the second dimension. Consecuently you hace to prepare the input tensor previously to be parsed as you want throught tf.san(). In my case this is the code:
batch_size = 100
hidden_dim = 5
input_dim = embedding_dim
time_size = 5
input_sentence = tf.placeholder(dtype=tf.float64, shape=[embedding_dim,None], name='input')
output_sentence = tf.placeholder(dtype=tf.float64, shape=[embedding_dim,None], name='output')
input_array = np.asarray(input_sentence)
output_array = np.asarray(output_sentence)
x_t = tf.transpose(input_array, [1, 0], name='x_t')
h_0 = tf.convert_to_tensor(h_0, dtype=tf.float64)
h_t_transposed = tf.scan(forward_pass, x_t, h_0, name='h_t_transposed')
h_t = tf.transpose(h_t_transposed, [1, 0], name='h_t')
I'm a beginner in deep learning and have taken a few courses on Udacity. Recently I'm trying to build a deep network detecting hand joints in the input depth images, which doesn't seem to be working well. (My dataset is ICVL Hand Posture Dataset)
The network structure is shown here.
① A batch of input images, 240x320;
② An 8-channel convolutional layer with a 5x5 kernel;
③ A max pooling layer, ksize = stride = 2;
④ A fully-connected layer, weight.shape = [38400, 1024];
⑤ A fully-connected layer, weight.shape = [1024, 48].
After several epochs of training, the output of the last layer converges as a (0, 0, ..., 0) vector. I chose the mean square error as the loss function and its value stayed above 40000 and didn't seem to reduce.
The network structure is already too simple to be simplified again but the problem remains. Could anyone offer any suggestions?
My main code is posted below:
image = tf.placeholder(tf.float32, [None, 240, 320, 1])
annotations = tf.placeholder(tf.float32, [None, 48])
W_convolution_layer1 = tf.Variable(tf.truncated_normal([5, 5, 1, 8], stddev=0.1))
b_convolution_layer1 = tf.Variable(tf.constant(0.1, shape=[8]))
h_convolution_layer1 = tf.nn.relu(
tf.nn.conv2d(image, W_convolution_layer1, [1, 1, 1, 1], 'SAME') + b_convolution_layer1)
h_pooling_layer1 = tf.nn.max_pool(h_convolution_layer1, [1, 2, 2, 1], [1, 2, 2, 1], 'SAME')
W_fully_connected_layer1 = tf.Variable(tf.truncated_normal([120 * 160 * 8, 1024], stddev=0.1))
b_fully_connected_layer1 = tf.Variable(tf.constant(0.1, shape=[1024]))
h_pooling_flat = tf.reshape(h_pooling_layer1, [-1, 120 * 160 * 8])
h_fully_connected_layer1 = tf.nn.relu(
tf.matmul(h_pooling_flat, W_fully_connected_layer1) + b_fully_connected_layer1)
W_fully_connected_layer2 = tf.Variable(tf.truncated_normal([1024, 48], stddev=0.1))
b_fully_connected_layer2 = tf.Variable(tf.constant(0.1, shape=[48]))
detection = tf.nn.relu(
tf.matmul(h_fully_connected_layer1, W_fully_connected_layer2) + b_fully_connected_layer2)
mean_squared_error = tf.reduce_sum(tf.losses.mean_squared_error(annotations, detection))
training = tf.train.AdamOptimizer(1e-4).minimize(mean_squared_error)
# This data loader reads images and annotations and convert them into batches of numbers.
loader = ICVLDataLoader('../data/')
with tf.Session() as session:
session.run(tf.global_variables_initializer())
for i in range(1000):
# batch_images: a list with shape = [BATCH_SIZE, 240, 320, 1]
# batch_annotations: a list with shape = [BATCH_SIZE, 48]
[batch_images, batch_annotations] = loader.get_batch(100).to_1d_list()
[x_, t_, l_, p_] = session.run([x_image, training, mean_squared_error, detection],
feed_dict={images: batch_images, annotations: batch_annotations})
And it runs like this.
The main issue is likely the relu activation in the output layer. You should remove this, i.e. let detection simply be the results of a matrix multiplication. If you want to force the outputs to be positive, consider something like the exponential function instead.
While relu is a popular hidden activation, I see one major problem with using it as an output activation: As is well known relu maps negative inputs to 0 -- however, crucially, the gradients will also be 0. This happening in the output layer basically means your network cannot learn from its mistakes when it produces outputs < 0 (which is likely to happen with random initializations). This will likely heavily impair the overall learning process.
I just browsed through Stack Overflow and other forums but couldn't find anything helpful for my problem. But it seems related to this question.
I currently have a trained model of Tensorflow (128 inputs and 11 outputs) which I saved, following the MNIST tutorial by Tensorflow.
It seemed to be successful and I have a model in this folder now with the 3 files (.meta, .ckpt.data and .index). However, I want to restore it and use it for predictions:
#encoding[0] => numpy ndarray (128, ) # anyway a list with only one entry
#unknowndata = np.array(encoding[0])[None]
unknowndata = np.expand_dims(encoding[0], axis=0)
print(unknowndata.shape) # Output (1, 128)
# Restore pre-trained tf model
with tf.Session() as sess:
#saver.restore(sess, "models/model_1.ckpt")
saver = tf.train.import_meta_graph('models/model_1.ckpt.meta')
saver.restore(sess,tf.train.latest_checkpoint('models/./'))
y = tf.get_collection('final tensor') # tf.nn.softmax(tf.matmul(y2, W3) + b3)
X = tf.get_collection('input') # tf.placeholder(tf.float32, [None, 128])
# W1 = tf.get_collection('vars')[0]
# b1 = tf.get_collection('vars')[1]
# W2 = tf.get_collection('vars')[2]
# b2 = tf.get_collection('vars')[3]
# W3 = tf.get_collection('vars')[4]
# b3 = tf.get_collection('vars')[5]
# y1 = tf.nn.relu(tf.matmul(X, W1) + b1)
# y2 = tf.nn.relu(tf.matmul(y1, W2) + b2)
# yLog = tf.matmul(y2, W3) + b3
# y = tf.nn.softmax(yLog)
prediction = tf.argmax(y, 1)
print(sess.run(prediction, feed_dict={i: d for i,d in zip(X, unknowndata.T)}))
# also had sess.run(prediction, feed_dict={X: unknowndata.T}) and also not transposed, still errors
# Output: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] # one should be 1 obviously with a specific percentage
There I only run in problems ...
ValueError: Cannot feed value of shape (1,) for Tensor 'x:0', which has shape '(?, 128)'
Altough I print the shape of the 'unknowndata' and it matches the (1, 128).
I also tried it with
sess.run(prediction, feed_dict={X: unknownData})) # with transposed etc. but nothing worked for me there I got the other error
TypeError: unhashable type: 'list'
I only want some predictions of this beautiful Tensorflow trained model.
I figured out the problem!
First I need to restore all of the values (weights and biases and matmul them seperately).
Second I need to create the same input as in the trained model, in my case:
X = tf.placeholder(tf.float32, [None, 128])
and then just call the prediction:
sess.run(prediction, feed_dict={X: unknownData})
But I do not get any percentage distribution but I expect that due to the softmax function. Does anybody know how to access those?
The prediction tensor is obtained by an argmax on y. Instead of returning only prediction, you can add y to your output feed when you execute sess.run.
output_feed = [prediction, y]
preds, probs = sess.run(output_feed, print(sess.run(prediction, feed_dict={i: d for i,d in zip(X, unknowndata.T)}))
preds will have the predictions of the model andprobs will have the probability scores.
First when you save, you have to add to the collection the placeholders you need tf.add_to_collection('i', i) and then retrieve them and pass them the feed_dict.
In your example is "i":
i = tf.get_collection('i')[0]
#sess.run(prediction, feed_dict={i: d for i,d in zip(X, unknowndata.T)})