I have created a Tensorflow model which uses the Dataset API in order to feed the data into the network.
After the training phase, I would like to restore this model and to perform inference on it once in a while.
Currently I am re initializing the dataset iterator each time, but I'm wondering if there is an alternative way.
Moreover at training time, my dataset contains x and y data, while at prediction time I only have x. As a temporary solution, I am providing a fake y, but again, this does not seem the best solution.
Here is a pseudocode of what i'm doing:
#### NETWORK
input_x = tf.placeholder(tf.int32, [None, None], name="input_x")
input_y = tf.placeholder(tf.int32, [None, 2], name="input_y")
dataset = tf.data.Dataset.from_tensor_slices((input_x, input_y))
iterator = tf.data.Iterator.from_structure(dataset.output_types, dataset.output_shapes)
dataset_init_op = iterator.make_initializer(dataset, name='dataset_init')
x_data, y_data = iterator.get_next()
output = tf.variable(x_data, name='output')
.....
### INFERENCE
while (true):
x = new_input
x_operation = session.graph.get_operation_by_name("input_x").outputs[0]
y_operation = session.graph.get_operation_by_name("input_y").outputs[0]
dataset_operation = session.graph.get_operation_by_name("dataset_init")
output_operation = session.graph.get_operation_by_name("output").outputs[0]
fake_y = np.array([[0, 0]])
dic = {input_x: x, input_y: y}
session.run(dataset_operation, feed_dict=dic)
prediction = session.run(output_operation)
Thank you for your help
Related
Problem
I'm trying to classify some 64x64 images as a black box exercise. The NN I have written doesn't change my weights. First time writing something like this, the same code, but on MNIST letters input works just fine, but on this code it does not train like it should:
import tensorflow as tf
import numpy as np
path = ""
# x is a holder for the 64x64 image
x = tf.placeholder(tf.float32, shape=[None, 4096])
# y_ is a 1 element vector, containing the predicted probability of the label
y_ = tf.placeholder(tf.float32, [None, 1])
# define weights and balances
W = tf.Variable(tf.zeros([4096, 1]))
b = tf.Variable(tf.zeros([1]))
# define our model
y = tf.nn.softmax(tf.matmul(x, W) + b)
# loss is cross entropy
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
# each training step in gradient decent we want to minimize cross entropy
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
train_labels = np.reshape(np.genfromtxt(path + "train_labels.csv", delimiter=',', skip_header=1), (14999, 1))
train_data = np.genfromtxt(path + "train_samples.csv", delimiter=',', skip_header=1)
# perform 150 training steps with each taking 100 train data
for i in range(0, 15000, 100):
sess.run(train_step, feed_dict={x: train_data[i:i+100], y_: train_labels[i:i+100]})
if i % 500 == 0:
print(sess.run(cross_entropy, feed_dict={x: train_data[i:i+100], y_: train_labels[i:i+100]}))
print(sess.run(b), sess.run(W))
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess.close()
How do I solve this problem?
The key to the problem is that the class number of you output y_ and y is 1.You should adopt one-hot mode when you use tf.nn.softmax_cross_entropy_with_logits on classification problems in tensorflow. tf.nn.softmax_cross_entropy_with_logits will first compute tf.nn.softmax. When your class number is 1, your results are all the same. For example:
import tensorflow as tf
y = tf.constant([[1],[0],[1]],dtype=tf.float32)
y_ = tf.constant([[1],[2],[3]],dtype=tf.float32)
softmax_var = tf.nn.softmax(logits=y_)
cross_entropy = tf.multiply(y, tf.log(softmax_var))
errors = tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y)
with tf.Session() as sess:
print(sess.run(softmax_var))
print(sess.run(cross_entropy))
print(sess.run(errors))
[[1.]
[1.]
[1.]]
[[0.]
[0.]
[0.]]
[0. 0. 0.]
This means that no matter what your output y_, your loss will be zero. So your weights and bias haven't been updated.
The solution is to modify the class number of y_ and y.
I suppose your class number is n.
First approch:You can change data to one-hot before feed data.Then use the following code.
y_ = tf.placeholder(tf.float32, [None, n])
W = tf.Variable(tf.zeros([4096, n]))
b = tf.Variable(tf.zeros([n]))
Second approch:change data to one-hot after feed data.
y_ = tf.placeholder(tf.int32, [None, 1])
y_ = tf.one_hot(y_,n) # your dtype of y_ need to be tf.int32
W = tf.Variable(tf.zeros([4096, n]))
b = tf.Variable(tf.zeros([n]))
All your initial weights are zeros. When you have that way, the NN doesn't learn well. You need to initialize all the initial weights with random values.
See Why should weights of Neural Networks be initialized to random numbers?
"Why Not Set Weights to Zero?
We can use the same set of weights each time we train the network; for example, you could use the values of 0.0 for all weights.
In this case, the equations of the learning algorithm would fail to make any changes to the network weights, and the model will be stuck. It is important to note that the bias weight in each neuron is set to zero by default, not a small random value.
"
See
https://machinelearningmastery.com/why-initialize-a-neural-network-with-random-weights/
I wonder there is any way to connect multiple NN as a series in tensorflow.
For example, input features to DNN structure, and get the result values for input data of RNN structure.
Example code:
import tensorflow as tf
import numpy as np
a = 50 #batch_size
b = 60 #sequence in RNN
c = 40 #features
d = 6 #label classes
rnn_size = b
x_data = np.random.rand(a,b,c)
y_data = np.random.randint(0,high=d,size=[a,1])
tf.reset_default_graph()
X = tf.placeholder(tf.float32, shape=[None,b,c])
Y = tf.placeholder(tf.float32, shape=[None,d])
X = tf.transpose(X, (1,0,2))
X = tf.reshape(X, (-1,c))
X = tf.split(X, b)
hidden_units = [40,20,10]
#DNN Structure
dnn = []
for i in range(len(hidden_units)):
if i == 0:
T = X
else:
T = dnn[-1]
dnn.append(tf.layers.dense(T, hidden_units[i], activation=tf.nn.relu, kernel_initializer=tf.contrib.layers.xavier_initializer()))
# RNN Structure
rnn = {'w': tf.Variable(tf.random_normal([rnn_size, d], stddev = 0.01), dtype=tf.float32),
'b': tf.Variable(tf.random_normal([d], stddev = 0.01), dtype=tf.float32)}
cell = tf.nn.rnn_cell.BasicLSTMCell(rnn_size)
outputs, states = tf.contrib.rnn.static_rnn(cell, dnn[-1], dtype=tf.float32)
output = tf.matmul(outputs[-1], rnn['w'])+rnn['b']
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=Y,logits=output))
optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)
correct = tf.equal(tf.argmax(output,1),tf.argmax(cost,1))
acc = tf.reduce_mean(tf.cast(correct, tf.float32))
# Run Session
sess = tf.Session()
sess.run([tf.global_variables_initializer(), tf.local_variables_initializer()])
_, c = sess.run([optimizer, cost],feed_dict={X: x_data, Y: tf.Session().run(tf.one_hot(y_data), d)})
print('Accuracy: ', sess.run(acc, feed_dict={X: x_data, Y: tf.Session().run(tf.one_hot(y_data), d)}))
When I run this code, there is an error raised:
File "C:\Anaconda3\Lib\site-packages\tensorflow\python\layers\core.py", line 250, in dense
dtype=inputs.dtype.base_dtype,
AttributeError: 'list' object has no attribute 'dtype'
it seems to be related with type of 'dnn[-1]'
Is there a connective function or data type controller for the connection of the neural networks?
I've solved the problem, finally.
The reason of error was little bit ambiguous but X was recognized by 'list' after running 'tf.split', finitely...
After I generate a list of DNN Structures which as a length of the sequence, as following:
seq = []
for i in range(b):
###dnn structure for i-th array of split###
seq.append(dnn structure)
and tuned some codes, then the whole code worked well.
Thanks for an attention :)
I have a large .npz numpy training file that I want to read more efficiently. I tried to follow the approach from Tensorflow documentation (https://www.tensorflow.org/guide/datasets#consuming_numpy_arrays):
As an alternative, you can define the Dataset in terms of
tf.placeholder() tensors, and feed the NumPy arrays when you
initialize an Iterator over the dataset.
However after implementing iterator, my model consumes even 2x more memory. Do you have any clues what might be wrong here?
def model(batch_size):
x = tf.placeholder(tf.float32,[None, IMGSIZE,IMGSIZE,1])
y = tf.placeholder(tf.float32,[None, n_landmark * 2])
z = tf.placeholder(tf.int32, [None, ])
Ret_dict['x'] = x
Ret_dict['y'] = y
Ret_dict['z'] = z
Ret_dict['iterator'] = iter_
dataset = tf.data.Dataset.from_tensor_slices((x, y, z)).batch(batch_size)
iter_ = dataset.make_initializable_iterator()
InputImage, GroundTruth, GroundTruth_Em = iter_.get_next()
Conv1a = tf.layers.conv2d(InputImage,64,3,1,..)
(...)
def main():
trainSet = np.load(args.datasetDir)
Xtrain = trainSet['Image']
Ytrain = trainSet['Label_1']
Ytrain_em = trainSet['Label_2']
with tf.Session() as sess:
my_model = model(BATCH_SIZE)
Saver = tf.train.Saver()
Saver.restore(sess, args.pretrainedModel)
sess.run(
[model['Optimizer'], model['iterator'].initializer],
feed_dict={model['x']:Xtrain,
model['y']:Ytrain,
model['z']:Ytrain_em})
I'm working on developing a CNN with the Cifar-10 dataset and to feed the data to the network, I am using the Dataset API to use feedable iterators with the handle placeholders: https://www.tensorflow.org/programmers_guide/datasets#creating_an_iterator. Personally I really like this method because it provides a clear and simple way to feed data to the network and switch between my testing and validation sets. However, when I save the graph at the end of training, the .meta file created is as large as the testing data I started with. I am using these operations to provide access later to the input placeholders and output operators:
tf.get_collection("validation_nodes")
tf.add_to_collection("validation_nodes", input_data)
tf.add_to_collection("validation_nodes", input_labels)
tf.add_to_collection("validation_nodes", predict)
And then use the following to save the graph:
Before training:
saver = tf.train.Saver()
After training:
save_path = saver.save(sess, "./my_model")
Is there a way to prevent TensorFlow from storing all the data in the graph? Thanks in advance!
You're creating a tf.constant for the dataset which is why it's added to the graph definition. The solution is to use an initializable iterator and define a placeholder. The first thing you do before you start running operations against the graph is to feed it the dataset. See the programmers guide under the "creating an iterator" section for an example.
https://www.tensorflow.org/programmers_guide/datasets#creating_an_iterator
I do exactly the same, so here is a copy/paste of the relevant parts of code that I use to achieve exactly your description (train/test sets of cifar10 using an initializable iterator):
def build_datasets(self):
""" Creates a train_iterator and test_iterator from the two datasets. """
self.imgs_4d_uint8_placeholder = tf.placeholder(tf.uint8, [None, 32, 32, 3], 'load_images_placeholder')
self.imgs_4d_float32_placeholder = tf.placeholder(tf.float32, [None, 32, 32, 3], 'load_images_float32_placeholder')
self.labels_1d_uint8_placeholder = tf.placeholder(tf.uint8, [None], 'load_labels_placeholder')
self.load_data_train = tf.data.Dataset.from_tensor_slices({
'data': self.imgs_4d_uint8_placeholder,
'labels': self.labels_1d_uint8_placeholder
})
self.load_data_test = tf.data.Dataset.from_tensor_slices({
'data': self.imgs_4d_uint8_placeholder,
'labels': self.labels_1d_uint8_placeholder
})
self.load_data_adversarial = tf.data.Dataset.from_tensor_slices({
'data': self.imgs_4d_float32_placeholder,
'labels': self.labels_1d_uint8_placeholder
})
# Train dataset pipeline
dataset_train = self.load_data_train
dataset_train = dataset_train.shuffle(buffer_size=50000)
dataset_train = dataset_train.repeat()
dataset_train = dataset_train.map(self._img_augmentation, num_parallel_calls=8)
dataset_train = dataset_train.map(self._img_preprocessing, num_parallel_calls=8)
dataset_train = dataset_train.batch(self.hyperparams['batch_size'])
dataset_train = dataset_train.prefetch(2)
self.iterator_train = dataset_train.make_initializable_iterator()
# Test dataset pipeline
dataset_test = self.load_data_test
dataset_test = dataset_test.map(self._img_preprocessing, num_parallel_calls=8)
dataset_test = dataset_test.batch(self.hyperparams['batch_size'])
self.iterator_test = dataset_test.make_initializable_iterator()
def init(self, sess):
self.cifar10 = Cifar10() # a class I wrote for loading cifar10
self.handle_train = sess.run(self.iterator_train.string_handle())
self.handle_test = sess.run(self.iterator_test.string_handle())
sess.run(self.iterator_train.initializer, feed_dict={self.handle: self.handle_train,
self.imgs_4d_uint8_placeholder: self.cifar10.train_data,
self.labels_1d_uint8_placeholder: self.cifar10.train_labels})
I am new to tensorflow and neural networks, and I am trying to create a model that just multiples two float values together.
I wasn't sure how many neurons I would want, but I picked 10 neurons and tried to see where I could go from that. I figured that would probably introduce enough complexity in order to semi-accurately learn that operation.
Anyways, here is my code:
import tensorflow as tf
import numpy as np
# Teach how to multiply
def generate_data(how_many):
data = np.random.rand(how_many, 2)
answers = data[:, 0] * data[:, 1]
return data, answers
sess = tf.InteractiveSession()
# Input data
input_data = tf.placeholder(tf.float32, shape=[None, 2])
correct_answers = tf.placeholder(tf.float32, shape=[None])
# Use 10 neurons--just one layer for now, but it'll be fully connected
weights_1 = tf.Variable(tf.truncated_normal([2, 10], stddev=.1))
bias_1 = tf.Variable(.1)
# Output of this will be a [None, 10]
hidden_output = tf.nn.relu(tf.matmul(input_data, weights_1) + bias_1)
# Weights
weights_2 = tf.Variable(tf.truncated_normal([10, 1], stddev=.1))
bias_2 = tf.Variable(.1)
# Softmax them together--this will be [None, 1]
calculated_output = tf.nn.softmax(tf.matmul(hidden_output, weights_2) + bias_2)
cross_entropy = tf.reduce_mean(correct_answers * tf.log(calculated_output))
optimizer = tf.train.GradientDescentOptimizer(.5).minimize(cross_entropy)
sess.run(tf.initialize_all_variables())
for i in range(1000):
x, y = generate_data(100)
sess.run(optimizer, feed_dict={input_data: x, correct_answers: y})
error = tf.reduce_sum(tf.abs(calculated_output - correct_answers))
x, y = generate_data(100)
print("Total Error: ", error.eval(feed_dict={input_data: x, correct_answers: y}))
It seems that the error is always around 7522.1, which very very bad for just 100 data points, so I assume it is not learning.
My questions: Is my machine learning? If so, what can I do to make it more accurate? If not, how can I make it learn?
There are a few major issues with the code. Aaron has already identified some of them, but there's another important one: calculated_output and correct_answers are not the same shape, so you're creating a 2D matrix when you subtract them. (The shape of calculated_output is (100, 1) and the shape of correct_answers is (100).) So you need to adjust the shape (for example, by using tf.squeeze on calculated_output).
This problem also doesn't really require any non-linearities, so you could get by with no activations and only one layer. The following code gets a total error of about 6 (~0.06 error on average for each test point). Hope that helps!
import tensorflow as tf
import numpy as np
# Teach how to multiply
def generate_data(how_many):
data = np.random.rand(how_many, 2)
answers = data[:, 0] * data[:, 1]
return data, answers
sess = tf.InteractiveSession()
input_data = tf.placeholder(tf.float32, shape=[None, 2])
correct_answers = tf.placeholder(tf.float32, shape=[None])
weights_1 = tf.Variable(tf.truncated_normal([2, 1], stddev=.1))
bias_1 = tf.Variable(.0)
output_layer = tf.matmul(input_data, weights_1) + bias_1
mean_squared = tf.reduce_mean(tf.square(correct_answers - tf.squeeze(output_layer)))
optimizer = tf.train.GradientDescentOptimizer(.1).minimize(mean_squared)
sess.run(tf.initialize_all_variables())
for i in range(1000):
x, y = generate_data(100)
sess.run(optimizer, feed_dict={input_data: x, correct_answers: y})
error = tf.reduce_sum(tf.abs(tf.squeeze(output_layer) - correct_answers))
x, y = generate_data(100)
print("Total Error: ", error.eval(feed_dict={input_data: x, correct_answers: y}))
The way you are using softmax is weird. Softmax is normally used when you want to have a probability distribution over a set of classes. In your code it looks like you have a one dimensional output. The softmax is not helping you there.
The cross entropy loss function is appropriate in classification problems but you are doing regression. You should try using a mean squared error loss function instead.