How to input char data to tensorflow rnn cell? - python

This is my model:
rnn_cell = tf.contrib.rnn.BasicRNNCell(512)
m_rnn_cell = tf.contrib.rnn.MultiRNNCell([rnn_cell]*3, state_is_tuple = False)
prediction, state = tf.nn.dynamic_rnn(m_rnn_cell, X, dtype=tf.float32)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
error = tf.reduce_mean((labels - prediction)**2
train = tf.train.GradientDescentOptimizer(learning_rate).minimize(error)
X is the placeholder I use to input data. I want to train it using a few English sentences. How do I do that? How do I shape my data, labels and placeholders? Can you also provide the code to train it?

Related

How does dimensions for placeholders work for tensorflow?

So suppose I have x_train and y_train where they are arrays and each element of that array a data point (in an array form)(so x_train would be in the form of x_train[i][j]). so x_train[0] represents 1st data point in the training set (in an array form) and suppose I want to create a simple regression
so I coded this
input = tf.placeholder(tf.float32, shape=[len(data[0]),None])
target = tf.placeholder(tf.flaot32, shape=[len(data[0]),None])
network = tf.layers.Dense(10, tf.keras.activations.relu)(input)
network = tf.layers.BatchNormalization()(network)
network = tf.layers.Dense(10,tf.keras.activations.relu)(network)
network = tf.layers.BatchNormalization()(network)
network = tf.layers.Dense(10,tf.keras.activations.linear)(network)
cost = tf.reduce_mean((target - network)**2)
optimizer = tf.train.AdamOptimizer().minimize(cost)
with tf.Session() as sess:
for epoch in range(1000):
_, val = sess.run([optimizer,cost], feed_dict={input: x_train, target: y_train})
print(val)
But is this correct? I'm not sure if the dimensions for the placeholders even match. When I try to run this code,
I get the error message
ValueError: The last dimension of the inputs to `Dense` should be defined. Found `None`.
So what I tried was to interchange the position of the dimensions' size for placeholders, so
the changed placeholders were
input = tf.placeholder(tf.float32, shape=[None,len(data[0])])
target = tf.placeholder(tf.float32, shape=[None,len(data[0])])
But with these, I then get the error message
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value dense/bias
[[{{node dense/bias/read}}]]
I was able to solve the above issue by performing np.expand_dims() on x_train & y_train at axis=0 and initializing batch_norm and network parameters with sess.run(tf.global_variable_initializer()) before optimizing the model.
Note: The presence of None in the first dimension of the shape of placeholder is alright as it allows TensorFlow to train models when batch_size is unknown (the same is true even for other dimensions of placeholder's shape). The error is due to mismatch in input and placeholder dimensions. Your inputs (x_train & y_train) were probably one-dimensional tensors while the placeholders either needed two-dimensional ones or one-dimensional vectors reshaped to two-dimensions.
Please find my below implementation for the same and a matplotlib plot that verifies the implementation:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
data = [[1,2,3,4,5,6,7,8,9,10],
[11,12,13,14,15,16,17,18,19,20]]
x_train = data[0]
y_train = data[1]
x_train = np.expand_dims(x_train, 0)
y_train = np.expand_dims(y_train, 0)
input = tf.placeholder(tf.float32, shape=[None, len(data[0])])
target = tf.placeholder(tf.float32, shape=[None, len(data[1])])
network = tf.layers.Dense(10, tf.keras.activations.relu)(input)
network = tf.layers.BatchNormalization()(network)
network = tf.layers.Dense(10,tf.keras.activations.relu)(network)
network = tf.layers.BatchNormalization()(network)
network = tf.layers.Dense(10,tf.keras.activations.linear)(network)
cost = tf.reduce_mean((target - network)**2)
optimizer = tf.train.AdamOptimizer().minimize(cost)
costs = []
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(1000):
_, val = sess.run([optimizer,cost], feed_dict={input: x_train, target: y_train})
costs.append(val)
print(val)
fig, ax = plt.subplots(figsize=(11, 8))
ax.plot(range(1000), costs)
ax.set_title("Costs vs epochs")
ax.set_xlabel("Epoch")
ax.set_ylabel("Avg. val. accuracy")
Here's the plot of costs vs epochs:
Costs vs Epochs
Additionally, to test the network on new data (say) x_test = [[21,22,23,24,25,26,27,28,29,30]], you could use below code:
y_pred = sess.run(network,feed_dict={input: x_test})
PS: Ensure you use the same Tensorflow Session sess created above to run the inference (unless you're not saving and loading the model checkpoint)

How to use dataset in TensorFlow session for training

I like to perform image classification on our own large image libary (millions of labeled images) with tensorflow. I´m new to stackoverflow, python and tensorflow and worked myself through a few tutorials (mnist etc.) and got to the point, where i was able to prepare a TensorFlow datset from a dictionary including the absolute path to the images and the according labels. However, i´m stuck at the point using the dataset in a TensorFlow session. Here is my (example) code:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
import numpy as np
import time
import mymodule # I build my module to read the images and labels
from tensorflow.python.framework import ops
from tensorflow.python.framework import dtypes
from tensorflow.contrib.data import Iterator
beginTime = time.time()
batch_size = 100
learning_rate = 0.005
max_steps = 2
NUM_CLASSES = 25
def input_parser(img_path, label):
one_hot = tf.one_hot(label, NUM_CLASSES)
img_file = tf.read_file(img_path)
img_decoded = tf.image.decode_jpeg(img_file, channels = 3)
return img_decoded, one_hot
#Import Training data (returns the dicitonary with paths and labels)
train_dict = mymodule.getFileMap(labelList, imageList)
#Import Test data
test_dict = mymodule.getFileMap(labelList, imageList)
#Get train data
train_file_list, train_label_list = get_file_label_list(train_dict)
train_images_tensor = ops.convert_to_tensor(train_file_list, dtype=dtypes.string)
train_labels_tensor = ops.convert_to_tensor(train_label_list, dtype=dtypes.int64)
#Get test data
test_file_list, test_label_list = get_file_label_list(test_dict)
test_images_tensor = ops.convert_to_tensor(test_file_list, dtype=dtypes.string)
test_labels_tensor = ops.convert_to_tensor(test_label_list, dtype=dtypes.int64)
#Create TensorFlow Datset object
train_data = tf.data.Dataset.from_tensor_slices((train_images_tensor, train_labels_tensor))
test_data = tf.data.Dataset.from_tensor_slices((test_images_tensor, test_labels_tensor))
# Transform the datset so that it contains decoded images
# and one-hot vector labels
train_data = train_data.map(input_parser)
test_data = test_data.map(input_parser)
# Batching --> How to do it right?
#train_data = train_data.batch(batch_size = 100)
#test_data = train_data.batch(batch_size = 100)
#Define input placeholders
image_size = 990*990*3
images_placeholder = tf.placeholder(tf.float32, shape=[None, image_size])
labels_placeholder = tf.placeholder(tf.int64, shape=[None])
# Define variables (these afe the values we want to optimize)
weigths = tf.Variable(tf.zeros([image_size, NUM_CLASSES]))
biases = tf.Variable(tf.zeros([NUM_CLASSES]))
# Define the classifier´s result
logits = tf.matmul(images_placeholder, weigths) + biases
# Define the loss function
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits = logits, labels = labels_placeholder))
# Define the training operation
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Operation comparing prediciton with true label
correct_prediciton = tf.equal(tf.argmax(logits, 1), labels_placeholder)
# Operation calculating the accuracy of our predicitons
accuracy = tf.reduce_mean(tf.cast(correct_prediciton, tf.float32))
#Create TensorFlow Iterator object
iterator = Iterator.from_structure(train_data.output_types,
train_data.output_shapes)
next_element = iterator.get_next()
#Create two initialization ops to switch between the datasets
train_init_op = iterator.make_initializer(train_data)
test_init_op = iterator.make_initializer(test_data)
with tf.Session() as sess:
#Initialize variables
sess.run(tf.global_variables_initializer())
sess.run(train_init_op)
for _ in range(10):
try:
elem = sess.run(next_element)
print(elem)
except tf.errors.OutOfRangeError:
print("End of training datset.")
break
Following this and this tutorial i could not solve the problem of how to use the (image and label) dataset in a tensorflow session for training. I was able to print out the datset by iterating through it, but wasn´t able to use it for learning.
I don´t understand how to access the images and labels seperately after they have been merged in the train_data = tf.data.Dataset.from_tensor_slices((train_images_tensor, train_labels_tensor)) operation, as requried by the 2nd tutorial. Also i don´t know how to implement batching correctly.
What i want to do in the session is basically this (from the 2nd tutorial):
# Generate input data batch
indices = np.random.choice(data_sets['images_train'].shape[0], batch_size)
images_batch = data_sets['images_train'][indices]
labels_batch = data_sets['labels_train'][indices]
# Periodically print out the model's current accuracy
if i % 100 == 0:
train_accuracy = sess.run(accuracy, feed_dict={
images_placeholder: images_batch, labels_placeholder: labels_batch})
print('Step {:5d}: training accuracy {:g}'.format(i, train_accuracy))
# Perform a single training step
sess.run(train_step, feed_dict={images_placeholder: images_batch,
labels_placeholder: labels_batch})
# After finishing the training, evaluate on the test set
test_accuracy = sess.run(accuracy, feed_dict={
images_placeholder: data_sets['images_test'],
labels_placeholder: data_sets['labels_test']})
print('Test accuracy {:g}'.format(test_accuracy))
endTime = time.time()
print('Total time: {:5.2f}s'.format(endTime - beginTime))
If anyone can tell me, how to access images and labels in the dataset sepearately and use it for training, i would be really thankful. Also a tip where and how to do the batching would be appreciated.
Thank you.
In your code, next_element is a tuple of two tensors, matching the structure of your datasets: i.e. it is a tuple whose first element is an image, and second element is a label. To access the individual tensors, you can do the following:
next_element = iterator.get_next()
next_image = next_element[0]
next_label = next_element[1]
# Or, in a single line:
next_image, next_label = iterator.get_next()
To batch a tf.data.Dataset, you can use the Dataset.batch() transformation. Your commented out code for this should simply work:
train_data = train_data.batch(batch_size = 100)
test_data = train_data.batch(batch_size = 100)

TensorFlow restore throwing "No Variable to save" error

I am working through some code to understand how to save and restore checkpoints in tensorflow. To do so, I implemented a simple neural netowork that works with MNIST digits and saved the .ckpt file like so:
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np
learning_rate = 0.001
n_input = 784 # MNIST data input (img shape = 28*28)
n_classes = 10 # MNIST total classes 0-9
#import MNIST data
mnist = input_data.read_data_sets('.', one_hot = True)
#Features and Labels
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])
#Weights and biases
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))
#logits = xW + b
logits = tf.add(tf.matmul(features, weights), bias)
#Define loss and optimizer
cost = tf.reduce_mean(\
tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)\
.minimize(cost)
# Calculate accuracy
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
import math
save_file = './train_model.ckpt'
batch_size = 128
n_epochs = 100
saver = tf.train.Saver()
# Launch the graph
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# Training cycle
for epoch in range(n_epochs):
total_batch = math.ceil(mnist.train.num_examples / batch_size)
# Loop over all batches
for i in range(total_batch):
batch_features, batch_labels = mnist.train.next_batch(batch_size)
sess.run(
optimizer,
feed_dict={features: batch_features, labels: batch_labels})
# Print status for every 10 epochs
if epoch % 10 == 0:
valid_accuracy = sess.run(
accuracy,
feed_dict={
features: mnist.validation.images,
labels: mnist.validation.labels})
print('Epoch {:<3} - Validation Accuracy: {}'.format(
epoch,
valid_accuracy))
# Save the model
saver.save(sess, save_file)
print('Trained Model Saved.')
This part works well, and I get the .ckpt file saved in the correct directory. The problem comes in when I try to restore the model in an attempt to work on it again. I use the following code to restore the model:
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, 'train_model.ckpt.meta')
print('model restored')
and end up with the error: ValueError: No variables to save
Not too sure, what the mistake here is. Any help is appreciated. Thanks in advance
A Graph is different to the Session. A graph is the set of operations joining tensors, each of which is a symbolic representation of a set of values. A Session assigns specific values to the Variable tensors, and allows you to run operations in that graph.
The chkpt file saves variable values - i.e. those saved in the weights and biases - but not the graph itself.
The solution is simple: re-run the graph construction (everything before the Session, then start your session and load values from the chkpt file.
Alternatively, you can check out this guide for exporting and importing MetaGraphs.
You should tell the Saver which Variables to restore, default Saver will get all the Variables from the default graph.
As in your case, you should add the constructing graph code before saver = tf.train.Saver()

Sample from tensorflow LSTM model when using symbolic batch inputs

I am building a next-character prediction LSTM for sentences.
I was following the tutorial here https://indico.io/blog/tensorflow-data-inputs-part1-placeholders-protobufs-queues/ on how to make the data input process part of the tensorflow graph, and now I have a stateful LSTM that is fed with symbolic (!) batches generated by tf.contrib.training.batch_sequences_with_states, which are in turn read from TF.SequenceExamples of varying lengths (Char-RNN working on characters in a sentence), as shown in the code below.
The whole input and batching process is therefore part of the compute graph.
The training works, but since the input is symbolic (not a TF.placeholder), I cannot figure out how to feed in my own sentence defined as a string to the LSTM to perform inference (sample from model). Any ideas?
import tensorflow as tf
import numpy as np
from tensorflow.python.util import nest
import SequenceHandler
import DataLoader
# SETTINGS
learning_rate = 0.001
batch_size = 128
num_unroll = 200
num_enqueue_threads = 10
lstm_size = 256
vocab_size = 39
# DATA
key, context, sequences = SequenceHandler.loadSequence("input.tf") # Loads TF.SequenceExample sequence using TF.RecordReader
# MODEL
cell = tf.nn.rnn_cell.BasicLSTMCell(num_units=lstm_size)
initial_states = {"lstm_state_c": tf.zeros(cell.state_size[0], dtype=tf.float32), "lstm_state_h": tf.zeros(cell.state_size[0], dtype=tf.float32)}
batch = tf.contrib.training.batch_sequences_with_states(
input_key=key,
input_sequences=sequences,
input_context=context,
input_length=tf.cast(context["length"], tf.int32),
initial_states=initial_states,
num_unroll=num_unroll,
batch_size=batch_size,
num_threads=num_enqueue_threads,
capacity=batch_size * num_enqueue_threads * 2)
# BATCH INPUT
inputs = batch.sequences["inputs"]
targets = batch.sequences["outputs"]
# Convert input into float one-hot representation
embedding = tf.constant(np.eye(vocab_size), dtype=tf.float32)
inputs = tf.nn.embedding_lookup(embedding, inputs)
# Reshape inputs (and targets respectively) into list of length T (unrolling length), with each element being a Tensor of shape (batch_size, input_dimensionality)
inputs_by_time = tf.split(1, num_unroll, inputs)
inputs_by_time = [tf.squeeze(elem, squeeze_dims=1) for elem in inputs_by_time]
targets_by_time = tf.split(1, num_unroll, targets)
targets_by_time = [tf.squeeze(elem, squeeze_dims=1) for elem in targets_by_time]
targets_by_time_packed = tf.pack(targets_by_time)
# Build RNN
state_name=("lstm_state_c", "lstm_state_h")
state_size = cell.state_size
state_is_tuple = nest.is_sequence(state_size)
state_name_tuple = nest.is_sequence(state_name)
state_name_flat = nest.flatten(state_name)
state_size_flat = nest.flatten(state_size)
initial_state = nest.pack_sequence_as(
structure=state_size,
flat_sequence=[batch.state(s) for s in state_name_flat])
seq_lengths = batch.context["length"]
(outputs, state) = tf.nn.state_saving_rnn(cell, inputs_by_time, state_saver=batch,
sequence_length=seq_lengths, state_name=state_name)
# Create softmax parameters, weights and bias, and apply to RNN outputs at each timestep
with tf.variable_scope('softmax') as sm_vs:
softmax_w = tf.get_variable("softmax_w", [lstm_size, vocab_size])
softmax_b = tf.get_variable("softmax_b", [vocab_size])
logits = [tf.matmul(outputStep, softmax_w) + softmax_b for outputStep in outputs]
logit = tf.pack(logits)
probs = tf.nn.softmax(logit)
with tf.name_scope('loss'):
# Compute mean cross entropy loss for each output.
loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logit, targets_by_time_packed)
mean_loss = tf.reduce_mean(loss)
global_step = tf.get_variable('global_step', [],
initializer=tf.constant_initializer(0.0))
learning_rate = tf.constant(learning_rate)
tvars = tf.trainable_variables()
grads, _ = tf.clip_by_global_norm(tf.gradients(mean_loss, tvars),
5.0)
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train_op = optimizer.apply_gradients(zip(grads, tvars),
global_step=global_step)
# TRAINING LOOP
# Start a prefetcher in the background
sess = tf.Session()
tf.train.start_queue_runners(sess=sess)
init_op = tf.initialize_all_variables()
sess.run(init_op)
# LOGGING
summary_writer = tf.train.SummaryWriter("log", sess.graph)
vocab_index_dict, index_vocab_dict, vocab_size = DataLoader.load_vocab("characters.json", "UTF-8")
while True:
# Step through batches, perform training
trainOps = [mean_loss, state, train_op,
global_step]
res = sess.run(trainOps) # THIS WORKS - LOSS DECLINES
testString = "Hello"
# HOW TO SAMPLE FROM MODEL, GIVEN INPUT testString HERE?
In general, I have trouble understanding how to work with the data input as part of the compute graph, in terms of how to split it for cross-validation etc., and there seem to be no examples in that direction using TFRecords.

Reusing the same layers for training and testing, but creating different nodes

I'm trying to (re)train AlexNet (based on the code found here) for a particular binary classification problem. Since my GPU is not very powerful, I settled on a batch size of 8 for training. This size determines the shape of the input tensor (8,227,227,3). However, one can use a larger batch size for the testing process, since there is no backprop involved.
My question is, how could I reuse the already trained hidden layers to create a different network on the same graph specifically for testing?
Here's a snippet of what I have tried to do:
NUM_TRAINING_STEPS = 200
BATCH_SIZE = 1
LEARNING_RATE = 1e-1
IMAGE_SIZE = 227
NUM_CHANNELS = 3
NUM_CLASSES = 2
def main():
graph = tf.Graph()
trace = Tracer()
train_data = readImage(filename1)
test_data = readImage(filename2)
train_labels = np.array([[0.0,1.0]])
with graph.as_default():
batch_data = tf.placeholder(tf.float32, shape=(BATCH_SIZE, IMAGE_SIZE, IMAGE_SIZE, NUM_CHANNELS) )
batch_labels = tf.placeholder(tf.float32, shape=(BATCH_SIZE, NUM_CLASSES) )
logits_training = createNetwork(batch_data)
loss = lossLayer(logits_training, batch_labels)
train_prediction = tf.nn.softmax(logits_training)
print 'Prediction shape: ' + str(train_prediction.get_shape())
optimizer = tf.train.GradientDescentOptimizer(learning_rate=LEARNING_RATE).minimize(loss)
test_placeholder = tf.placeholder(tf.float32, shape=(1, IMAGE_SIZE, IMAGE_SIZE, NUM_CHANNELS) )
logits_test = createNetwork(test_placeholder)
test_prediction = tf.nn.softmax(logits_test)
with tf.Session(graph=graph) as session:
tf.initialize_all_variables().run()
for step in range(NUM_TRAINING_STEPS):
print 'Step #: ' + str(step+1)
feed_dict = {batch_data: train_data, batch_labels : train_labels}
_, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
feed_dict = {batch_data:test_data, test_placeholder:test_data}
logits1, logits2 = session.run([logits_training,logits_test],feed_dict=feed_dict)
print (logits1 - logits2)
return
I'm only training with a single image, just to evaluate whether network is actually being trained and if the values of logits1 and logits2 are the same. They are not, by several orders of magnitude.
createNetwork is a function which loads the weights for AlexNet and builds the model, based on the code for the myalexnet.py script found on the page to which I linked.
I've tried to replicate the examples from the Udacity course on Deep Learning, in particular, assignments 3 and 4.
If anyone could figure out how I could use the same layers for training and testing, I would be very grateful.
Use shape=Nonefor your placeholders: placeholder doc
This way you can feed any shape of data. Another (worse) option is to recreate your graph for testing with the shapes that you need, and load the ckpt that was created during training.

Categories