I'm new with tensorflow.
Using tensorflow, I want to define a vector which depends on the output of my neural net to compute the wanted cost function:
# Build the neural network
X = tf.placeholder(tf.float32, shape=[None, n_inputs], name='X')
hidden = fully_connected(X, n_hidden, activation_fn=tf.nn.elu, weights_initializer=initializer)
logits = fully_connected(hidden, n_outputs, activation_fn=None, weights_initializer=initializer)
outputs = tf.nn.softmax(logits)
# Select a random action based on the probability
action = tf.multinomial(tf.log(outputs), num_samples=1)
# Define the target if the action chosen was correct and the cost function
y = np.zeros(n_outputs)
y[int(tf.to_float(action))] = 1.0
cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits(labels=y, logits=logits)
To define y, I need the value of action (between 0 and 9) so that my vector y is [0,0,0,1,0 ...] whith the 1 at the index "action".
But action is a tensor and not an integer so I can't do that !
This code before crashes because I can't apply int to a Tensor object...
What should I do ?
Many thanks
tf.one_hot() is the function you are looking for.
You will have to do as follows :
action_indices = tf.cast(action, tf.int32)
y = tf.one_hot(action_indices)
Related
I want to use an external program as a custom operation.
Because automatic gradient would be not available, I wrote the code to provide gradients by using numerical methods. However, because it have to compute the batch_size number of derivatives,
I wrote it to get batch_size from the shape of x.
Following is an example using numpy function as an external program
f(x) = np.sum(x**2)
(In fact, for this simple numpy function, no loop over batch_size is necessary. But, it is written for general external function.)
#tf.custom_gradient
def custom_op(x):
# without using numpy, use external function
# assume x shape = (batch_size,3)
batch_size= x.shape[0]
input_length = x.shape[1]
# assert input_length==3
yout=[] # shape should be (batch_size,1)
gout=[] # shape should be (batch_size,3)
for i in range(batch_size):
inputs = x[i,:] # shape (3,)
y = np.sum(inputs**2) # shape (3,)
yout.append(y) # shape (1,)
# compute differences
dy = []
for j in range(len(inputs)):
delta = np.zeros_like(inputs)
delta[j] = np.abs(inputs[j])*0.001
yplus = np.sum((inputs + delta)**2) # change only j-th input
grad = (yplus-y)/delta[j] #shape (1,)
dy.append(grad)
gout.append(dy)
yout = tf.convert_to_tensor(yout,dtype='float32') # (batch_size,)
yout = tf.reshape(yout,shape=(batch_size,1)) # (batch_size,1)
gout = tf.convert_to_tensor(gout,dtype='float32') # (batch_size,)
gout = tf.reshape(gout,shape=(batch_size,input_length)) # (batch_size,1)
def grad(upstream):
return upstream*gout
return yout, grad
x = tf.Variable([[1.,2.,3.],[2.,3.,4.]],dtype='float32')
with tf.GradientTape() as tape:
y = custom_op(x)
tape.gradient(y,x)
and found it works.
However, when I tried to use it in the keras model , for example,
def construct_model():
inputs = tf.keras.Input(shape=(3,)) #input array
x = tf.keras.layers.Dense(1)(inputs)
outputs = custom_op(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
optimizer = 'adam'
model.compile(loss='mean_squared_error',
optimizer=optimizer,
metrics=['mean_absolute_error', 'mean_squared_error'])
return model
model = construct_model()
it gives errors
because kerasTensor "inputs" does not have specified batch_size.
I tried to specify batch_size as "tf.keras.Input(shape=(3,),batch_size=2)".
However, it also raises errors because of the use of kerasTensor.
How should I change the custom_op to be compatible with keras?
So I have been trying to train a single layered encoder-decoder network in tensorflow, it is just simply so frustrating given the document is so sparse on explanation, and I have only taken Stanford's CS231n on tensorflow.
So here's the straightforward model:
def simple_model(X,Y, is_training):
"""
a simple, single layered encoder decoder network,
that encodes X of shape (batch_size, window_len,
n_comp+1), then decodes Y of shape (batch_size,
pred_len+1, n_comp+1), of which the vector Y[:,0,
:], is simply [0,...,0,1] * batch_size, so that
it starts the decoding
"""
num_units = 128
window_len = X.shape[1]
n_comp = X.shape[2]-1
pred_len = Y.shape[1]-1
init = tf.contrib.layers.variance_scaling_initializer()
encoder_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units)
encoder_output, encoder_state = tf.nn.dynamic_rnn(
encoder_cell,X,dtype = tf.float32)
decoder_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units)
decoder_output, _ = tf.nn.dynamic_rnn(decoder_cell,
encoder_output,
initial_state = encoder_state)
# we expect the shape to be of the shape of Y
print(decoder_output.shape)
proj_layer = tf.layers.dense(decoder_output, n_comp)
return proj_layer
now I try to set up the training details:
tf.reset_default_graph()
X = tf.placeholder(tf.float32, [None, 15, 74])
y = tf.placeholder(tf.float32, [None, 4, 74])
is_training = tf.placeholder(tf.bool)
y_out = simple_model(X,y,is_training)
mean_loss = 0.5*tf.reduce_mean((y_out-y[:,1:,:-1])**2)
optimizer = tf.train.AdamOptimizer(learning_rate=5e-4)
extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(extra_update_ops):
train_step = optimizer.minimize(mean_loss)
okay then now I get this stupid error
ValueError: Variable rnn/basic_lstm_cell/kernel already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:
I'm not sure if I understand this correctly. You have two BasicLSTMCells in your graph. According to the documentation, you probably should use MultiRNNCell like this:
encoder_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units)
decoder_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units)
rnn_layers = [encoder_cell, decoder_cell]
multi_rnn_cell = tf.nn.rnn_cell.MultiRNNCell(rnn_layers)
decoder_output, state = tf.nn.dynamic_rnn(cell=multi_rnn_cell,
inputs=X,
dtype=tf.float32)
If this is not the correct architecture you'd like to have and you need to use the two BasicLSTMCells separately, I think pass different/unique names when defining encoder_cell and decoder_cell will help solve this error. tf.nn.dynamic_rnn will put the cell under a 'rnn' scope. If you don't define the cell name explicitly, that will cause a reuse confusion.
I know how to create an rnn in TensorFlow with a one_hot vector:
x = tf.placeholder(tf.int32, [batch_size, num_steps], name='input_placeholder')
y = tf.placeholder(tf.int32, [batch_size, num_steps], name='labels_placeholder')
init_state = tf.zeros([batch_size, state_size])
x_one_hot = tf.one_hot(x, num_classes)
rnn_inputs = tf.unstack(x_one_hot, axis=1)
But I am not really sure what to do when my input vector has multiple 1s, eg. it could be 11011 as 1 input per time. so: [[11011],[00111],...]
Is there an issue if I would just feed this vector like I would have my one-hot representation? How should I formulate the above then? I feel like I shouldn't use the tf.one_hot function... Not sure how the shape of rnn_inputs (200 x 5 x 2) can be created without one_hot.
(using TF 1.0)
The task I try to implement by using a neural network differs a bit from its most common usage. I try to simulate a physical process by propagating something from the input to the output layer by optimizing the network's weights which represents physical properties.
Therefor I need a i.e. 150 layer network where each layer has the same properties in the form
mx+b
where x is my variable I like to optimize and m an external factor which is the same for each layer (b is not in use right now).
I would like to automate the process of creating the graph rather than copy/paste each layer. So is there a function to copy the structure of the first layer to all following layers?
In tensorflow it should look like something like this here:
graph = tf.Graph()
with graph.as_default():
# Input data.
tf_input = tf.placeholder(tf.float32, shape=(n_data, n))
tf_spatial_grid = tf.constant(m_index_mat)
tf_ph_unit = tf.const(m_unit_mat)
tf_output = tf.placeholder(tf.float32, shape=(n_data, n))
# new hidden layer 1
hidden_weights = tf.Variable( tf.truncated_normal([n*n, 1]) )
hidden_layer = tf.nn.matmul( tf.matmul( tf_input, hidden_weights), tf_ph_unit)
# new hidden layer 2
hidden_weights_2 = tf.Variable( tf.truncated_normal([n*n, 1]) )
hidden_layer_2 = tf.nn.matmul( tf.matmul( hidden_layer, hidden_weights_2), tf_ph_unit)
......
# new hidden layer n
hidden_weights_n = tf.Variable( tf.truncated_normal([n*n, 1]) )
hidden_layer_n = tf.nn.matmul( tf.matmul( hidden_layer_m, hidden_weights_n), tf_ph_unit)
...
So is there any option that automates this process somehow? Maybe I'm missing something
I really appreciate any help!
The easiest way to accomplish that is to create a function that builds your layer and simply invoke the function multiple times, possibly in a loop.
For instance:
def layer(input):
hidden_weights = tf.Variable( tf.truncated_normal([n*n, 1]) )
hidden_layer = tf.nn.matmul( tf.matmul( input, hidden_weights), tf_ph_unit)
return hidden_layer
and then:
input = tf_input
for i in range(10):
input = layer(input)
I am trying to save Nueral Network weights into a file and then restoring those weights by initializing the network instead of random initialization. My code works fine with random initialization. But, when i initialize weights from file it is showing me an error TypeError: Input 'b' of 'MatMul' Op has type float64 that does not match type float32 of argument 'a'. I don't know how do i solve this issue.Here is my code:
Model Initialization
# Parameters
training_epochs = 5
batch_size = 64
display_step = 5
batch = tf.Variable(0, trainable=False)
regualarization = 0.008
# Network Parameters
n_hidden_1 = 300 # 1st layer num features
n_hidden_2 = 250 # 2nd layer num features
n_input = model.layer1_size # Vector input (sentence shape: 30*10)
n_classes = 12 # Sentence Category detection total classes (0-11 categories)
#History storing variables for plots
loss_history = []
train_acc_history = []
val_acc_history = []
# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])
Model parameters
#loading Weights
def weight_variable(fan_in, fan_out, filename):
stddev = np.sqrt(2.0/fan_in)
if (filename == ""):
initial = tf.random_normal([fan_in,fan_out], stddev=stddev)
else:
initial = np.loadtxt(filename)
print initial.shape
return tf.Variable(initial)
#loading Biases
def bias_variable(shape, filename):
if (filename == ""):
initial = tf.constant(0.1, shape=shape)
else:
initial = np.loadtxt(filename)
print initial.shape
return tf.Variable(initial)
# Create model
def multilayer_perceptron(_X, _weights, _biases):
layer_1 = tf.nn.relu(tf.add(tf.matmul(_X, _weights['h1']), _biases['b1']))
layer_2 = tf.nn.relu(tf.add(tf.matmul(layer_1, _weights['h2']), _biases['b2']))
return tf.matmul(layer_2, weights['out']) + biases['out']
# Store layers weight & bias
weights = {
'h1': w2v_utils.weight_variable(n_input, n_hidden_1, filename="weights_h1.txt"),
'h2': w2v_utils.weight_variable(n_hidden_1, n_hidden_2, filename="weights_h2.txt"),
'out': w2v_utils.weight_variable(n_hidden_2, n_classes, filename="weights_out.txt")
}
biases = {
'b1': w2v_utils.bias_variable([n_hidden_1], filename="biases_b1.txt"),
'b2': w2v_utils.bias_variable([n_hidden_2], filename="biases_b2.txt"),
'out': w2v_utils.bias_variable([n_classes], filename="biases_out.txt")
}
# Define loss and optimizer
#learning rate
# Optimizer: set up a variable that's incremented once per batch and
# controls the learning rate decay.
learning_rate = tf.train.exponential_decay(
0.02*0.01, # Base learning rate. #0.002
batch * batch_size, # Current index into the dataset.
X_train.shape[0], # Decay step.
0.96, # Decay rate.
staircase=True)
# Construct model
pred = tf.nn.relu(multilayer_perceptron(x, weights, biases))
#L2 regularization
l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in tf.trainable_variables()])
#Softmax loss
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y))
#Total_cost
cost = cost+ (regualarization*0.5*l2_loss)
# Adam Optimizer
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost,global_step=batch)
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Initializing the variables
init = tf.initialize_all_variables()
print "Network Initialized!"
ERROR DETAILS
The tf.matmul() op does not perform automatic type conversions, so both of its inputs must have the same element type. The error message you are seeing indicates that you have a call to tf.matmul() where the first argument has type tf.float32, and the second argument has type tf.float64. You must convert one of the inputs to match the other, for example using tf.cast(x, tf.float32).
Looking at your code, I don't see anywhere that a tf.float64 tensor is explicitly created (the default dtype for floating-point values in the TensorFlow Python API—e.g. for tf.constant(37.0)—is tf.float32). I would guess that the errors are caused by the np.loadtxt(filename) calls, which might be loading an np.float64 array. You can explicitly change them to load np.float32 arrays (which are converted to tf.float32 tensors) as follows:
initial = np.loadtxt(filename).astype(np.float32)
Although It's an old question but I would like you include that I came across the same problem. I resolved it using dtype=tf.float64 for parameter initialization and for creating X and Y placeholders as well.
Here is the snap of my code.
X = tf.placeholder(shape=[n_x, None],dtype=tf.float64)
Y = tf.placeholder(shape=[n_y, None],dtype=tf.float64)
and
parameters['W' + str(l)] = tf.get_variable('W' + str(l), [layers_dims[l],layers_dims[l-1]],dtype=tf.float64, initializer = tf.contrib.layers.xavier_initializer(seed = 1))
parameters['b' + str(l)] = tf.get_variable('b' + str(l), [layers_dims[l],1],dtype=tf.float64, initializer = tf.zeros_initializer())
Declaring all placholders and parameters with float64 datatype will resolve this issue.
For Tensorflow 2
You can cast one of the tensor, like this for example:
_X = tf.cast(_X, dtype='float64')
You can get rid of this error by setting all layers to have a default dtype of float64:
tf.keras.backend.set_floatx('float64')