I'm writing a very simple network:
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
training_data = np.array([[1, 1, 1], [2, 3, 1], [0, -1, 4], [0, 3, 0], [10, -6, 8], [-3, -12, 4]])
testing_data = np.array([6, 11, 1, 9, 10, -38])
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(units = 1, activation = tf.keras.activations.relu, input_shape = (3, )))
model.compile(optimizer = tf.keras.optimizers.RMSprop(0.001), loss = tf.keras.losses.mean_squared_error, metrics = tf.keras.metrics.mean_squared_error)
model.summary()
model.fit(training_data, testing_data, epochs = 1, verbose = 'False')
print("Traning completed.")
model.predict(np.array([1, 1, 1]))
The goal is to train the weights like : aX + bY + cZ = (output)
But I get the error
ValueError: Input 0 of layer sequential_54 is incompatible with the layer: expected axis -1 of input shape to have value 3 but received input with shape [None, 1]
I can't make scene of the dimensions, there is something I'm doing wrong! Any help?
In Keras when you specify the input shape batch size is ignored, please refer here for more details. Your declaration of input_shape = (3, ) is correct, but when you do inference you need to account for the batch size as well by adding an extra dimension for the same so instead of np.array([1, 1, 1]) you need to have np.array([[1, 1, 1]]).
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
training_data = np.array([[1, 1, 1], [2, 3, 1], [0, -1, 4], [0, 3, 0], [10, -6, 8], [-3, -12, 4]])
testing_data = np.array([6, 11, 1, 9, 10, -38])
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(units = 1, activation = tf.keras.activations.relu, input_shape = (3,)))
model.compile(optimizer = tf.keras.optimizers.RMSprop(0.001), loss = tf.keras.losses.mean_squared_error, metrics = [tf.keras.metrics.mean_squared_error])
model.summary()
model.fit(training_data, testing_data, epochs = 1, verbose = 'False')
print("Traning completed.")
model.predict(np.array([[1, 2, 1]]))
array([[0.08026636]], dtype=float32)
Related
I understand why this error usually occurs, which is that the input >= embedding_dim.
However in my case the torch.max(inputs) = embedding_dim - 1.
print('inputs: ', src_seq)
print('input_shape: ', src_seq.shape)
print(self.src_word_emb)
inputs: tensor([[10, 6, 2, 4, 9, 14, 6, 2, 5, 0],
[12, 6, 3, 8, 13, 2, 0, 1, 1, 1],
[13, 8, 12, 7, 2, 4, 0, 1, 1, 1]])
input_shape: [3, 10]
Embedding(15, 512, padding_idx=1)
emb = self.src_word_emb(src_seq)
I try to get a transformer model to work and for some reason the encoder embedding only accepts inputs < embedding_dim_decoder, which does not make sense right?
Found the error source! In the transformer model the encoder and decoder can be set up to share the same embedding weights. However, I had a translation task with one embedding for the decoder and one embedding for the encoder. In the code it initializes the weights via:
if emb_src_trg_weight_sharing:
self.encoder.src_word_emb.weight = self.decoder.trg_word_emb.weight
Setting emb_src_trg_weight_sharing to false solved the issue!
I want to train a neural network using gradient descent on batches that contain N training points each. I would like these batches to only contain points with the same label, instead of being randomly sampled from the training set.
For example, if I'm training using MNIST, I would like to have batches that look like the following:
batch_1 = {0,0,0,0,0,0,0,0}
batch_2 = {3,3,3,3,3,3,3,3}
batch_3 = {7,7,7,7,7,7,7,7}
.....
and so on.
How can I do it using pytorch?
One way to do it is to create subsets and dataloaders for each class and then iterate by randomly switching between the dataloaders at each iteration:
import torch
from torch.utils.data import DataLoader, Subset
from torchvision.datasets import MNIST
from torchvision import transforms
import numpy as np
dataset = MNIST('path/to/mnist_root/',
transform=transforms.ToTensor(),
download=True)
class_inds = [torch.where(dataset.targets == class_idx)[0]
for class_idx in dataset.class_to_idx.values()]
dataloaders = [
DataLoader(
dataset=Subset(dataset, inds),
batch_size=8,
shuffle=True,
drop_last=False)
for inds in class_inds]
epochs = 1
for epoch in range(epochs):
iterators = list(map(iter, dataloaders))
while iterators:
iterator = np.random.choice(iterators)
try:
images, labels = next(iterator)
print(labels)
# do_more_stuff()
except StopIteration:
iterators.remove(iterator)
This will work with any dataset (not just the MNIST).
Here's the result of printing the labels at each iteration:
tensor([6, 6, 6, 6, 6, 6, 6, 6])
tensor([3, 3, 3, 3, 3, 3, 3, 3])
tensor([0, 0, 0, 0, 0, 0, 0, 0])
tensor([5, 5, 5, 5, 5, 5, 5, 5])
tensor([8, 8, 8, 8, 8, 8, 8, 8])
tensor([0, 0, 0, 0, 0, 0, 0, 0])
...
tensor([1, 1, 1, 1, 1, 1, 1, 1])
tensor([1, 1, 1, 1, 1, 1])
Note that by setting drop_last=False, there will be batches, here and there, with less than batch_size elements. By setting it to True, the batches will be all of equal size, but some data points will be dropped.
I'm using this library to create a model to learn graphs. Here is the code (from repository):
import numpy as np
from keras_gcn.backend import keras
from keras_gcn import GraphConv
# feature matrix
input_data = np.array([[[0, 1, 2],
[2, 3, 4],
[4, 5, 6],
[7, 7, 8]]])
# adjacency matrix
input_edge = np.array([[[1, 1, 1, 0],
[1, 1, 0, 0],
[1, 0, 1, 0],
[0, 0, 0, 1]]])
labels = np.array([[[1],
[0],
[1],
[0]]])
data_layer = keras.layers.Input(shape=(None, 3), name='Input-Data')
edge_layer = keras.layers.Input(shape=(None, None), dtype='int32', name='Input-Edge')
conv_layer = GraphConv(units=4, step_num=1, kernel_initializer='ones',
bias_initializer='ones', name='GraphConv')([data_layer, edge_layer])
model = keras.models.Model(inputs=[data_layer, edge_layer], outputs=conv_layer)
model.compile(optimizer='adam', loss='mae', metrics=['mae'])
model.fit([input_data, input_edge], labels)
However, when I run the code I get the following error:
ValueError: Error when checking target: expected GraphConv to have 3 dimensions, but got array with shape (4, 1)
while the shape of labels is (1, 4, 1)
You should encode your labels using onehot-encoder, something like the following:
lables = np.array([[[0, 1],
[1, 0],
[0, 1],
[1, 0]]])
Also number of units in GraphConv layer should be equal to the number of unique labels which is 2 in your case.
I think the issue is mismatch between the shapes of your edge_layer and data_layer.
When you use the function keras.layers.Input you're giving data_layer a shape of shape=(None, 3) and then you're giving edge_layer a shape of shape=(None, None)
Match the shapes and let me know how it goes.
I have a problem on making forecasting model(lstm) model now with tensorflow.
I try to use embedding_lookup with multiple categorical columns.
This is my data (shape[100, 12, 16])
[[[1, 1, 1, 7, 1, -1, 26, 9],
[1, 5, 2, -5, 2, 20, 25, 8],
[1, 4, 3, 1, 1, 32, 1, 7],
...]]
for forecasting, each row has multiple features, and I changed these features to categorical data(0, 1, 2...). after then, I'd like to use embedding layer for each columns and then concat all for using as an input data to LSTM.
How can I use embedding layer and concat each others?
This is my code
def get_var(self, name='', shape=None, dtype=tf.float32):
return tf.get_variable(name, shape, dtype=dtype, initializer=self.initializer)
def get_embedding(self, data='', name='', shape=[], dtype=tf.float32):
return tf.nn.embedding_lookup(self.get_var(name=name, shape=shape), data)
emb_concat = self.get_embedding(tf.cast(data[:,:, 0], 'int32'), name='emb_ap2id'+type, shape=[3, 2])
emb = self.get_embedding(tf.cast(data[:,:, 1], 'int32'), name='emb_month'+type, shape=[12, 11])
emb_concat = tf.concat(emb_concat, emb, axis=2)
It isn't working and I got an error msg.
result img
I use slim.conv2d to set up VGG-net
with slim.arg_scope([slim.conv2d, slim.max_pool2d], padding='SAME'):
conv1_1 = slim.conv2d(img, 64, [3, 3], scope='conv1')
conv1_2 = slim.conv2d(conv1_1, 64, [3, 3], scope='conv1_1')
pool1 = slim.max_pool2d(conv1_2, [2, 2], 2, scope='pool1_2')
conv2_1 = slim.conv2d(pool1, 128, [3, 3], 1, scope='conv2_1')
conv2_2 = slim.conv2d(conv2_1, 128, [3, 3], 1, scope='conv2_2')
pool2 = slim.max_pool2d(conv2_2, [2, 2], 2, scope='pool2')
conv3_1 = slim.conv2d(pool2, 256, [3, 3], 1, scope='conv3_1')
conv3_2 = slim.conv2d(conv3_1, 256, [3, 3], 1, scope='conv3_2')
conv3_3 = slim.conv2d(conv3_2, 256, [3, 3], 1, scope='conv3_3')
pool3 = slim.max_pool2d(conv3_3, [2, 2], 2, scope='pool3')
conv4_1 = slim.conv2d(pool3, 512, [3, 3], scope='conv4_1')
# print conv4_1.shape
conv4_2 = slim.conv2d(conv4_1, 512, [3, 3], scope='conv4_2')
conv4_3 = slim.conv2d(conv4_2, 512, [3, 3], scope='conv4_3') # 38
If I want to initialize the variables of conv1 or conv2 from existing VGG-model.
How can I do it?
You can also use assign_from_values as suggested here:
Github - Initialize layers.convolution2d from numpy array
sess = tf.Session()
with sess.as_default():
init = tf.global_variables_initializer()
sess.run(init)
path = pathlib.Path('./assets/classifier_weights.npz')
if(path.is_file()):
print("Initilize Weights from Numpy Array")
init_weights = np.load(path)
assign_op, feed_dict_init = slim.assign_from_values({
'conv1/weights' : init_weights['conv1_w'],
})
sess.run(assign_op, feed_dict_init)
I'm assuming you have a checkpoint of your existing VGG model.
One way to do this using TF Slim is to restore from a checkpoint, but specify a custom mapping between the variable names in the checkpoint and variables in your model. See the comments here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/slim/python/slim/learning.py#L146
I've replaced the slim.conv2d with tf.nn.conv2d(input, kernel...), where the kernel was created using tf.get_variable, and assigned using tf.assign.