Incompatible shapes in Stellargraph - python

I'm trying to implement a small prototype of The GCN Model using the library Stellargraph. I've my StellarGraph graph object ready, I'm trying to solve a multi-class multi-label classification problem. This means I'm trying to predict more than one column (19 exactly) each column is encoded to either 0 or 1.
Here is what I've done:
from sklearn.model_selection import train_test_split
from stellargraph.mapper import FullBatchNodeGenerator
train_subjects, test_subjects = train_test_split(nodelist, test_size = .25)
generator = FullBatchNodeGenerator(graph, method="gcn")
from stellargraph.layer import GCN
train_gen = generator.flow(train_subjects['ID'], train_subjects.drop(['ID'], axis = 1))
gcn = GCN(layer_sizes=[16, 16], activations=["relu", "relu"], generator=generator, dropout=0.5)
from tensorflow.keras import layers, optimizers, losses, metrics, Model
x_inp, x_out = gcn.in_out_tensors()
predictions = layers.Dense(units = 1, activation="sigmoid")(x_out)
from tensorflow.keras.metrics import Precision as Precision
​
model = Model(inputs=x_inp, outputs=predictions)
model.compile(
optimizer=optimizers.Adam(learning_rate = 0.01),
loss=losses.categorical_crossentropy,
metrics= [Precision()])
val_gen = generator.flow(test_subjects['ID'], test_subjects.drop(['ID'], axis = 1))
from tensorflow.keras.callbacks import EarlyStopping
es_callback = EarlyStopping(monitor="val_precision", patience=200, restore_best_weights=True)
history = model.fit(
train_gen,
epochs=200,
validation_data=val_gen,
verbose=2,
shuffle=False,
callbacks=[es_callback])
I've 271045 edges & 16354 nodes in total including 12265 training nodes. The issue I'm getting is a shape mismatching from Keras. It states as follows. I suspect it's due to inserting multiple columns as target columns. I've tried the model using only one column (class) & it worked perfectly.
InvalidArgumentError: Incompatible shapes: [1,12265] vs. [1,233035]
[[node LogicalAnd_1 (defined at tmp/ipykernel_52/2745570431.py:7) ]] [Op:__inference_train_function_1405]
It's worth mentioning that 233035 = 12265 (number of train nodes) times 19 (number of classes). Any Idea on what is going wrong here?

I figured out the problem.
It was a newbie mistake, I initialized the Dense Classification layer with 1 unit instead of 19 (number of classes).
I just needed to fix that line to:
predictions = layers.Dense(units = 19, activation="sigmoid")(x_out)
Have a nice day!

Related

Bert prediction shape not equal to num_samples

I have a text classification that I am trying to do using BERT. Below is the code I am using. The model training code(below) works fine but I am facing issue with the prediction part
from transformers import TFBertForSequenceClassification
import tensorflow as tf
# recommended learning rate for Adam 5e-5, 3e-5, 2e-5
learning_rate = 5e-5
nlabels = 26
# we will do just 1 epoch for illustration, though multiple epochs might be better as long as we will not overfit the model
number_of_epochs = 1
# model initialization
model = TFBertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=nlabels,
output_attentions=False,
output_hidden_states=False)
# optimizer Adam
optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate, epsilon=1e-08)
# we do not have one-hot vectors, we can use sparce categorical cross entropy and accuracy
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
metric = tf.keras.metrics.SparseCategoricalAccuracy('accuracy')
model.compile(optimizer=optimizer, loss=loss, metrics=[metric])
bert_history = model.fit(ds_tr_encoded, epochs=number_of_epochs)
I am getting the output using the following
preds = model.predict(ds_te_encoded)
pred_labels_idx = np.argmax(preds['logits'], axis=1)
The issue I am facing is that the shape of pred_labels_idx is not the same as ds_te_encoded
len(pred_labels_idx) #426820
tf.data.experimental.cardinality(ds_te_encoded) #<tf.Tensor: shape=(), dtype=int64, numpy=21341>
Not sure why this is happening.
Since ds_te_encoded is of type tf.data.Dataset and you call cardinality(...), the cardinality in your case is simply the rounded number of batches and not the number of samples. So I am assuming you are using a batch size of 20, because 426820/20 = 21341. That is probably what is causing the confusion.

Keras model consistently underestimates the target

I don't understand why my keras model is under-estimating the target. I include the minimal example below. If I simplify the model architecture, the predictions are closer to the true one. But what I confuse is if the complex model overfits, why isn't the predictions be extremely close to the training true value but off systematically like that? (The plot is for training data)
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
from sklearn.metrics import mean_squared_error
def create_dataset(num_series=1, num_steps=1000, period=500, mu=1, sigma=0.3):
noise = np.random.normal(mu, sigma, size=(num_series, num_steps))
sin_minumPi_Pi = np.sin(np.tile(np.linspace(-np.pi, np.pi, period), int(num_steps / period)))
sin_Zero_2Pi = np.sin(np.tile(np.linspace(0, 2 * np.pi, period), int(num_steps / period)))
pattern = np.concatenate((np.tile(sin_minumPi_Pi.reshape(1, -1),
(int(np.ceil(num_series / 2)),1)),
np.tile(sin_Zero_2Pi.reshape(1, -1),
(int(np.floor(num_series / 2)), 1))
),
axis=0
)
target = noise + pattern
return target[0]
avail=create_dataset(mu=5)
window_size = 7
def getdata(data,window_size):
X,y = np.array([1]*window_size),np.array([])
for i in range(window_size, len(data)):
X = np.vstack((X,data[i-window_size:i]))
y = np.append(y,data[i:i+1])
return X[1:],y
X,y = getdata(avail,window_size)
def train_model(X,y,a_dim=100,epoch=50,batch_size=32,d=0.2):
model = Sequential()
model.add(Dense(a_dim,activation='relu',input_dim=X.shape[1]))
model.add(Dropout(d))
model.add(Dense(a_dim,activation='relu'))
model.add(Dropout(d))
model.add(Dense(a_dim,activation='relu'))
model.add(Dropout(d))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X, y, epochs=epoch,batch_size=batch_size, verbose=2)
return model
model = train_model(X,y)
plt.plot(model.predict(X)[:,0])
plt.plot(y)
plt.show()
It is because your model has been built for this behaviour.
For a model is the main purpose to get an optimized function which fits any unseen data as well as possible. Aming this, your model starts to learn from training samples. The problem is, that during the training process your model tends to overfit on your training samples loosing its generalization ability i.e. to perform well on unseen data.
To avoid this, we use some technics, one of them is dropout.
You used it as well:
model.add(Dropout(d))
with the default parameter d=0.2:
def train_model(X,y,a_dim=100,epoch=50,batch_size=32,d=0.2):
As I wrote above this is to avoid overfit on your training data and that is why your model constantly overestimate it, but get a better estimation to unseen data.
Passing other dropout value to your train_model() function you will get better fitting to your train data:
model = train_model(X, y, d=0.0)
plt.plot(model.predict(X)[:,0])
plt.plot(y)
plt.show()
Out:

Improving the accuracy of Iris ML model using Tensorflow

I'm beginner in Python and ML. I was practising this Iris Data set to create a ML model using tensor flow 2.0.
I parsed the csv and trained the model using the dataset. I'm able to get 90 % training accuracy and 91 % validation accuracy during my model creation.
import tensorflow as tf
import numpy as np
from sklearn import preprocessing
csv_data = np.loadtxt('iris_training.csv',delimiter=',')
target_all = csv_data[:,-1]
csv_data = csv_data[:,0:-1]
# Shuffling the input
shuffled_indices = np.arange(csv_data.shape[0])
np.random.shuffle(shuffled_indices)
shuffled_inputs = csv_data[shuffled_indices]
shuffled_targets = target_all[shuffled_indices]
# Standardize the Inputs
shuffled_inputs = preprocessing.scale(shuffled_inputs)
# Split date into train , validation and test
total_count = shuffled_inputs.shape[0]
train_data_count = int(0.8*total_count)
validation_data_count = int(0.1*total_count)
test_data_count = total_count - train_data_count - validation_data_count
train_inputs = shuffled_inputs[:train_data_count]
train_targets = shuffled_targets[:train_data_count]
validation_inputs = shuffled_inputs[train_data_count:train_data_count+validation_data_count]
validation_targets = shuffled_targets[train_data_count:train_data_count+validation_data_count]
test_inputs = shuffled_inputs[train_data_count+validation_data_count:]
test_targets = shuffled_targets[train_data_count+validation_data_count:]
print(len(train_inputs))
print(len(validation_inputs))
print(len(test_inputs))
# Model Creation
input_size = 4
hidden_layer_size = 100
output_size = 3
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(hidden_layer_size, input_dim=input_size, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(hidden_layer_size, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(output_size, activation=tf.nn.softmax))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(train_inputs,train_targets, epochs=10, validation_data=(validation_inputs, validation_targets), verbose=2)
prediction = model.predict(test_inputs)
Point me if there is something in my code that i could do to improve the accuracy of my model for this simple Iris Dataset.
File Used for training my Model : Iris Csv
As for your model, you can try to do hyperparameters tuning,
Setting the learning rate to a lower value
Increase the epoch
Add more training dataset since you have a small set of the dataset.
The neural network shines when there is a good amount of data for the training.
You can also add more layers to the model, add dropouts to avoid overfitting
as well as using different activation functions.
These are the common factors that affect model performance.

Keras model predicting experience from hours

I am very new to Keras, neural networks and machine learning having just started to learn yesterday. I decided to try predicting the experience over an hour (0 to 23) (for a game and my own generated data-set) that a user would earn. Currently running what I have the predictions seem to be very low and very poor. I have tried a relu activation, which produced predictions all to be zero and from a bit of research, LeakyReLU.
This is the code I have for the prediction model so far:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LeakyReLU
import numpy
numpy.random.seed(7)
dataset = numpy.loadtxt("experience.csv", delimiter=",")
X = dataset[: ,0]
Y = dataset[: ,1]
model = Sequential()
model.add(Dense(12, input_dim = 1, activation=LeakyReLU(0.3)))
model.add(Dense(8, activation=LeakyReLU(0.3)))
model.add(Dense(1, activation=LeakyReLU(0.3)))
model.compile(loss = 'mean_absolute_error', optimizer='adam', metrics = ['accuracy'])
model.fit(X, Y, epochs=120, batch_size=10, verbose = 0)
predictions = model.predict(X)
rounded = [round(x[0]) for x in predictions]
print(rounded)
I have also tried playing around with the hidden levels of the network, but honestly have no idea how many there should be or a good way to justify an amount.
If it helps here is the data-set I have been using:
https://raw.githubusercontent.com/NightShadeII/xpPredictor/master/experience.csv
Thankyou for any help
Looking at your data it does not seem like a classification problem.
You have two options:
-> Look at the second column and bucket them depending on the ranges and make classes that can be predicted, for instance: 0, 1, 2 etc. Now it tries to train but does not have enough examples for millions of classes that it thinks you are trying to predict.
-> If you want real valued output and not classes, try using linear regression.

How to decide the size of layers in Keras' Dense method?

Below is the simple example of multi-class classification task with
IRIS data.
import seaborn as sns
import numpy as np
from sklearn.cross_validation import train_test_split
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.regularizers import l2
from keras.utils import np_utils
#np.random.seed(1335)
# Prepare data
iris = sns.load_dataset("iris")
iris.head()
X = iris.values[:, 0:4]
y = iris.values[:, 4]
# Make test and train set
train_X, test_X, train_y, test_y = train_test_split(X, y, train_size=0.5, random_state=0)
################################
# Evaluate Keras Neural Network
################################
# Make ONE-HOT
def one_hot_encode_object_array(arr):
'''One hot encode a numpy array of objects (e.g. strings)'''
uniques, ids = np.unique(arr, return_inverse=True)
return np_utils.to_categorical(ids, len(uniques))
train_y_ohe = one_hot_encode_object_array(train_y)
test_y_ohe = one_hot_encode_object_array(test_y)
model = Sequential()
model.add(Dense(16, input_shape=(4,),
activation="tanh",
W_regularizer=l2(0.001)))
model.add(Dropout(0.5))
model.add(Dense(3, activation='sigmoid'))
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
# Actual modelling
# If you increase the epoch the accuracy will increase until it drop at
# certain point. Epoch 50 accuracy 0.99, and after that drop to 0.977, with
# epoch 70
hist = model.fit(train_X, train_y_ohe, verbose=0, nb_epoch=100, batch_size=1)
score, accuracy = model.evaluate(test_X, test_y_ohe, batch_size=16, verbose=0)
print("Test fraction correct (NN-Score) = {:.2f}".format(score))
print("Test fraction correct (NN-Accuracy) = {:.2f}".format(accuracy))
My question is how do people usually decide the size of layers?
For example based on code above we have:
model.add(Dense(16, input_shape=(4,),
activation="tanh",
W_regularizer=l2(0.001)))
model.add(Dense(3, activation='sigmoid'))
Where first parameter of Dense is 16 and second is 3.
Why two layers uses two different values for Dense?
How do we choose what's the best value for Dense?
Basically it is just trial and error. Those are called hyperparameters and should be tuned on a validation set (split from your original data into train/validation/test).
Tuning just means trying different combinations of parameters and keep the one with the lowest loss value or better accuracy on the validation set, depending on the problem.
There are two basic methods:
Grid search: For each parameter, decide a range and steps into that range, like 8 to 64 neurons, in powers of two (8, 16, 32, 64), and try each combination of the parameters. This is obviously requires an exponential number of models to be trained and tested and takes a lot of time.
Random search: Do the same but just define a range for each parameter and try a random set of parameters, drawn from an uniform distribution over each range. You can try as many parameters sets you want, for as how long you can. This is just a informed random guess.
Unfortunately there is no other way to tune such parameters. About layers having different number of neurons, that could come from the tuning process, or you can also see it as dimensionality reduction, like a compressed version of the previous layer.
There is no known way to determine a good network structure evaluating the number of inputs or outputs. It relies on the number of training examples, batch size, number of epochs, basically, in every significant parameter of the network.
Moreover, a high number of units can introduce problems like overfitting and exploding gradient problems. On the other side, a lower number of units can cause a model to have high bias and low accuracy values. Once again, it depends on the size of data used for training.
Sadly it is trying some different values that give you the best adjustments. You may choose the combination that gives you the lowest loss and validation loss values, as well as the best accuracy for your dataset, as said in the previous post.
You could do some proportion on your number of units value, something like:
# Build the model
model = Sequential()
model.add(Dense(num_classes * 8, input_shape=(shape_value,), activation = 'relu' ))
model.add(Dropout(0.5))
model.add(Dense(num_classes * 4, activation = 'relu'))
model.add(Dropout(0.2))
model.add(Dense(num_classes * 2, activation = 'relu'))
model.add(Dropout(0.2))
#Output layer
model.add(Dense(num_classes, activation = 'softmax'))
The model above shows an example of a categorisation AI system. The num_classes are the number of different categories the system has to choose. For instance, in the iris dataset from Keras, we have:
Iris Setosa
Iris Versicolour
Iris Virginica
num_classes = 3
However, this could lead to worse results than with other random values. We need to adjust the parameters to the training dataset by making some different tries and then analyse the results seeking for the best combination of parameters.
My suggestion is to use EarlyStopping(). Then check the number of epochs and accuracy with test loss.
from tensorflow.keras.callbacks import ReduceLROnPlateau, EarlyStopping
rlp = lrd = ReduceLROnPlateau(monitor = 'val_loss',patience = 2,verbose = 1,factor = 0.8, min_lr = 1e-6)
es = EarlyStopping(verbose=1, patience=2)
his = classifier.fit(X_train, y_train, epochs=500, batch_size = 128, validation_split=0.1, verbose = 1, callbacks=[lrd,es])

Categories