I'm beginner in Python and ML. I was practising this Iris Data set to create a ML model using tensor flow 2.0.
I parsed the csv and trained the model using the dataset. I'm able to get 90 % training accuracy and 91 % validation accuracy during my model creation.
import tensorflow as tf
import numpy as np
from sklearn import preprocessing
csv_data = np.loadtxt('iris_training.csv',delimiter=',')
target_all = csv_data[:,-1]
csv_data = csv_data[:,0:-1]
# Shuffling the input
shuffled_indices = np.arange(csv_data.shape[0])
np.random.shuffle(shuffled_indices)
shuffled_inputs = csv_data[shuffled_indices]
shuffled_targets = target_all[shuffled_indices]
# Standardize the Inputs
shuffled_inputs = preprocessing.scale(shuffled_inputs)
# Split date into train , validation and test
total_count = shuffled_inputs.shape[0]
train_data_count = int(0.8*total_count)
validation_data_count = int(0.1*total_count)
test_data_count = total_count - train_data_count - validation_data_count
train_inputs = shuffled_inputs[:train_data_count]
train_targets = shuffled_targets[:train_data_count]
validation_inputs = shuffled_inputs[train_data_count:train_data_count+validation_data_count]
validation_targets = shuffled_targets[train_data_count:train_data_count+validation_data_count]
test_inputs = shuffled_inputs[train_data_count+validation_data_count:]
test_targets = shuffled_targets[train_data_count+validation_data_count:]
print(len(train_inputs))
print(len(validation_inputs))
print(len(test_inputs))
# Model Creation
input_size = 4
hidden_layer_size = 100
output_size = 3
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(hidden_layer_size, input_dim=input_size, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(hidden_layer_size, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(output_size, activation=tf.nn.softmax))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(train_inputs,train_targets, epochs=10, validation_data=(validation_inputs, validation_targets), verbose=2)
prediction = model.predict(test_inputs)
Point me if there is something in my code that i could do to improve the accuracy of my model for this simple Iris Dataset.
File Used for training my Model : Iris Csv
As for your model, you can try to do hyperparameters tuning,
Setting the learning rate to a lower value
Increase the epoch
Add more training dataset since you have a small set of the dataset.
The neural network shines when there is a good amount of data for the training.
You can also add more layers to the model, add dropouts to avoid overfitting
as well as using different activation functions.
These are the common factors that affect model performance.
Related
I am trying to predict GPU prices. For this I have prepared a custom dataset of shape (135,39). I am using a simple multi layer neural network.
I gave my network a feature vector of size 39,1 which includes crypto prices and GPU prices of today. my y vector consists only of GPU prices one month from now.
When I train the network on the shuffled data, the predictions I get turn out to be really good. When I start predicting prices 12 months from now, I still get great results. This does not make any sense at all.
here are some of the results:
Here's what the dataset looks like
Input matrix X
Output y
And now here's the code
y_data = df.iloc[:,23:39]
y_data_mean = y_data.mean()
y_data_std = y_data.std()
norm_df=(df-df.mean())/df.std()
future_months = 1
X = norm_df[:len(df)-(5*future_months)]
y = norm_df.iloc[(5*future_months):,23:39]
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,shuffle=True)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.InputLayer(input_shape=(39,)))
model.add(Dense(25, activation='relu'))
model.add(Dense(20, activation='relu'))
model.add(tf.keras.layers.Dense(16))
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=opt,
loss=[tf.keras.losses.MeanSquaredError()])
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), batch_size = 32, epochs = 2000)
Here's the train/test MSE loss.As per this graph I am not overfitting on the data. I should have used validation data but due to lack of dataset i only kept train and test datasets.
I'm trying to implement a small prototype of The GCN Model using the library Stellargraph. I've my StellarGraph graph object ready, I'm trying to solve a multi-class multi-label classification problem. This means I'm trying to predict more than one column (19 exactly) each column is encoded to either 0 or 1.
Here is what I've done:
from sklearn.model_selection import train_test_split
from stellargraph.mapper import FullBatchNodeGenerator
train_subjects, test_subjects = train_test_split(nodelist, test_size = .25)
generator = FullBatchNodeGenerator(graph, method="gcn")
from stellargraph.layer import GCN
train_gen = generator.flow(train_subjects['ID'], train_subjects.drop(['ID'], axis = 1))
gcn = GCN(layer_sizes=[16, 16], activations=["relu", "relu"], generator=generator, dropout=0.5)
from tensorflow.keras import layers, optimizers, losses, metrics, Model
x_inp, x_out = gcn.in_out_tensors()
predictions = layers.Dense(units = 1, activation="sigmoid")(x_out)
from tensorflow.keras.metrics import Precision as Precision
model = Model(inputs=x_inp, outputs=predictions)
model.compile(
optimizer=optimizers.Adam(learning_rate = 0.01),
loss=losses.categorical_crossentropy,
metrics= [Precision()])
val_gen = generator.flow(test_subjects['ID'], test_subjects.drop(['ID'], axis = 1))
from tensorflow.keras.callbacks import EarlyStopping
es_callback = EarlyStopping(monitor="val_precision", patience=200, restore_best_weights=True)
history = model.fit(
train_gen,
epochs=200,
validation_data=val_gen,
verbose=2,
shuffle=False,
callbacks=[es_callback])
I've 271045 edges & 16354 nodes in total including 12265 training nodes. The issue I'm getting is a shape mismatching from Keras. It states as follows. I suspect it's due to inserting multiple columns as target columns. I've tried the model using only one column (class) & it worked perfectly.
InvalidArgumentError: Incompatible shapes: [1,12265] vs. [1,233035]
[[node LogicalAnd_1 (defined at tmp/ipykernel_52/2745570431.py:7) ]] [Op:__inference_train_function_1405]
It's worth mentioning that 233035 = 12265 (number of train nodes) times 19 (number of classes). Any Idea on what is going wrong here?
I figured out the problem.
It was a newbie mistake, I initialized the Dense Classification layer with 1 unit instead of 19 (number of classes).
I just needed to fix that line to:
predictions = layers.Dense(units = 19, activation="sigmoid")(x_out)
Have a nice day!
I have a text classification that I am trying to do using BERT. Below is the code I am using. The model training code(below) works fine but I am facing issue with the prediction part
from transformers import TFBertForSequenceClassification
import tensorflow as tf
# recommended learning rate for Adam 5e-5, 3e-5, 2e-5
learning_rate = 5e-5
nlabels = 26
# we will do just 1 epoch for illustration, though multiple epochs might be better as long as we will not overfit the model
number_of_epochs = 1
# model initialization
model = TFBertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=nlabels,
output_attentions=False,
output_hidden_states=False)
# optimizer Adam
optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate, epsilon=1e-08)
# we do not have one-hot vectors, we can use sparce categorical cross entropy and accuracy
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
metric = tf.keras.metrics.SparseCategoricalAccuracy('accuracy')
model.compile(optimizer=optimizer, loss=loss, metrics=[metric])
bert_history = model.fit(ds_tr_encoded, epochs=number_of_epochs)
I am getting the output using the following
preds = model.predict(ds_te_encoded)
pred_labels_idx = np.argmax(preds['logits'], axis=1)
The issue I am facing is that the shape of pred_labels_idx is not the same as ds_te_encoded
len(pred_labels_idx) #426820
tf.data.experimental.cardinality(ds_te_encoded) #<tf.Tensor: shape=(), dtype=int64, numpy=21341>
Not sure why this is happening.
Since ds_te_encoded is of type tf.data.Dataset and you call cardinality(...), the cardinality in your case is simply the rounded number of batches and not the number of samples. So I am assuming you are using a batch size of 20, because 426820/20 = 21341. That is probably what is causing the confusion.
I have a question regarding the evaluation of an LSTM Model. I have trained an LSTM Model and stored it with model.save(...). Now I want load_model and evaluate it on the validation set datasets. Since neural networks are stochastic, I run it several times and compute the mean and the variance of the different metrics I am interested in.
Now I am shocked that after the first run all consecutive runs have the same performance on every metric. I don't think that is right, but I don't know where the error occurs.
So my question is:
what is my mistake in setting up the validation of my model?
and how can I fix that?
Here are the code snippets that should explain what I am doing:
Compile and fit the Model
def compile_and_fit( hparams,
MAX_EPOCHS,
model_path ):
window = WindowGenerator( input_width= hparams[HP_WINDOW_SIZE],
label_width=hparams[HP_WINDOW_SIZE], shift=1,
label_columns=['q_MARI'], batch_size = hparams[HP_BATCH_SIZE])
model = tf.keras.models.Sequential([
tf.keras.layers.LSTM(hparams[HP_NUM_UNITS], return_sequences=True, name="LSTM_1"),
tf.keras.layers.Dropout(hparams[HP_DROPOUT], name="Dropout_1"),
tf.keras.layers.LSTM(hparams[HP_NUM_UNITS], return_sequences=True, name="LSTM_2"),
tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(1))
])
learning_rate = hparams[HP_LEARNING_RATE]
model.compile(loss=tf.losses.MeanSquaredError(),
optimizer=tf.optimizers.Adam(learning_rate=learning_rate),
metrics=get_metrics())
history = model.fit(window.train,
epochs=MAX_EPOCHS,
validation_data=window.val,
callbacks= get_callbacks(model_path))
_, a,_,_,_,_ = model.evaluate(window.val)
return a, model, history
Train and safe it
a, model, history = compile_and_fit( hparams = hparams, MAX_EPOCHS = MAX_EPOCHS, model_path = run_path)
model.save(run_path)
Load and evaluate it
model = tf.keras.models.load_model(os.path.join(hparam_path, model_name),
custom_objects={"max_error": max_error, "median_absolute_error": median_absolute_error, "rev_metric": rev_metric, "nse_metric": nse_metric})
model.compile(loss=tf.losses.MeanSquaredError(), optimizer="adam", metrics=get_metrics())
metric_values = np.empty(shape = (nr_runs, len(metrics)), dtype=float)
for j in range(nr_runs):
window = WindowGenerator(input_width= hparam_vals[i], label_width=hparam_vals[i], shift=1,
label_columns=['q_MARI'])
metric_values[j]= np.array(model.evaluate(window.val))
means = metric_values.mean(axis=0)
varis = metric_values.var(axis=0)
print(f'means: {means}, varis: {varis}')
The results I am getting
For setting up the Training I follow those two guides:
https://www.tensorflow.org/tutorials/structured_data/time_series
https://www.tensorflow.org/tensorboard/hyperparameter_tuning_with_hparams
LSTM is not stochastic. Evaluation results should be the same for the same data.
There are two steps, when you train the model, randomness will influence the model you trained. However, after that, you saved the model, the prediction result would be same if you use the same model.
I am training an LSTM model with embedding input layer with a vocabulary size of approximately 100,000. While profiling the training via tensorboard, I discovered that most of the training time is spent on "Kernel Launch" (58%), followed by "All Others" (36%). In other words the GPU is idle most of the time due to overhead. The high kernel launch time seems to be driven by the size of the embedding layer.
My question is: how can I improve the training speed? Is it inevitable that most of the training time is spent on kernel launch when working with a large-ish embedding? Increasing the batch size (currently at 128) would help since the kernel launch time doesn't depend on the batch size, but 128 is already on the high side.
Not sure what exactly falls under "All Others"?
I am working on a Tesla T4 GPU with Tensorflow 2.2.0, but I see the same behavior using the nightly build.
Following the RNN tutorial on tensorflow.org (https://www.tensorflow.org/tutorials/text/text_classification_rnn), here is an example that highlights the performance issues:
import tensorflow_datasets as tfds
import tensorflow as tf
from datetime import datetime
from tqdm.auto import tqdm
### retrieve data ###
# use imdb_reviews dataset from TFDS
dataset = tfds.load('imdb_reviews',as_supervised=True)
train_dataset, test_dataset = dataset['train'], dataset['test']
### get encoder ###
# initialize tokenizer
tokenizer = tfds.features.text.Tokenizer()
# build vocabulary
def addOrUpdate(d,token):
d[token] = d.get(token,0)+1
vocab = dict()
dataset_iter = iter(train_dataset)
for el in tqdm(dataset_iter):
text = el[0].numpy().decode("utf-8")
for token in tokenizer.tokenize(text):
addOrUpdate(vocab,token)
# shrink vocabulary (MIN_COUNT>1 significantly reduces model dimension)
MIN_COUNT = 1
vocab_subset = set([k for k,v in vocab.items() if v >= MIN_COUNT])
print("Using vocabulary subset with min_count={:}: {:,} words, ".format(MIN_COUNT,len(vocab_subset)))
# create encoder
encoder = tfds.features.text.TokenTextEncoder(vocab_subset)
### Prepare the data for training ###
def encode(text_tensor, label):
encoded_text = encoder.encode(text_tensor.numpy())
return encoded_text, label
def encode_map_fn(text,label):
# encode
encoded_text, label = tf.py_function(encode,
inp=[text, label],
Tout=(tf.int64, tf.int64))
# set shapes
encoded_text.set_shape([None])
label.set_shape([])
return encoded_text, label
train_dataset = train_dataset.map(encode_map_fn)
test_dataset = test_dataset.map(encode_map_fn)
BUFFER_SIZE = 25000
BATCH_SIZE = 128
train_dataset = train_dataset.shuffle(BUFFER_SIZE)
train_dataset = train_dataset.padded_batch(BATCH_SIZE)
### create the model ###
model = tf.keras.Sequential([
tf.keras.layers.Embedding(encoder.vocab_size, 256, mask_zero=True),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(128)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(1)
])
model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.Adam(1e-4),
metrics=['accuracy'])
### Train the model ###
# create tensorboard callback
log_path = 'logs_'+datetime.now().strftime("%Y%m%d_%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_path,
profile_batch = '10,20')
history = model.fit(train_dataset, epochs=1, steps_per_epoch=30,
callbacks=[tensorboard_callback])
Same code in a Colab Notebook: https://colab.research.google.com/drive/1WoAShXR2cGOYWPQoKdh4IGlhZh4FAK7o?usp=sharing
I haven't tried your code, but from looking at it, I guess the following issue might be related:
If a GPU is present but eager execution is enabled, Embedding layers are still placed on the CPU.
See https://github.com/tensorflow/tensorflow/issues/44194 (it includes a workaround).