I am building a Tensorflow implementation of an autoencoder for time series. I have a 2000 time series, each of which is a series of 501-time components. These time series are stored in a '.mat' file, which I read in input using scipy.
I then build the autoencoder and train it using batches of the 2000 time series. Finally, I would like to visualize the prediction of the trained autoencoder on the 2000 time series given as input, and compare with the original series, so that I can see if the autoencoder is doing a good job in compressing the data.
I use a double-layer autoencoder, with 250 and 100 nodes in the first and second hidden layer, respectively.
My problem is that when I compare the predicted time series with the original ones, the predicted ones have only positive values, while the original time series have both negative and positive values.
Here the code I have been using:
import scipy.io
mat = scipy.io.loadmat('input_time_series.mat')
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.contrib.layers import fully_connected
input = mat
output = input
tf.reset_default_graph()
num_inputs=501 #number of components in the original time seris
num_hid1=250
num_hid2=100
num_hid3=num_hid1
num_output=num_inputs
lr=0.01
actf=tf.nn.relu
X=tf.placeholder(tf.float32,shape=[None,num_inputs])
initializer=tf.variance_scaling_initializer()
w1=tf.Variable(initializer([num_inputs,num_hid1]),dtype=tf.float32)
w2=tf.Variable(initializer([num_hid1,num_hid2]),dtype=tf.float32)
w3=tf.Variable(initializer([num_hid2,num_hid3]),dtype=tf.float32)
w4=tf.Variable(initializer([num_hid3,num_output]),dtype=tf.float32)
b1=tf.Variable(tf.zeros(num_hid1))
b2=tf.Variable(tf.zeros(num_hid2))
b3=tf.Variable(tf.zeros(num_hid3))
b4=tf.Variable(tf.zeros(num_output))
hid_layer1=actf(tf.matmul(X,w1)+b1)
hid_layer2=actf(tf.matmul(hid_layer1,w2)+b2)
hid_layer3=actf(tf.matmul(hid_layer2,w3)+b3)
output_layer=actf(tf.matmul(hid_layer3,w4)+b4)
loss=tf.reduce_mean(tf.square(output_layer-X))
optimizer=tf.train.AdamOptimizer(lr)
train=optimizer.minimize(loss)
init=tf.global_variables_initializer()
num_epoch=5000
batch_size=150
with tf.Session() as sess:
sess.run(init)
for epoch in range(num_epoch):
num_batches=2000//batch_size
for iteration in range(num_batches):
X_batch = input[:]
Y_batch = output[:]
sess.run(train,feed_dict={X:X_batch})
train_loss=loss.eval(feed_dict={X:X_batch})
print("epoch {} loss {}".format(epoch,train_loss))
results=output_layer.eval(feed_dict={X:input})
I also include an example of comparison between one input time series (in blue) and the relevant one predicted by the autoencoder (in orange)
Related
Edit: For anyone interested. I made it slight better. I used L2 regularizer=0.0001, I added two more dense layers with 3 and 5 nodes with no activation functions. Added doupout=0.1 for the 2nd and 3rd GRU layers.Reduced batch size to 1000 and also set loss function to mae
Important note: I discovered that my TEST dataframe wwas extremely small compared to the train one and that is the main Reason it gave me very bad results.
I have a GRU model which has 12 features as inputs and I'm trying to predict output power. I really do not understand though whether I choose
1 layer or 5 layers
50 neurons or 512 neuron
10 epochs with a small batch size or 100 eopochs with a large batch size
Different optimizers and activation functions
Dropput and L2 regurlarization
Adding more dense layer.
Increasing and Decreasing learning rate
My results are always the same and doesn't make any sense, my loss and val_loss loss is very steep in first 2 epochs and then for the rest it becomes constant with small fluctuations in val_loss
Here is my code and a figure of losses, and my dataframes if needed:
Dataframe1: https://drive.google.com/file/d/1I6QAU47S5360IyIdH2hpczQeRo9Q1Gcg/view
Dataframe2: https://drive.google.com/file/d/1EzG4TVck_vlh0zO7XovxmqFhp2uDGmSM/view
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from google.colab import files
from tensorboardcolab import TensorBoardColab, TensorBoardColabCallback
tbc=TensorBoardColab() # Tensorboard
from keras.layers.core import Dense
from keras.layers.recurrent import GRU
from keras.models import Sequential
from keras.callbacks import EarlyStopping
from keras import regularizers
from keras.layers import Dropout
df10=pd.read_csv('/content/drive/My Drive/Isolation Forest/IF 10 PERCENT.csv',index_col=None)
df2_10= pd.read_csv('/content/drive/My Drive/2019 Dataframe/2019 10minutes IF 10 PERCENT.csv',index_col=None)
X10_train= df10[['WindSpeed_mps','AmbTemp_DegC','RotorSpeed_rpm','RotorSpeedAve','NacelleOrientation_Deg','MeasuredYawError','Pitch_Deg','WindSpeed1','WindSpeed2','WindSpeed3','GeneratorTemperature_DegC','GearBoxTemperature_DegC']]
X10_train=X10_train.values
y10_train= df10['Power_kW']
y10_train=y10_train.values
X10_test= df2_10[['WindSpeed_mps','AmbTemp_DegC','RotorSpeed_rpm','RotorSpeedAve','NacelleOrientation_Deg','MeasuredYawError','Pitch_Deg','WindSpeed1','WindSpeed2','WindSpeed3','GeneratorTemperature_DegC','GearBoxTemperature_DegC']]
X10_test=X10_test.values
y10_test= df2_10['Power_kW']
y10_test=y10_test.values
# scaling values for model
x_scale = MinMaxScaler()
y_scale = MinMaxScaler()
X10_train= x_scale.fit_transform(X10_train)
y10_train= y_scale.fit_transform(y10_train.reshape(-1,1))
X10_test= x_scale.fit_transform(X10_test)
y10_test= y_scale.fit_transform(y10_test.reshape(-1,1))
X10_train = X10_train.reshape((-1,1,12))
X10_test = X10_test.reshape((-1,1,12))
Early_Stop=EarlyStopping(monitor='val_loss', patience=3 , mode='min',restore_best_weights=True)
# creating model using Keras
model10 = Sequential()
model10.add(GRU(units=200, return_sequences=True, input_shape=(1,12),activity_regularizer=regularizers.l2(0.0001)))
model10.add(GRU(units=100, return_sequences=True))
model10.add(GRU(units=50))
#model10.add(GRU(units=30))
model10.add(Dense(units=1, activation='linear'))
model10.compile(loss=['mse'], optimizer='adam',metrics=['mse'])
model10.summary()
history10=model10.fit(X10_train, y10_train, batch_size=1500,epochs=100,validation_split=0.1, verbose=1, callbacks=[TensorBoardColabCallback(tbc),Early_Stop])
score = model10.evaluate(X10_test, y10_test)
print('Score: {}'.format(score))
y10_predicted = model10.predict(X10_test)
y10_predicted = y_scale.inverse_transform(y10_predicted)
y10_test = y_scale.inverse_transform(y10_test)
plt.scatter( df2_10['WindSpeed_mps'], y10_test, label='Measurements',s=1)
plt.scatter( df2_10['WindSpeed_mps'], y10_predicted, label='Predicted',s=1)
plt.legend()
plt.savefig('/content/drive/My Drive/Figures/we move on curve6 IF10.png')
plt.show()
I think the units of GRU are very high there. Too many GRU units might cause vanishing gradient problem. For starting, I would choose 30 to 50 units of GRU. Also, a bit higher learning rate e. g. 0.001.
If the dataset is publicly available can you please give me the link so that I can experiment on that and inform you.
I made it slightly better. I used L2 regularizer=0.0001, I added two more dense layers with 3 and 5 nodes with no activation functions. Added doupout=0.1 for the 2nd and 3rd GRU layers.Reduced batch size to 1000 and also set loss function to mae
Important note: I discovered that my TEST dataframe was extremely small compared to the train one and that is the main Reason it gave me very bad results.
I am trying to create a CNN in Keras (Python 3.7) which ingests a 2D matrix input (much like a grayscale image) and outputs a 1 dimensional vector. So far I did manage to get results, but I am not sure if what I am doing is correct (or if my intuition is).
I input a 100x50 array into my convolutional layer. This 2D array holds the peak information at every position (ie. x axis pertains to the position, y-axis pertains to the frequency, and each cell gives the intensity). The 3D graph of this shows something akin to the one given in this link.
From the (all of the) literature I have read, I learned that CNN accepts image data--image is converted into pixel values and then repeatedly convolved and pooled to get the output. However, I am using a MatLab simulator to get my input data, and I have access to the raw 2D array containing information on the peak frequency at each point.
My intuition is this: if we normalize each cell and feed the information to the CNN, it will be as if I fed the normalized pixel values of the image to the CNN, since my raw 2D array also has height, width and depth=1, like an image.
Please enlighten me if my thinking is correct or wrong.
My code is as follows:
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
import matplotlib.pyplot as plt
%matplotlib inline
import tensorflow as tf
import keras
'''load sample input'''
BGS1 = pd.read_csv("C:/Users/strain1_input.csv")
BGS2 = pd.read_csv("C:/Users/strain2_input.csv")
BGS3 = pd.read_csv("C:/Users/strain3_input.csv")
BGS_ = np.array([BGS1, BGS2, BGS3]) #3x100x50 array
BGS_normalized = BGS_/np.amax(BGS_)
'''load sample output'''
BFS1 = pd.read_csv("C:/Users/strain1_output.csv")
BFS2 = pd.read_csv("C:/Users/strain2_output.csv")
BFS3 = pd.read_csv("C:/Users/strain3_output.csv")
BFS_ = np.array([BFS1, BFS2, BFS3]) #3x100
BFS_normalized = BFS/50 #since max value for each cell is 50
#after splitting data into training, validation and testing sets,
output_nodes = 100
n_classes = 1
batch_size_ = 8 #so far, optimized for 8 batch size
epoch = 100
input_layer = Input(shape=(45,300,1))
conv1 = Conv2D(16,3,padding="same",activation="relu", input_shape =
(45,300,1))(input_layer)
pool1 = MaxPooling2D(pool_size=(2,2),padding="same")(conv1)
flat = Flatten()(pool1)
hidden1 = Dense(10, activation='softmax')(flat) #relu
batchnorm1 = BatchNormalization()(hidden1)
output_layer = Dense(output_nodes*n_classes, activation="softmax")(batchnorm1)
output_layer2 = Dense(output_nodes*n_classes, activation="relu")(output_layer)
output_reshape = Reshape((output_nodes, n_classes))(output_layer2)
model = Model(inputs=input_layer, outputs=output_reshape)
print(model.summary())
model.compile(loss='mean_squared_error', optimizer='adam', sample_weight_mode='temporal')
model.fit(train_X,train_label,batch_size=batch_size_,epochs=epoch)
predictions = model.predict(train_X)
what you did is exactly the strategy used to input non image data in to 2d convolutional layers. As long the model predicts correctly, what you did is correct. its just that CNN perform very poorly on non-image data or there might be chances to overfit. But then again, as long it performs correctly then its good.
I have trained a neural net on the MNIST dataset from kaggle.I am having trouble with getting the neural net to predict the number which it is receiving.
I don't know what to try to fix this issue.
'''python
import pandas as pd
from tensorflow import keras
import matplotlib.pyplot as plt
import numpy as np
mnist=pd.read_csv(r"C:\Users\Chandrasang\python projects\digit-recognizer\train.csv").values
xtest=pd.read_csv(r"C:\Users\Chandrasang\python projects\digit-recognizer\test.csv").values
ytrain=mnist[:,0]
xtrain=mnist[:,1:]
x_train=keras.utils.normalize(xtrain,axis=1)
x_test=keras.utils.normalize(xtest,axis=1)
x=0
xtrain2=[]
while True:
d=x_train[x]
d.shape=(28,28)
xtrain2.append(d)
x+=1
if x==42000:
break
y=0
xtest2=[]
while True:
b=x_test[y]
b.shape=(28,28)
xtest2.append(b)
y+=1
if y==28000:
break
train=np.array(xtrain2,dtype=np.float32)
test=np.array(xtest2,dtype=np.float32)
model=keras.models.Sequential()
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(256,activation=keras.activations.relu))
model.add(keras.layers.Dense(256,activation=keras.activations.relu))
model.add(keras.layers.Dense(10,activation=keras.activations.softmax))
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(train,ytrain,epochs=10)
ans=model.predict(x_test)
print(ans[3])
'''
I expect the output to be a Whole number instead it gives me the following array:
[2.7538205e-02 1.0337318e-11 2.9973364e-03 5.7095995e-06 1.6916725e-07
6.9060135e-08 1.3406207e-09 1.1861910e-06 1.4758119e-06 9.6945578e-01]
Your output is normal, it is a vector of probabilities. You have 10 classes (digits from 0 to 9) and your network compute the probability of your image to be in each class.Looking at your results, your network classified your input as a 9, with a probability of roughly 0.96.
If you want to see just the predicted class, as Chris A. said use predict_classes.
I am trying to solve a classification problem using a sequential keras model.
In Keras, model.fit requires two numpy arrays to train on - data, labels.
This works correctly if each row of the data has one corresponding label.
However, for my use, I have more than one classification possible for a given data point.
Can this be handled in keras? If so, what should be the format of my data and labels numpy array?
Sample inputs could look like this:
data[0] = ['What is the colour of the shirt?']
#This text is converted to a vector using a 300 dimension GloVe embedding layer and then processed.
label[0] = ['Red','Orange','Brown']
I require my model to train such that any of the 3 classes can be correct for the given question asked.
Any help would be great.
you can do this with MultiLabelBinarizer:
from sklearn.preprocessing import MultiLabelBinarizer
lb = MultiLabelBinarizer()
label = lb.fit_transform(label)
you can than pass the labels to the fit function with 'categorical_crossentropy' loss.
if you want to do it with keras:
from keras.utils import to_categorical
import numpy as np
unique_labels, new_labels = np.unique(label, return_inverse=True)
to_categorical(new_labels, num_classes=None)
import pandas as pd
import numpy as np
from pandas import DataFrame
from random import shuffle
import tensorflow as tf
Taking data from CSV file (IMDB dataset)
data=pd.read_csv('imdb.csv')
data.fillna(-1)
features=data.loc[:,['actor_1_facebook_likes','actor_2_facebook_likes','actor_3_facebook_likes','movie_facebook_likes']].as_matrix()
labels=data.loc[:,['imdb_score']].as_matrix()
learning_rate=.01
training_epochs=2000
display_steps=50
n_samples=features.size
Defining placeholders for features and labels:
inputX = tf.placeholder(tf.float32,[None,4])
inputY = tf.placeholder(tf.float32,[None,1])
Defining weights and bias.
Weights and bias are coming out to be NaN.
w = tf.Variable(tf.zeros([4,4]))
b = tf.Variable(tf.zeros([4]))
y_values = tf.add(tf.matmul(inputX,w),b)
Applying neural network:
y=tf.nn.softmax(y_values)
cost=tf.reduce_sum(tf.pow(inputY-y,2))/2*n_samples
optimizer=tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(training_epochs):
sess.run(optimizer,feed_dict={inputX:features,inputY:labels})
if (i) % display_steps==0:
cc=sess.run(cost,feed_dict={inputX:features,inputY:labels})
print(sess.run(w,feed_dict={inputX:features,inputY:labels}))
Your learning rate is too big (try starting with 1e-3).
Also, your neural network won't learn anything because you're starting from a condition in which your weights can't change: you have initialized your weights to zero, that's wrong.
Change your weights initialization to random values in that way:
w = tf.Variable(tf.truncated_normal([4,4]))
and you'll be able to train your network. (biases initialized to 0 are OK)
Use add_check_numerics_ops of TensorFlow library to check which operation is giving you the nan values.
https://www.tensorflow.org/api_docs/python/tf/add_check_numerics_ops