MinMax inverse_transform after LSTM + Dense(1) layers - python

My initial data is:
[[ 8375.5 0. 8374.14285714 8374.14285714]
[ 8354.5 0. 8383.39285714 8371.52380952]
...
[11060. 0. 11055.21428571 11032.53702732]
[11076.5 0. 11061.60714286 11038.39875701]]
I create MinMax scaler to transform data to values from 0 to 1
scaler = MinMaxScaler(feature_range = (0, 1))
T = scaler.fit_transform(T)
So data now is:
[[0.5186697 , 0. , 0.46812344, 0.46950912],
[0.5161844 , 0. , 0.46935928, 0.46915412],
...,
[0.72264636, 0. , 0.6767292 , 0.6807525 ],
[0.7198651 , 0. , 0.6785377 , 0.6833385 ]]
I do some magic to prepare this data for LSTM layer and this is the result:
X_train variable of shape (6989, 4, 200)
[[[0.5186697 0. 0.46812344 ... 0. 0.45496237 0.45219505]
[0.48742527 0. 0.45273864 ... 0. 0.43144143 0.431924 ]
[0.4800284 0. 0.43054438 ... 0. 0.425362 0.4326681 ]
[0.5007989 0. 0.4290794 ... 0. 0.4696839 0.47831726]]
...
[[0.61240304 0. 0.57254803 ... 0. 0.5749577 0.57792616]
[0.61139715 0. 0.5746571 ... 0. 0.5971378 0.6017289 ]
[0.6365465 0. 0.59772 ... 0. 0.62671924 0.63145673]
[0.65719867 0. 0.62684333 ... 0. 0.6757128 0.6772785 ]]]
I process the data using this model with Dense(1) layer at the end:
model = Sequential()
model.add(LSTM(units = 50, activation = 'relu', #return_sequences = True,
input_shape = (X_train.shape[1], window_size)))
model.add(Dropout(0.2))
model.add(Dense(1, activation = 'linear'))
model.compile(loss = 'mean_squared_error', optimizer = 'adam')
And when I set return_sequences to false the shape of new data after fit is (6989, 1) and when I want to inverse_transform scaler.inverse_transform(train_predict) using this scalar I get an error:
ValueError: non-broadcastable output operand with shape (6989,1) doesn't match the broadcast shape (6989,4)
When I do set return_sequences to true the new shape is (6989, 4, 1) and when I inverse_transform I get other error:
ValueError: Found array with dim 3. None expected <= 2.
============
I think I know why I get these errors, because scaler requires shape of (6989,4), but what and how can I do to transform this data so that I will be able to inverse_transform?
How can I inverse_transform the data of new shape of (6989, 1)?
How can I inverse_transform the data of new shape of (6989, 4, 1)?
Is it doable? My scaler can be used? Or should I create new scaler? Can you suggest something? What am I missing?
I will appreciate any help, thanks!

Related

predict() values are zero or NaN

I have a GRU model
GRU = keras.models.Sequential([keras.layers.GRU(32),
keras.layers.Dense(32, activation= 'relu'),
keras.layers.Dense(1, activation=None)])
GRU.compile(loss="mae", optimizer="adam")
resultsGRU = GRU.fit_generator(generator = train, validation_data = train, epochs = 3, verbose= 1, shuffle = False)
If I convert train data to numpy array, I can see I don't have any zero or Nan values (I also dropna values before)
trainArray= np.array(train)
print(trainArray)
I copied only a part of array, just so you can see the values:
[[array([[[-0.86286026, 0.51805955, 1.0427724 , ..., 0.27464896,
0.08823532, -1.1183959 ],
[-0.3186916 , 0.00295895, 0.740636 , ..., 0.27464896,
0.08823532, -1.1304985 ],
[-0.31057638, 0.00295895, 0.5593542 , ..., 0.27464896,
-0.5521559 , -1.1183959 ],
...,
If I print resultsGRU
print(resultsGRU.history.values())
I get
dict_values([[0.597104012966156, 0.5652544498443604, 0.5574262142181396], [0.6241905093193054, 0.6183988451957703, 0.6134349703788757]])
Then I use predict, but values are returned 0
predictGRU = GRU.predict(test)
print(predictGRU)
[0. 0. 0. ... 0. 0. 0.]
I then save this model and use it for API and the values are NaN.
What is the problem here? How do I get the model to predict a different, reasonable value?
I also use metrics later on
print(metrics.mean_absolute_error(test, predictGRU))
print(metrics.mean_squared_error(test, predictGRU))
print(metrics.explained_variance_score(test, predictGRU))
And I get normal numbers
0.6471065
0.50334525
0.23076766729354858
I don't know how to fix this on my own.

Multi class confusion matrix Keras tool

I'm doing a neural network designed to classify between 10 different compounds, the data set is something like:
array([[400. , 23. , 52.38, ..., 1. , 0. , 0. ],
[400. , 21.63, 61.61, ..., 0. , 0. , 0. ],
[400. , 21.49, 61.95, ..., 0. , 0. , 0. ],
...,
[400. , 21.69, 41.98, ..., 0. , 0. , 0. ],
[400. , 22.48, 65.2 , ..., 0. , 0. , 0. ],
[400. , 22.02, 58.91, ..., 0. , 0. , 1. ]])
where the 10 last numbers are the one hot encoded for the compounds I want to identify. This is the code I'm using:
dataset=numpy.asfarray(dataset[1:,0:],float)
x = dataset[0:,0:30]
y = dataset[0:,30:40]
x_train, x_test, y_train, y_test = train_test_split(
x, y, test_size=0.20, random_state=1) #siempre ha sido 42
standard=preprocessing.StandardScaler().fit(x_train)
x_train=standard.transform(x_train)
x_test=standard.transform(x_test)
dump(standard, 'std_modelo_400.bin', compress=True)
model = Sequential()
model.add(Dense(50, input_dim = x_test.shape[1], activation = 'relu',kernel_regularizer=keras.regularizers.l1(0.01)))
model.add(Dense(30, input_dim = x_test.shape[1], activation = 'relu',kernel_regularizer=keras.regularizers.l1(0.01)))
model.add(Dense(15, input_dim = x_test.shape[1], activation = 'relu',kernel_regularizer=keras.regularizers.l1(0.01)))
model.add(Dense(10, activation='softmax',kernel_initializer='normal', bias_initializer=keras.initializers.Constant(value=0)))
model.summary()
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
history=model.fit(x_train,y_train,validation_data=(x_test,y_test),verbose=2,epochs=epochs,batch_size=batch_size)#callbacks=[monitor] , verbose=2
I try to get the confusion matrix using the command multilabel_confusion_matrix(y_test,pred) and I get in this form:
array([[[929681, 158],
[ 308, 102180]],
[[930346, 407],
[ 6677, 94897]],
[[930740, 38],
[ 477, 101072]],
[[929287, 1522],
[ 69, 101449]],
[[929703, 8843],
[ 12217, 81564]],
[[902624, 474],
[ 1565, 127664]],
[[931152, 2236],
[ 12140, 86799]],
[[929085, 10],
[ 0, 103232]],
[[911158, 22378],
[ 5362, 93429]],
[[930412, 689],
[ 617, 100609]]], dtype=int64)
When I use multilabel_confusion_matrix(y_test,pred,labels=["Comp1","Comp2","Comp3", "Comp4", "Comp5", "Comp6", "Comp7", "Comp8", "Comp9", "Comp10",]) I get an error:
elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
mask &= (ar1 != a)
Traceback (most recent call last):
File "<ipython-input-18-00af06ffcbef>", line 1, in <module>
multilabel_confusion_matrix(y_test,pred,labels=["Comp1","Comp2","Comp3", "Comp4", "Comp5", "Comp6", "Comp7", "Comp8", "Comp9", "Comp10",])
File "C:\Users\fmarin\Anaconda3\lib\site-packages\sklearn\metrics\_classification.py", line 485, in multilabel_confusion_matrix
if np.max(labels) > np.max(present_labels):
I have no idea how to fix it. I also like to get the graphic version of the confusion matrix, I'm using scikit-learn toolbox.
Thank you!

calling Keras predict on CNN simply returns the input

Based on a matrix, I am trying to approximate a value (regression). However, the CNN always predicts a matrix which is identical to the input of predict.
I am not getting any errors.
The data (matrices) used for training are stored in a numpy array but I only have around 9000 samples available. The values for each matrix are stored in a one dimensional array (one value for each matrix).
This is my model:
model = keras.Sequential([
layers.Conv2D(64, kernel_size=3, activation='selu', input_shape=(8, 8, 1)),
layers.Conv2D(64, kernel_size=3, activation='selu'),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Conv2D(64, kernel_size=2, activation='selu'),
layers.Flatten(),
layers.Dense(1, activation='linear')
])
optimizer = keras.optimizers.RMSprop(0.001)
model.compile(optimizer=optimizer,
loss='mean_squared_error',
metrics=['mean_squared_error'])
model.fit(matrices, values, epochs=10)
test_loss = model.evaluate(test_boards, test_values, verbose=2)
Example output when calling prediction = model.predict(some_matrix) can be found below. In this case some_matrix is equal to the output below.
[[ 51. 0. 33. 0. 100. 33. 0. 51.]
[ 10. 10. 10. 0. 0. 10. 10. 10.]
[ 0. 0. 32. 0. 0. 32. 0. 0.]
[ 0. 0. 0. 88. 10. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. -10. 0. -32. 0. 0.]
[ -10. -10. -10. 0. 0. -10. -10. -10.]
[ -51. -32. -33. -88. -100. -33. 0. -51.]]
What am I missing to get a single value as output? Or at least a modified version of the input?
Edit:
My matrix data (did not fit in a free pastebin account, sorry)
My values
An example google colab file
I did not find a way to provide the data into Google Colab and include them in the link, I'm sorry for the inconvenience.
I did get an error this time which I did not get when running the code in my own environment. This is definitely the issue but I am still unaware of how to fix this.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-595f98617fa0> in <module>()
97 [ -51, -32, -33, -88, -100, -33, 0, -51,]])
98 print(test_boards[0])
---> 99 prediction = model.predict(test_boards[0])
100 print("Prediction:")
101 print(prediction)
3 frames
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
561 ': expected ' + names[i] + ' to have ' +
562 str(len(shape)) + ' dimensions, but got array '
--> 563 'with shape ' + str(data_shape))
564 if not check_batch_axis:
565 data_shape = data_shape[1:]
ValueError: Error when checking input: expected conv2d_12_input to have 4 dimensions, but got array with shape (8, 8, 1)
You need to add the batch size dimension to the test sample.
some_matrix = some_matrix[np.newaxis,:,:,np.newaxis]

How to parse the data after Bert embedding?

I am doing binary classification for title sentences in news. (To determinate whether the new is political biased)
I am using the Bert embedding from https://pypi.org/project/bert-embedding/ to embedding training sentences (one raw one title sentence) in Dataframes then feed vectorised Data into logistic regression, but the output data shape from the Bert embedding doesn't support logistic regression model. How can I parse this to make it fit logistic regression model?
Before I used tifdVectorizer it works perfectly and the output is numpy array like
[[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
...
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]]
each row is vectorised data for one sentence and It's an array with size of 1903
And I have 516 titles in training data.
The output shapes are:
train_x.shape: (516, 1903) test_x.shape (129, 1903)
train_y.shape: (516,) test_y.shape (129,)
But after I switched into Bert_Embedding
the output vector for ONE row is numpy array list like
[list([array([ 9.79349554e-01, -7.06475616e-01 ...... ]dtype=float32),
array([ ........ ],dtype=float32), ......................
array([ ........ ],dtype=float32)]
the output shape is like:
train_x.shape: (516, 1) test_x.shape (129, 1)
train_y.shape: (516,) test_y.shape (129,)
def transform_to_Bert(articles_file: str, classified_articles_file: str):
df = get_df_from_articles_file(articles_file, classified_articles_file)
df_train, df_test, _, _ = train_test_split(df, df.label, stratify=df.label, test_size=0.2)
bert_embedding = BertEmbedding()
df_titles_values=df_train.title.values.tolist()
result_train = bert_embedding(df_titles_values)
result_test = bert_embedding(df_test.title.values.tolist())
train_x = pd.DataFrame(result_train, columns=['A', 'Vector'])
train_x = train_x.drop(columns=['A'])
test_x = pd.DataFrame(result_test, columns=['A', 'Vector'])
test_x=test_x.drop(columns=['A'])
test_x=test_x.values
train_x=train_x.values
print(test_x)
print(train_x)
train_y = df_train.label.values
test_y = df_test.label.values
return {'train_x': train_x, 'test_x': test_x, 'train_y': train_y, 'test_y': test_y, 'input_length': train_x.shape[1], 'vocab_size': train_x.shape[1]}
Column A is the original title string in the result. So I just drop it.
Below is the code where I use tifd vectoriser which works for logistical model.
def transform_to_tfid(articles_file: str, classified_articles_file: str):
df = get_df_from_articles_file(articles_file, classified_articles_file)
df_train, df_test, _, _ = train_test_split(df, df.label, stratify=df.label, test_size=0.2)
vectorizer = TfidfVectorizer(stop_words='english', )
vectorizer.fit(df_train.title)
train_x= vectorizer.transform(df_train.title)
train_x=train_x.toarray()
print(type(train_x))
print(train_x)
test_x= vectorizer.transform(df_test.title)
test_x=test_x.toarray()
print(test_x)
train_y = df_train.label.values
test_y = df_test.label.values
return {'train_x': train_x, 'test_x': test_x, 'train_y': train_y, 'test_y': test_y, 'input_length': train_x.shape[1], 'vocab_size': train_x.shape[1]}
model=LogisticRegression(solver='lbfgs')
model.fit(train_x, train_y)
the error is ValueError: setting an array element with a sequence.
I expected output shape from Bert: train_x.shape: (516, 1) test_x.shape (129, 1) is like that from tifd: train_x.shape: (516, 1903) test_x.shape (129, 1903)so that it fits the logistic model
ok that was my mistake or maybe it's bad convention from the library author:
[list([array([ 9.79349554e-01, -7.06475616e-01 ...... ]dtype=float32),
array([ ........ ],dtype=float32), ......................
array([ ........ ],dtype=float32)]
actually it's :
[[list([array([ 9.79349554e-01, -7.06475616e-01 ...... ]dtype=float32),
array([ ........ ],dtype=float32), ......................
array([ ........ ],dtype=float32)]]
So you gotta index 0 to reach that

Many-to-many lstm model on varying samples

I have recently started learning how to build LSTM model for multivariate time series data. I have looked here and here on how to pad sequences and implement many-to-many LSTM model. I have created a dataframe to test the model but I keep getting an error (below).
d = {'ID':['a12', 'a12','a12','a12','a12','b33','b33','b33','b33','v55','v55','v55','v55','v55','v55'], 'Exp_A':[2.2,2.2,2.2,2.2,2.2,3.1,3.1,3.1,3.1,1.5,1.5,1.5,1.5,1.5,1.5],
'Exp_B':[2.4,2.4,2.4,2.4,2.4,1.2,1.2,1.2,1.2,1.5,1.5,1.5,1.5,1.5,1.5],
'A':[0,0,1,0,1,0,1,0,1,0,1,1,1,0,1], 'B':[0,0,1,1,1,0,0,1,1,1,0,0,1,0,1],
'Time_Interval': ['11:00:00', '11:10:00', '11:20:00', '11:30:00', '11:40:00',
'11:00:00', '11:10:00', '11:20:00', '11:30:00',
'11:00:00', '11:10:00', '11:20:00', '11:30:00', '11:40:00', '11:50:00']}
df = pd.DataFrame(d)
df.set_index('Time_Interval', inplace=True)
I tried to pad using brute force:
from keras.preprocessing.sequence import pad_sequences
x1 = df['A'][df['ID']== 'a12']
x2 = df['A'][df['ID']== 'b33']
x3 = df['A'][df['ID']== 'v55']
mx = df['ID'].size().max() # Find the largest group
seq1 = [x1, x2, x3]
padded1 = np.array(pad_sequences(seq1, maxlen=6, dtype='float32')).reshape(-1,mx,1)
In similar ways I have created padded2, padded3 and padded4 for each feature:
padded_data = np.dstack((padded1, padded1, padded3, padded4))
padded_data.shape = (3, 6, 4)
padded_data
array([[[0. , 0. , 0. , 0. ],
[0. , 0. , 2.2, 2.4],
[0. , 0. , 2.2, 2.4],
[1. , 1. , 2.2, 2.4],
[0. , 0. , 2.2, 2.4],
[1. , 1. , 2.2, 2.4]],
[[0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. ],
[0. , 0. , 3.1, 1.2],
[1. , 1. , 3.1, 1.2],
[0. , 0. , 3.1, 1.2],
[1. , 1. , 3.1, 1.2]],
[[0. , 0. , 1.5, 1.5],
[1. , 1. , 1.5, 1.5],
[1. , 1. , 1.5, 1.5],
[1. , 1. , 1.5, 1.5],
[0. , 0. , 1.5, 1.5],
[1. , 1. , 1.5, 1.5]]], dtype=float32)
edit
#split into train/test
train = pad_1[:2] # train on the 1st two samples.
test = pad_1[-1:]
train_X = train[:,:-1] # one step ahead prediction.
train_y = train[:,1:]
test_X = test[:,:-1] # test on the last sample
test_y = test[:,1:]
# check shapes
print(train_X.shape, train_y.shape, test_X.shape, test_y.shape)
#(2, 5, 4) (2, 5, 4) (1, 5, 4) (1, 5, 4)
# design network
model = Sequential()
model.add(Masking(mask_value=0., input_shape=(train.shape[1], train.shape[2])))
model.add(LSTM(32, input_shape=(train.shape[1], train.shape[2]), return_sequences=True))
model.add(Dense(4))
model.compile(loss='mae', optimizer='adam', metrics=['accuracy'])
model.summary()
# fit network
history = model.fit(train, test, epochs=300, validation_data=(test_X, test_y), verbose=2, shuffle=False)
[![enter image description here][3]][3]
So my questions are:
Surely, there must be an efficient way of transforming the data?
Say I want a single time-step prediction for future sequence, I have
first time-step = array([[[0.5 , 0.9 , 2.5, 3.5]]], dtype=float32)
Where first time-step is a single 'frame' of a sequence.
How do adjust the model to incorporate this?
To resolve the error, remove return_sequence=True from the LSTM layer arguments (since with this architecture you have defined, you only need the output of last layer) and also simply use train[:, -1] and test[:, -1] (instead of train[:, -1:] and test[:, -1:]) to extract the labels (i.e. removing : causes the second axis to be dropped and therefore makes the labels shape consistent with the output shape of the model).
As a side note, wrapping a Dense layer inside a TimeDistributed layer is redundant, since the Dense layer is applied on the last axis.
Update: As for the new question, either pad the input sequence which has only one timestep to make it have a shape of (5,4), or alternatively set the input shape of the first layer (i.e. Masking) to input_shape=(None, train.shape[2]) so the model can work with inputs of varying length.

Categories