How can I build an LSTM AutoEncoder with PyTorch? - python

I have my data as a DataFrame:
dOpen dHigh dLow dClose dVolume day_of_week_0 day_of_week_1 ... month_6 month_7 month_8 month_9 month_10 month_11 month_12
639 -0.002498 -0.000278 -0.005576 -0.002228 -0.002229 0 0 ... 0 0 1 0 0 0 0
640 -0.004174 -0.005275 -0.005607 -0.005583 -0.005584 0 0 ... 0 0 1 0 0 0 0
641 -0.002235 0.003070 0.004511 0.008984 0.008984 1 0 ... 0 0 1 0 0 0 0
642 0.006161 -0.000278 -0.000281 -0.001948 -0.001948 0 1 ... 0 0 1 0 0 0 0
643 -0.002505 0.001113 0.005053 0.002788 0.002788 0 0 ... 0 0 1 0 0 0 0
644 0.004185 0.000556 -0.000559 -0.001668 -0.001668 0 0 ... 0 0 1 0 0 0 0
645 0.002779 0.003056 0.003913 0.001114 0.001114 0 0 ... 0 0 1 0 0 0 0
646 0.000277 0.004155 -0.002227 -0.002782 -0.002782 1 0 ... 0 0 1 0 0 0 0
647 -0.005540 -0.007448 -0.003348 0.001953 0.001953 0 1 ... 0 0 1 0 0 0 0
648 0.001393 -0.000278 0.001960 -0.003619 -0.003619 0 0 ... 0 0 1 0 0 0 0
My input will be 10 rows (already one-hot encoded). I want to create an n-dimensional auto encoded representation. So as I understand it, my input and output should be the same.
I've seen some examples to construct this, but am still stuck on the first step. Is my training data just a lot of those samples as to make a matrix? What then?
I apologize for the general nature of the question. Any questions, just ask and I will clarify in the comments.
Thank you.

It isn't quite clear from the question what you are trying to achieve. Based on what you wrote you want to create an autoencoder with the same input and output and that doesn't quite make sense to me when I see your data set. In the common case, the encoder part of the autoencoder creates a model which, based on a large set of input features produces a small output vector and decoder is performing an inverse operation of reconstruction of the plausible input features based on the full set of output and input features. A result of using an autoencoder is enhanced (in some meaning, like with noise removed, etc) input.
You can find a few examples here with the 3rd use case providing code for the sequence data, learning random number generation model. Here is another example, which looks closer to your application. A sequential model is constructed to encode a large data set with information loss. If that is what you are trying to achieve, you'll find the code there.
If the goal is a sequence prediction (like future stock prices), this and that example seem to be more appropriate as you likely only want to predict a handful of values in your data sequence (say dHigh and dLow) and you don't need to predict day_of_week_n or the month_n (even though that part of autoencoder model probably will train much more reliable as the pattern is pretty clear). This approach will allow you to predict a single consequent output feature value (tomorrow's dHigh and dLow)
If you want to predict a sequence of future outputs you can use a sequence of outputs, rather than a single one in your model.
In general, the structure of inputs and outputs is totally up to you

Related

Classification based on categorical data

I have a dataset
Inp1 Inp2 Output
A,B,C AI,UI,JI Animals
L,M,N LI,DO,LI Noun
X,Y AI,UI Extras
For these values, I need to apply a ML algorithm. Which algorithm would be best suited to find relations in between these groups to assign an output class to them?
Assuming each cell is a list (as you have multiple strings stored in each), and that you are not looking for a specific encoding. The following should work. It can also be adjusted to suit different encodings.
import pandas as pd
A = [["Inp1", "Inp2", "Inp3", "Output"],
[["A","B","C"], ["AI","UI","JI"],["Apple","Bat","Dog"],["Animals"]],
[["L","M","N"], ["LI","DO","LI"], ["Lawn", "Moon", "Noon"], ["Noun"]]]
dataframe = pd.DataFrame(A[1:], columns=A[0])
def my_encoding(row):
encoded_row = []
for ls in row:
encoded_ls = []
for s in ls:
sbytes = s.encode('utf-8')
sint = int.from_bytes(sbytes, 'little')
encoded_ls.append(sint)
encoded_row.append(encoded_ls)
return encoded_row
print(dataframe.apply(my_encoding))
output:
Inp1 ... Output
0 [65, 66, 67] ... [32488788024979009]
1 [76, 77, 78] ... [1853189966]
if my assumptions are incorrect or this is not what you're looking for let me know.
As you mentioned, you are going to apply ML algorithm (say classification), I think One Hot Encoding is what you are looking for.
Requested format:
Inp1 Inp2 Inp3 Output
7,44,87 4,65,2 47,36,20 45
This format can't help you to train your model as multiple labels in a single cell. However you have to pre-process again like OHE.
Suggesting format:
A B C L M N X Y AI DO JI LI UI Apple Bat Dog Lawn Moon Noon Yemen Zombie
1 1 1 0 0 0 0 0 1 0 1 0 1 1 1 1 0 0 0 0 0
0 0 0 1 1 1 0 0 0 1 0 1 0 0 0 0 1 1 1 0 0
0 0 0 0 0 0 1 1 1 0 0 0 1 0 0 0 0 0 0 1 1
Hereafter you can label encode / ohe the output field as per your model requires.
Happy learning !
BCE is for multi-label classifications, whereas categorical CE is for multi-class classification where each example belongs to a single class. In your task you need to understand if for a single example you end in a single class only (CE) or single example may end in multiple classes (BCE). Probable the second is true since animal can be a noun. ;)

Warning Message in binary classification model Gaussian Naive Bayes?

I am using a multiclass classification-ready dataset with 14 continuous variables and classes from 1 to 10.
This is the data file:
https://drive.google.com/file/d/1nPrE7UYR8fbTxWSuqKPJmJOYG3CGN5y9/view?usp=sharing
My goal is to apply the scikit-learn Gaussian NB model to the data, but in a binary classification task where only class 2 is the positive label and the remainder of the classes are all negatives. For that, I did the following code:
from sklearn.naive_bayes import GaussianNB, CategoricalNB
import pandas as pd
dataset = pd.read_csv("PD_21_22_HA1_dataset.txt", index_col=False, sep="\t")
x_d = dataset.values[:, :-1]
y_d = dataset.values[:, -1]
### train_test_split to split the dataframe into train and test sets
## with a partition of 20% for the test https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
X_TRAIN, X_IVS, y_TRAIN, y_IVS = train_test_split(x_d, y_d, test_size=0.20, random_state=23)
yc_TRAIN=np.array([int(i==2) for i in y_TRAIN])
mdl = GaussianNB()
mdl.fit(X_TRAIN, yc_TRAIN)
preds = mdl.predict(X_IVS)
# binarization of "y_true" array
yc_IVS=np.array([int(i==2) for i in y_IVS])
print("The Precision is: %7.4f" % precision_score(yc_IVS, preds))
print("The Matthews correlation coefficient is: %7.4f" % matthews_corrcoef(yc_IVS, preds))
But I get the following warning message when calculating precision:
UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples.
The matthew's correlation coeficient func also outputs 0 and gives a runtimewarning: invalid value encountered in double_scalars message.
Furthermore, by inspecting preds, I got that the model predicts only negatives/zeros.
I've tried increasing the 20% test partition as some forums suggested but it didn't do anything.
Is this simply a problem of the model not being able to fit against the data or am I doing something wrong that may be inputting the wrong data format/type into the model?
Edit: yc_TRAIN is the result of turning all cases from class 2 into my true positive cases "1" and the remaining classes into negatives/0, so it's a 1-d array of length 9450 (which matches my total number of prediction cases) with over 8697 0s and 753 1s, so its aspect would be something like this:
[0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 ... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ]
Your code looks fine; this is a classic problem with imbalanced datasets, and it actually means you do not have enough training data to correctly classify the rare positive class.
The only thing you could improve in the given code is to set stratify=y_d in train_test_split, in order to get a stratified training set; decreasing the size of the test set (i.e. leaving more samples for training) may also help:
X_TRAIN, X_IVS, y_TRAIN, y_IVS = train_test_split(x_d, y_d, test_size=0.10, random_state=23, stratify=y_d)
If this does not work, you should start thinking of applying class imbalance techniques (or different models); but this is not a programming question any more but a theory/methodology one, and it should be addressed at the appropriate SE sites and not here (see the intro and NOTE in the machine-learning tag info).

Predicting a Chess Board Position using Pytorch

I want to predict a the current chess board using pytorch/keras. (Let's not worry about the input for now.)
How would I got about that?
A chess board has 8x8 positions (64) on each position could be a black or white piece (12) or no piece at all (1). I am planning on using this representation for the chess board (other suggestions are welcome!):
https://en.wikipedia.org/wiki/Board_representation_(computer_chess)#Square_list
For example:
2 3 4 5 6 4 3 2
1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
-1-1-1-1-1-1-1-1
-2-3-4-5-6-4-3-2`
As far as I know it is not possible to predict something like this. Because the number of classes my final layer would have to predict are 448 (64x7) and I don't feel like a NN could do that. Additionally there is the problem that softmax wouldn't work (imo). Also the Loss function might become a problem as well.
Does someone have an idea on how to do that? Or could point me in the right direction, because multi-class classification isn't really the right term for this task. I was thinking about creating 6 networks that create a classification for each piece. So a 8x8 array that looks like this (for rooks):
10000001
00000000
00000000
00000000
00000000
-1000000-1
But the problem is still quite similar.
I think creating 64 NNs that take care of one position each would simplify the problem a bit. But that would be a pain to train.
Looking forward to hearing your suggestions!
For anyone wondering how to do this. I think I figured it out:
You build a Softmax over the third dimension of a 8x8x13 array and get a 8x8 matrix with all the chess figures.
Thanks to #Prune. I will adapt my questions in the future.

Keras predict(..) output interpretation

I currently use a keras model for text classification. Calling the evaluate method I often have accuracies around 90 percent. However, calling the predict function and printing the output does not seem interpretable to me. I am using binary_crossentropy. I do not know which value will trigger the neurons to be active, or how to see that at all.
I attached some outputs(the binary ones are the actual classes). How does evaluate compute the accuracy?
[0 0 0 0 0 0 1 0 0 0 0 0 0 0 0]
[0.02632797 0.02205164 0.00884359 0.00948936 0.21821289 0.02533042
0.07450009 0.01911888 0.22753781 0.00904192 0.0023979 0.03065717
0.0049532 0.09980826 0.0047154 ]
[1 0 0 0 0 0 0 0 1 0 0 0 0 0 0]
[0.17915486 0.1063956 0.05139401 0.01718497 0.06058983 0.11605757
0.11845534 0.03865225 0.6665891 0.01648878 0.02570258 0.14659531
0.01044943 0.04226198 0.02007598]
[1 0 0 0 0 0 0 0 1 0 0 0 0 0 0]
[0.07659172 0.07020403 0.00733146 0.01322867 0.43747708 0.02796873
0.03419256 0.03095324 0.15433209 0.02747604 0.01686232 0.0165229
0.0226498 0.01947697 0.07312528]
Use 'categorical_crossentropy' instead of 'binary_crossentropy'.
Check if you are normalizing the training data (for example X_train/255) and not normalizing the test data.

Is there any easy way to rotate the values of a matrix/array?

So, let's say I have the following matrix/array -
[0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 1 0 0 0 0 0
0 0 0 1 1 1 1 0 0 0 0 0
0 0 1 1 1 1 1 1 0 0 0 0
0 0 1 1 1 1 1 1 0 0 0 0
0 0 0 1 1 1 1 0 0 0 0 0
0 0 0 0 1 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0]
It would be fairly trivial to write something that would translate these values up and down. What if I wanted to rotate it by an angle that isn't a multiple of 90 degrees? I know that It is obviously impossible to get the exact same shape (made of 1s), because of the nature of the grid. The idea that comes to mind is converting each value of 1 to a coordinate vector. Then it would amount to rotating the coordinates (which should be more simple) about a point. One could then write something which would take the coordinates, and compare them to the matrix grid, and if there is a point in the right box, it will be filled. I know I'll also have to find a center around which to rotate.
Does this seem like a reasonable way to do this? If anyone has a better idea, I'm all ears. I know with a small grid like this, the shape would probably be entirely different, however if I had a large shape represented by 1s, in a large grid, the difference between representations would be smaller.
First of all, rotating a shape like that with only 1's and 0's at non 90 degree angles is not really going to look much like the original at all, when it's done at such a low "resolution". However, I would recommend looking into rotation matrices. Like you said, you would probably want to find each value as a coordinate pair, and rotate it around the center. It would probably be easier if you made this a two-dimensional array. Good luck!
I think this should work:
from math import sin, cos, atan2, radians
i0,j0 = 0,0 #point around which you'll rotate
alpha = radians(3) #3 degrees
B = np.zeros(A.shape)
for i,j in np.swapaxes(np.where(A==1),0,1):
di = i-i0
dj = j-j0
dist = (di**2 + dj**2)**0.5
ang = atan2(dj,di)
pi = round(sin(ang+alpha)*dist) + i0
pj = round(cos(ang+alpha)*dist) + j0
B[pi][pj] = 1
But, please, don't forget about segmentation fault!
B array should be much bigger than A and origin should be (optimally) in the middle of the array.

Categories