I want to predict a the current chess board using pytorch/keras. (Let's not worry about the input for now.)
How would I got about that?
A chess board has 8x8 positions (64) on each position could be a black or white piece (12) or no piece at all (1). I am planning on using this representation for the chess board (other suggestions are welcome!):
https://en.wikipedia.org/wiki/Board_representation_(computer_chess)#Square_list
For example:
2 3 4 5 6 4 3 2
1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
-1-1-1-1-1-1-1-1
-2-3-4-5-6-4-3-2`
As far as I know it is not possible to predict something like this. Because the number of classes my final layer would have to predict are 448 (64x7) and I don't feel like a NN could do that. Additionally there is the problem that softmax wouldn't work (imo). Also the Loss function might become a problem as well.
Does someone have an idea on how to do that? Or could point me in the right direction, because multi-class classification isn't really the right term for this task. I was thinking about creating 6 networks that create a classification for each piece. So a 8x8 array that looks like this (for rooks):
10000001
00000000
00000000
00000000
00000000
-1000000-1
But the problem is still quite similar.
I think creating 64 NNs that take care of one position each would simplify the problem a bit. But that would be a pain to train.
Looking forward to hearing your suggestions!
For anyone wondering how to do this. I think I figured it out:
You build a Softmax over the third dimension of a 8x8x13 array and get a 8x8 matrix with all the chess figures.
Thanks to #Prune. I will adapt my questions in the future.
Related
I have a dataset
Inp1 Inp2 Output
A,B,C AI,UI,JI Animals
L,M,N LI,DO,LI Noun
X,Y AI,UI Extras
For these values, I need to apply a ML algorithm. Which algorithm would be best suited to find relations in between these groups to assign an output class to them?
Assuming each cell is a list (as you have multiple strings stored in each), and that you are not looking for a specific encoding. The following should work. It can also be adjusted to suit different encodings.
import pandas as pd
A = [["Inp1", "Inp2", "Inp3", "Output"],
[["A","B","C"], ["AI","UI","JI"],["Apple","Bat","Dog"],["Animals"]],
[["L","M","N"], ["LI","DO","LI"], ["Lawn", "Moon", "Noon"], ["Noun"]]]
dataframe = pd.DataFrame(A[1:], columns=A[0])
def my_encoding(row):
encoded_row = []
for ls in row:
encoded_ls = []
for s in ls:
sbytes = s.encode('utf-8')
sint = int.from_bytes(sbytes, 'little')
encoded_ls.append(sint)
encoded_row.append(encoded_ls)
return encoded_row
print(dataframe.apply(my_encoding))
output:
Inp1 ... Output
0 [65, 66, 67] ... [32488788024979009]
1 [76, 77, 78] ... [1853189966]
if my assumptions are incorrect or this is not what you're looking for let me know.
As you mentioned, you are going to apply ML algorithm (say classification), I think One Hot Encoding is what you are looking for.
Requested format:
Inp1 Inp2 Inp3 Output
7,44,87 4,65,2 47,36,20 45
This format can't help you to train your model as multiple labels in a single cell. However you have to pre-process again like OHE.
Suggesting format:
A B C L M N X Y AI DO JI LI UI Apple Bat Dog Lawn Moon Noon Yemen Zombie
1 1 1 0 0 0 0 0 1 0 1 0 1 1 1 1 0 0 0 0 0
0 0 0 1 1 1 0 0 0 1 0 1 0 0 0 0 1 1 1 0 0
0 0 0 0 0 0 1 1 1 0 0 0 1 0 0 0 0 0 0 1 1
Hereafter you can label encode / ohe the output field as per your model requires.
Happy learning !
BCE is for multi-label classifications, whereas categorical CE is for multi-class classification where each example belongs to a single class. In your task you need to understand if for a single example you end in a single class only (CE) or single example may end in multiple classes (BCE). Probable the second is true since animal can be a noun. ;)
I have my data as a DataFrame:
dOpen dHigh dLow dClose dVolume day_of_week_0 day_of_week_1 ... month_6 month_7 month_8 month_9 month_10 month_11 month_12
639 -0.002498 -0.000278 -0.005576 -0.002228 -0.002229 0 0 ... 0 0 1 0 0 0 0
640 -0.004174 -0.005275 -0.005607 -0.005583 -0.005584 0 0 ... 0 0 1 0 0 0 0
641 -0.002235 0.003070 0.004511 0.008984 0.008984 1 0 ... 0 0 1 0 0 0 0
642 0.006161 -0.000278 -0.000281 -0.001948 -0.001948 0 1 ... 0 0 1 0 0 0 0
643 -0.002505 0.001113 0.005053 0.002788 0.002788 0 0 ... 0 0 1 0 0 0 0
644 0.004185 0.000556 -0.000559 -0.001668 -0.001668 0 0 ... 0 0 1 0 0 0 0
645 0.002779 0.003056 0.003913 0.001114 0.001114 0 0 ... 0 0 1 0 0 0 0
646 0.000277 0.004155 -0.002227 -0.002782 -0.002782 1 0 ... 0 0 1 0 0 0 0
647 -0.005540 -0.007448 -0.003348 0.001953 0.001953 0 1 ... 0 0 1 0 0 0 0
648 0.001393 -0.000278 0.001960 -0.003619 -0.003619 0 0 ... 0 0 1 0 0 0 0
My input will be 10 rows (already one-hot encoded). I want to create an n-dimensional auto encoded representation. So as I understand it, my input and output should be the same.
I've seen some examples to construct this, but am still stuck on the first step. Is my training data just a lot of those samples as to make a matrix? What then?
I apologize for the general nature of the question. Any questions, just ask and I will clarify in the comments.
Thank you.
It isn't quite clear from the question what you are trying to achieve. Based on what you wrote you want to create an autoencoder with the same input and output and that doesn't quite make sense to me when I see your data set. In the common case, the encoder part of the autoencoder creates a model which, based on a large set of input features produces a small output vector and decoder is performing an inverse operation of reconstruction of the plausible input features based on the full set of output and input features. A result of using an autoencoder is enhanced (in some meaning, like with noise removed, etc) input.
You can find a few examples here with the 3rd use case providing code for the sequence data, learning random number generation model. Here is another example, which looks closer to your application. A sequential model is constructed to encode a large data set with information loss. If that is what you are trying to achieve, you'll find the code there.
If the goal is a sequence prediction (like future stock prices), this and that example seem to be more appropriate as you likely only want to predict a handful of values in your data sequence (say dHigh and dLow) and you don't need to predict day_of_week_n or the month_n (even though that part of autoencoder model probably will train much more reliable as the pattern is pretty clear). This approach will allow you to predict a single consequent output feature value (tomorrow's dHigh and dLow)
If you want to predict a sequence of future outputs you can use a sequence of outputs, rather than a single one in your model.
In general, the structure of inputs and outputs is totally up to you
I'm trying to match up the elements in 2 different arrays. Array_A is a 3d map of A_Clouds, Array_B is a 3d map of B_Clouds. Each "cloud" is continuous, i.e. any isolated pixels would define a new cloud. The values of the pixels are a single, unique integer for each cloud. Non-cloud values are 0. Here's a 2D example:
[[0 0 0 0 0 0 0 0 0]
[0 0 0 1 1 1 0 0 0]
[0 0 1 1 1 1 1 1 0]
[0 0 0 1 1 1 1 1 0]
[0 0 0 0 0 1 0 0 0]
[0 0 0 0 0 0 0 0 0]]
The output I need is simply the IDs (for both clouds) of each A_Cloud which is overlapping with a B_Cloud, and the number (locations not needed) of pixels which are overlapping between those clouds.
The problem is that these are both very large 3 dimensional arrays (~2000x2000x200, both are the same size). I'm basically doing a bunch of nested for loops, which is of course very slow. Is there a faster way that I could approach this problem? Thanks in advance.
This is what I have right now (simplified to 2d):
final_matches = []
for Acloud_id in ACloud_list:
Acloud_locs = list(set([(i,j) for j, line in enumerate(Array_A) for i,pix in enumerate(line) if pix == Acloud_id]))
matches = []
for loc in Acloud_locs:
Bcloud_pix = Array_B[loc[0]][loc[1]]
if Bcloud_pix:
matches.append(Bcloud_pix)
counter=collections.Counter(matches)
final_matches.append([Acloud_id, counter])
Thanks in advance!
Some considerations here:
for Acloud_id in ACloud_list:
Acloud_locs = list(set([(i,j) for j, line in enumerate(Array_A) for i,pix in enumerate(line) if pix == Acloud_id]))
If I've read that right, this needs to check every pixel in the array in order to generate the set, and it repeats that for every cloud in A. So if you have 500 clouds, you're checking every pixel 500 times. This is not going to scale well!
Might be more efficient to store the overlap counts in a dict, and just go through the arrays once:
overlaps=dict()
for i in possible_x_coords: # define these however you like
for j in possible_y_coords:
if (Array_A[i][j] and Array_B[i][j]):
overlaps[(Array_A[i][j],Array_B[i][j])] = 1 + overlaps.get((Array_A[i][j],Array_B[i][j]),0)
(apologies for any errors, I'm on the road and can't test my code)
update: You've clarified that the arrays are about 80% sparse. If that figure was a lot higher, and if you had control over the format of your inputs, I'd suggest looking into sparse array formats - if your input only stores the non-zero values for A, this can save you the trouble of checking for zero values in A. However, for something that's only 80% sparse, I'm not sure how much efficiency this would add.
I'm trying to write an algorithm to do a triangulation on a 2D sampling of grid points. The idea is similar to Delaunay triangulation but with a few custom rules.
To represent the vertices and their coordinates, the input is a sparse 2D array of 0's and 1's. A given element is 1 if it is a vertex, and 0 if it is not. So basically if it is 1, it means that point was sampled and the triangulation needs to include it in the triangulation; if it is a 0, it should not be involved in the triangulation.
1) Unlike Delaunay, in my case, all the triangles must be right triangles with horizontal or vertical orientation, e.g.:
0 0 0 0
0 1 0 1
0 1 0 0
has a right triangle that can be formed by connecting the 1's. And it has a vertical/horizontal orientation since the 2 non-hypotenuse edges are horizontal and vertical.
2) No 2 triangles can share a hypotenuse, but it's ok if they share an edge that is not a hypotenuse.
3) No vertex can be the apex of a right triangle and also a non-apex of a different right triangle. In other words,
0 1 0
0 (1) 1
0 1 0
is ok because the central a marked inside a (), is the apex of both right triangles.
But in the following case:
0 0 0 1 1
1 0 0 1 1
it would ok to do:
0 0 0 X B
A 0 0 A B
meaning 2 triangles (AAX and BBX), but the following would not be allowed:
0 0 0 A B
A 0 0 X B
since now the vertex "X" would be an apex in triangle A, but would be a non-apex in triangle B.
I'm interested in any thoughts / outline for how to develop this algorithm. The matrices are pretty big, but very sparse, so the algorithm doesn't have to be too efficient; any conceptually simple approach should work fine. The output should be a list of lists:
[[(x1a,y1a),(x1b,y1b),(x1c,y1c)], [(x2a,y2a),(x2b,y2b),(x2c,y2c)], ..., [(xNa,yNa),(xNb,yNb),(xNc,yNc)]]
for the coordinates of the 3 vertices of all of the N different triangles.
So, let's say I have the following matrix/array -
[0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 1 0 0 0 0 0
0 0 0 1 1 1 1 0 0 0 0 0
0 0 1 1 1 1 1 1 0 0 0 0
0 0 1 1 1 1 1 1 0 0 0 0
0 0 0 1 1 1 1 0 0 0 0 0
0 0 0 0 1 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0]
It would be fairly trivial to write something that would translate these values up and down. What if I wanted to rotate it by an angle that isn't a multiple of 90 degrees? I know that It is obviously impossible to get the exact same shape (made of 1s), because of the nature of the grid. The idea that comes to mind is converting each value of 1 to a coordinate vector. Then it would amount to rotating the coordinates (which should be more simple) about a point. One could then write something which would take the coordinates, and compare them to the matrix grid, and if there is a point in the right box, it will be filled. I know I'll also have to find a center around which to rotate.
Does this seem like a reasonable way to do this? If anyone has a better idea, I'm all ears. I know with a small grid like this, the shape would probably be entirely different, however if I had a large shape represented by 1s, in a large grid, the difference between representations would be smaller.
First of all, rotating a shape like that with only 1's and 0's at non 90 degree angles is not really going to look much like the original at all, when it's done at such a low "resolution". However, I would recommend looking into rotation matrices. Like you said, you would probably want to find each value as a coordinate pair, and rotate it around the center. It would probably be easier if you made this a two-dimensional array. Good luck!
I think this should work:
from math import sin, cos, atan2, radians
i0,j0 = 0,0 #point around which you'll rotate
alpha = radians(3) #3 degrees
B = np.zeros(A.shape)
for i,j in np.swapaxes(np.where(A==1),0,1):
di = i-i0
dj = j-j0
dist = (di**2 + dj**2)**0.5
ang = atan2(dj,di)
pi = round(sin(ang+alpha)*dist) + i0
pj = round(cos(ang+alpha)*dist) + j0
B[pi][pj] = 1
But, please, don't forget about segmentation fault!
B array should be much bigger than A and origin should be (optimally) in the middle of the array.