expand a tensor in Keras - python

My hare-brained idea is to create a custom layer that allows me to programmatically add features to a model's output.
EDIT - My "O" output values (see image below) are ASCII values. I want the "F"eature nodes to be 1 if the corresponding "O" nodes are alphabetic and 0 otherwise. In a previous experiment, the additional information made training much much better.
class Unpack_and_Categorize(keras.layers.Layer):
def __init__(self, units=32, **kwargs):
super(Unpack_and_Categorize, self).__init__(units, **kwargs)
self.units = units
self.trainable = False
def build(self, input_shape):
self.weight = self.add_weight(
shape=(input_shape[-1], self.units),
trainable=True,
)
self.bias = self.add_weight(
shape=(self.units,), trainable=True, dtype="float32"
)
def call(self, inputs):
batch_size = inputs.shape[0]
one_hot_size = self.units
c = tf.constant([0] * (one_hot_size * batch_size), shape=(batch_size, one_hot_size))
base_out = tf.tensordot(inputs, self.weight, axes = 1) + self.bias
return tf.concat(base_out, c, shape=(batch_size, 2*one_hot_size))
This image shows what I am trying to accomplish. My custom layer (right side) has 3 values that are densely connected to the previous layer. But now I want to add three 3 more output values that are totally derived from the O1..3. For example, I might set Fx to 1 if Ox was an even number. This would be done in the call method.
So the challenge is that I don't want to hardcode the number of outputs. That is, if the input layer has 10 inputs, then the customer layer will have 20 values. (The challenge that follows is 'will it back-prop, or simply explode...)
Here is an example where we see the "A" is categorized with a 1, while the punctuation and numeric are categorized with a 0.

It's a bit difficult to say exactly what you want to do without seeing all your code, but you need to make sure that all dimensions except the one you want to concatenate are the same:
import tensorflow as tf
def call(inputs):
w_init = tf.random_normal_initializer()
w = tf.Variable(
initial_value=w_init(shape=(10, 10), dtype="float32"),
trainable=True,
)
batch_size = tf.shape(inputs)[0]
c = tf.constant([1.0] * (15 * tf.cast(batch_size, tf.float32)), shape=(batch_size, 15))
print('Inputs: ', inputs.shape)
print('C: ', c.shape)
return tf.concat((tf.tensordot(inputs, w, axes = 1) ,c), 1)
batch_size = 5
print('Result: ', call(tf.random.normal((batch_size, 10))).shape)
Inputs: (5, 10)
C: (5, 15)
Result: (5, 25)

Related

Concatenate along last dimension with custom layer with Tensorflow

I'm trying to concatenate a number to the last dimension of a (None, 10, 3) tensor to make it a (None, 10, 4) tensor using a custom layer. It seems impossible, because to concatenate, all the dimensions except for the one being merged on must be equal and we can't initialize a tensor with 'None' as the first dimension.
For example, the code below gives me this error:
ValueError: Shape must be rank 3 but is rank 2 for '{{node position_embedding_concat_37/concat}} = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32](Placeholder, position_embedding_concat_37/concat/values_1, position_embedding_concat_37/concat/axis)' with input shapes: [?,10,3], [10,1], []
class PositionEmbeddingConcat(tf.keras.layers.Layer):
def __init__(self, sequence_length, **kwargs):
super(PositionEmbeddingConcat, self).__init__(**kwargs)
self.positional_embeddings_array = np.arange(sequence_length).reshape(sequence_length, 1)
def call(self, inputs):
outp = tf.concat([inputs, self.positional_embeddings_array], axis = 2)
return outp
seq_len = 10
input_layer = Input(shape = (seq_len, 3))
embedding_layer = PositionEmbeddingConcat(sequence_length = seq_len)
embeddings = embedding_layer(input_layer)
dense_layer = Dense(units = 1)
output = dense_layer(Flatten()(embeddings))
modelT = tf.keras.Model(input_layer, output)
Is there another way to do this?
You will have to make sure you respect the batch dimension. Maybe something like this:
outp = tf.concat([inputs, tf.cast(tf.repeat(self.positional_embeddings_array[None, ...], repeats=tf.shape(inputs)[0], axis=0), dtype=tf.float32)], axis = 2)
Also, tf.shape gives you the dynamic shape of a tensor.
your question is to concatenate of tensors shape [10, 3] and [10, 1] but you need to perform a Dense function with a specific number of units. You can remark multiplication to use tf.concatenate() only or you change the Dense function to the specific number of units.
Sample: Position embedding is not performed concatenate function, tails to the current dimension, or propagated results from both for domain result.
import tensorflow as tf
class MyPositionEmbeddedLayer( tf.keras.layers.Concatenate ):
def __init__( self, units ):
super(MyPositionEmbeddedLayer, self).__init__( units )
self.num_units = units
def build(self, input_shape):
self.kernel = self.add_weight("kernel",
shape=[int(input_shape[-1]),
self.num_units])
def call(self, inputs, tails):
### area to perform pre-calculation or custom algorithms ###
# #
# #
############################################################
temp = tf.keras.layers.Concatenate(axis=2)([inputs, tails])
temp = tf.matmul(temp, self.kernel)
temp = tf.squeeze( temp )
return temp
#####################################################
start = 3
limit = 93
delta = 3
sample = tf.range(start, limit, delta)
sample = tf.cast( sample, dtype=tf.float32 )
sample = tf.constant( sample, shape=( 10, 1, 3, 1 ) )
start = 3
limit = 33
delta = 3
tails = tf.range(start, limit, delta)
tails = tf.cast( tails, dtype=tf.float32 )
tails = tf.constant( tails, shape=( 10, 1, 1, 1 ) )
layer = MyPositionEmbeddedLayer(10)
print( layer(sample, tails) )
Output: You see it learning with Dense kernels, close neighbors frequencies aliases.
...
[[-26.67632 35.44779 23.239683 20.374893 -12.882696
54.963055 -18.531412 -4.589509 -21.722694 -43.44675 ]
[-27.629044 36.713783 24.069672 21.102568 -13.3427925
56.92602 -19.193249 -4.7534204 -22.498507 -44.99842 ]
[-28.58177 37.979774 24.89966 21.830242 -13.802889
58.88899 -19.855083 -4.917331 -23.274317 -46.55009 ]
[ -9.527256 12.6599245 8.299887 7.276747 -4.600963
19.629663 -6.6183615 -1.6391104 -7.7581053 -15.516697 ]]], shape=(10, 4, 10), dtype=float32)

PyTorch: Sizes of tensors must match on 2 input neural network

I am attempting to recreate a 2 input neural network from this article: https://towardsdatascience.com/moving-from-keras-to-pytorch-f0d4fff4ce79
I have copied the network described in the post and adjusted it so that it fits my data. The first input is from GloVe Word embeddings while the other is numerical features about the text data.
class Net(nn.Module):
def __init__(self,hidden_size,lin_size, embedding_matrix=embedding_weights):
super(Alex_NeuralNet_Meta, self).__init__()
# Initialize some parameters for your model
self.hidden_size = hidden_size
drp = 0.1
# Layer 1: Embeddings.
self.embedding = nn.Embedding(size_of_vocabulary, pretrained_embedding_dim)
self.embedding.weight = nn.Parameter(torch.tensor(embedding_matrix, dtype=torch.float32))
self.embedding.weight.requires_grad = False
# Layer 2: Dropout1D(0.1)
self.embedding_dropout = nn.Dropout2d(0.1)
# Layer 3: Bidirectional CuDNNLSTM
self.lstm = nn.LSTM(pretrained_embedding_dim, hidden_size, bidirectional=True, batch_first=True)
# Layer 4: Bidirectional CuDNNGRU
self.gru = nn.GRU(hidden_size*2, hidden_size, bidirectional=True, batch_first=True)
# Layer 7: A dense layer
self.linear = nn.Linear(hidden_size*6 + X2_train.shape[1], lin_size)
self.relu = nn.ReLU()
# Layer 8: A dropout layer
self.dropout = nn.Dropout(drp)
# Layer 9: Output dense layer with one output for our Binary Classification problem.
self.out = nn.Linear(lin_size, 1)
def forward(self, x):
'''
here x[0] represents the first element of the input that is going to be passed.
We are going to pass a tuple where first one contains the sequences(x[0])
and the second one is a additional feature vector(x[1])
'''
h_embedding = self.embedding(x[0].long())
h_embedding = torch.squeeze(self.embedding_dropout(torch.unsqueeze(h_embedding, 0)))
#print("emb", h_embedding.size())
h_lstm, _ = self.lstm(h_embedding)
# print("lst",h_lstm.size())
h_gru, hh_gru = self.gru(h_lstm)
hh_gru = hh_gru.view(-1, 2*self.hidden_size )
print("gru", h_gru.size())
print("h_gru", hh_gru.size())
# Layer 5: is defined dynamically as an operation on tensors.
avg_pool = torch.mean(h_gru, 1)
max_pool, _ = torch.max(h_gru, 1)
print("avg_pool", avg_pool.size())
print("max_pool", max_pool.size())
# the extra features you want to give to the model
f = torch.tensor(x[1], dtype=torch.float).cuda()
print("f", f.size())
# Layer 6: A concatenation of the last state, maximum pool, average pool and
# additional features
conc = torch.cat(( hh_gru, avg_pool, max_pool, f), 1)
#print("conc", conc.size())
# passing conc through linear and relu ops
conc = self.relu(self.linear(conc))
conc = self.dropout(conc)
out = self.out(conc)
# return the final output
return out
And during runtime I get an error on the concatenation line:
RuntimeError: Sizes of tensors must match except in dimension 0. Got 33164 and 20 (The offending index is 0)
From the dimensions of the outputs, I can see where the problem lies but I am not sure how I can fix it
The data inputs to the network is:
torch.Size([20, 150])
torch.Size([33164, 40])
The sizes of each layer output is:
gru torch.Size([20, 150, 80])
h_gru torch.Size([20, 80])
avg_pool torch.Size([20, 80])
max_pool torch.Size([20, 80])
f torch.Size([33164, 40])
For the example above the batch size is 20, hidden_size is 40, the number of rows in numerical data features is 33164 and its feature size is 40.
Thanks for any help in advance

Retrieving attention weights for sentences? Most attentive sentences are zero vectors

I have a document classification task, that classifies documents as good (1) or bad (0), and I use some sentence embeddings for each document to classify the documents accordingly.
What I like to do is retrieving the attention scores for each document, to obtain the most "relevant" sentences (i.e., those with high attention scores)
I padded each document to the same length (i.e., 1000 sentences per document). So my tensor for 5000 documents looks like this X = np.ones(shape=(5000, 1000, 200)) (5000 documents with each having a 1000 sequence of sentence vectors and each sentence vector consisting of 200 features).
My network looks like this:
no_sentences_per_doc = 1000
sentence_embedding = 200
sequence_input = Input(shape=(no_sentences_per_doc, sentence_embedding))
gru_layer = Bidirectional(GRU(50,
return_sequences=True
))(sequence_input)
sent_dense = Dense(100, activation='relu', name='sent_dense')(gru_layer)
sent_att,sent_coeffs = AttentionLayer(100,return_coefficients=True, name='sent_attention')(sent_dense)
preds = Dense(1, activation='sigmoid',name='output')(sent_att)
model = Model(sequence_input, preds)
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=[TruePositives(name='true_positives'),
TrueNegatives(name='true_negatives'),
FalseNegatives(name='false_negatives'),
FalsePositives(name='false_positives')
])
history = model.fit(X, y, validation_data=(x_val, y_val), epochs=10, batch_size=32)
After training I retrieved the attention scores as follows:
sent_att_weights = Model(inputs=sequence_input,outputs=sent_coeffs)
## load a single sample
## from file with 150 sentences (one sentence per line)
## each sentence consisting of 200 features
x_sample = np.load(x_sample)
## and reshape to (1, 1000, 200)
x_sample = x_sample.reshape(1,1000,200)
output_array = sent_att_weights.predict(x_sample)
However, if I show the top 3 attention scores for the sentences, I also obtain sentence indices that are, for example, [432, 434, 999] for a document that has only 150 sentences (the rest is padded, i.e., just zeros).
Does that make sense or am I doing something wrong here? (is there a mistake in my attention layer? Or is due to a low F-score?)
The attention layer I use is the following:
class AttentionLayer(Layer):
"""
https://humboldt-wi.github.io/blog/research/information_systems_1819/group5_han/
"""
def __init__(self,attention_dim=100,return_coefficients=False,**kwargs):
# Initializer
self.supports_masking = True
self.return_coefficients = return_coefficients
self.init = initializers.get('glorot_uniform') # initializes values with uniform distribution
self.attention_dim = attention_dim
super(AttentionLayer, self).__init__(**kwargs)
def build(self, input_shape):
# Builds all weights
# W = Weight matrix, b = bias vector, u = context vector
assert len(input_shape) == 3
self.W = K.variable(self.init((input_shape[-1], self.attention_dim)),name='W')
self.b = K.variable(self.init((self.attention_dim, )),name='b')
self.u = K.variable(self.init((self.attention_dim, 1)),name='u')
self.trainable_weights = [self.W, self.b, self.u]
super(AttentionLayer, self).build(input_shape)
def compute_mask(self, input, input_mask=None):
return None
def call(self, hit, mask=None):
# Here, the actual calculation is done
uit = K.bias_add(K.dot(hit, self.W),self.b)
uit = K.tanh(uit)
ait = K.dot(uit, self.u)
ait = K.squeeze(ait, -1)
ait = K.exp(ait)
if mask is not None:
ait *= K.cast(mask, K.floatx())
ait /= K.cast(K.sum(ait, axis=1, keepdims=True) + K.epsilon(), K.floatx())
ait = K.expand_dims(ait)
weighted_input = hit * ait
if self.return_coefficients:
return [K.sum(weighted_input, axis=1), ait]
else:
return K.sum(weighted_input, axis=1)
def compute_output_shape(self, input_shape):
if self.return_coefficients:
return [(input_shape[0], input_shape[-1]), (input_shape[0], input_shape[-1], 1)]
else:
return input_shape[0], input_shape[-1]
Note that I use keras with tensorflow backend version 2.1.; the attention layer was originally written for theano, but I use import tensorflow.keras.backend as K

Hierarchical transformer for document classification: model implementation error, extracting attention weights

I am trying to implement a hierarchical transformer for document classification in Keras/tensorflow, in which:
(1) a word-level transformer produces a representation of each sentence, and attention weights for each word, and,
(2) a sentence-level transformer uses the outputs from (1) to produce a representation of each document, and attention weights for each sentence, and finally,
(3) the document representations produced by (2) are used to classify documents (in the following example, as belonging or not belonging to a given class).
I am attempting to model the classifier on Yang et al.'s approach here (https://www.cs.cmu.edu/~./hovy/papers/16HLT-hierarchical-attention-networks.pdf), but replacing the GRU and attention layers with transformers.
I am using Apoorv Nandan's transformer implementation from https://keras.io/examples/nlp/text_classification_with_transformer/.
I have two issues for which I would be grateful for the community's help:
(1) I get an error in the upper (sentence) level model that I can't resolve (details and code below)
(2) I don't know how to extract the word- and sentence-level attention weights, and value advice on how best to do this.
I am new to both Keras and this forum, so apologies for obvious mistakes and thank you in advance for any help.
Here is a reproducible example, indicating where I encounter errors:
First, establish the multi-head attention, transformer, and token/position embedding layers, after Nandan.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import pandas as pd
import numpy as np
class MultiHeadSelfAttention(layers.Layer):
def __init__(self, embed_dim, num_heads=8):
super(MultiHeadSelfAttention, self).__init__()
self.embed_dim = embed_dim
self.num_heads = num_heads
if embed_dim % num_heads != 0:
raise ValueError(
f"embedding dimension = {embed_dim} should be divisible by number of heads = {num_heads}"
)
self.projection_dim = embed_dim // num_heads
self.query_dense = layers.Dense(embed_dim)
self.key_dense = layers.Dense(embed_dim)
self.value_dense = layers.Dense(embed_dim)
self.combine_heads = layers.Dense(embed_dim)
def attention(self, query, key, value):
score = tf.matmul(query, key, transpose_b=True)
dim_key = tf.cast(tf.shape(key)[-1], tf.float32)
scaled_score = score / tf.math.sqrt(dim_key)
weights = tf.nn.softmax(scaled_score, axis=-1)
output = tf.matmul(weights, value)
return output, weights
def separate_heads(self, x, batch_size):
x = tf.reshape(x, (batch_size, -1, self.num_heads, self.projection_dim))
return tf.transpose(x, perm=[0, 2, 1, 3])
def call(self, inputs):
# x.shape = [batch_size, seq_len, embedding_dim]
batch_size = tf.shape(inputs)[0]
query = self.query_dense(inputs) # (batch_size, seq_len, embed_dim)
key = self.key_dense(inputs) # (batch_size, seq_len, embed_dim)
value = self.value_dense(inputs) # (batch_size, seq_len, embed_dim)
query = self.separate_heads(
query, batch_size
) # (batch_size, num_heads, seq_len, projection_dim)
key = self.separate_heads(
key, batch_size
) # (batch_size, num_heads, seq_len, projection_dim)
value = self.separate_heads(
value, batch_size
) # (batch_size, num_heads, seq_len, projection_dim)
attention, weights = self.attention(query, key, value)
attention = tf.transpose(
attention, perm=[0, 2, 1, 3]
) # (batch_size, seq_len, num_heads, projection_dim)
concat_attention = tf.reshape(
attention, (batch_size, -1, self.embed_dim)
) # (batch_size, seq_len, embed_dim)
output = self.combine_heads(
concat_attention
) # (batch_size, seq_len, embed_dim)
return output
class TransformerBlock(layers.Layer):
def __init__(self, embed_dim, num_heads, ff_dim, dropout_rate, name=None):
super(TransformerBlock, self).__init__(name=name)
self.att = MultiHeadSelfAttention(embed_dim, num_heads)
self.ffn = keras.Sequential(
[layers.Dense(ff_dim, activation="relu"), layers.Dense(embed_dim),]
)
self.layernorm1 = layers.LayerNormalization(epsilon=1e-6)
self.layernorm2 = layers.LayerNormalization(epsilon=1e-6)
self.dropout1 = layers.Dropout(dropout_rate)
self.dropout2 = layers.Dropout(dropout_rate)
def call(self, inputs, training):
attn_output = self.att(inputs)
attn_output = self.dropout1(attn_output, training=training)
out1 = self.layernorm1(inputs + attn_output)
ffn_output = self.ffn(out1)
ffn_output = self.dropout2(ffn_output, training=training)
return self.layernorm2(out1 + ffn_output)
class TokenAndPositionEmbedding(layers.Layer):
def __init__(self, maxlen, vocab_size, embed_dim, name=None):
super(TokenAndPositionEmbedding, self).__init__(name=name)
self.token_emb = layers.Embedding(input_dim=vocab_size, output_dim=embed_dim)
self.pos_emb = layers.Embedding(input_dim=maxlen, output_dim=embed_dim)
def call(self, x):
maxlen = tf.shape(x)[-1]
positions = tf.range(start=0, limit=maxlen, delta=1)
positions = self.pos_emb(positions)
x = self.token_emb(x)
return x + positions
For the purpose of this example, the data are 10,000 documents, each truncated to 15 sentences, each sentence with a maximum of 60 words, which are already converted to integer tokens 1-1000.
X is a 3-D tensor (10000, 15, 60) containing these tokens. y is a 1-D tensor containing the classes of the documents (1 or 0). For the purpose of this example there is no relation between X and y.
The following produces the example data:
max_docs = 10000
max_sentences = 15
max_words = 60
X = tf.random.uniform(shape=(max_docs, max_sentences, max_words), minval=1, maxval=1000, dtype=tf.dtypes.int32, seed=1)
y = tf.random.uniform(shape=(max_docs,), minval=0, maxval=2, dtype=tf.dtypes.int32, seed=1)
Here I attempt to construct the word level encoder, after https://keras.io/examples/nlp/text_classification_with_transformer/:
# Lower level (produce a representation of each sentence):
embed_dim = 100 # Embedding size for each token
num_heads = 2 # Number of attention heads
ff_dim = 64 # Hidden layer size in feed forward network inside transformer
L1_dense_units = 100 # Size of the sentence-level representations output by the word-level model
dropout_rate = 0.1
vocab_size=1000
word_input = layers.Input(shape=(max_words,), name='word_input')
word_embedding = TokenAndPositionEmbedding(maxlen=max_words, vocab_size=vocab_size,
embed_dim=embed_dim, name='word_embedding')(word_input)
word_transformer = TransformerBlock(embed_dim=embed_dim, num_heads=num_heads, ff_dim=ff_dim,
dropout_rate=dropout_rate, name='word_transformer')(word_embedding)
word_pool = layers.GlobalAveragePooling1D(name='word_pooling')(word_transformer)
word_drop = layers.Dropout(dropout_rate,name='word_drop')(word_pool)
word_dense = layers.Dense(L1_dense_units, activation="relu",name='word_dense')(word_drop)
word_encoder = keras.Model(word_input, word_dense)
word_encoder.summary()
It looks as though this word encoder works as intended to produce a representation of each sentence. Here, run on the 1st document, it produces a tensor of shape (15, 100), containing the vectors representing each of 15 sentences:
word_encoder(X[0]).shape
My problem is in connecting this to the higher (sentence) level model, to produce document representations.
I get error "NotImplementedError" when trying to apply the word encoder to each sentence in a document. I would be grateful for any help in fixing this issue, since the error message is not informative as to the specific problem.
After applying the word encoder to each sentence, the goal is to apply another transformer to produce attention weights for each sentence, and a document-level representation with which to perform classification. I can't determine whether this part of the model will work because of the error above.
Finally, I would like to extract word- and sentence-level attention weights for each document, and would be grateful for advice on how to do so.
Thank you in advance for any insight.
# Upper level (produce a representation of each document):
L2_dense_units = 100
sentence_input = layers.Input(shape=(max_sentences, max_words), name='sentence_input')
# This is the line producing "NotImplementedError":
sentence_encoder = tf.keras.layers.TimeDistributed(word_encoder, name='sentence_encoder')(sentence_input)
sentence_transformer = TransformerBlock(embed_dim=L1_dense_units, num_heads=num_heads, ff_dim=ff_dim,
dropout_rate=dropout_rate, name='sentence_transformer')(sentence_encoder)
sentence_dense = layers.TimeDistributed(Dense(int(L2_dense_units)),name='sentence_dense')(sentence_transformer)
sentence_out = layers.Dropout(dropout_rate)(sentence_dense)
preds = layers.Dense(1, activation='sigmoid', name='sentence_output')(sentence_out)
model = keras.Model(sentence_input, preds)
model.summary()
I got NotImplementedError as well while trying to do the same thing as you. The thing is Keras's TimeDistributed layer needs to know its inner custom layer's output shapes. So you should add compute_output_shape method to your custom layers.
In your case MultiHeadSelfAttention, TransformerBlock and TokenAndPositionEmbedding layers should include:
class MultiHeadSelfAttention(layers.Layer):
...
def compute_output_shape(self, input_shape):
# it does not change the shape of its input
return input_shape
class TransformerBlock(layers.Layer):
...
def compute_output_shape(self, input_shape):
# it does not change the shape of its input
return input_shape
class TokenAndPositionEmbedding(layers.Layer):
...
def compute_output_shape(self, input_shape):
# it changes the shape from (batch_size, maxlen) to (batch_size, maxlen, embed_dim)
return input_shape + (self.pos_emb.output_dim,)
After you add these methods you should be able to run your code.
As for your second question, I am not sure but maybe you can return the "weights" variable that is returned from MultiHeadSelfAttention's attention method in call methods of both MultiHeadSelfAttention and TransformerBlock. So that you can access it where you build your model.

Proper way to create a custom layer/function to rearrange layer values (Keras with Tensorflow)

I need to rearrange a tensor values and then reshape it in Keras, however I am struggling with the proper way to to rearrange a tensor in Keras with Tensorflow backend.
This custom layer/function will iterate through the values, and then rearrange the values via a stride formula
This doesn't seem to have weights, so I am assuming stateless and won't affect back propagation.
It requires list slicing though:
out_array[b,channel_out, row_out, column_out] = in_array[b,i,j,k]
and this is just one of the components I am struggling with.
Here is the function/layer
def reorg(tensor, stride):
batch,channel, height, width = (tensor.get_shape())
out_channel = channel * (stride * stride)
out_len = length//stride
out_width = width//stride
#create new empty tensor
out_array = K.zeros((batch, out_channel, out_len, out_width))
for b in batch:
for i in range(channel):
for j in range(height):
for k in range(width):
channel_out = i + (j % stride) * (channel * stride) + (k % stride) * channel
row_out = j//stride
column_out = k//stride
out_array[b,channel_out, row_out, column_out] = K.slice(in_array,[b,i,j,k], size = (1,1,1,1))
return out_array.astype("int")
I don't have much experience creating custom functions/layers in Keras,
so not quite sure If I am on the right track.
Here is what the code bit is doing depending on the stride (here it's 2):
https://towardsdatascience.com/training-object-detection-yolov2-from-scratch-using-cyclic-learning-rates-b3364f7e4755
When you say re-arrange, do you mean change the order of your axes? There is a function called tf.transpose which you can use inside a custom layer. There is also tf.keras.layers.Permute which can be used without any custom code to re-order a tensor.
If you are asking how you can create a custom layer, there are some methods you'll need to implement. The docs explain it pretty well here: Custom Layers
from tensorflow.keras import layers
import tensorflow as tf
class Linear(layers.Layer):
def __init__(self, units=32):
super(Linear, self).__init__()
self.units = units
def build(self, input_shape):
self.w = self.add_weight(shape=(input_shape[-1], self.units),
initializer='random_normal',
trainable=True)
self.b = self.add_weight(shape=(self.units,),
initializer='random_normal',
trainable=True)
def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b

Categories