I'm trying to make an array of one-hot vector of integers into an array of one-hot vector that keras will be able to use to fit my model. Here's the relevant part of the code:
Y_train = np.hstack(np.asarray(dataframe.output_vector)).reshape(len(dataframe),len(output_cols))
dummy_y = np_utils.to_categorical(Y_train)
Below is an image showing what Y_train and dummy_y actually are.
I couldn't find any documentation for to_categorical that could help me.
Thanks in advance.
np_utils.to_categorical is used to convert array of labeled data(from 0 to nb_classes - 1) to one-hot vector.
The official doc with an example.
In [1]: from keras.utils import np_utils # from keras import utils as np_utils
Using Theano backend.
In [2]: np_utils.to_categorical?
Signature: np_utils.to_categorical(y, num_classes=None)
Docstring:
Convert class vector (integers from 0 to nb_classes) to binary class matrix, for use with categorical_crossentropy.
# Arguments
y: class vector to be converted into a matrix
nb_classes: total number of classes
# Returns
A binary matrix representation of the input.
File: /usr/local/lib/python3.5/dist-packages/keras/utils/np_utils.py
Type: function
In [3]: y_train = [1, 0, 3, 4, 5, 0, 2, 1]
In [4]: """ Assuming the labeled dataset has total six classes (0 to 5), y_train is the true label array """
In [5]: np_utils.to_categorical(y_train, num_classes=6)
Out[5]:
array([[ 0., 1., 0., 0., 0., 0.],
[ 1., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 1., 0., 0.],
[ 0., 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 0., 1.],
[ 1., 0., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0., 0.],
[ 0., 1., 0., 0., 0., 0.]])
from keras.utils.np_utils import to_categorical
UPDATED --- keras.utils.np_utils doesn't work in newer versions; if so use:
from tensorflow.keras.utils import to_categorical
In both cases
to_categorical(0, max_value_of_array)
It assumes the class values were in string and you will be label encoding them, hence starting everytime from 0 to n-classes.
for the same example:- consider an array of {1,2,3,4,2}
The output will be [zero value, one value, two value, three value, four value]
array([[ 0., 1., 0., 0., 0.],
[ 0., 0., 1., 0., 0.],
[ 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 1.],
[ 0., 0., 1., 0., 0.]],
Let's look at another example:-
Again, for an array having 3 classes, Y = {4, 8, 9, 4, 9}
to_categorical(Y) will output
array([[0., 0., 0., 0., 1., 0., 0., 0., 0., 0. ],
[0., 0., 0., 0., 0., 0., 0., 0., 1., 0. ],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 1. ],
[0., 0., 0., 0., 1., 0., 0., 0., 0., 0. ],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 1. ]]
Related
I am trying to reverse the output of LRP as heatmap back to model's weights. I have decieded to minimize the loss of the between the relevence of the untrained model weights and the desired heatmap relevence score so in theory it should make the weights reach the values that produce the desired heatmap and I am doing this via GraidentTape. I am following this tutorial implementation of simple LRP for a fully connected model for minst dataset here is the model diagram
And this is the model implementation
num_classes = 10
input_layer = Input(shape=(img_width * img_height,))
X = Dense(300, activation='relu', kernel_regularizer='l2')(input_layer)
X = Dense(100, activation='relu',kernel_regularizer='l2')(X)
X = Dense(10, activation='relu',kernel_regularizer='l2')(X)
model = Model(inputs=input_layer, outputs=X)
And this is the LRP function to get revelence where arguments are
W -> weights of the model and in form of list of weights of each layer in order
B -> Biases of the model and in form of list of weights of each layer in order
img -> input image in shape of (1,28*28) where 28 is the height & width of image
pred -> one hot encoded array of which number is the input image
and returns R the revelence of each neuron in each layer
def get_relevance_tf(W,B,img,pred):
L = len(W)
A = [img]+[None]*L
for l in range(L):
A[l+1] = tf.nn.relu(tf.matmul(A[l],W[l])+B[l])
R = [0.0]*L + [A[L]*(pred)]
for l in range(1,L)[::-1]:
w = W[l]
b = B[l]
z = tf.matmul(A[l],w)+b # step 1
s = R[l+1] / z # step 2
c = tf.matmul(s,w,transpose_b=True) # step 3
R[l] = A[l]*c # step 4
w = W[0]
wp = tf.math.maximum(0,w)
wm = tf.math.minimum(0,w)
lb = A[0]*0-1
hb = A[0]*0+1
z = tf.matmul(A[0],w)-tf.matmul(lb,wp)-tf.matmul(hb,wm)+1e-9 # step 1
s = R[1]/z # step 2
c,cp,cm = tf.matmul(s,w,transpose_b=True),tf.matmul(s,wp,transpose_b=True),tf.matmul(s,wm,transpose_b=True) # step 3
R[0] = A[0]*c-lb*cp-hb*cm # step 4
return R
And this is gradient tape function where pred_R is the relevence score of the desired heatmap and model.10.hdf5 is the untrained model
model = tf.keras.models.load_model("model.10.hdf5")
img = tf.convert_to_tensor(X_train[index].reshape(1,784),dtype=tf.float32)
pred = tf.convert_to_tensor(y_train_one_hot[index],dtype=tf.float32)
W = [tf.Variable(i,dtype=tf.float32,trainable=True) for i in model.get_weights()[::2]]
B = [tf.Variable(i,dtype=tf.float32,trainable=True) for i in model.get_weights()[1::2]]
with tf.GradientTape() as tape:
R = get_relevance_tf(W,B,img,pred)
loss = tf.math.reduce_sum(tf.math.abs(R[0]-pred_R[0]))
grads = tape.gradient(loss, [W,B])
print(grads)
This is the output as you can see all the grads are zeros
[[<tf.Tensor: shape=(784, 300), dtype=float32, numpy=
array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]], dtype=float32)>, <tf.Tensor: shape=(300, 100), dtype=float32, numpy=
array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]], dtype=float32)>, <tf.Tensor: shape=(100, 10), dtype=float32, numpy=
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
...
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
dtype=float32)>, <tf.Tensor: shape=(10,), dtype=float32, numpy=array([-0., -0., -0., -0., -0., -0., -0., -0., -0., -0.], dtype=float32)>]]
I can't understand why the gradients are zeros while there is direct correlation between the weights and the relevence
Things I have tried
Used autograd same results
Tried to make it as a custom model where the forward propagation is calculating the relevance score and used optimizer with custom loss function and also same results
I have solved this issue apparently the untrained model's weights were local minima in the loss function thus giving 0 for all weights
Note to future me, start with randomized values
This question already has an answer here:
Finding connected components in a pixel-array
(1 answer)
Closed 6 months ago.
I have the following problem that I wanted to solve using opencv or scikit-image.
Suppose I have a "map" in the following form:
1 is ground 0 is water
map = np.array([
[ 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0.],
[ 0., 1., 1., 1., 0., 0.],
[ 0., 0., 1., 0., 1., 0.],
[ 0., 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 0., 0.]])
How many islands in the map? considering 4 neighbors. In this example there is 2
Given an (i,j) position, return the number of ground neighbors.
example: (2,2) -> 4
Solving question no. 1 with Scikit-Image: The measure module will be your friend. Please checkout the documentation for it.
import numpy as np
from skimage import measure
import matplotlib.pyplot as plt
img = np.array([ [ 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0.],
[ 0., 1., 1., 1., 0., 0.],
[ 0., 0., 1., 0., 1., 0.],
[ 0., 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 0., 0.]])
imglabeled,island_count = measure.label(img,background=0,return_num=True,connectivity=1)
plt.imshow(imglabeled)
I have a tensor
import torch
torch.zeros((5,10))
>>> tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
How can I replace the values of X random cells in each row with random inputs (torch.rand())?
That is, if X = 2, in each row, 2 random cells should be replaced with torch.rand().
Since I need it to not break backpropagation I found here that replacing the .data attribute of the cells should work.
The only familiar thing to me is using a for loop but it's not efficient for a large tensor
You can try tensor.scatter_().
x = torch.zeros(3,4)
n_replace = 3 # number of cells to be replaced with random number
src = torch.randn(x.size())
index = torch.stack([torch.randperm(x.size()[1]) for _ in range(x.size()[0])])[:,:n_replace]
x.scatter_(1, index, src)
Out[22]:
tensor([[ 0.0000, 0.5769, 0.7432, -0.1776],
[-2.1673, -1.0802, 0.0000, 0.6241],
[-0.6421, 0.1315, 0.0000, -2.7224]])
To avoid repetition,
perm = torch.randperm(tensor.size(0))
idx = perm[:k]
samples = tensor[idx]
So lets say I have a (4,10) array initialized to zeros, and I have an input array in the form [2,7,0,3]. The input array will modify the zeros matrix to look like this:
[[0,0,1,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,1,0,0],
[1,0,0,0,0,0,0,0,0,0],
[0,0,0,1,0,0,0,0,0,0]]
I know I can do that by looping through the input target and indexing the matrix array with something like matrix[i][target in input target], but I tried to do it without a loop doing something like:
matrix[:, input_target] = 1, but that sets me the entire matrix to all 1.
Apparently the way to do it is:
matrix[range(input_target.shape[0]), input_target], the question is why this works and not using the colon ?
Thanks!
You only wish to update one column for each row. Therefore, with advanced indexing you must explicitly provide those row identifiers:
A = np.zeros((4, 10))
A[np.arange(A.shape[0]), [2, 7, 0, 3]] = 1
Result:
array([[ 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
[ 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.]])
Using a colon for the row indexer will tell NumPy to update all rows for the specified columns:
A[:, [2, 7, 0, 3]] = 1
array([[ 1., 0., 1., 1., 0., 0., 0., 1., 0., 0.],
[ 1., 0., 1., 1., 0., 0., 0., 1., 0., 0.],
[ 1., 0., 1., 1., 0., 0., 0., 1., 0., 0.],
[ 1., 0., 1., 1., 0., 0., 0., 1., 0., 0.]])
In pandas or numpy, I can do the following to get one-hot vectors:
>>> import numpy as np
>>> import pandas as pd
>>> x = [0,2,1,4,3]
>>> pd.get_dummies(x).values
array([[ 1., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0.],
[ 0., 1., 0., 0., 0.],
[ 0., 0., 0., 0., 1.],
[ 0., 0., 0., 1., 0.]])
>>> np.eye(len(set(x)))[x]
array([[ 1., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0.],
[ 0., 1., 0., 0., 0.],
[ 0., 0., 0., 0., 1.],
[ 0., 0., 0., 1., 0.]])
From text, with gensim, I can do:
>>> from gensim.corpora import Dictionary
>>> sent1 = 'this is a foo bar sentence .'.split()
>>> sent2 = 'this is another foo bar sentence .'.split()
>>> texts = [sent1, sent2]
>>> vocab = Dictionary(texts)
>>> [[vocab.token2id[word] for word in sent] for sent in texts]
[[3, 4, 0, 6, 1, 2, 5], [3, 4, 7, 6, 1, 2, 5]]
Then I'll have to do the same pd.get_dummies or np.eyes to get the one-hot vector but I get an error where there's one dimension missing from my one-hot vector I have 8 unique words but the one-hot vector lengths are only 7:
>>> [pd.get_dummies(sent).values for sent in texts_idx]
[array([[ 0., 0., 0., 1., 0., 0., 0.],
[ 0., 0., 0., 0., 1., 0., 0.],
[ 1., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 1.],
[ 0., 1., 0., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 1., 0.]]), array([[ 0., 0., 1., 0., 0., 0., 0.],
[ 0., 0., 0., 1., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 1.],
[ 0., 0., 0., 0., 0., 1., 0.],
[ 1., 0., 0., 0., 0., 0., 0.],
[ 0., 1., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 1., 0., 0.]])]
It seems like it's doing one-hot vector individually as it iterates through each sentence, instead of using the global vocabulary.
Using np.eye, I do get the right vectors:
>>> [np.eye(len(vocab))[sent] for sent in texts_idx]
[array([[ 0., 0., 0., 1., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 1., 0., 0., 0.],
[ 1., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 1., 0.],
[ 0., 1., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 1., 0., 0.]]), array([[ 0., 0., 0., 1., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 1., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 1.],
[ 0., 0., 0., 0., 0., 0., 1., 0.],
[ 0., 1., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 1., 0., 0.]])]
Also, currently, I have to do several things from using gensim.corpora.Dictionary to converting the words to their ids then getting the one-hot vector.
Are there other ways to achieve the same one-hot vector from texts?
There are various packages that will do all the steps in a single function such as http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html.
Alternatively, if you have your vocabulary and text indexes for each sentence already, you can create a one-hot encoding by preallocating and using smart indexing. In the following text_idx is a list of integers and vocab is a list relating integers indexes to words.
import numpy as np
vocab_size = len(vocab)
text_length = len(text_idx)
one_hot = np.zeros(([vocab_size, text_length])
one_hot[text_idx, np.arange(text_length)] = 1
to create one_hot_vector, you need to create unique vocabulary from text
vocab=set(vocab)
label_encoder = LabelEncoder()
integer_encoded = label_encoder.fit_transform(vocab)
one_hot_encoder = OneHotEncoder(sparse=False)
doc = "dog"
index=vocab.index(doc)
integer_encoded = integer_encoded.reshape(len(integer_encoded), 1)
one_hot_encoder=one_hot_encoder.fit_transform(integer_encoded)[index]
The 7th value is the "."(Dot) in your sentences separated by a " "(space) and split() counts it as a word !!