I am trying to make a lambda layer use a custom function. I want to change the last 10 entries of the outermost axis of the inputs into a corresponding output but keras tensors are immutable. I am wondering what is the most efficient way to do this.
My function looks like this:
def custom_function(y_pred):
temp1 = tf.ones_like(y_pred[...,0,-10:])
temp2 = tf.ones_like(y_pred[...,1,-10:])
temp3 = tf.ones_like(y_pred[...,2,-10:])
a = K.greater_equal(K.sum(y_pred[...,0,-10:]),K.sum(y_pred[...,1,1:11]))
temp1 = temp1 * (K.sum(y_pred[...,0,-10:]) / 10)
b = K.greater_equal(K.sum(y_pred[...,1,-10:]),K.sum(y_pred[...,2,1:11]))
temp2 = temp2 *K.sum(y_pred[...,1,-10:]) / 10
c = K.greater_equal(K.sum(y_pred[...,2,-10:]),K.sum(y_pred[...,3,1:11]))
temp3 = temp3 *K.sum(y_pred[...,2,-10:]) / 10
y_pred[...,0,-10:] = K.switch(a,temp1,y_pred[...,0,-10:])
y_pred[...,1,-10:] = K.switch(b,temp2,y_pred[...,1,-10:])
y_pred[...,2,-10:] = K.switch(c,temp3,y_pred[...,2,-10:])
return y_pred
its not working right now because I cant assign values to the tensor y_pred. What would be the most efficient way to do this ?
Related
I am trying to create my own RNN with pytorch and am following some simple tutorials on the .backward function. Once I run my code, I get "None" as the result for .grad and I cannot figure out why. It looks like from this post, that it may be because I setting up the inputs as tensors so they are getting detached? If so, I am not sure how to correct for this but ensure they can still be multiplied in the matrices.
import math
import numpy as np
import torch
from collections import deque
#set up the inputs
lists = deque()
for i in range(0, 13, 1):
lists.append(range(i, i + 4))
x = np.array(lists)
# set up the y vector
y = []
for i in range(len(x)):
y.append((x[i,3])+1)
#set up the validation input
lists = deque()
for i in range(13, 19, 1):
lists.append(range(i, i + 4))
x_val = np.array(lists)
#set up the validation y vector
y_val = []
for i in range(len(x_val)):
y_val.append((x_val[i,3])+1)
#set params
input_dimension = len(x[0])
hidden_dimension = 100
output_dimension = 1
#set up the weighted matrices
np.random.seed(99)
Wxh = np.random.uniform(0, 1, (hidden_dimension, input_dimension)) # weights from input to hidden layer
Whh = np.random.uniform(0, 1, (hidden_dimension, hidden_dimension)) # weights inside cell - recurrant
Why = np.random.uniform(0, 1, (output_dimension, hidden_dimension)) # weights from hidden to output layer
#set up the input tensor
xt = torch.tensor(x[[0]], dtype=torch.float) #do I want to keep a float here? or force an int? think: float - understand why
Wxh_t = torch.tensor(Wxh, requires_grad = True).float()
Whh_t = torch.tensor(Whh, requires_grad = True).float()
Why_t = torch.tensor(Why, requires_grad = True).float()
loss = 0
for i in range(len(x)):
xt = torch.tensor(x[[i]], dtype=torch.float)
print(xt)
current_affine_3 = torch.mm(xt,Wxh_t.T)
hidden_t = torch.mm(h_prev_t, Whh_t.T)
ht_t = torch.tanh(current_affine_3 + hidden_t)
y_hat_t = torch.mm(ht_t, Why_t.T)
loss += (y[i] - y_hat_t)**2
print(y[i])
print(loss)
h_prev_t = ht_t
loss.backward
print(Wxh_t.grad)
loss.backward returns <bound method Tensor.backward of tensor([[18672.0215]], grad_fn=<AddBackward0>)>
And if I view the weighted matrices, I notice something different than the tutorials. Instead of grad_fn=<AddBackward0> after calculating with a requires_grad = True tensor, I get grad_fn=<MmBackward0>. I assume it's because I am using torch.mm, but I'm not sure if this matters. This is an example of some code I was using for a tutorial:
x = torch.tensor(2., requires_grad = False)
w = torch.tensor(3., requires_grad = True)
b = torch.tensor(1., requires_grad = True)
print("x:", x)
print("w:", w)
print("b:", b)
# define a function of the above defined tensors
y = w * x + b
print("y:", y)
# take the backward() for y
y.backward()
# print the gradients w.r.t. above x, w, and b
print("x.grad:", x.grad)
print("w.grad:", w.grad)
print("b.grad:", b.grad)
Thank you!
I want to write a batched pairwise bi-variate moran's I. The formula can be found here.
If X and Y are both (n,n) then the weight matrix W has dimension (n^2, n^2).
I think I have a vectorization for a dumb toy example with 1 pair as follows. Note that you have to flatten and standardize x,y.
n_elts = 4
x = torch.FloatTensor([0,1,1,0])
y = torch.FloatTensor([0,1,1,0])
w = torch.FloatTensor(np.identity(n_elts))
x = x - torch.mean(x)
y = y - torch.mean(y)
ans = torch.sum(torch.outer(x,y) * w)/(torch.norm(x)**2) * (n_elts/torch.sum(w)) # = 1
I'm having a hard time extending this to the batched pairwise case. That is, x has shape (B,n,n) and y has shape (C,n,n). You can assume they get flattened to (B,n^2) and (C, n^2), respectively. The output should have shape (B,C). Here B is batch size and C is some number that will generally be different than B.
So far all I can figure out is that, again if x is (B,n^2) and y is (C,n^2) then I can get a broadcasted outer product as follows
at = a[:,None,:,None]
bt = b[None,:,None,:]
outer = at*bt # has shape (B,C,n^2,n^2)
I am trying to filter out all non-gray values within a given tolerance with the following code. It gives the expected results but runs too slowly to use in practice. Is there a way to do the following using numpy operations?
for i in range(height):
for j in range(width):
r, g, b = im_arr[i][j]
r = (r + 150) / 2
g = (g + 150) / 2
b = (b + 150) / 2
mean = (r + g + b) / 3
diffr = abs(mean - r)
diffg = abs(mean - g)
diffb = abs(mean - b)
maxdev = 2
if (diffr + diffg + diffb) > maxdev:
im_arr[i][j][0] = 0
im_arr[i][j][1] = 0
im_arr[i][j][2] = 0
Looping in plain Python is slow: one of the advantages of numpy is that
transversing the arrays is highly optimized. Without commenting on the algorithm, you can get the same results using only numpy, which will be much faster
Since im_arr is an image, it is very likely that the dtype is np.uint8.
That is only 8 bits, so you have to be careful of overflows. In you code, when you add 150 to a number, the result will be of type np.int64. But if you add 150 to an 8-bit np.ndarray, the result will still be of type np.uint8 and it can overflow.
You can either change the array type (using astype) or add a float, which will automatically promote the array to float
mod_img = (im_arr + 150.)/2 # the point of "150." is important
signed_dif = mod_img - np.mean(mod_img, axis=2, keepdims=True)
collapsed_dif = np.sum(np.abs(signed_dif), axis=2)
maxdev = 2
im_arr[collapsed_dif > maxdev] = 0
This can be done without any loop. I'll try to break out every step into a dedicated line
import numpy as np
im_arr = np.random.rand(300,400,3) # Assuming this how you image looks like
img_shifted = (im_arr + 15) / 2 # This can be done in one go
mean_v = np.mean(img_shifted, axis=2) # Compute the mean along the channel axis
diff_img = np.abs(mean_v[:,:,None] - img_shifted) # Broadcasting to subtract n x m from n x m x k
maxdev = 2
selection = np.sum(diff_img, axis=2) > maxdev
im_arr[selection] = 0 # Using fancy indexing with booleans
Below I implemented the gradient descent function found here. (Method: batch_gradient_descent) I wanted to try and implement it without numpy because I don't know numpy arrays well and I want to intuitively understand the required operations.
I created a simple linear relation in lists x and y such that y0 = 5x0+1. x is initially a list of two element lists of n elements and y is a st of n elements. The second element of x is 1 to account for the bias weight. The weights are initialized to [0,0].
I think my issue is in GetGradient() as I have tested GetHypothesis() - a dot product operation between the weights and each 2-element input list- and Subtract1d() - both of which are easy enough to verify on run 1 of the loop when the weights are 0.
GetGradient() calculates X.T.dot(loss) / m found in their tutorial, which is what this method implements. But I don't know where I am going wrong. Gradient shuffles off to nan-land pretty soon after 10 iterations.
First ever posted question, so I appreciate the help.
def dot(K, L):
if len(K) != len(L):
return False
return sum(i[0] * i[1] for i in zip(K, L))
def GetHypothesis(input,weights):
hyp = []
for i in range(len(input)):
d = dot(input[i],weights)
hyp.append(d)
return hyp
def Subtract1D(hyp,act):
loss = []
for i in range (len(hyp)):
loss.append(hyp[i]-act[i])
return loss
def GetGradient(x,loss):
col1 = []
col2 = []
for ele in x:
col1.append(ele[0])
col2.append(ele[1])
w1 = dot(col1,loss)
w2 = dot(col2,loss)
w1 = w1/len(x)
w2 = w2/len(x)
return [w1, w2]
def GradientDescent(x,y,w,lr=0.1):
for i in range(10):
# hypothesis
h = GetHypothesis(x,w)
loss = Subtract1D(h,y)
gradientVector = []
grad = GetGradient(x,loss)
grad[0] = grad[0]*lr
grad[1] = grad[1]*lr
w[0] = (w[0] - grad[0])
w[1] = (w[1] - grad[1])
print(grad)
return w
x = list(range(1,50))
y = [i*5+1 for i in x]
for i in range(len(x)):
x[i] = [x[i],1]
w = [0,0]
GradientDescent(x,y,w)
I have to implement least square fitting algorithm for this model function
Y = a_0 * e^(a_1*x_1+a_2*x_2+...+a_n*x_n)
The approach I found was to define function to calculate residuals and pass it to scipy.optimize.leastsq or lmfit. Yet i cannot make it to work with multidimensional data, when parameters are vector and not single values.
def residual(variables,X,y):
a_0 = variables[0]
a = variables[1]
return (y - a_0 * np.exp(X.dot(a)))**2
X = np.random.randn(100,5)
y = np.random.randint(low=0,high=2,size=100)
a_0 = 1
a = np.random.randn(X.shape[1])
leastsq(residual,[a_0,a],args=(X,y))
I get this error.
ValueError: setting an array element with a sequence.
Can you point me the right course of action from here?
I think something like this should do the job :
def residual(variables,X,y):
a_0 = variables[0]
a = variables[1:]
return (y - a_0 * np.exp(X.dot(a)))**2
X = np.random.randn(100,5)
y = np.random.randint(low=0,high=2,size=100)
a = np.random.randn(X.shape[1]+1)
a[0] = 1
res = scipy.optimize.leastsq(residual,a,args=(X,y))
Regards