Converting numpy equation to Keras backend loss function equation - python

I'm working on a model to generate music. All of my training data is in the same key and mode, C Major. I have a numpy array keyspace with shape (n,) that represents the total number of keys on my keyboard (in a chromatic scale). The slots in that array with a 1 are keys that are in C Major; the slots that have 0s are not in C Major.
The model predicts which keys should be pressed as an array y_pred. I want to add a term to my loss function that penalizes the model for pressing keys that aren't in C Major. That said, I don't want to penalize my model for failing to press keys in the keyspace (as not every beat uses every key in the scale!). In numpy, I can do this like so:
import numpy as np
keyspace = np.array( [0, 1, 0, 1, 0, 1] )
y_pred = np.array( [1, 0, 0, 1, 0, 1] )
loss_term = 0
for idx, i in enumerate(y_pred):
if i:
if not keyspace[idx]:
loss_term += 1
loss_term
I'd now like to convert this to Keras backend functions, which means vectorizing this. Does anyone see a good way to do so? Any pointers would be very helpful!

Your code is basically:
((1-keyspace) * y_pred).sum()
Test:
def loop_loss(keyspace, y_pred):
loss_term = 0
for idx, i in enumerate(y_pred):
if i and not keyspace[idx]:
loss_term += 1
return loss_term
keyspace, y_pred = np.random.choice([0,1], (2,10))
loop_loss(keyspace, y_pred) == ((1-keyspace) * y_pred).sum()
# True

Related

Issue with Python scipy optimize minimize fmin_slsqp solver

I start with the optimization function from scipy.
I tried to create my code by copying the Find optimal vector that minimizes function solution
I have an array that contains series in columns. I need to multiply each of them by a weight so that the sum of last row of these columns multiplied by the weights gives a given number (constraint).
The sum of the series multiplied by the weights gives a new series where I extract the max-draw-down and I want to minimize this mdd.
I wrote my code as best as I can (2 months of Python and 3 hours of scipy) and can't solve the error message on the function used to solve the problem.
Here is my code and any help would be much appreciated:
import numpy as np
from scipy.optimize import fmin_slsqp
# based on: https://stackoverflow.com/questions/41145643/find-optimal-vector-that-minimizes-function
# the number of columns (and so of weights) can vary; it should be generic, regardless the number of columns
def mdd(serie): # finding the max-draw-down of a series (put aside not to create add'l problems)
min = np.nanargmax(np.fmax.accumulate(serie) - serie)
max = np.nanargmax((serie)[:min])
return serie[np.nanargmax((serie)[:min])] - serie[min] # max-draw-down
# defining the input data
# mat is an array of 5 columns containing series of independent data
mat = np.array([[1, 0, 0, 1, 1],[2, 0, 5, 3, 4],[3, 2, 4, 3, 7],[4, 1, 3, 3.1, -6],[5, 0, 2, 5, -7],[6, -1, 4, 1, -8]]).astype('float32')
w = np.ndarray(shape=(5)).astype('float32') # 1D vector for the weights to be used for the columns multiplication
w0 = np.array([1/5, 1/5, 1/5, 1/5, 1/5]).astype('float32') # initial weights (all similar as a starting point)
fixed_value = 4.32 # as a result of constraint nb 1
# testing the operations that are going to be used in the minimization
series = np.sum(mat * w0, axis=1)
# objective:
# minimize the mdd of the series by modifying the weights (w)
def test(w, mat):
series = np.sum(mat * w, axis=1)
return mdd(series)
# constraints:
def cons1(last, w, fixed_value): # fixed_value = 4.32
# the sum of the weigths multiplied by the last value of each column must be equal to this fixed_value
return np.sum(mat[-1, :] * w) - fixed_value
def cons2(w): # the sum of the weights must be equal to 1
return np.sum(w) - 1
# solution:
# looking for the optimal set of weights (w) values that minimize the mdd with the two contraints and bounds being respected
# all w values must be between 0 and 1
result = fmin_slsqp(test, w0, f_eqcons=[cons1, cons2], bounds=[(0.0, 1.0)]*len(w), args=(mat, fixed_value, w0), full_output=True)
weights, fW, its, imode, smode = result
print(weights)
You weren't that far off the mark. The biggest problem lies in the mdd function: In case there is no draw-down, your function spits out an empty list as an intermediate result, which then can no longer cope with the argmax function.
def mdd(serie): # finding the max-draw-down of a series (put aside not to create add'l problems)
i = np.argmax(np.maximum.accumulate(serie) - serie) # end of the period
start = serie[:i]
# check if there is dd at all
if not start.any():
return 0
j = np.argmax(start) # start of period
return serie[j] - serie[i] # max-draw-down
In addition, you must make sure that the parameter list is the same for all functions involved (cost function and constraints).
# objective:
# minimize the mdd of the series by modifying the weights (w)
def test(w, mat,fixed_value):
series = mat # w
return mdd(series)
# constraints:
def cons1(w, mat, fixed_value): # fixed_value = 4.32
# the sum of the weigths multiplied by the last value of each column must be equal to this fixed_value
return mat[-1, :] # w - fixed_value
def cons2(w, mat, fixed_value): # the sum of the weights must be equal to 1
return np.sum(w) - 1
# solution:
# looking for the optimal set of weights (w) values that minimize the mdd with the two contraints and bounds being respected
# all w values must be between 0 and 1
result = fmin_slsqp(test, w0, eqcons=[cons1, cons2], bounds=[(0.0, 1.0)]*len(w), args=(mat,fixed_value), full_output=True)
One more remark: You can make the matrix-vector multiplications much leaner with the #-operator.

Random weight initialisation influence on a simple neural network

I am following a book which has the following code:
import numpy as np
np.random.seed(1)
streetlights = np.array([[1, 0, 1], [0, 1, 1], [0, 0, 1], [1, 1, 1]])
walk_vs_stop = np.array([[1, 1, 0, 0]]).T
def relu(x):
return (x > 0) * x
def relu2deriv(output):
return output > 0
alpha = 0.2
hidden_layer_size = 4
# random weights from the first layer to the second
weights_0_1 = 2*np.random.random((3, hidden_layer_size)) -1
# random weights from the second layer to the output
weights_1_2 = 2*np.random.random((hidden_layer_size, 1)) -1
for iteration in range(60):
layer_2_error = 0
for i in range(len(streetlights)):
layer_0 = streetlights[i : i + 1]
layer_1 = relu(np.dot(layer_0, weights_0_1))
layer_2 = relu(np.dot(layer_1, weights_1_2))
layer_2_error += np.sum((layer_2 - walk_vs_stop[i : i + 1])) ** 2
layer_2_delta = layer_2 - walk_vs_stop[i : i + 1]
layer_1_delta = layer_2_delta.dot(weights_1_2.T) * relu2deriv(layer_1)
weights_1_2 -= alpha * layer_1.T.dot(layer_2_delta)
weights_0_1 -= alpha * layer_0.T.dot(layer_1_delta)
if iteration % 10 == 9:
print(f"Error: {layer_2_error}")
Which outputs:
# Error: 0.6342311598444467
# Error: 0.35838407676317513
# Error: 0.0830183113303298
# Error: 0.006467054957103705
# Error: 0.0003292669000750734
# Error: 1.5055622665134859e-05
I understand everything but this part is not explained and I am not sure why it is the way it is:
weights_0_1 = 2*np.random.random((3, hidden_layer_size)) -1
weights_1_2 = 2*np.random.random((hidden_layer_size, 1)) -1
I don't understand:
Why there is 2* the whole matrix and why is there a -1
If I change 2 to 3 my error becomes greatly lower # Error: 5.616513576418916e-13
I tried changing the 2 to many other numbers along with the change of -1 to many other numbers I get # Error: 2.0 most of the time or the Error is much worst than combination of 3 and -1.
I can't seem to grasp the relationship and the purpose of multiplying the random weights by a number and subracting a number afterwards.
P.S. The idea of the network is to understand a streetlight pattern when people should go and when they should stop depending what combination of the lights in streetlight is on / off.
There is a lot of ways to initialize neural network, and it's a current research subject as it can have a great impact on performance and training time. Some rules of thumb :
avoid having only one value for all weights, as they would all update the same
avoid having too large weights that could make your gradient too high
avoid having too small weights that could make your gradient vanish
In your case, the goal is just to have something between [-1;1] :
np.random.random gives you a float in [0;1]
multiply by 2 gives you something in [0;2]
substract 1 gives you a number in [-1;1]
2*np.random.random((3, 4)) -1 is a way to generated 3*4=12 random number from uniform distribution of half-open interval [-1, +1) i.e including -1 but excluding +1.
This is equivalent to more readable code
np.random.uniform(-1, 1, (3, 4))

Simple neural network gives wrong output after training

I've been working on a simple neural network.
It takes in a data set with 3 columns, if the first column's value is a 1, then the output should be a 1.
I've provided comments so it is easier to follow.
Code is as follows:
import numpy as np
import random
def sigmoid_derivative(x):
return x * (1 - x)
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def think(weights, inputs):
sum = (weights[0] * inputs[0]) + (weights[1] * inputs[1]) + (weights[2] * inputs[2])
return sigmoid(sum)
if __name__ == "__main__":
# Assign random weights
weights = [-0.165, 0.440, -0.867]
# Training data for the network.
training_data = [
[0, 0, 1],
[1, 1, 1],
[1, 0, 1],
[0, 1, 1]
]
# The answers correspond to the training_data by place,
# so first element of training_answers is the answer to the first element of training_data
# NOTE: The pattern is if there's a 1 in the first place, the result should be a one
training_answers = [0, 1, 1, 0]
# Train the neural network
for iteration in range(50000):
# Pick a random piece of training_data
selected = random.randint(0, 3)
training_output = think(weights, training_data[selected])
# Calculate the error
error = training_output - training_answers[selected]
# Calculate the adjustments that need to be applied to the weights
adjustments = np.dot(training_data[selected], error * sigmoid_derivative(training_output))
# Apply adjustments, maybe something wrong is going here?
weights += adjustments
print("The Neural Network has been trained!")
# Result of print below should be close to 1
print(think(weights, [1, 0, 0]))
The result of the last print should be close to 1, however it is not?
I have a feeling that I'm not adjusting the weights correctly.

How to substract two tensor with mask in Tensorflow?

I am implementing YOLO network with a selfdefine loss.
Say there two tensor,GT and PD (ground truth and predicts).both are 2 dims matrix of 4x4.
Assume GT is:
0,0,0,0
0,1,0,0
0,0,1,0
0,0,0,0
PD has the same size with some random nums.
Here I need to calc Mean Squared Error separately.
calc MSE with ones in GT and calc MSE with zeros in GT seperately.
I prefer to use a mask to cover the unrelated elements, so the calculation with only calc the related elements. I already implemented this in numpy, but don't know how to do this with tf(v1.14)
import numpy as np
import numpy.ma as ma
conf = y_true[...,0]
conf = np.expand_dims(conf,-1)
conf_pred = y_pred[...,0]
conf_pred = np.expand_dims(conf_pred,-1)
noobj_conf = ma.masked_equal(conf,1) #cover grid with objects
obj_conf = ma.masked_equal(conf,0) #cover grid without objects
loss_obj = np.sum(np.square(obj_conf - conf_pred))
loss_noobj = np.sum(np.square(noobj_conf - conf_pred))
Any suggestions about how to implement this in tensorflow?
If I understand you correctly, you want to calculate mean square errors of 0's and 1's separately.
You can do something like below:
y_true = tf.constant([[0,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,0,0]], dtype=tf.float32)
y_pred = tf.random.uniform([4, 4], minval=0, maxval=1)
# find indices where 0 is present in y_true
indices0 = tf.where(tf.equal(y_true, tf.zeros([1.])))
# find indices where 1 is present in y_true
indices1 = tf.where(tf.equal(y_true, tf.ones([1.])))
# find all values in y_pred which are present at indices0
y_pred_indices0 = tf.gather_nd(y_pred, indices0)
# find all values in y_pred which are present at indices1
y_pred_indices1 = tf.gather_nd(y_pred, indices1)
# mse loss calculations
mse0 = tf.losses.mean_squared_error(labels=tf.gather_nd(y_true, indices0), predictions=y_pred_indices0)
mse1 = tf.losses.mean_squared_error(labels=tf.gather_nd(y_true, indices1), predictions=y_pred_indices1)
# mse0 = tf.reduce_sum(tf.squared_difference(tf.gather_nd(y_true, indices0), y_pred_indices0))
# mse1 = tf.reduce_sum(tf.squared_difference(tf.gather_nd(y_true, indices1), y_pred_indices1))
with tf.Session() as sess:
y_, loss0, loss1 = sess.run([y_pred, mse0, mse1])
print(y_)
print(loss0, loss1)
output:
[[0.12770343 0.43467927 0.9362457 0.09105921]
[0.46243036 0.8838414 0.92655015 0.9347118 ]
[0.14018488 0.14527774 0.8395766 0.14391887]
[0.1209656 0.7793218 0.70543754 0.749542 ]]
0.341359 0.019614244

Equivalent of numpy.digitize in tensorflow

I am working on a customised loss function that uses numpy.digitize() internally. The loss is minimised for a set of parameters that are the bins values used in digitize method. In order to use the tensorflow optimisers, I would like to know if there an equivalent implementation of digitize in tensorflow? if not is there a good way to implement a workaround?
Here a numpy version:
def fom_func(b, n):
np.where((b > 0) & (n > 0), np.sqrt(2*(n*np.log(np.divide(n,b)) + b - n)),0)
def loss(param, X, y):
param = np.sort(np.asarray(param))
nbins = param.shape[0]
score = 0
y_pred = np.digitize(X, param)
for c in np.arange(nbins):
b = np.where((y==0) & (y_pred==c), 1, 0).sum()
n = np.where((y_pred==c), 1, 0).sum()
score += fom_func(b,n)**2
return -np.sqrt(score)
The equivalent of np.digitize method is called bucketize in TensorFlow, quoting from this api doc:
Bucketizes 'input' based on 'boundaries'.
Summary
For example, if the inputs are boundaries = [0, 10, 100] input = [[-5, 10000] [150, 10] [5, 100]]
then the output will be output = [[0, 3] [3, 2] [1, 3]]
Arguments:
scope: A Scope object
input: Any shape of Tensor contains with int or float type.
boundaries: A sorted list of floats gives the boundary of the buckets.
Returns:
Output: Same shape with 'input', each value of input replaced with bucket index.
(numpy) Equivalent to np.digitize.
I'm not sure why but, this method is hidden in TensorFlow (see the hidden_ops.txt file). So I wouldn't count on it even if you can import it by doing:
from tensorflow.python.ops import math_ops
math_ops._bucketize
this has helped me, you only have to pay attention that the affiliation does not happen to the right or to the left but with regard to the spaces in between the bins:
import tensorflow_probability as tfp
tfp.stats.find_bins()

Categories