Define output size of DQN

Define output size of DQN - python

I recently learned about Q-Learning with the example of the Gym environment "CartPole-v1".
The predict function of said model always returns a vector that looks like [[ 0.31341377 -0.03776223]]. I created my own little game, where the Ai has to move left or right with ouput 0 and 1. I just show a list [0, 0, 1, 0, 0] to the network, if it ouputs 0 it goes left, if it outputs 1 it goes right. Reach the left 0 and you win, right 0 and you lose. Really easy. When i print my ouput Vector however, i always get something like this:
[[0.01347399 0.04450664]
[0.01347399 0.04450664]
[0.01347399 0.04450664]
[0.1216775 0.38299465]
[0.01347399 0.04450664]]
This messes with the learning function because np.argmax() then returns something like or 5 and the network cannot handle this given the fact that there are only 2 actions to begin with.
This is the init of my model:
def __init__(self, state_shape, num_actions, lr):
super(DQN, self).__init__()
self.state_shape = state_shape # (1,)
self.num_actions = num_actions # 2
self.lr = lr # 1e-3
input_state = Input(shape=state_shape)
x = Dense(20)(input_state)
x = Activation('relu')(x)
x = Dense(20)(x)
x = Activation('relu')(x)
output_pred = Dense(self.num_actions)(x)
self.model = Model(inputs=input_state, outputs=output_pred)
self.model.compile(loss="mse", optimizer=Adam(lr=self.lr))
Full code is available at https://www.mediafire.com/file/rq7ogjxpr990e51/dqn.py/file.
How do i crop the output vector? Or how would i have to change my inputs to get a useful output?
Edit:
I've experimented a little more, and increasing num_actions from currently 2 to for example 4 does increase the vector horizontally, so it looks like this:
[[ 0.00109814 0.01464381 -0.00270887 -0.00422738]
[ 0.00109814 0.01464381 -0.00270887 -0.00422738]
[-0.01450843 0.10628925 -0.06114068 -0.10908635]
[ 0.00109814 0.01464381 -0.00270887 -0.00422738]
[ 0.00109814 0.01464381 -0.00270887 -0.00422738]]
This means num_actions as 2 is not the problem, its rather that it ouputs 5 lines instead of 1.

So after even more experiments, i have found a solution.
The input is still the list [0, 0, 1, 0, 0] which has the len() = 5.
This explains the five rows. If its changed to [[0, 0, 1, 0, 0]] and the state_shape is changed to (5, ) it works and I get a vector with 2 values.
*All the other functions that access the list have to be changed from board[idx] to board[0][idx].

Related

np.random.rand somehow produces floats bigger than 1

I'm trying to create a 2-layer neural network, for that I first initialize weights and biases to random floats between 0 an 1 using numpy.random.rand. However, for some reason this process produces floats bigger than 1 for W1 (weight 1) whereas it works correctly for all other weights an biases. I can't understand why this happens, I thought maybe something affects the function from outside the function where I initialized the parameters, but I couldn't detect any part in the function that could be affected from outside the function.
import numpy as np
### CONSTANTS DEFINING THE MODEL ####
n_x = 12288 # num_px * num_px * 3
n_h = 7
n_y = 1
layers_dims = (n_x, n_h, n_y)
def initialize_parameters_deep(layer_dims):
"""
Arguments:
layer_dims -- python array (list) containing the dimensions of each layer in our network
Returns:
parameters -- python dictionary containing your parameters "W1", "b1","W2", "b2":
"""
np.random.seed(1)
parameters = {}
parameters["W1"] = np.random.rand(n_h, n_x) #(7, 12288)
parameters["b1"] = np.random.rand(n_h, 1) #(7)
parameters["W2"] = np.random.rand(n_y, n_h) #(7, 1)
parameters["b2"] = np.random.rand(n_y, 1) #(1)
return parameters
parameters = initialize_parameters_deep(layers_dims)
print(parameters)
Output:
{'W1': array([[4.17022005e-01, 7.20324493e-01, 1.14374817e-04, ...,
3.37562919e-01, 1.12292153e-01, 5.37047221e-01],
[7.07934286e-01, 3.37726007e-01, 7.07954162e-01, ...,
4.22040811e-01, 7.78593215e-01, 3.49866021e-01],
[9.01338451e-01, 7.95132845e-03, 1.03777034e-01, ...,
2.78602449e-01, 5.05813021e-02, 8.26828833e-01],
...,
[5.62717083e-03, 6.58208224e-01, 3.88407263e-01, ...,
5.56312618e-01, 8.69650932e-01, 1.00112287e-01],
[4.16278934e-01, 4.56060621e-01, 9.33378848e-01, ...,
9.52798385e-01, 9.41894584e-01, 4.44342962e-01],
[8.89254832e-01, 6.42558949e-01, 2.29427262e-01, ...,
8.05884494e-01, 1.80676088e-01, 6.12694420e-01]]), 'b1': array([[0.11933315],
[0.50073416],
[0.21336813],
[0.14223935],
[0.60809243],
[0.41994954],
[0.43137737]]), 'W2': array([[0.81360697, 0.44638382, 0.41794085, 0.08649817, 0.29957473,
0.33706742, 0.24721952]]), 'b2': array([[0.92363097]])}

It's not generating floats bigger than 1, it's just representing them differently.
4.17022005e-01 is the same as 0.417022005, and 1.14374817e-04 is the same as 0.000114374817.
See here or here.

The e-01, e-02, e-03, etc at the end of the W1 numbers just mean that the numbers are written in exponential format. So if you have for example 2.786e-01 that is the same as if it was written like (2.786/10) and that is the same as 0.2786. Same thing goes for: 2.786e-03 == (2.786/1000) == 0.002786. e+2 is 10^2 and e-2 is 1/(10^2).

Pay attention to the final few characters printed when you print your weights parameter tensor, which gives e.g. e-01. This represents base-10 exponentiation, i.e. meaning that the value of a given weight is the number printed times 10 to the given power.
All of the powers are negative, meaning the weights have small but positive values in the range [0, 1].
For example, 4.17022005e-01 equals 0.417022005.

How to apply crop_to_bounding_box in TensorFlow 2 to a symbolic tensor?

I have the following bit of code that is designed to find the first and last pixels with a pixel intensity of >= 25 and then crop to that bounding box:
white_pixels = tf.where(input_img >= 25)
first_white_pixel = white_pixels[:, 0]
first_white_pixel = tf.cast(first_white_pixel, dtype=tf.int32)
last_white_pixel = white_pixels[:, -1]
last_white_pixel = tf.cast(last_white_pixel, dtype=tf.int32)
cropped = tf.image.crop_to_bounding_box(input_img, first_white_pixel[0], 0, last_white_pixel[0] - first_white_pixel[0], 299)
However, I keep getting an error saying that the target_height in tf.image.crop_to_bounding_box must be above 0. In all my images, the result of last_white_pixel[0] - first_white_pixel[0] is definitely above 0. The code is executed as a symbolic tensor in TensorFlow 2.3 and works fine in a non-symbolic setting (for the lack of a better term).

My mistake seems to have been a slicing error. Instead of writing
first_white_pixel = white_pixels[:, 0]
last_white_pixel = white_pixels[:, -1]
I should have written
first_white_pixel = white_pixels[0, :]
last_white_pixel = white_pixels[-1, :]

Random weight initialisation influence on a simple neural network

I am following a book which has the following code:
import numpy as np
np.random.seed(1)
streetlights = np.array([[1, 0, 1], [0, 1, 1], [0, 0, 1], [1, 1, 1]])
walk_vs_stop = np.array([[1, 1, 0, 0]]).T
def relu(x):
return (x > 0) * x
def relu2deriv(output):
return output > 0
alpha = 0.2
hidden_layer_size = 4
# random weights from the first layer to the second
weights_0_1 = 2*np.random.random((3, hidden_layer_size)) -1
# random weights from the second layer to the output
weights_1_2 = 2*np.random.random((hidden_layer_size, 1)) -1
for iteration in range(60):
layer_2_error = 0
for i in range(len(streetlights)):
layer_0 = streetlights[i : i + 1]
layer_1 = relu(np.dot(layer_0, weights_0_1))
layer_2 = relu(np.dot(layer_1, weights_1_2))
layer_2_error += np.sum((layer_2 - walk_vs_stop[i : i + 1])) ** 2
layer_2_delta = layer_2 - walk_vs_stop[i : i + 1]
layer_1_delta = layer_2_delta.dot(weights_1_2.T) * relu2deriv(layer_1)
weights_1_2 -= alpha * layer_1.T.dot(layer_2_delta)
weights_0_1 -= alpha * layer_0.T.dot(layer_1_delta)
if iteration % 10 == 9:
print(f"Error: {layer_2_error}")
Which outputs:
# Error: 0.6342311598444467
# Error: 0.35838407676317513
# Error: 0.0830183113303298
# Error: 0.006467054957103705
# Error: 0.0003292669000750734
# Error: 1.5055622665134859e-05
I understand everything but this part is not explained and I am not sure why it is the way it is:
weights_0_1 = 2*np.random.random((3, hidden_layer_size)) -1
weights_1_2 = 2*np.random.random((hidden_layer_size, 1)) -1
I don't understand:
Why there is 2* the whole matrix and why is there a -1
If I change 2 to 3 my error becomes greatly lower # Error: 5.616513576418916e-13
I tried changing the 2 to many other numbers along with the change of -1 to many other numbers I get # Error: 2.0 most of the time or the Error is much worst than combination of 3 and -1.
I can't seem to grasp the relationship and the purpose of multiplying the random weights by a number and subracting a number afterwards.
P.S. The idea of the network is to understand a streetlight pattern when people should go and when they should stop depending what combination of the lights in streetlight is on / off.

There is a lot of ways to initialize neural network, and it's a current research subject as it can have a great impact on performance and training time. Some rules of thumb :
avoid having only one value for all weights, as they would all update the same
avoid having too large weights that could make your gradient too high
avoid having too small weights that could make your gradient vanish
In your case, the goal is just to have something between [-1;1] :
np.random.random gives you a float in [0;1]
multiply by 2 gives you something in [0;2]
substract 1 gives you a number in [-1;1]

2*np.random.random((3, 4)) -1 is a way to generated 3*4=12 random number from uniform distribution of half-open interval [-1, +1) i.e including -1 but excluding +1.
This is equivalent to more readable code
np.random.uniform(-1, 1, (3, 4))

Converting numpy equation to Keras backend loss function equation

I'm working on a model to generate music. All of my training data is in the same key and mode, C Major. I have a numpy array keyspace with shape (n,) that represents the total number of keys on my keyboard (in a chromatic scale). The slots in that array with a 1 are keys that are in C Major; the slots that have 0s are not in C Major.
The model predicts which keys should be pressed as an array y_pred. I want to add a term to my loss function that penalizes the model for pressing keys that aren't in C Major. That said, I don't want to penalize my model for failing to press keys in the keyspace (as not every beat uses every key in the scale!). In numpy, I can do this like so:
import numpy as np
keyspace = np.array( [0, 1, 0, 1, 0, 1] )
y_pred = np.array( [1, 0, 0, 1, 0, 1] )
loss_term = 0
for idx, i in enumerate(y_pred):
if i:
if not keyspace[idx]:
loss_term += 1
loss_term
I'd now like to convert this to Keras backend functions, which means vectorizing this. Does anyone see a good way to do so? Any pointers would be very helpful!

Your code is basically:
((1-keyspace) * y_pred).sum()
Test:
def loop_loss(keyspace, y_pred):
loss_term = 0
for idx, i in enumerate(y_pred):
if i and not keyspace[idx]:
loss_term += 1
return loss_term
keyspace, y_pred = np.random.choice([0,1], (2,10))
loop_loss(keyspace, y_pred) == ((1-keyspace) * y_pred).sum()
# True

Simple neural network gives wrong output after training

I've been working on a simple neural network.
It takes in a data set with 3 columns, if the first column's value is a 1, then the output should be a 1.
I've provided comments so it is easier to follow.
Code is as follows:
import numpy as np
import random
def sigmoid_derivative(x):
return x * (1 - x)
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def think(weights, inputs):
sum = (weights[0] * inputs[0]) + (weights[1] * inputs[1]) + (weights[2] * inputs[2])
return sigmoid(sum)
if __name__ == "__main__":
# Assign random weights
weights = [-0.165, 0.440, -0.867]
# Training data for the network.
training_data = [
[0, 0, 1],
[1, 1, 1],
[1, 0, 1],
[0, 1, 1]
]
# The answers correspond to the training_data by place,
# so first element of training_answers is the answer to the first element of training_data
# NOTE: The pattern is if there's a 1 in the first place, the result should be a one
training_answers = [0, 1, 1, 0]
# Train the neural network
for iteration in range(50000):
# Pick a random piece of training_data
selected = random.randint(0, 3)
training_output = think(weights, training_data[selected])
# Calculate the error
error = training_output - training_answers[selected]
# Calculate the adjustments that need to be applied to the weights
adjustments = np.dot(training_data[selected], error * sigmoid_derivative(training_output))
# Apply adjustments, maybe something wrong is going here?
weights += adjustments
print("The Neural Network has been trained!")
# Result of print below should be close to 1
print(think(weights, [1, 0, 0]))
The result of the last print should be close to 1, however it is not?
I have a feeling that I'm not adjusting the weights correctly.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Define output size of DQN - python

Related

np.random.rand somehow produces floats bigger than 1

How to apply crop_to_bounding_box in TensorFlow 2 to a symbolic tensor?

Random weight initialisation influence on a simple neural network

Converting numpy equation to Keras backend loss function equation

Simple neural network gives wrong output after training

Categories

Resources