I am trying to use pybrain to output rgb values. The input layer takes an array of rgb values, and all hidden layers are linear models. I would have expected the network to output rgb values. However, the output of this network turns out to be an array of values that are no where near within the range of 0:255.
The images are about 25 different .jpg images of a bull. Each image is a flattened array of length 575280. I was hoping the network would converge on an image that ends up resembling a bull.
import numpy as np
from pybrain.structure import FeedForwardNetwork, LinearLayer, SigmoidLayer, GaussianLayer, TanhLayer
from pybrain.structure import FullConnection, BiasUnit
import testabull
bull_x = 510
bull_y = 398
bull_flat = 575280
n = FeedForwardNetwork()
bias_unit = BiasUnit()
in_layer = LinearLayer(bull_flat)
hidden_A = LinearLayer(5)
hidden_B = LinearLayer(10)
out_layer = LinearLayer(bull_flat)
n.addInputModule(in_layer)
n.addModule(hidden_A)
n.addModule(hidden_B)
n.addOutputModule(out_layer)
n.addModule(bias_unit)
in_to_hidden = FullConnection(in_layer, hidden_A)
hidden_to_hidden = FullConnection(hidden_A, hidden_B)
hidden_to_out = FullConnection(hidden_B, out_layer)
bias_to_hidden = FullConnection(hidden_B, out_layer)
n.addConnection(in_to_hidden)
n.addConnection(hidden_to_hidden)
n.addConnection(bias_to_hidden)
n.addConnection(hidden_to_out)
n.sortModules()
bull_img_array = testabull.crop_the_bull_images('../../imgs/thebull/')
trainable_array = [] ## an array of flattened images
for im in bull_img_array:
flat_im = np.array(im).flatten()
trainable_array.append(flat_im)
print n
print n.activate(trainable_array[0])
output = None
for a in trainable_array:
output = n.activate(a)
print output, len(output)
If anyone has any tips I would be very greatful.
First off there are two issues here, one you need to scale your outputs between 0 and 255. You can do this with some transformation afterwards. By taking the max and min value, then transposing between 0 and 255.
On the other hand this network will likely not learn what you'd like it to, your hidden layers are using Linear Layers. This is not very useful, as the weights themselves form a linear transformation. You'll essentially end up with a linear function. ftp://ftp.sas.com/pub/neural/FAQ2.html#A_act
I would recommend using a SigmoidLayer for your hidden layers, this of course squashes the values between 0 and 1. You can correct this in the output layer by multiplying by 255. Either via a fixed layer or just transforming the values afterwards.
Related
For my future use,I wanted to test multivariate multilayer perceptron.
In order to test it, I made a simple python program.
Here's the code.
import tensorflow as tf
import pandas as pd
import numpy as np
import random
input = []
result = []
for i in range(0,10000):
x = random.random()*100
y = random.random()*100
input.append([x,y])
result.append(x*y)
input = np.array(input,dtype=float)
result = np.array(result,dtype = float)
activation_func = "relu"
unit_count = 256
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(1,input_dim=2),
tf.keras.layers.Dense(unit_count,activation=activation_func),
tf.keras.layers.Dense(unit_count,activation=activation_func),
tf.keras.layers.Dense(unit_count,activation=activation_func),
tf.keras.layers.Dense(unit_count,activation=activation_func),
tf.keras.layers.Dense(1)])
model.compile(optimizer="adam",loss="mse")
model.fit(input,result,epochs=10)
predict_input = np.array([[7,3],[5,4],[8,8]]);
print(model.predict(predict_input))
I tried with this code, and the result was not good. The loss value seem not to get lower at some point.
I also tried with smaller x and y. It made model inaccurate with bigger numbers.
I've changed activation function, made more dense layers and increased the number of units but it didnt get better.
Neural networks are not able to adapt themself (without additional training) to a different domain, this means that you should train on a domain and run the inference on the same domain.
In images, we often just scale the input images from [0,255] to the [-1,1] and let the network learn from values in this range (and during inference we rescale always the input values to be in the [-1,1] range).
For solving your tasks you should bring the problem to a restricted domain.
In practice, if you're interested in training a model only for multiplying positive number you can squash them in the [0,1] range, and since the multiplication of values in this range always gives an output value in the same range.
I slightly modified your code and added some comments in the source code.
import random
import numpy as np
import pandas as pd
import tensorflow as tf
input = []
result = []
# We want to train our network to work in a fixed domain
# the [0,1] range.
# Let's also increase the training set -> more data is always better
for i in range(0, 100000):
x = random.random()
y = random.random()
input.append([x, y])
result.append(x * y)
print(input, result)
sys.exit()
input = np.array(input, dtype=float)
result = np.array(result, dtype=float)
activation_func = "relu"
unit_count = 256
# no need for a tons of layers
model = tf.keras.models.Sequential(
[
tf.keras.layers.Dense(unit_count, input_dim=2, activation=activation_func),
tf.keras.layers.Dense(unit_count, activation=activation_func),
tf.keras.layers.Dense(1, use_bias=False),
]
)
model.compile(optimizer="adam", loss="mse")
model.fit(input, result, epochs=10)
# Bring our input values in the [0,1] range
max_value = 10
predict_input = np.array([[7, 3], [5, 4], [8, 8]]) / max_value
print(predict_input)
# Back to the original domain
# Multiply by max_value**2 is required since the multiplication
# for a number in [0,1] it's the same of a division
print(model.predict(predict_input) * max_value ** 2)
Example output:
[[0.7 0.3]
[0.5 0.4]
[0.8 0.8]]
[[21.04468 ]
[20.028284]
[64.05521 ]]
I recently started learning Python and am trying to implement my first neural network. My goal is to write a function that generates a neural net with a variable amount of layers and nodes. All necessary information for that is stored in layerStructure (e.g.: First layer has four nodes, third layer has three nodes).
import numpy as np
#Vector of input layer
input = np.array([1,2,3,4])
#Amount of nodes in each layer
layerStructure = np.array([len(input),2,3])
#Generating empty weight matrix container
weightMatrix_arr = np.array([])
#Initialsing random weights matrices
for ii in range(len(layerStructure[0:-1])):
randmatrix = np.random.rand(layerStructure[ii+1],layerStructure[ii])
print(randmatrix)
The code above generates the following output:
[[0.6067148 0.66445212 0.54061231 0.19334004]
[0.22385007 0.8391435 0.73625366 0.86343394]]
[[0.61794333 0.9114799 ]
[0.10626486 0.95307027]
[0.50567023 0.57246852]]
My first attempt was to store each random weight matrix in a container array called weightMatrix_arr. However, since the shape of individual matrices varies, I cannot use np.append() to store them all in the matrix container. How can I save these matrices in order to access them during the backpropagation?
You can use a list instead of an np.array:
#Generating empty weight LIST container
weightMatrixes = []
#Initialsing random weights matrices
for ii in range(len(layerStructure[0:-1])):
randmatrix = np.random.rand(layerStructure[ii+1],layerStructure[ii])
weightMatrixes.append(randmatrix)
print(randmatrix)
Otherwise you can set the weightMatrix_arr dtype to object:
:
#Generating empty weight LIST container
weightMatrixes = np.array([], dtype=object)
#Initialsing random weights matrices
for ii in range(len(layerStructure[0:-1])):
randmatrix = np.random.rand(layerStructure[ii+1],layerStructure[ii])
weightMatrixes = np.append(weightMatrixes, randmatrix)
Note both ways you can't access the inner layer indexes without accessing the layer matrix:
weightMatrixes[layer, 0, 3] # ERROR
weightMatrixes[layer][0, 3] # OK
If memory consumption is not a problem, you can shape all layers as a longest one, and just ignore extra cells according to a layerStructure value.
I used a python dictionary to store the weights for each hidden layer with layer number as a key to the dictionary,
so that while retrieval is easy to access the weights I,e simple and clean use the dictionary to store the model weights,
its doesn't matter the shape of weights. below is a snippet of code.
"""def generate_weights(layers):
Weights={}
for i in range(1,len(layers)):
w0=2*np.random.random((layers[i-1],layers[i]))-1
Weights[i-1] = w0
return Weights
generate_weights([3,4,2])"""
I am trying to create a CNN in Keras (Python 3.7) which ingests a 2D matrix input (much like a grayscale image) and outputs a 1 dimensional vector. So far I did manage to get results, but I am not sure if what I am doing is correct (or if my intuition is).
I input a 100x50 array into my convolutional layer. This 2D array holds the peak information at every position (ie. x axis pertains to the position, y-axis pertains to the frequency, and each cell gives the intensity). The 3D graph of this shows something akin to the one given in this link.
From the (all of the) literature I have read, I learned that CNN accepts image data--image is converted into pixel values and then repeatedly convolved and pooled to get the output. However, I am using a MatLab simulator to get my input data, and I have access to the raw 2D array containing information on the peak frequency at each point.
My intuition is this: if we normalize each cell and feed the information to the CNN, it will be as if I fed the normalized pixel values of the image to the CNN, since my raw 2D array also has height, width and depth=1, like an image.
Please enlighten me if my thinking is correct or wrong.
My code is as follows:
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
import matplotlib.pyplot as plt
%matplotlib inline
import tensorflow as tf
import keras
'''load sample input'''
BGS1 = pd.read_csv("C:/Users/strain1_input.csv")
BGS2 = pd.read_csv("C:/Users/strain2_input.csv")
BGS3 = pd.read_csv("C:/Users/strain3_input.csv")
BGS_ = np.array([BGS1, BGS2, BGS3]) #3x100x50 array
BGS_normalized = BGS_/np.amax(BGS_)
'''load sample output'''
BFS1 = pd.read_csv("C:/Users/strain1_output.csv")
BFS2 = pd.read_csv("C:/Users/strain2_output.csv")
BFS3 = pd.read_csv("C:/Users/strain3_output.csv")
BFS_ = np.array([BFS1, BFS2, BFS3]) #3x100
BFS_normalized = BFS/50 #since max value for each cell is 50
#after splitting data into training, validation and testing sets,
output_nodes = 100
n_classes = 1
batch_size_ = 8 #so far, optimized for 8 batch size
epoch = 100
input_layer = Input(shape=(45,300,1))
conv1 = Conv2D(16,3,padding="same",activation="relu", input_shape =
(45,300,1))(input_layer)
pool1 = MaxPooling2D(pool_size=(2,2),padding="same")(conv1)
flat = Flatten()(pool1)
hidden1 = Dense(10, activation='softmax')(flat) #relu
batchnorm1 = BatchNormalization()(hidden1)
output_layer = Dense(output_nodes*n_classes, activation="softmax")(batchnorm1)
output_layer2 = Dense(output_nodes*n_classes, activation="relu")(output_layer)
output_reshape = Reshape((output_nodes, n_classes))(output_layer2)
model = Model(inputs=input_layer, outputs=output_reshape)
print(model.summary())
model.compile(loss='mean_squared_error', optimizer='adam', sample_weight_mode='temporal')
model.fit(train_X,train_label,batch_size=batch_size_,epochs=epoch)
predictions = model.predict(train_X)
what you did is exactly the strategy used to input non image data in to 2d convolutional layers. As long the model predicts correctly, what you did is correct. its just that CNN perform very poorly on non-image data or there might be chances to overfit. But then again, as long it performs correctly then its good.
I have a dataset with each data point having 4 images (different pixel sizes for each) that are correlated to each other. I want to do convolutions on them separately, and then combine the information for the 4 images and feed it to 1 dense network. How can I do this in keras functional API?
I also have 10 other features that are not images. I plan to feed it directly to the dense end of the network.
So what I want is:
4 independent conv layers
flatten
concatenate
Dense layers
1 Output
How can I provide the input to keras in such a way?
According to the description you provided, I think this is what you are looking for:
input_im1 = Input(...)
input_im2 = Input(...)
input_im3 = Input(...)
input_im4 = Input(...)
conv_im1 = Conv2D(...)(input_im1)
conv_im2 = Conv2D(...)(input_im2)
conv_im3 = Conv2D(...)(input_im3)
conv_im4 = Conv2D(...)(input_im4)
concat_conv = concatenate([conv_im1,conv_im2,conv_im3,conv_im4])
flatten_conv = Flatten()(concat_conv)
input_feat = Input(...)
concat_conv_feat = concatenate([flatten_conv, input_feat])
output = Dense(...)(concat_conv_feat)
model = Model([input_im1,input_im2,input_im3,input_im4,input_feat], output)
Though, I am not aware of the sizes of the input images and the parameters for each of the convolution layers. So you may need to modify the code above to adjust it to your exact requirements.
I am using Python Caffe, and confused with net.layers[layer_index].blobs and net.params[layer_type]. If I understand well, net.params contains all the network parameters. Take the LeNet for example, net.params['conv1'] represents the network coefficients for the 'conv1' layer. Then net.layer[layer_index].blobs should represent the same. However, what I found is that they are not exactly the same. I use the following codes to test it:
def _differ_square_sum(self,blobs):
import numpy as np
gradients = np.sum(np.multiply(blobs[0].diff,blobs[0].diff)) + np.sum(np.multiply(blobs[1].diff,blobs[1].diff))
return gradients
def _calculate_objective(self, iteration, solver):
net = solver.net
params = net.params
params_value_list = list(params.keys())
[print(k,v.data.shape) for k,v in net.blobs.items()]
layer_num = len(net.layers)
j = 0
for layer_index in range(layer_num):
if(len(net.layers[layer_index].blobs)>0):
cur_gradient = self._differ_square_sum(net.layers[layer_index].blobs)
key = params_value_list[j]
cur_gradient2 = self._differ_square_sum(params[key])
print([cur_gradient,cur_gradient2])
assert(cur_gradient == cur_gradient2)
Any ideas on the difference between them? Thanks.
You are mixing the trainable net parameters (stored in net.params) and the input data to the net (stored in net.blobs):
Once you are done training the model, net.params are fixed and will not change. However, for each new input example you are feeding to the net, net.blobs will store the different layers' response to that particular input.