Python - matrix multiplication code problem - python

I have this exercise where I get to build a simple neural network with one input layer and one hidden layer... I made the code below to perform a simple matrix multiplication, but it's not doing it properly as when I do the multiplication by hand. What am I doing wrong in my code?
#toes %win #fans
ih_wgt = ([0.1, 0.2, -0.1], #hid[0]
[-0.1, 0.1, 0.9], #hid[1]
[0.1, 0.4, 0.1]) #hid[2]
#hid[0] hid[1] #hid[2]
ho_wgt = ([0.3, 1.1, -0.3], #hurt?
[0.1, 0.2, 0.0], #win?
[0.0, 1.3, 0.1]) #sad?
weights = [ih_wgt, ho_wgt]
def w_sum(a,b):
assert(len(a) == len(b))
output = 0
for i in range(len(a)):
output += (a[i] * b[i])
return output
def vect_mat_mul(vec, mat):
assert(len(vec) == len(mat))
output = [0, 0, 0]
for i in range(len(vec)):
output[i]= w_sum(vec, mat[i])
return output
def neural_network(input, weights):
hid = vect_mat_mul(input, weights[0])
pred = vect_mat_mul(hid, weights[1])
return pred
toes = [8.5, 9.5, 9.9, 9.0]
wlrec = [0.65, 0.8, 0.8, 0.9]
nfans = [1.2, 1.3, 0.5, 1.0]
input = [toes[0],wlrec[0],nfans[0]]
pred = neural_network(input, weights)
print(pred)
the output of my code is:
[0.258, 0, 0]
The way I attempted to solve it by hand is as follows:
I multiplied the input vector [8.5, 0.65, 1.2] with the input weight matrix
ih_wgt = ([0.1, 0.2, -0.1], #hid[0]
[-0.1, 0.1, 0.9], #hid[1]
[0.1, 0.4, 0.1]) #hid[2]
[0.86, 0.295, 1.23]
the output vector is then fed into the network as an input vector which is then multiplied by the hidden weight matrix
ho_wgt = ([0.3, 1.1, -0.3], #hurt?
[0.1, 0.2, 0.0], #win?
[0.0, 1.3, 0.1]) #sad?
the correct output prediction:
[0.2135, 0.145, 0.5065]
Your help would be much appreciated!

You're almost there! Only a simple indentation thing is the reason:
def vect_mat_mul(vec, mat):
assert(len(vec) == len(mat))
output = [0, 0, 0]
for i in range(len(vec)):
output[i]= w_sum(vec, mat[i])
return output # <-- This one was inside the for loop

Related

Python: Optimize weights in portfolio

I have the following dataframe with weights:
df = pd.DataFrame({'a': [0.1, 0.5, 0.1, 0.3], 'b': [0.2, 0.4, 0.2, 0.2], 'c': [0.3, 0.2, 0.4, 0.1],
'd': [0.1, 0.1, 0.1, 0.7], 'e': [0.2, 0.1, 0.3, 0.4], 'f': [0.7, 0.1, 0.1, 0.1]})
and then I normalize each row using:
df = df.div(df.sum(axis=1), axis=0)
I want to optimize the normalized weights of each row such that no weight is less than 0 or greater than 0.4.
If the weight is greater than 0.4, it will be clipped to 0.4 and the additional weight will be distributed to the other entries in a pro-rata fashion (meaning the second largest weight will receive more weight so it gets close to 0.4, and if there is any remaining weight, it will be distributed to the third and so on).
Can this be done using the "optimize" function?
Thank you.
UPDATE: I would also like to set a minimum bound for the weights. In my original question, the minimum weight bound was automatically considered as zero, however, I would like to set a constraint such that the minimum weight is at at least equal to 0.05, for example.
Unfortunately, I can only find a loop solution to this problem. When you trim off the excess weight and redistribute it proportionally, the underweight may go over the limit. Then they have to be trimmed off. And the cycle keep repeating until no value is overweight. The same goes for underweight rows.
# The original data frame. No normalization yet
df = pd.DataFrame(
{
"a": [0.1, 0.5, 0.1, 0.3],
"b": [0.2, 0.4, 0.2, 0.2],
"c": [0.3, 0.2, 0.4, 0.1],
"d": [0.1, 0.1, 0.1, 0.7],
"e": [0.2, 0.1, 0.3, 0.4],
"f": [0.7, 0.1, 0.1, 0.1],
}
)
def ensure_min_weight(row: np.array, min_weight: float):
while True:
underweight = row < min_weight
if not underweight.any():
break
missing_weight = min_weight * underweight.sum() - row[underweight].sum()
row[~underweight] -= missing_weight / row[~underweight].sum() * row[~underweight]
row[underweight] = min_weight
def ensure_max_weight(row: np.array, max_weight: float):
while True:
overweight = row > max_weight
if not overweight.any():
break
excess_weight = row[overweight].sum() - (max_weight * overweight.sum())
row[~overweight] += excess_weight / row[~overweight].sum() * row[~overweight]
row[overweight] = max_weight
values = df.to_numpy()
normalized = values / values.sum(axis=1)[:, None]
min_weight = 0.15 # just for fun
max_weight = 0.4
for i in range(len(values)):
row = normalized[i]
ensure_min_weight(row, min_weight)
ensure_max_weight(row, max_weight)
# Normalized weight
assert np.isclose(normalized.sum(axis=1), 1).all(), "Normalized weight must sum up to 1"
assert ((min_weight <= normalized) & (normalized <= max_weight)).all(), f"Normalized weight must be between {min_weight} and {max_weight}"
print(pd.DataFrame(normalized, columns=df.columns))
# Raw values
# values = normalized * values.sum(axis=1)[:, None]
# print(pd.DataFrame(values, columns=df.columns))
Note that this algorithm will run into infinite loop if your min_weight and max_weight are illogical: try min_weight = 0.4 and max_weight = 0.5. You should handle these errors in the 2 ensure functions.

How to build OneHot Decoder in python

I have encoded my images(masks) with dimensions (img_width x img_height x 1) with OneHotEncoder in this way:
import numpy as np
def OneHotEncoding(im,n_classes):
one_hot = np.zeros((im.shape[0], im.shape[1], n_classes),dtype=np.uint8)
for i, unique_value in enumerate(np.unique(im)):
one_hot[:, :, i][im == unique_value] = 1
return one_hot
After doing some data manipulation with deep learning, softmax activation function will result in probabilities instead of 0 and 1 values, so in my Decoder I wanted to implement the following approach:
Threshold the output to obtain 0 or 1 only.
Multiply each channel with weight equal to the channel index.
take the max between labels along channels axis.
import numpy as np
arr = np.array([
[[0.1,0.2,0,5],[0.2,0.4,0.7],[0.3,0.5,0.8]],
[[0.3,0.6,0 ],[0.4,0.9,0.1],[0 ,0 ,0.2]],
[[0.7,0.1,0.1],[0,6,0.1,0.1],[0.6,0.6,0.3]],
[[0.6,0.2,0.3],[0.4,0.5,0.3],[0.1,0.2,0.7]]
])
# print(arr.dtype,arr.shape)
def oneHotDecoder(img):
# Thresholding
img[img<0.5]=0
img[img>=0.5]=1
# weigts of the labels
img = [i*img[:,:,i] for i in range(img.shape[2])]
# take the max label
img = np.amax(img,axis=2)
print(img.shape)
return img
arr2 = oneHotDecoder(arr)
print(arr2)
My questions is:
How to git rid of the error:
line 15, in oneHotDecoder
img[img<0.5]=0 TypeError: '<' not supported between instances of 'list' and 'float'
Is there any other issues in my implementaion that you suggest to improve?
Thanks in advance.
You have typos with commas and dots with some of your items (e.g. your first list should be [0.1, 0.2, 0.5] instead of [0.1, 0.2, 0, 5]).
The fixed list is:
l = [
[[0.1,0.2,0.5],[0.2,0.4,0.7],[0.3,0.5,0.8]],
[[0.3,0.6,0 ],[0.4,0.9,0.1],[0 ,0 ,0.2]],
[[0.7,0.1,0.1],[0.6,0.1,0.1],[0.6,0.6,0.3]],
[[0.6,0.2,0.3],[0.4,0.5,0.3],[0.1,0.2,0.7]]
]
Then you could do:
np.array(l) # np.dstack(l) would work as well
Which would yield:
array([[[0.1, 0.2, 0.5],
[0.2, 0.4, 0.7],
[0.3, 0.5, 0.8]],
[[0.3, 0.6, 0. ],
[0.4, 0.9, 0.1],
[0. , 0. , 0.2]],
[[0.7, 0.1, 0.1],
[0.6, 0.1, 0.1],
[0.6, 0.6, 0.3]],
[[0.6, 0.2, 0.3],
[0.4, 0.5, 0.3],
[0.1, 0.2, 0.7]]])

How to L2 Normalize a list of lists in Python using Sklearn

s2 = [[0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194], [0.2, 0.4892574205256839, 0.2, 0.2, 0.383258146374831], [0.3193817886456925, 0.16666666666666666, 0.16666666666666666, 0.16666666666666666, 0.3193817886456925, 0.3193817886456925], [0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194]]
from sklearn.preprocessing import normalize
X = normalize(s2)
this is throwing error:
ValueError: setting an array element with a sequence.
How to L2 Normalize a list of lists in Python using Sklearn.
Since I don't have enough reputation to comment; hence posting it as an answer.
Let's quickly look at your datapoint.
I have converted the given datapoint into NumPy array. Since it doesn't have the same length, so it will look like.
>>> n2 = np.array([[0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194], [0.2, 0.4892574205256839, 0.2, 0.2, 0.383258146374831], [0.3193817886456925, 0.16666666666666666, 0.16666666666666666, 0.16666666666666666, 0.3193817886456925, 0.3193817886456925], [0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194]])
>>> n2
array([list([0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194]),
list([0.2, 0.4892574205256839, 0.2, 0.2, 0.383258146374831]),
list([0.3193817886456925, 0.16666666666666666, 0.16666666666666666, 0.16666666666666666, 0.3193817886456925, 0.3193817886456925]),
list([0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194])],
dtype=object)
And you can see here that converted values are not in Sequence of Values and to achieve this you need to keep the same length for the internal list ( looks like 0.16666666666666666 is copied multiple time in your array; if not then fix the length), it will look like
>>> n3 = np.array([[0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194], [0.2, 0.4892574205256839, 0.2, 0.2, 0.383258146374831], [0.3193817886456925, 0.16666666666666666, 0.16666666666666666, 0.16666666666666666, 0.319381788645692], [0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194]])
>>> n3
array([[0.2 , 0.2 , 0.2 , 0.30216512, 0.24462871],
[0.2 , 0.48925742, 0.2 , 0.2 , 0.38325815],
[0.31938179, 0.16666667, 0.16666667, 0.16666667, 0.31938179],
[0.2 , 0.2 , 0.2 , 0.30216512, 0.24462871]])
As you can see now n3 has become a sequence of values.
and if you use normalize function, it simply works
>>> X = normalize(n3)
>>> X
array([[0.38408524, 0.38408524, 0.38408524, 0.58028582, 0.46979139],
[0.28108867, 0.6876236 , 0.28108867, 0.28108867, 0.53864762],
[0.59581303, 0.31091996, 0.31091996, 0.31091996, 0.59581303],
[0.38408524, 0.38408524, 0.38408524, 0.58028582, 0.46979139]])
How to use NumPy array to avoid this issue, please have a look at this SO link ValueError: setting an array element with a sequence
Important: I removed one element from the 3rd list in order for all lists to have the same length.
I did that cause I really believe that it's a copy-paste error. If not, comment below and I will modify my answer.
import numpy as np
s2 = [[0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194], [0.2, 0.4892574205256839, 0.2, 0.2, 0.383258146374831], [0.3193817886456925, 0.16666666666666666, 0.16666666666666666, 0.3193817886456925, 0.3193817886456925], [0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194]]
X = normalize(np.array(s2))

python matplotlib: How can I add a point mark to curve knowing only the x value?

For example, in matplotlib, I plot a simple curve based on few points:
from matplotlib import pyplot as plt
import numpy as np
x=[0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. , 1.1, 1.2,
1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2. , 2.1, 2.2, 2.3, 2.4, 2.5,
2.6, 2.7, 2.8, 2.9]
y=[0.0, 0.19, 0.36, 0.51, 0.64, 0.75, 0.8400000000000001, 0.91, 0.96, 0.99, 1.0,
0.99, 0.96, 0.9099999999999999, 0.8399999999999999, 0.75, 0.6399999999999997,
0.5099999999999998, 0.3599999999999999, 0.18999999999999995, 0.0,
-0.20999999999999996, -0.4400000000000004, -0.6900000000000004,
-0.9600000000000009, -1.25, -1.5600000000000005, -1.8900000000000006,
-2.240000000000001, -2.610000000000001]
plt.plot(x,y)
plt.show()
Hypothetically, say I want to highlight the point on the curve where the x value is 0.25, but I don't know the y value for this point. What should I do?
The easiest solution is to perform a linear interpolation between neighboring points for the provided x value. Here is a sample code to show the general principle:
X=[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2,
1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2. , 2.1, 2.2, 2.3, 2.4, 2.5,
2.6, 2.7, 2.8, 2.9]
Y=[0.0, 0.19, 0.36, 0.51, 0.64, 0.75, 0.8400000000000001, 0.91, 0.96,
0.99, 1.0, 0.99, 0.96, 0.9099999999999999, 0.8399999999999999, 0.75,
0.6399999999999997, 0.5099999999999998, 0.3599999999999999,
0.18999999999999995, 0.0, -0.20999999999999996, -0.4400000000000004,
-0.6900000000000004, -0.9600000000000009, -1.25, -1.5600000000000005,
-1.8900000000000006, -2.240000000000001, -2.610000000000001]
def interpolate(X, Y, xval):
for n, x in enumerate(X):
if x > xval: break
else: return None # xval > last x value
if n == 0: return None # xval < first x value
xa, xb = X[n-1], X[n] # get surrounding x values
ya, yb = Y[n-1], Y[n] # get surrounding y values
if xb == xa: return ya #
return ya + (xval - xa) * (yb - ya) / (xb - xa) # compute yval by interpolation
print(interpolate(X, Y, 0.25)) # --> 0.435
print(interpolate(X, Y, 0.85)) # --> 0.975
print(interpolate(X, Y, 2.15)) # --> -0.3259999999999997
print(interpolate(X, Y, -1.0)) # --> None (out of bounds)
print(interpolate(X, Y, 3.33)) # --> None (out of bounds)
Note: When the provided xval is not within the range of x values, the function returns None
You could manually do linearly interpolation like this:
def get_y_val(p):
lower_i = max(i for (i, v) in enumerate(x) if v<= p)
upper_i = min(i for (i, v) in enumerate(x) if v>= p)
d = x[upper_i] - x[lower_i]
if d == 0:
return y[lower_i]
y_pt = y[lower_i] * (x[upper_i] - p) / d+ y[upper_i] * (p -
x[lower_i]) / d
return y_pt

Sample from a 2d probability numpy array?

Say that I have an 2d array ar like this:
0.9, 0.1, 0.3
0.4, 0.5, 0.1
0.5, 0.8, 0.5
And I want to sample from [1, 0] according to this probability array.
rdchoice = lambda x: numpy.random.choice([1, 0], p=[x, 1-x])
I have tried two methods:
1) reshape it into a 1d array first and use numpy.random.choice and then reshape it back to 2d:
np.array(list(map(rdchoice, ar.reshape((-1,))))).reshape(ar.shape)
2) use the vectorize function.
func = numpy.vectorize(rdchoice)
func(ar)
But these two ways are all too slow, and I learned that the nature of the vectorize is a for-loop and in my experiments, I found that map is no faster than vectorize.
I thought this can be done faster. If the 2d array is large it would be unbearably slow.
You should be able to do this like so:
>>> p = np.array([[0.9, 0.1, 0.3], [0.4, 0.5, 0.1], [0.5, 0.8, 0.5]])
>>> (np.random.rand(*p.shape) < p).astype(int)
Actually I can use the np.random.binomial:
import numpy as np
p = [[0.9, 0.1, 0.3],
[0.4, 0.5, 0.1],
[0.5, 0.8, 0.5]]
np.random.binomial(1, p)

Categories