Computing weighted average on a deque using numpy - python

I have a deque object whose weighted average I would like to find. The deque object is a collection of (60 * 60 * 3) NumPy arrays (They are actually images stored in deque object). I would like to find the weighted average of all the elements(i.e images) in the deque object.
motion_buffer = deque(maxlen = 5)
motion_weights = [5./15, 4./15, 3./15, 2./15, 1./15]
# After adding few elements ( i.e images ) in motion_buffer. the following is done:
motion_avg = np.average(motion_buffer, weights=motion_weights)
I get the error:
"Axis must be specified when shapes of a and weights "
TypeError: Axis must be specified when shapes of a and weights differ.
I understand there is a mismatch somewhere but supplying axis values ( as per docs) did not help me. I have tested it in the following manner:
>>> A = np.random.randn(4,4)
>>> weights = [1 , 4 ,6 ,7]
>>> buf = deque(maxlen=5)
>>> buf.appendleft(A)
>>> c = np.average(buf, weights=weights)
Traceback (most recent call last):
...
"Axis must be specified when shapes of a and weights "
TypeError: Axis must be specified when shapes of a and weights differ.
I have tried using np.average for deque object with 1d elements and it works.
How exactly should I modify my code, I experimented but it didn't work for me.

According to np.average documentation
weights : array_like, optional
    An array of weights associated with the values in a. Each value in
    a contributes to the average according to its associated weight.
    The weights array can either be 1-D (in which case its length must be
    the size of a along the given axis) or of the same shape as a.
you can't.
You can implement a workaround
av_average = np.average(np.average(your_deque, axis=(1,2,3)), weights=(5,4,3,2,1))
where you first (the inner average) mediate each 60×60×3matrix specifying the axis on which to sum and later using the weights to compute the weighted average of the averages.
The OP really wants this
average = np.average(the_deque, axis=0, weights=(…))
where (…) is a sequence whose length equals the current length of the deque.

One way which worked, is converting deque to numpy array and then
my_array = np.array(deque)
np.average(deque, axis=0, weights=weights)
helped with the only problem of increase of computational time.

In [1]: from collections import deque
In [2]: >>> A = np.random.randn(4,4)
...: >>> weights = [1 , 4 ,6 ,7]
...: >>> buf = deque(maxlen=5)
...: >>> buf.appendleft(A)
In [3]: buf
Out[3]:
deque([array([[ 1.10651806, -0.50125715, -0.35877456, 1.31969932],
[-0.4674734 , 0.25144544, -1.5392525 , 0.09607722],
[ 2.24245413, -1.09636901, 1.97502862, -0.90069983],
[ 0.61917197, -0.13276115, -0.1103521 , 0.56556319]])])
In [4]: np.array(buf)
Out[4]:
array([[[ 1.10651806, -0.50125715, -0.35877456, 1.31969932],
[-0.4674734 , 0.25144544, -1.5392525 , 0.09607722],
[ 2.24245413, -1.09636901, 1.97502862, -0.90069983],
[ 0.61917197, -0.13276115, -0.1103521 , 0.56556319]]])
In [5]: np.average(buf, weights=weights, axis=0)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-8a8c3dba2415> in <module>
----> 1 np.average(buf, weights=weights, axis=0)
<__array_function__ internals> in average(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/numpy/lib/function_base.py in average(a, axis, weights, returned)
399 if wgt.shape[0] != a.shape[axis]:
400 raise ValueError(
--> 401 "Length of weights not compatible with specified axis.")
402
403 # setup wgt to broadcast along axis
ValueError: Length of weights not compatible with specified axis.
In [6]: _4.shape
Out[6]: (1, 4, 4)
Oops, I should have checked the shape of the array. For some reason, which I won't dig into Out[4] has an initial size 1 dimension.
In [7]: np.average(buf, weights=weights, axis=1)
Out[7]: array([[ 0.94586406, -0.38905653, 0.25344014, 0.01437508]])

Related

Paddin new axis in 3d matrix returns error

What I need to do is to extend a 2D matrix to 3D and fill the 3rd axis with an arbitrary number of zero. The error returned is:
all the input arrays must have same number of dimensions, but the
array at index 0 has 3 dimension(s) and the array at index 1 has 0
dimension(s)
What should I correct?
import numpy as np
kernel = np.ones((3,3)) / 9
kernel = kernel[..., None]
print(type(kernel))
print(np.shape(kernel))
print(kernel)
i = 1
for i in range(27):
np.append(kernel, 0, axis = 2)
print(kernel)
What should I use instead of np.append()?
Use concatenate():
import numpy as np
kernel = np.ones((3,3)) / 9
kernel = kernel[..., None]
print(type(kernel))
print(np.shape(kernel))
print(kernel)
print('-----------------------------')
append_values = np.zeros((3,3))
append_values = append_values[..., None]
i = 1
for i in range(2):
kernel = np.concatenate((kernel, append_values), axis=2)
print(kernel.shape)
print(kernel)
But best generate the append_values array already with the required shape in the third dimension to avoid looping:
append_values = np.zeros((3,3,2)) # or (3,3,27)
kernel = np.concatenate((kernel, append_values), axis=2)
print(kernel.shape)
print(kernel)
Look at the output and error - full error, not just a piece!
<class 'numpy.ndarray'>
(3, 3, 1)
...
In [94]: np.append(kernel, 0, axis = 2)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [94], in <cell line: 1>()
----> 1 np.append(kernel, 0, axis = 2)
File <__array_function__ internals>:5, in append(*args, **kwargs)
File ~\anaconda3\lib\site-packages\numpy\lib\function_base.py:4817, in append(arr, values, axis)
4815 values = ravel(values)
4816 axis = arr.ndim-1
-> 4817 return concatenate((arr, values), axis=axis)
File <__array_function__ internals>:5, in concatenate(*args, **kwargs)
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 3 dimension(s) and the array at index 1 has 0 dimension(s)
As your shape shows kernel is 3d (3,3,1). np.append takes the scalar 0, and makes an array, np.array(0), and calls concatenate. concatenate, if you take time to read its docs, requires matching numbers of dimensions.
But my main beef with your code was that you used np.append without capturing the result. Again, if you take time to read the docs, you'll realize that np.append does not work in-place. It does NOT modify kernel. When it works, it returns a new array. And doing that repeatedly in a loop is inefficient.
It looks like you took the list append model, applied it without much thought, to arrays. That's not how to code numpy.
As the other answer shows, doing one concatenate with a (3,3,27) array of 0s is the way to go if you want to make a (3,3,28) array.
Alternatively make a (3,3,28) array of 0s, and copy the one (3,3,1) array to the appropriate column.

ValueError: could not broadcast input array from shape (15,15) into shape (15)

import numpy
HL1_neurons = 15
input_HL1_weights = numpy.random.uniform(low=-0.1, high=0.1,size=(15, HL1_neurons))
output_neurons = 1
HL2_output_weights = numpy.random.uniform(low=-0.1, high=0.1,size=(HL1_neurons, 1))
weights = numpy.array([input_HL1_weights,HL2_output_weights])
while execting the code HL1_neurons is accepting any number other than 15
if it is 15 it is showing following error please help me in this regard
ValueError Traceback (most recent call last)
<ipython-input-15-97fe596c0407> in <module>
4 output_neurons = 1
5 HL2_output_weights = numpy.random.uniform(low=-0.1, high=0.1,size=(HL1_neurons, 1))
----> 6 weights = numpy.array([input_HL1_weights.astype(object),HL2_output_weights.astype(object)])
ValueError: could not broadcast input array from shape (15,15) into shape (15)
input_HL1_weights is (15,15) shape. HL2_output_weights is (15,1).
Trying to make an object dtype array from these two, results in this kind of error. If their shapes differed in the first dimensions (e.g. their transposes), the result would be a (2,) object dtype array. If the shapes matched the shape would be (2,n,m). This a known problem with using np.array(...) to create an object dtype array.
What exactly do you want to produce?
A more reliable way to produce a (2,) object array is:
weights = np.empty(2, object)
weights[0] = input_HL1_weights
weights[1] = HL2_output_weights
This removes the ambiguity about what to do if the shapes match, or match partially.

Is there an equivalent to R apply function in Python?

I am trying to find the Python equivalent to R's apply function but with multidimensional arrays.
For example, when called the following code:
z <- array(1, dim = 2:4)
apply(z, 1, sum)
The result is:
[1] 12 12
and when called with two values for margin:
apply(z, c(1,2), sum)
The result is:
[,1] [,2] [,3]
[1,] 4 4 4
[2,] 4 4 4
I found that the sum function in numpy can be used, but not in the same consistent way:
For example:
import numpy as np
xx= np.ones((2,3,4))
np.sum(xx,axis=(1,2))
The result is:
array([12., 12.])
but I can't find a function that equivalent to apply in its manner specifically when dealing with margin=c(1,2). Could anyone help?
The equivalent in NumPy is:
xx.sum(axis=2)
That is, you are summing over axis 2 (the last dimension), which as its length is 4, leaves the other two dimensions (2,3) as the shape of the result:
array([[4., 4., 4.],
[4., 4., 4.]])
Perhaps a more literal translation of your R code would be:
np.apply_over_axes(np.sum, xx, 2)
Which gives a similar result but transposed. This is likely to be slower, however, and is not idiomatic unless the actual operation you're performing is something more complicated than sum.
np.apply_over_axes is different from R's apply in several ways.
First, np.apply_over_axes needs collapsing axes to be specified,
whereas R's apply needs remaining axes to be specified.
Secondly, np.apply_over_axes applies function iteratively as the documentation stated below. The result is the same for np.sum but it could be different for other functions.
func is called as res = func(a, axis), where axis is the first element of axes. The result res of the function call must have either the same dimensions as a or one less dimension. If res has one less dimension than a, a dimension is inserted before axis. The call to func is then repeated for each axis in axes, with res as the first argument.
And the func for np.apply_over_axes needs to be in particular format and the return of func needs to be in particular shape for np.apply_over_axes to perform correctly.
Here's an example how np.apply_over_axes fails
>>> arr.shape
(5, 4, 3, 2)
>>> np.apply_over_axes(np.mean, arr, (0,1))
array([[[[ 0.05856732, -0.14844212],
[ 0.34214183, 0.24319846],
[-0.04807454, 0.04752829]]]])
>>> np_mean = lambda x: np.mean(x)
>>> np.apply_over_axes(np_mean, arr, (0,1))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<__array_function__ internals>", line 5, in apply_over_axes
File "/Users/kwhkim/opt/miniconda3/envs/rtopython2-pip/lib/python3.8/site-packages/numpy/lib/shape_base.py", line 495, in apply_over_axes
res = func(*args)
TypeError: <lambda>() takes 1 positional argument but 2 were given
Since there seems to be no equivalent function in Python,
I made a function that is similar to R's apply
def np_apply(arr, axes_remain, fun, *args, **kwargs):
axes_remain = tuple(set(axes_remain))
arr_shape = arr.shape
axes_to_move = set(range(len(arr.shape)))
for axis in axes_remain:
axes_to_move.remove(axis)
axes_to_move = tuple(axes_to_move)
arr, axes_to_move
arr2 = np.moveaxis(arr, axes_to_move, [-x for x in list(range(1,len(axes_to_move)+1))]).copy()
#if arr2.flags.c_contiguous:
arr2 = arr2.reshape([arr_shape[x] for x in axes_remain]+[-1])
return np.apply_along_axis(fun, -1, arr2, *args, **kwargs)
It works fine at least for the sample example as above(not exactly the same as the result above but math.close() returns True for nearly all elements)
>>> np_apply(arr, (2,3), np.mean)
array([[ 0.05856732, -0.14844212],
[ 0.34214183, 0.24319846],
[-0.04807454, 0.04752829]])
>>> np_apply(arr, (2,3), np_mean)
array([[ 0.05856732, -0.14844212],
[ 0.34214183, 0.24319846],
[-0.04807454, 0.04752829]])
For the function to work smoothly for large multidimensional array,
it needs to be optimized. For instance,
array should be prevented from copying.
Anyway it seems to work as a proof-of-concept and I hope it helps.
PS)
arr is generated by arr = np.random.normal(0,1,(5,4,3,2))

Converting OpenCV SURF features to float32 arrays in Python

I extract the features with the compute() function and add them to a list. I then try to convert all the features to float32 using NumPy so that they can be used with OpenCV for classification. The error I am getting is:
ValueError: setting an array element with a sequence.
Not really sure what I can do about this. I am following a book and doing the same steps except they use HOS to extract the features. I am extracting the features and getting back matrices of inconsistent sizes and am not sure how I can make them all equal. Related code (which might have minor syntax errors cause I truncated it from the original code):
def get_SURF_feature_vector(area_of_interest, surf):
# Detect the key points in the image
key_points = surf.detect(area_of_interest);
# Create array of zeros with the same shape and type as a given array
image_key_points = np.zeros_like(area_of_interest);
# Draw key points on the image
image_key_points = cv2.drawKeypoints(area_of_interest, key_points, image_key_points, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
# Create feature discriptors
key_points, feature_descriptors = surf.compute(area_of_interest, key_points);
# Plot Image and descriptors
# plt.imshow(image_key_points);
# Return computed feature description matrix
return feature_descriptors;
for x in range(0, len(data)):
feature_list.append(get_SURF_feature_vector(area_of_interest[x], surf));
list_of_features = np.array(list_of_features, dtype = np.float32);
The error isn't specific to OpenCV at all, just numpy.
Your list feature_list contains different length arrays. You can't make a 2d array out of arrays of different sizes.
For e.g. you can reproduce the error really simply:
>>> np.array([[1], [2, 3]], dtype=np.float32)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: setting an array element with a sequence.
I'm assuming what you're expecting from the operation is to input [1], [1, 2] and be returned np.array([1, 2, 3]), i.e., concatenation (actually this is not what OP wants, see the comments under this post). You can use the np.hstack() or np.vstack() for those operations, just depending on the shape of your input. You can use np.concatenate() too with the axis argument but the stacking operations are more explicit for 2D/3D arrays.
>>> a = np.array([1], dtype=np.float32)
>>> b = np.array([2, 3, 4], dtype=np.float32)
>>> np.hstack([a, b])
array([1., 2., 3., 4.], dtype=float32)
Descriptors are listed vertically though, so they should be stacked vertically, not horizontally as above. Thus you can simply do:
list_of_features = np.vstack(list_of_features)
You don't need to specify dtype=np.float32 as the descriptors are np.float32 by default (also, vstack doesn't have a dtype argument so you'd have to convert it after the stacking operation).
If you instead want an 3D array, then you need the same number of features across all images so that it's an evenly filled 3D array. You could just fill up your feature vectors with placeholder values, like 0s or np.nan so that they're all the same length, and then you can group them together as you did originally.
>>> des1 = np.random.rand(500, 64).astype(np.float32)
>>> des2 = np.random.rand(200, 64).astype(np.float32)
>>> des3 = np.random.rand(400, 64).astype(np.float32)
>>> feature_descriptors = [des1, des2, des3]
So here each image's feature descriptors have a different number of features. You can find the largest one:
>>> max_des_length = max([len(d) for d in feature_descriptors])
>>> max_des_length
500
You can use np.pad() to pad each feature array with however many more values it needs to be the same size as your maximum size descriptor set.
Now this is a little unnecessary to do it all in one line, but whatever.
>>> feature_descriptors = [np.pad(d, ((0, (max_des_length - len(d))), (0, 0)), 'constant', constant_values=np.nan) for d in feature_descriptors]
The annoying argument here ((0, (max_des_length - len(d))), (0, 0)) is just saying to pad with 0 elements on the top, max_des_length - len(des) elements on the bottom, 0 on the left, 0 on the right.
As you can see here, I'm adding np.nan values to the arrays. If you left out the constant_values argument it defaults to 0. Lastly all you have to do is cast as a numpy array:
>>> feature_descriptors = np.array(feature_descriptors)
>>> feature_descriptors.shape
(3, 500, 64)

One dimensional Mahalanobis Distance in Python

I've been trying to validate my code to calculate Mahalanobis distance written in Python (and double check to compare the result in OpenCV)
My data points are of 1 dimension each (5 rows x 1 column).
In OpenCV (C++), I was successful in calculating the Mahalanobis distance when the dimension of a data point was with above dimensions.
The following code was unsuccessful in calculating Mahalanobis distance when dimension of the matrix was 5 rows x 1 column. But it works when the number of columns in the matrix are more than 1:
import numpy;
import scipy.spatial.distance;
s = numpy.array([[20],[123],[113],[103],[123]]);
covar = numpy.cov(s, rowvar=0);
invcovar = numpy.linalg.inv(covar)
print scipy.spatial.distance.mahalanobis(s[0],s[1],invcovar);
I get the following error:
Traceback (most recent call last):
File "/home/abc/Desktop/Return.py", line 6, in <module>
invcovar = numpy.linalg.inv(covar)
File "/usr/lib/python2.6/dist-packages/numpy/linalg/linalg.py", line 355, in inv
return wrap(solve(a, identity(a.shape[0], dtype=a.dtype)))
IndexError: tuple index out of range
One-dimensional Mahalanobis distance is really easy to calculate manually:
import numpy as np
s = np.array([[20], [123], [113], [103], [123]])
std = s.std()
print np.abs(s[0] - s[1]) / std
(reducing the formula to the one-dimensional case).
But the problem with scipy.spatial.distance is that for some reason np.cov returns a scalar, i.e. a zero-dimensional array, when given a set of 1d variables. You want to pass in a 2d array:
>>> covar = np.cov(s, rowvar=0)
>>> covar.shape
()
>>> invcovar = np.linalg.inv(covar.reshape((1,1)))
>>> invcovar.shape
(1, 1)
>>> mahalanobis(s[0], s[1], invcovar)
2.3674720531046645
Covariance needs 2 arrays to compare. In both np.cov() and Opencv CalcCovarMatrix, it expects the two arrays to be stacked on top of each other (Use vstack). You can also have the 2 arrays to be side by side if you change the Rowvar to false in numpy or use COVAR_COL in opencv. If your arrays are multidimentional, just flatten() them first.
So if I want to compare two 24x24 images, I flatten them both into 2 1x1024 images, then stack the two to get a 2x1024, and that is the first argument of np.cov().
You should then get a large square matrix, where it shows the results of comparing each element in array1 with each element in array2. In my example it will be 1024x1024. THAT is what you pass into your invert function.

Categories