how to merge different dimensions arrays in python? - python

I am analyzing some image represented datasets using keras. I am stuck that I have two different dimensions of images. Please see the snapshot. Features has 14637 images having dimension (10,10,3) and features2 has dimension (10,10,100)
Is there any way that I can merge/concatenate these two data together.?

If features and features2 contain the features of the same batch of images, that is features[i] is the same image of features2[i] for each i, then it would make sense to group the features in a single array using the numpy function concatenate():
newArray = np.concatenate((features, features2), axis=3)
Where 3 is the axis along which the arrays will be concatenated. In this case, you'll end up with a new array having dimension (14637, 10, 10, 103).
However, if they refer to completely different batches of images and you would like to merge them on the first axis such that the 14637 images of features2 are placed after the first 14637 image, then, there no way you can end up with an array, since numpy array are structured as matrix, non as a list of objects.
For instance, if you try to execute:
> a = np.array([[0, 1, 2]]) // shape = (1, 3)
> b = np.array([[0, 1]]) // shape = (1, 2)
> c = np.concatenate((a, b), axis=0)
Then, you'll get:
ValueError: all the input array dimensions except for the concatenation axis must match exactly
since you are concatenating along axis = 0 but axis 1's dimensions differ.

If dealing with numpy arrays, you should be able to use concatenate method and specify the axis, along which the data should be merged. Basically: np.concatenate((array_a, array_b), axis=2)

I think it would be better if you use class.
class your_class:
array_1 = []
array_2 = []
final_array = []
for x in range(len(your_previous_one_array)):
temp_class = your_class
temp_class.array_1 = your_previous_one_array
temp_class.array_2 = your_previous_two_array
final_array.append(temp_class)

Related

Advanced Indexing in 3 Dimensional Numpy ndarray In Python

I have a ndarray of shape (68, 64, 64) called 'prediction'. These dimensions correspond to image_number, height, width. For each image, I have a tuple of length two that contains coordinates that corresponds to a particular location in each 64x64 image, for example (12, 45). I can stack these coordinates into another Numpy ndarray of shape (68,2) called 'locations'.
How can I construct a slice object or construct the necessary advanced indexing indices to access these locations without using a loop? Looking for help on the syntax. Using pure Numpy matrixes without loops is the goal.
Working loop structure
Import numpy as np
# example code with just ones...The real arrays have 'real' data.
prediction = np.ones((68,64,64), dtype='float32')
locations = np.ones((68,2), dtype='uint32')
selected_location_values = np.empty(prediction.shape[0], dtype='float32')
for index, (image, coordinates) in enumerate(zip(prediction, locations)):
selected_locations_values[index] = image[coordinates]
Desired approach
selected_location_values = np.empty(prediction.shape[0], dtype='float32')
correct_indexing = some_function_here(locations). # ?????
selected_locations_values = predictions[correct_indexing]
A straightforward indexing should work:
img = np.arange(locations.shape[0])
r = locations[:, 0]
c = locations[:, 1]
selected_locations_values = predictions[img, r, c]
Fancy indexing works by selecting elements of the indexed array that correspond to the shape of the broadcasted indices. In this case, the indices are quite straightforward. You just need the range to tell you what image each location corresponds to.

Converting OpenCV SURF features to float32 arrays in Python

I extract the features with the compute() function and add them to a list. I then try to convert all the features to float32 using NumPy so that they can be used with OpenCV for classification. The error I am getting is:
ValueError: setting an array element with a sequence.
Not really sure what I can do about this. I am following a book and doing the same steps except they use HOS to extract the features. I am extracting the features and getting back matrices of inconsistent sizes and am not sure how I can make them all equal. Related code (which might have minor syntax errors cause I truncated it from the original code):
def get_SURF_feature_vector(area_of_interest, surf):
# Detect the key points in the image
key_points = surf.detect(area_of_interest);
# Create array of zeros with the same shape and type as a given array
image_key_points = np.zeros_like(area_of_interest);
# Draw key points on the image
image_key_points = cv2.drawKeypoints(area_of_interest, key_points, image_key_points, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
# Create feature discriptors
key_points, feature_descriptors = surf.compute(area_of_interest, key_points);
# Plot Image and descriptors
# plt.imshow(image_key_points);
# Return computed feature description matrix
return feature_descriptors;
for x in range(0, len(data)):
feature_list.append(get_SURF_feature_vector(area_of_interest[x], surf));
list_of_features = np.array(list_of_features, dtype = np.float32);
The error isn't specific to OpenCV at all, just numpy.
Your list feature_list contains different length arrays. You can't make a 2d array out of arrays of different sizes.
For e.g. you can reproduce the error really simply:
>>> np.array([[1], [2, 3]], dtype=np.float32)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: setting an array element with a sequence.
I'm assuming what you're expecting from the operation is to input [1], [1, 2] and be returned np.array([1, 2, 3]), i.e., concatenation (actually this is not what OP wants, see the comments under this post). You can use the np.hstack() or np.vstack() for those operations, just depending on the shape of your input. You can use np.concatenate() too with the axis argument but the stacking operations are more explicit for 2D/3D arrays.
>>> a = np.array([1], dtype=np.float32)
>>> b = np.array([2, 3, 4], dtype=np.float32)
>>> np.hstack([a, b])
array([1., 2., 3., 4.], dtype=float32)
Descriptors are listed vertically though, so they should be stacked vertically, not horizontally as above. Thus you can simply do:
list_of_features = np.vstack(list_of_features)
You don't need to specify dtype=np.float32 as the descriptors are np.float32 by default (also, vstack doesn't have a dtype argument so you'd have to convert it after the stacking operation).
If you instead want an 3D array, then you need the same number of features across all images so that it's an evenly filled 3D array. You could just fill up your feature vectors with placeholder values, like 0s or np.nan so that they're all the same length, and then you can group them together as you did originally.
>>> des1 = np.random.rand(500, 64).astype(np.float32)
>>> des2 = np.random.rand(200, 64).astype(np.float32)
>>> des3 = np.random.rand(400, 64).astype(np.float32)
>>> feature_descriptors = [des1, des2, des3]
So here each image's feature descriptors have a different number of features. You can find the largest one:
>>> max_des_length = max([len(d) for d in feature_descriptors])
>>> max_des_length
500
You can use np.pad() to pad each feature array with however many more values it needs to be the same size as your maximum size descriptor set.
Now this is a little unnecessary to do it all in one line, but whatever.
>>> feature_descriptors = [np.pad(d, ((0, (max_des_length - len(d))), (0, 0)), 'constant', constant_values=np.nan) for d in feature_descriptors]
The annoying argument here ((0, (max_des_length - len(d))), (0, 0)) is just saying to pad with 0 elements on the top, max_des_length - len(des) elements on the bottom, 0 on the left, 0 on the right.
As you can see here, I'm adding np.nan values to the arrays. If you left out the constant_values argument it defaults to 0. Lastly all you have to do is cast as a numpy array:
>>> feature_descriptors = np.array(feature_descriptors)
>>> feature_descriptors.shape
(3, 500, 64)

Numpy Append Matrix to Tensor

I am trying to build a list of matrices using numpy, but when I try to append a matrix to an empty tensor, I get the error:
ValueError: all the input arrays must have same number of dimensions
Concatenate and append both seem to fail. I tried calling:
tensor = np.concatenate((tensor, matrix), axis=0)
and
tensor = np.append(tensor, matrix, axis=0)
but I get the same error either way.
The tensor starts with a size of [0, h, w], and the matrix is of size [h, w]. The matrix is the correct shape in the direction I want to append to, but it won't seem to attach.
It seems matrix would representing the incoming ones, while you accumulate those into tensor. So, to solve it, add a new axis with None/np.newaxis as the leading one to matrix and then concatenate with tensor -
np.concatenate((tensor, matrix[None]),axis=0)
If you are accumulating, store it back into tensor.
Or use np.vstack((tensor, matrix[None])).
Sample run -
In [16]: h,w = 3,4
...: a = np.random.rand(0,h,w)
...: b = np.random.rand(h,w)
In [17]: np.concatenate((a, b[None]),axis=0).shape
Out[17]: (1, 3, 4)

Appending matricies into a single matrix with numpy

I have a function in Python that returns a numpy.mat of shape (100, 1). I am calling this function 4 times in a loop and would like to take the resulting 4 matricies and create a matrix of shape (100, 4). I have looked for sometime at numpy.append, numpy.concatenate, and numpy.insert but have not been able to get this working.
Here is a short SSCCE of my issue
zeros = np.zeros(shape=(100, 4))
for i in range(1, 5):
np.append(zeros, np.empty(shape=(100, 1)))
print(zeros)
Where zeros should results in a matrix of shape (100, 4) with "junk" values from each of the calls to numpy.empty and not all 0..
Do something along these lines -
zeros = np.zeros(shape=(100, 4))
for i in range(1, 5):
data = np.random.rand(100,1) # func that returns (100,1) shaped array
zeros[:,i-1] = data.ravel()
In place of ravel(), we could also use : data[:,0] or np.squeeze(data), basic idea is to feed a 1D array there, because the LHS zeros[:,i-1] expects a 1D array there.
As an alternative, inside the loop, we could also do -
zeros[:,[i-1]] = data
Thus, with that list of column index [i-1] instead of i-1, we are keeping the dimensions into which data is to be assigned (keeps as 2D) and that allows us to feed in data, which is also 2D without any change.

numpy insert 2D array into 4D structure

I have a 4D array: array = np.random.rand(3432,1,30,512)
I also have 5 sets of 2D arrays with shape (30,512)
I want to insert these into the 4D structure along axis 1 so that my final shape is (3432,6,30,512) (5 new arrays + the original 1). I need to iteratively insert this set for each of the 3432 elements
Whats the most effective way to do this?
I've tried reshaping the 2D to 4D and then inserting along axis 1. I'm expecting axis 1 to never exceed a size of 6, but the 2D arrays just keep getting added, rather than a set for each of the 3432 elements. I think my problem lies in not fully understanding the obj param for the insert method:
all_data = np.reshape(all_data, (-1, 1, 30, 512))
for i in range(all_data.shape[0]):
num_band = 1
for band in range(5):
temp_trial = np.zeros((30, 512)) # Just an example. values arent actually 0
temp_trial = np.reshape(temp_trial, (1,1,30,512))
all_data = np.insert(all_data, num_band, temp_trial, 1)
num_band += 1
Create an array with the final shape first and insert the elements later:
final = np.zeros((3432,6,30,512))
for i in range(3432): # note, this will take a while
for j in range(6):
final[i, j, :, :] = # insert your array here (np.ones((30, 512)))
or if you actually want to broadcast this over the zeroth axis, assuming each of the 3432 should be the same for each "band":
for i in range(6):
final[:, i, :, :] = # insert your array here (np.ones((30, 512)))
As long as you don't do many loops there is no need to vectorize it

Categories