Iteratively appending ndarray arrays using numpy in Python - python

I am trying to figure out how to iteratively append 2D arrays to generate a singular larger array. On each iteration a 16x200 ndarray is generated as seen below:
For each iteration a new 16x200 array is generated, I would like to 'append' this to the previously generated array for a total of N iterations. For example for two iterations the first generated array would be 16x200 and for the second iteration the newly generated 16x200 array would be appended to the first creating a 16x400 sized array.
train = np.array([])
for i in [1, 2, 1, 2]:
spike_count = [0, 0, 0, 0]
img = cv2.imread("images/" + str(i) + ".png", 0) # Read the associated image to be classified
k = np.array(temporallyEncode(img, 200, 4))
# Somehow append k to train on each iteration
In the case of the above embedded code the loop iterates 4 times so the final train array is expected to be 16x800 in size. Any help would be greatly appreciated, I have drawn a blank on how to successfully accomplish this. The code below is a general case:
import numpy as np
totalArray = np.array([])
for i in range(1,3):
arrayToAppend = totalArray = np.zeros((4, 200))
# Append arrayToAppend to totalArray somehow

While it is possible to perform a concatenate (or one of the 'stack' variants) at each iteration, it is generally faster to accumulate the arrays in a list, and perform the concatenate once. List append is simpler and faster.
alist = []
for i in range(0,3):
arrayToAppend = totalArray = np.zeros((4, 200))
alist.append(arrayToAppend)
arr = np.concatenate(alist, axis=1) # to get (4,600)
# hstack does the same thing
# vstack is the same, but with axis=0 # (12,200)
# stack creates new dimension, # (3,4,200), (4,3,200) etc

Try using numpy hstack. From the documention, hstack takes a sequence of arrays and stack them horizontally to make a single array.
For example:
import numpy as np
x = np.zeros((16, 200))
y = x.copy()
for i in xrange(5):
y = np.hstack([y, x])
print y.shape
Gives:
(16, 400)
(16, 600)
(16, 800)
(16, 1000)
(16, 1200)

Related

Manipulating values of a 2D array in python using a loop (using numpy)

import random as rand
weights = np.zeros((3, 4))
print(weights, end = "\n\n")
for i in weights:
for j in i:
j = rand.randint(1, 10)
print(weights)
So I was playing around with numpy and decided to make a 2D array filled with 0s. I then wanted to change the values of the elements within the array to random numbers between 1 and 10. Using the above code, the 2D array does not change and remains filled with zeros. Why does python behave this way when it comes to arrays?
To get my desired result, I had to do the following:
import numpy as np
import random as rand
weights = np.zeros((3, 4))
print(weights, end = "\n\n")
for i in range(len(weights)):
for j in range(len(weights[i])):
weights[i][j] = rand.randint(1, 9)
print(weights)
Is there any better way to do this so that I don't have to resort to using range() and len()?
Better way to do it is during init:
# args - low, high, shape
weights = np.random.randint(1, 9, (3, 4))
You can read documentation here.
And as to why python behaves this way, with list comprehension you are not modifying the list, you are creating a new list.

Append Value to 3D array numpy

I iterate over a 3D numpy array and want to append in every step a float value to the array in the 3rd dimension (axis =2).
Something like (I know the code doesn't work as of now, latIndex, data and lonIndex for simplicity as randoms)
import numpy as np
import random
GridData = np.ones((121, 201, 1000))
data = np.random.rand(4800, 4800)
for row in range(4800):
for column in range(4800):
latIndex = random.randrange(0, 121, 1)
lonIndex = random.randrange(0, 201, 1)
GridData = np.append(GridData[latIndex, lonIndex, :], data[column, row], axis = 2)
The 3rd dimension of GridData is arbitrary in this example of size 1000.
How can I achieve this?
Addition:
It might be possible without np.append but then I don't know how to do this since the 3rd index is different for every combination of latIndex and lonIndex.
You can allocate extra space for your array grid_data, fill it with the NaN, and keep track of the next index to be filled in another array while iterating through and filling with values from data. If you completely fill the third dimension for some lat_idx, lon_idx with non-NaN values, then you just allocate more space. Since appending is expensive with numpy, it's best that this extra space is pretty large so you only do it once or twice (below I allocate twice the original space).
Once the array is filled, you can remove the added space that was unused with numpy.isnan(). This solution does what you want but is very slow (for the example values you gave it took about two minutes), but the slow execution comes from iterating rather than the numpy operations.
Here's the code:
import random
import numpy as np
grid_data = np.ones(shape=(121, 201, 1000))
data = np.random.rand(4800, 4800)
# keep track of next index to fill for all the arrays in axis 2
next_to_fill = np.full(shape=(grid_data.shape[0], grid_data.shape[1]),
fill_value=grid_data.shape[2],
dtype=np.int32)
# allocate more space
double_shape = (grid_data.shape[0], grid_data.shape[1], grid_data.shape[2] * 2)
extra_space = np.full(shape=double_shape, fill_value=np.nan)
grid_data = np.append(grid_data, extra_space, axis=2)
for row in range(4800):
for col in range(4800):
lat_idx = random.randint(0, 120)
lon_idx = random.randint(0, 200)
# allocate more space if needed
if next_to_fill[lat_idx, lon_idx] >= grid_data.shape[2]:
grid_data = np.append(grid_data, extra_space, axis=2)
grid_data[lat_idx, lon_idx, next_to_fill[lat_idx, lon_idx]] = data[row,
col]
next_to_fill[lat_idx, lon_idx] += 1
# remove unnecessary nans that were appended
not_all_nan_idxs = ~np.isnan(grid_data).all(axis=(0, 1))
grid_data = grid_data[:, :, not_all_nan_idxs]

Numpy array with arrays of different size inside

I want to create a 3D np.array named output of varying size. An array of size (5,a,b); with a and b varying (b decreasing):
(a,b) = (1000,20)
(a,b) = (1000,19)
(a,b) = (1000,18)
(a,b) = (1000,17)
(a,b) = (1000,16)
I could create an array of arrays in order to do so, but later on I want to get the first column of all the arrays (without a loop) then I cannot use:
output[:,:,0]
Concatenating them wont work also, it asks for the same size of the arrays...
Any alternatives to be able to have a varying single array instead of an array of arrays?
Thanks!
Like #Divakar said, create an empty array with type object and assign the different sized arrays to their respective indices.
import numpy as np
arrs = [np.ones((5, i, 10 - i)) for i in range(10)]
arrs[0].shape
(5, 0, 10)
arrs[1].shape
(5, 1, 9)
out = np.emtpy(len(arrs), dtype=object)
out[:] = arrs
out[0].shape
(5, 0, 10)
out[1].shape
(5, 1, 9)
Maybe you could make a list and add this 5 arrays.

How to stack multiple 2D numpy arrays in a 3D numpy array

I am extracting features from Audio clips. In doing so for 1 clip a matrix of 20x2 dimension is obtained. I have around 1000 of such clips. I want to store all the data in 1 numpy array of dimension 20x2x1000. Please suggest a method for the same.
The function you're looking for is np.stack. It's used to stack multiple NumPy arrays along a new axis.
import numpy as np
# Generate 1000 features
original_features = [np.random.rand(20, 2) for i in range(1000)]
# Stack them into one array
stacked_features = np.stack(original_features, axis=2)
assert stacked_features.shape == (20, 2, 1000)
There is a convenient function for this and that is numpy.dstack. Below is a snippet of code for depth stacking of arrays:
# whatever the number of arrays that you have
In [4]: tuple_of_arrs = tuple(np.random.randn(20, 2) for _ in range(10))
# stack each of the arrays along third axis
In [7]: depth_stacked = np.dstack(tuple_of_arrs)
In [8]: depth_stacked.shape
Out[8]: (20, 2, 10)

how to randomly sample in 2D matrix in numpy

I have a 2d array/matrix like this, how would I randomly pick the value from this 2D matrix, for example getting value like [-62, 29.23]. I looked at the numpy.choice but it is built for 1d array.
The following is my example with 4 rows and 8 columns
Space_Position=[
[[-62,29.23],[-49.73,29.23],[-31.82,29.23],[-14.2,29.23],[3.51,29.23],[21.21,29.23],[39.04,29.23],[57.1,29.23]],
[[-62,11.28],[-49.73,11.28],[-31.82,11.28],[-14.2,11.28],[3.51,11.28],[21.21,11.28] ,[39.04,11.28],[57.1,11.8]],
[[-62,-5.54],[-49.73,-5.54],[-31.82,-5.54] ,[-14.2,-5.54],[3.51,-5.54],[21.21,-5.54],[39.04,-5.54],[57.1,-5.54]],
[[-62,-23.1],[-49.73,-23.1],[-31.82,-23.1],[-14.2,-23.1],[3.51,-23.1],[21.21,-23.1],[39.04,-23.1] ,[57.1,-23.1]]
]
In the answers the following solution was given:
random_index1 = np.random.randint(0, Space_Position.shape[0])
random_index2 = np.random.randint(0, Space_Position.shape[1])
Space_Position[random_index1][random_index2]
this indeed works to give me one sample, how about more than one sample like what np.choice() does?
Another way I am thinking is to tranform the matrix into a array instead of matrix like,
Space_Position=[
[-62,29.23],[-49.73,29.23],[-31.82,29.23],[-14.2,29.23],[3.51,29.23],[21.21,29.23],[39.04,29.23],[57.1,29.23], ..... ]
and at last use np.choice(), however I could not find the ways to do the transformation, np.flatten() makes the array like
Space_Position=[-62,29.23,-49.73,29.2, ....]
Just use a random index (in your case 2 because you have 3 dimensions):
import numpy as np
Space_Position = np.array(Space_Position)
random_index1 = np.random.randint(0, Space_Position.shape[0])
random_index2 = np.random.randint(0, Space_Position.shape[1])
Space_Position[random_index1, random_index2] # get the random element.
The alternative is to actually make it 2D:
Space_Position = np.array(Space_Position).reshape(-1, 2)
and then use one random index:
Space_Position = np.array(Space_Position).reshape(-1, 2) # make it 2D
random_index = np.random.randint(0, Space_Position.shape[0]) # generate a random index
Space_Position[random_index] # get the random element.
If you want N samples with replacement:
N = 5
Space_Position = np.array(Space_Position).reshape(-1, 2) # make it 2D
random_indices = np.random.randint(0, Space_Position.shape[0], size=N) # generate N random indices
Space_Position[random_indices] # get N samples with replacement
or without replacement:
Space_Position = np.array(Space_Position).reshape(-1, 2) # make it 2D
random_indices = np.arange(0, Space_Position.shape[0]) # array of all indices
np.random.shuffle(random_indices) # shuffle the array
Space_Position[random_indices[:N]] # get N samples without replacement
Refering to numpy.random.choice:
Sampling random rows from a 2-D array is not possible with this function, but is possible with Generator.choice through its axis keyword.
The genrator documentation is linked here numpy.random.Generator.choice.
Using this knowledge. You can create a generator and then "choice" from your array:
rng = np.random.default_rng() #creates the generator ==> Generator(PCG64) at 0x2AA703BCE50
N = 3 #Number of Choices
a = np.array(Space_Position) #makes sure, a is an ndarray and numpy-supported
s = a.shape #(4,8,2)
a = a.reshape((s[0] * s[1], s[2])) #makes your array 2 dimensional keeping the last dimension seperated
a.shape #(32, 2)
b = rng.choice(a, N, axis=0, replace=False) #returns N choices of a in array b, e.g. narray([[ 57.1 , 11.8 ], [ 21.21, -5.54], [ 39.04, 11.28]])
#Note: replace=False prevents having the same entry several times in the result
Space_Position[np.random.randint(0, len(Space_Position))]
[np.random.randint(0, len(Space_Position))]
gives you what you want

Categories