Appending numpy array of arrays - python

I am trying to append an array to another array but its appending them as if it was just one array. What I would like to have is have each array appended on its own index, (withoug having to use a list, i want to use np arrays) i.e
temp = np.array([])
for i in my_items
m = get_item_ids(i.color) #returns an array as [1,4,20,5,3] (always same number of items but diff ids
temp = np.append(temp, m, axis=0)
On the second iteration lets suppose i get [5,4,15,3,10]
then i would like to have temp as
array([1,4,20,5,3][5,4,15,3,10])
But instead i keep getting [1,4,20,5,3,5,4,15,3,10]
I am new to python but i am sure there is probably a way to concatenate in this way with numpy without using lists?

You have to reshape m in order to have two dimension with
m.reshape(-1, 1)
thus adding the second dimension. Then you could concatenate along axis=1.
np.concatenate(temp, m, axis=1)

List append is much better - faster and easier to use correctly.
temp = []
for i in my_items
m = get_item_ids(i.color) #returns an array as [1,4,20,5,3] (always same number of items but diff ids
temp = m
Look at the list to see what it created. Then make an array from that:
arr = np.array(temp)
# or `np.vstack(temp)

Related

How to concatenate numpy arrays to create a 2d numpy array

I'm working on using AI to give me better odds at winning Keno. (don't laugh lol)
My issue is that when I gather my data it comes in the form of 1d arrays of drawings at a time. I have different files that have gathered the data and formatted it as well as performed simple maths on the data set. Now I'm trying to get the data into a certain shape for my Neural Network layers and am having issues.
formatted_list = file.readlines()
#remove newline chars
formatted_list = list(filter(("\n").__ne__, formatted_list))
#iterate through each drawing, format the ends and split into list of ints
for i in formatted_list:
i = i[1:]
i = i[:-2]
i = [int(j) for j in i.split(",")]
#convert to numpy array
temp = np.array(i)
#t1 = np.reshape(temp, (-1, len(temp)))
#print(np.shape(t1))
#append to master list
master_list.append(temp)
print(np.shape(master_list))
This gives output of "(292,)" which is correct there are 292 rows of data however they contain 20 columns as well. If I comment in the "#t1 = np.reshape(temp, (-1, len(temp))) #print(np.shape(t1))" it gives output of "(1,20)(1,20)(1,20)(1,20)(1,20)(1,20)(1,20)(1,20)", etc. I want all of those rows to be added together and keep the columns the same (292,20). How can this be accomplished?
I've tried reshaping the final list and many other things and had no luck. It either populates each number in the row and adds it to the first dimension, IE (5840,) I was expecting to be able to append each new drawing to a master list, convert to numpy array and reshape it to the 292 rows of 20 columns. It just appears that it want's to keep the single dimension. I've tried numpy.concat also and no luck. Thank you.
You can use vstack to concatenate your master_list.
master_list = []
for array in formatted_list:
master_list.append(array)
master_array = np.vstack(master_list)
Alternatively, if you know the length of your formatted_list containing the arrays and array length you can just preallocate the master_array.
import numpy as np
formatted_list = [np.random.rand(20)]*292
master_array = np.zeros((len(formatted_list), len(formatted_list[0])))
for i, array in enumerate(formatted_list):
master_array[i,:] = array
** Edit **
As mentioned by hpaulj in the comments, np.array(), np.stack() and np.vstack() worked with this input and produced a numpy array with shape (7,20).

How to transform 2D array using values as another array's indices?

I have a 2D array with indices refering to another array:
indexarray = np.array([[0,0,1,1],
[1,2,3,0]])
The array which these indices refer to is:
valuearray = np.array([8,7,6,5])
I would like to get an array with the numbers from valuearray in the shape of indexarray, each item in this array corresponds to the value in valuearray with the index on the same location in indexarray, ie:
targetarray = np.array([[8,8,7,7],
[7,6,5,8]])
How can I do this without iteration?
What I do now to achieve this is:
np.apply_along_axis(func1d = lambda row: valuearray[row],axis=0,arr = indexarray)
If there is a simpler way, I am interested.
One way is to flatten the index array and get the values and reshape it back as follows.
targetarray = valuearray[indexarray.flatten()].reshape(indexarray.shape)

Splitting numpy multidimensional array based on indices stored in another array or list

I have a numpy multidimensional array with shape = (12,2,3,3)
import numpy as np
arr = np.arange(12*2*3*3).reshape((12,2,3,3))
I need to select those elements based on the 2nd dimension where the dindices are stored in another list
indices = [0,1,0,0,1,1,0,1,1,0,1,1]
in one array, and the rest in another array.
the output in either case should be in another array of shape (12,3,3)
arr2 = np.empty((arr.shape[0],*arr.shape[-2:]))
I could do it using a for loop
for i, ii in enumerate(indices):
arr2[i] = arr[i, indices[ii],...]
However, I am searching for a one liner.
When I try indexing using the list as indices
test = arr[:,indices,...]
I get test of shape (12,12,3,3) instead of (12,3,3). Could you help me please?
You can use np.arange for indexing the first dimension:
test = arr[np.arange(arr.shape[0]),indices,...]
or just the python range function:
test = arr[range(arr.shape[0]),indices,...]

Numpy vectorize sum over indices

I have a list of indices (list(int)) and a list of summing indices (list(list(int)). Given a 2D numpy array, I need to find the sum over indices in the second list for each column and add them to the corresponding indices in the first column. Is there any way to vectorize this?
Here is the normal code:
indices = [1,0,2]
summing_indices = [[5,6,7],[6,7,8],[4,5]]
matrix = np.arange(9*3).reshape((9,3))
for c,i in enumerate(indices):
matrix[i,c] = matrix[summing_indices[i],c].sum()+matrix[i,c]
Here's an almost* vectorized approach using np.add.reduceat -
lens = np.array(map(len,summing_indices))
col = np.repeat(indices,lens)
row = np.concatenate(summing_indices)
vals = matrix[row,col]
addvals = np.add.reduceat(vals,np.append(0,lens.cumsum()[:-1]))
matrix[indices,np.arange(len(indices))] += addvals[indices.argsort()]
Please note that this has some setup overhead, so it would be best suited for 2D input arrays with a good number of columns as we are iterating along the columns.
*: Almost because of the use of map() at the start, but computationally that should be negligible.

Fastest way to get elements from a numpy array and create a new numpy array

I have numpy array called data of dimensions 150x4
I want to create a new numpy array called mean of dimensions 3x4 by choosing random elements from data.
My current implementation is:
cols = (data.shape[1])
K=3
mean = np.zeros((K,cols))
for row in range(K):
index = np.random.randint(data.shape[0])
for col in range(cols):
mean[row][col] = data[index][col]
Is there a faster way to do the same?
You can specify the number of random integers in numpy.randint (third argument). Also, you should be familiar with numpy.array's index notations. Here, you can access all the elements in one row by : specifier.
mean = data[np.random.randint(0,len(data),3),:]

Categories