I have an array like this:
array = np.array([[[[ 2, -3],[ 3, 2]],[[-4, -1],[-5, 1]],
[[-7, -5],[-1, 6]],[[-5, 0],[-4, 2]]],
[[[-1, 4],[ 6, 1]],[[-2, -3],[-5, 5]],
[[-2, -8],[-1, 7]],[[-1, 8],[-4, 2]]]])
If I sum(array) then I get the sum of (4x2x2) + (4x2x2).
How can I sum the elements inside of the first arrays, opposite of what sum() function did. Like (2-3) = -1 in the first group, (3+2) = 5 in the second, etc.
Thanks
summing along the 3rd axis should do what you want:
res = np.sum(array, axis=3)
# or:
# res = array.sum(axis=3)
which produces
[[[ -1 5]
[ -5 -4]
[-12 5]
[ -5 -2]]
[[ 3 7]
[ -5 0]
[-10 6]
[ 7 -2]]]
Related
For a project I need to be able to get, from a vector with shape (k, m), the indexes of the N greatest values of each row greater than a fixed threshold.
For example, if k=3, m=5, N=3 and the threshold is 5 and the vector is :
[[3 2 6 7 0],
[4 1 6 4 0],
[7 10 6 9 8]]
I should get the result (or the flattened version, I don't care) :
[[2, 3],
[2],
[1, 3, 4]]
The indexes don't have to be sorted.
My code is currently :
indexes = []
for row, inds in enumerate(np.argsort(results, axis=1)[:, -N:]):
for index in inds:
if results[row, index] > threshold:
indexes.append(index)
but I feel like I am not using Numpy to its full capacity.
Does anybody know a better and more elegant solution ?
How about this method:
import numpy as np
arr = np.array(
[[3, 2, 6, 7, 0],
[4, 1, 6, 4, 0],
[7, 10, 6, 9, 8]]
)
t = 5
n = 3
sorted_idxs = arr.argsort(1)[:, -n:]
sorted_arr = np.sort(arr, 1)[:, -n:]
item_nums = np.cumsum((sorted_arr > t).sum(1))
masked_idxs = sorted_idxs[sorted_arr > t]
idx_lists = np.split(masked_idxs, item_nums)
output:
[array([2, 3]), array([2]), array([4, 3, 1])]
I'm working with a m x n numpy 2D-array which holds some integer values. The dimensions are unknown before executing the script, but n (the width) is always even. Something like:
[[ 1 2 3 4]
[ 1 2 3 4]
[ 1 2 3 4]
[ 1 2 3 4]
[ 1 2 3 4]]
What I need is to group the columns in pairs and concatenate them along the first axis:
[[ 1 2]
[ 1 2]
[ 1 2]
[ 1 2]
[ 1 2]
[ 3 4]
[ 3 4]
[ 3 4]
[ 3 4]
[ 3 4]]
I tried using reshape but that doesn't output the expected result. I'm not very used to program in Python and would be able to implement it using loops and if statements, but I'm sure there's a more elegant way to do it. Any help is welcomed!
You need to transpose the matrix between the reshape:
# sample
a = np.stack([[1,2,3,4, 5, 6]]*2)
a.reshape(a.shape[0], -1, 2).transpose(1,0,2).reshape(-1,2)
Output:
array([[1, 2],
[1, 2],
[3, 4],
[3, 4],
[5, 6],
[5, 6]])
hi with reshape you can choose to start with the columns like this:
a=np.array([[1,2,3,4],[1,2,3,4],[1,2,3,4],[1,2,3,4]])
a.reshape((8,2),order='F')
I have an example 2 x 2 x 2 array:
np.array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7 , 8]]])
I want the nansum of the array across the first index as follows:
Sum all values in:
[[ 1, 2],
[ 3, 4]]
and
[[ 5, 6],
[ 7 , 8]]
The sum of the first array would be 10 and the second would be 26
i.e.
array([10, 26])
I think you are looking for this
a = np.array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7 , 8]]])
np.nansum(a,axis=(1,2))
# array([10, 26])
because you want to sum on axis 1 and 2 only, and get one number per axis 0
I have an numpy array with 4 columns and want to select columns 1, 3 and 4, where the value of the second column meets a certain condition (i.e. a fixed value). I tried to first select only the rows, but with all 4 columns via:
I = A[A[:,1] == i]
which works. Then I further tried (similarly to matlab which I know very well):
I = A[A[:,1] == i, [0,2,3]]
which doesn't work. How to do it?
EXAMPLE DATA:
>>> A = np.array([[1,2,3,4],[6,1,3,4],[3,2,5,6]])
>>> print A
[[1 2 3 4]
[6 1 3 4]
[3 2 5 6]]
>>> i = 2
# I want to get the columns 1, 3 and 4
# for every row which has the value i in the second column.
# In this case, this would be row 1 and 3 with columns 1, 3 and 4:
[[1 3 4]
[3 5 6]]
I am now currently using this:
I = A[A[:,1] == i]
I = I[:, [0,2,3]]
But I thought that there had to be a nicer way of doing it... (I am used to MATLAB)
>>> a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
>>> a
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
>>> a[a[:,0] > 3] # select rows where first column is greater than 3
array([[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
>>> a[a[:,0] > 3][:,np.array([True, True, False, True])] # select columns
array([[ 5, 6, 8],
[ 9, 10, 12]])
# fancier equivalent of the previous
>>> a[np.ix_(a[:,0] > 3, np.array([True, True, False, True]))]
array([[ 5, 6, 8],
[ 9, 10, 12]])
For an explanation of the obscure np.ix_(), see https://stackoverflow.com/a/13599843/4323
Finally, we can simplify by giving the list of column numbers instead of the tedious boolean mask:
>>> a[np.ix_(a[:,0] > 3, (0,1,3))]
array([[ 5, 6, 8],
[ 9, 10, 12]])
If you do not want to use boolean positions but the indexes, you can write it this way:
A[:, [0, 2, 3]][A[:, 1] == i]
Going back to your example:
>>> A = np.array([[1,2,3,4],[6,1,3,4],[3,2,5,6]])
>>> print A
[[1 2 3 4]
[6 1 3 4]
[3 2 5 6]]
>>> i = 2
>>> print A[:, [0, 2, 3]][A[:, 1] == i]
[[1 3 4]
[3 5 6]]
Seriously,
>>> a=np.array([[1,2,3], [1,3,4], [2,2,5]])
>>> a[a[:,0]==1][:,[0,1]]
array([[1, 2],
[1, 3]])
>>>
This also works.
I = np.array([row[[x for x in range(A.shape[1]) if x != i-1]] for row in A if row[i-1] == i])
print I
Edit: Since indexing starts from 0, so
i-1
should be used.
I am hoping this answers your question but a piece of script I have implemented using pandas is:
df_targetrows = df.loc[df[col2filter]*somecondition*, [col1,col2,...,coln]]
For example,
targets = stockdf.loc[stockdf['rtns'] > .04, ['symbol','date','rtns']]
this will return a dataframe with only columns ['symbol','date','rtns'] from stockdf where the row value of rtns satisfies, stockdf['rtns'] > .04
hope this helps
I have the following to calculate the difference of a matrix, i.e. the i-th element - the (i-1) element.
How can I (easily) calculate the difference for each element horizontally and vertically? With a transpose?
inputarr = np.arange(12)
inputarr.shape = (3,4)
inputarr+=1
#shift one position
newarr = list()
for x in inputarr:
newarr.append(np.hstack((np.array([0]),x[:-1])))
z = np.array(newarr)
print inputarr
print 'first differences'
print inputarr-z
Output
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
first differences
[[1 1 1 1]
[5 1 1 1]
[9 1 1 1]]
Check out numpy.diff.
From the documentation:
Calculate the n-th order discrete difference along given axis.
The first order difference is given by out[n] = a[n+1] - a[n] along
the given axis, higher order differences are calculated by using diff
recursively.
An example:
>>> import numpy as np
>>> a = np.arange(12).reshape((3,4))
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> np.diff(a,axis = 1) # row-wise
array([[1, 1, 1],
[1, 1, 1],
[1, 1, 1]])
>>> np.diff(a, axis = 0) # column-wise
array([[4, 4, 4, 4],
[4, 4, 4, 4]])