My question is very similar to
Indexing tensor with index matrix in theano?
except that I have 3 dimensions. At first I want to got it working in numpy. With 2 dimensions there is no problem:
>>> idx = np.random.randint(3, size=(4, 2, 3))
>>> d = np.random.rand(4*2*3).reshape((4, 2, 3))
>>> d[1]
array([[ 0.37057415, 0.73066383, 0.76399376],
[ 0.12155831, 0.12552545, 0.87648523]])
>>> idx[1]
array([[2, 0, 1],
[2, 2, 2]])
>>> d[1][np.arange(d.shape[1])[:, np.newaxis], idx[1]]
array([[ 0.76399376, 0.37057415, 0.73066383],
[ 0.87648523, 0.87648523, 0.87648523]]) #All correct
But I have no idea how to make it works with all 3 dimensions. Example of failed try:
>>> d[np.arange(d.shape[0])[:, np.newaxis], np.arange(d.shape[1]), idx]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (4,1) (2,) (4,2,3)
Does this work?
d[
np.arange(d.shape[0])[:, np.newaxis, np.newaxis],
np.arange(d.shape[1])[:, np.newaxis],
idx
]
You need the index arrays to collectively have broadcastable dimensions
Related
I'm trying to use advanced indexing but I cannot get it to work with this simple array
arr = np.array([[[ 1, 10, 100,1000],[ 2, 20, 200,2000]],[[ 3, 30, 300,3000],[ 4,40,400,4000]],[[5, 50, 500,5000],[6, 60,600,6000]]])
d1=np.array([0])
d2=np.array([0,1])
d3=np.array([0,1,2])
arr[d1,d2,d3]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (1,) (2,) (3,)
and
arr[d1[:,np.newaxis],d2[np.newaxis,:],d3]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (1,1) (1,2) (3,)
Expected output:
array([[[ 1, 10, 100],
[ 2, 20, 200]]])
You can use np.ix_ to combine several one-dimensional index arrays of different lengths to index a multidimensional array. For example:
arr[np.ix_(d1,d2,d3)]
To add more context, np.ix_ returns a tuple of ndimensional arrays. The same can be achieved "by hand" by adding np.newaxis for appropriate dimensions:
xs, ys, zs = np.ix_(d1,d2,d3)
# xs.shape == (1, 1, 1) == (len(d1), 1, 1 )
# ys.shape == (1, 2, 1) == (1, len(d2), 1 )
# zs.shape == (1, 1, 3) == (1, 1, len(d3))
result_ix = arr[xs, ys, zs]
# using newaxis:
result_newaxis = arr[
d1[:, np.newaxis, np.newaxis],
d2[np.newaxis, :, np.newaxis],
d3[np.newaxis, np.newaxis, :],
]
assert (result_ix == result_newaxis).all()
You need only d1 to select the first cell:
>>> arr[d1]
array([[[ 1, 10, 100],
[ 2, 20, 200]]])
Very simple but had no clue about it.
How do I add 4 to the end of the param_array?
param_array = np.array([[1,2,3]])
print(param_array)
print(param_array.shape)
print()
param_array = np.append(param_array, 4)
print(param_array)
print(param_array.shape)
[[1 2 3]]
(1, 3)
[1 2 3 4]
(4,)
I need the array of
[[1,2,3,4]]
shape should be (1,4)
Concatenate a (1,1) array to a (1,3) to make a (1,4):
In [168]: arr = np.array([[1,2,3]])
In [169]: arr1 = np.concatenate((arr, np.array([[4]])), axis=1)
In [170]: arr1
Out[170]: array([[1, 2, 3, 4]])
Your use of np.append produced a (4,) because, according to the docs:
If `axis` is None, `out` is a flattened array.
If I specify the axis in append:
In [172]: np.append(arr, 4, axis=1)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-172-ca20005ded46> in <module>
----> 1 np.append(arr, 4, axis=1)
<__array_function__ internals> in append(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/numpy/lib/function_base.py in append(arr, values, axis)
4698 values = ravel(values)
4699 axis = arr.ndim-1
-> 4700 return concatenate((arr, values), axis=axis)
4701
4702
<__array_function__ internals> in concatenate(*args, **kwargs)
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 0 dimension(s)
The error is in the concatenate line. You need to specify the 2d addition, just as I did above:
In [173]: np.append(arr, [[4]], axis=1)
Out[173]: array([[1, 2, 3, 4]])
np.append(A, B, axis) is just another way of writing np.concatenate((A,B), axis). With both you have to pay attention to dimensions.
To make the changes you seek you should mess a bit with the axis parameter. There is, however, a shortcut you could use: creating a new reshaped array after you append the new element.
param_array = np.append(param_array, 4).reshape((1,4))
It does exactly what you want (at least I think) and it's quite simpler.
header
output:
array(['Subject_ID', 'tube_label', 'sample_#', 'Relabel',
'sample_ID','cortisol_value', 'Group'], dtype='<U14')
body
output:
array([['STM002', '170714_STM002_1', 1, 1, 1, 1.98, 'HC'],
['STM002', '170714_STM002_2', 2, 2, 2, 2.44, 'HC'],], dtype=object)
testing = np.concatenate((header, body), axis=0)
ValueError Traceback (most recent call last) <ipython-input-302-efb002602b4b> in <module>()
1 # Merge names and the rest of the data in np array
2
----> 3 testing = np.concatenate((header, body), axis=0)
ValueError: all the input arrays must have same number of dimensions
Might someone be able to troubleshoot this?
I have tried different commands to merge the two (including stack) and am getting the same error. The dimensions (columns) do seem to be the same though.
You're right in trying to use numpy.concatenate() but you've to promote the first array to 2D before concatenating. Here's a simple example:
In [1]: import numpy as np
In [2]: arr1 = np.array(['Subject_ID', 'tube_label', 'sample_#', 'Relabel',
...: 'sample_ID','cortisol_value', 'Group'], dtype='<U14')
...:
In [3]: arr2 = np.array([['STM002', '170714_STM002_1', 1, 1, 1, 1.98, 'HC'],
...: ['STM002', '170714_STM002_2', 2, 2, 2, 2.44, 'HC'],], dtype=object)
...:
In [4]: arr1.shape
Out[4]: (7,)
In [5]: arr2.shape
Out[5]: (2, 7)
In [8]: concatenated = np.concatenate((arr1[None, :], arr2), axis=0)
In [9]: concatenated.shape
Out[9]: (3, 7)
And the resultant concatenated array would look like:
In [10]: concatenated
Out[10]:
array([['Subject_ID', 'tube_label', 'sample_#', 'Relabel', 'sample_ID',
'cortisol_value', 'Group'],
['STM002', '170714_STM002_1', 1, 1, 1, 1.98, 'HC'],
['STM002', '170714_STM002_2', 2, 2, 2, 2.44, 'HC']], dtype=object)
Explanation:
The reason you were getting the ValueError is because one of the arrays is 1D while the other is 2D. But, numpy.concatenate expects the arrays to be of same dimension in this case. That's why we promoted the array dimension of arr1 using None. But, you can also use numpy.newaxis in place of None
You need to align array dimensions first. You are currently trying to combine 1-dimensional and 2-dimensional arrays. After alignment, you can use numpy.vstack.
Note np.array([A]).shape returns (1, 7), while B.shape returns (2, 7). A more efficient alternative would be to use A[None, :].
Also note your array will become of dtype object, as this will accept arbitrary / mixed types.
A = np.array(['Subject_ID', 'tube_label', 'sample_#', 'Relabel',
'sample_ID','cortisol_value', 'Group'], dtype='<U14')
B = np.array([['STM002', '170714_STM002_1', 1, 1, 1, 1.98, 'HC'],
['STM002', '170714_STM002_2', 2, 2, 2, 2.44, 'HC'],], dtype=object)
res = np.vstack((np.array([A]), B))
print(res)
array([['Subject_ID', 'tube_label', 'sample_#', 'Relabel', 'sample_ID',
'cortisol_value', 'Group'],
['STM002', '170714_STM002_1', 1, 1, 1, 1.98, 'HC'],
['STM002', '170714_STM002_2', 2, 2, 2, 2.44, 'HC']], dtype=object)
Look at numpy.vstack and hstack, as well as the axis argument in np.append. Here it looks like you want vstack (i.e. the output array will have 3 columns, each with the same number of rows). You can also look into numpy.reshape, to change the shape of the input arrays so you can concatenate them.
I am trying to concatenate 4 arrays, one 1D array of shape (78427,) and 3 2D array of shape (78427, 375/81/103). Basically this are 4 arrays with features for 78427 images, in which the 1D array only has 1 value for each image.
I tried concatenating the arrays as follows:
>>> print X_Cscores.shape
(78427, 375)
>>> print X_Mscores.shape
(78427, 81)
>>> print X_Tscores.shape
(78427, 103)
>>> print X_Yscores.shape
(78427,)
>>> np.concatenate((X_Cscores, X_Mscores, X_Tscores, X_Yscores), axis=1)
This results in the following error:
Traceback (most recent call last):
File "", line 1, in
ValueError: all the input arrays must have same number of dimensions
The problem seems to be the 1D array, but I can't really see why (it also has 78427 values). I tried to transpose the 1D array before concatenating it, but that also didn't work.
Any help on what's the right method to concatenate these arrays would be appreciated!
Try concatenating X_Yscores[:, None] (or X_Yscores[:, np.newaxis] as imaluengo suggests). This creates a 2D array out of a 1D array.
Example:
A = np.array([1, 2, 3])
print A.shape
print A[:, None].shape
Output:
(3,)
(3,1)
I am not sure if you want something like:
a = np.array( [ [1,2],[3,4] ] )
b = np.array( [ 5,6 ] )
c = a.ravel()
con = np.concatenate( (c,b ) )
array([1, 2, 3, 4, 5, 6])
OR
np.column_stack( (a,b) )
array([[1, 2, 5],
[3, 4, 6]])
np.row_stack( (a,b) )
array([[1, 2],
[3, 4],
[5, 6]])
You can try this one-liner:
concat = numpy.hstack([a.reshape(dim,-1) for a in [Cscores, Mscores, Tscores, Yscores]])
The "secret" here is to reshape using the known, common dimension in one axis, and -1 for the other, and it automatically matches the size (creating a new axis if needed).
Hi I have a 2x4 array called mi_reshaped. I used the argmax to find out the indeces of the largest elements in my array. Now I want to convert these indeces to x,y coordinates. So I used the numpy.unravel_index. I get this error:
Traceback (most recent call last):
File "CAfeb.py", line 273, in <module>
analyzeCA('full', im)
File "CAfeb.py", line 80, in analyzeCA
bg_params = parameterSearch( im, [3, 2], roi, ew, hist_sz, w_data);
File "CAfeb.py", line 185, in parameterSearch
ix = np.unravel_index(max_ix, mi_reshaped.shape)#(mi.size)
File "/usr/lib/pymodules/python2.7/numpy/lib/index_tricks.py", line 64, in unravel_index
if x > _nx.prod(dims)-1 or x < 0:
ValueError: The truth value of an array with more than one element isambiguous.
a.any() or a.all()
mi_reshaped=mi.reshape(2,4)
max_ix = np.argmax(mi_reshaped, axis=1)
ix = np.unravel_index(max_ix, mi_reshaped.shape)#(mi.size)
Thank you
You should skip the axis=1 for this. If you do a numpy.argmax(array) it will look for max in the flattened array, and then you can do the unravel_index with the array shape to find the actual index. When you pass the axis, numpy will look for the maximum for that axis for each entry in the array. For example:
>>>data = numpy.array(range(8)).reshape(2, 4)
>>>data
array([[0, 1, 2, 3],
[4, 5, 6, 7]])
>>>max_ix = numpy.argmax(data, axis=1)
>>>max_ix
array([3, 3])
>>>numpy.unravel_index(max_ix, data.shape)
(array([0, 0]), array([3, 3]))
Now if you skip the axis:
>>>max_ix = numpy.argmax(data)
>>>max_ix
7
>>>numpy.unravel_index(max_ix, data.shape)
(1, 3)
Now what happened is you told numpy to give you the index for maximums on the dimension 1 and it finds the maximums '3' and '7' with indexes [3, 3]. Still you should't get an error with your code, just the wrong final result.
np.unravel_index expects an integer as its first argument. max_ix is an array.
Moreover, each value in max_ix is an index with respect to the second axis (axis = 1) of mi.
Try instead:
ix = [(row, ix) for row, ix in enumerate(max_ix)]
For example,
In [89]: mi_reshaped = np.array(range(8)).reshape(2, 4)
In [90]: mi_reshaped
Out[90]:
array([[0, 1, 2, 3],
[4, 5, 6, 7]])
In [91]: max_ix = np.argmax(mi_reshaped, axis=1)
In [92]: max_ix
Out[92]: array([3, 3])
In [93]: ix = [(row, ix) for row, ix in enumerate(max_ix)]
In [94]: ix
Out[94]: [(0, 3), (1, 3)]