Extract specific elements from numpy array by column [duplicate] - python

This question already has answers here:
NumPy selecting specific column index per row by using a list of indexes
(7 answers)
Closed 8 years ago.
I would like to extract specific elements from a 2d-array by index.
The index specifies the element in the column.
Example:
14, 7, 30
44, 76, 65
42, 87, 11
indices = (0, 1, 2) or (0, 1, 1)
=> result = [14,76,11] or [14, 76, 65]
I don't want to use any loop, just numpy functions and slicing and stuff.
I thought about masking but again I don't know how to generate a mask-2d-array
from the indices-array without a direct loop.

You can use row and column index vectors directly:
import numpy as np
A = np.array([[14, 7, 30],
[44, 76, 65],
[42, 87, 11]])
print A[[0, 1, 2], range(len(A))]
print A[[0, 1, 1], range(len(A))]
(Since you want exactly one item per column, the column index vector is range(len(A)).)
Output:
[14 76 11]
[14 76 65]

Related

Add Row Numbers To an array [duplicate]

This question already has answers here:
Adding a column in front of a numpy array
(3 answers)
Closed 2 months ago.
How all how do you add rows numbers to an array using numpy?
I wish to print an array to look like the following:
[1, 39, 41, 43],
[2, 38, 32, 18],
[3, 27, 14, 17],
[4, 22, 21, 22],
[5, 20, 28, 23]
With 1-5 being the row numbers
I can only print the array without row numbers.
np.insert(array, 0, np.arange(array.shape[0]), axis=1)

How to shuffle the the element choosen by user input?

I have two array
arr_1 = [2, 4, 6, 32]
arr_2 = [56, 45, 12, 65]
I am tying to give user_input from 'arr_1 list'
e.g. if I choose to give user_input '32' from 'arr_1 list', it should shuffle '32' to any position in 'arr_1 list' and along with '32' the element from 'arr_2 list' which is in same position that is '65' should also be shuffle. I tried many ways, but it shuffles all elements from a list using random.shuffle.
from random import randint
def shuffle_them(arr_1, arr_2, element_to_remove):
# get the index where to be removed element is
index_to_remove = arr_1.index(element_to_remove)
# remove that element
arr_1.remove(element_to_remove)
# randomly generate the new index
new_index = randint(0, len(arr_1))
# insert the removed element into that position in array 1
arr_1.insert(new_index, element_to_remove)
# also change the position of elements in array 2 accordingly
arr_2[new_index], arr_2[index_to_remove] = arr_2[index_to_remove], arr_2[new_index]
We find the index of element that user wants moving. Then we remove it. Then we generate a new index for it and insert it there. Lastly we use the original index and new index to exchange the values in the second array.
usage
# before
arr_1 = [2, 4, 6, 32]
arr_2 = [56, 45, 12, 65]
# shuffiling
shuffle_them(arr_1, arr_2, element_to_remove=32)
# after (32 and 65 places changed in same way)
> arr_1
[2, 32, 4, 6]
> arr_2
[56, 65, 12, 45]
another round
# before
arr_1 = [2, 4, 6, 32]
arr_2 = [56, 45, 12, 65]
# shuffiling
shuffle_them(arr_1, arr_2, element_to_remove=6)
# after (6 and 12 places changed in same way)
> arr_1
[2, 4, 32, 6]
> arr_2
[56, 45, 65, 12]
note: function directly mutates the mutable arr_1 and arr_2. It doesn't return new lists.

How to sort 2D array column by ascending and row by descending in Python 3

This is my array:
import numpy as np
boo = np.array([
[10, 55, 12],
[0, 81, 33],
[92, 11, 3]
])
If I print:
[[10 55 12]
[ 0 81 33]
[92 11 3]]
How to sort array column by ascending and row by descending like this:
[[33 81 92]
[10 12 55]
[0 3 11]]
# import the necessary packages.
import numpy as np
# create the array.
boo = np.array([
[10, 55, 12],
[0, 81, 33],
[92, 11, 3]
])
# we use numpy's 'sort' method to conduct the sorting process.
# we first sort the array along the rows.
boo = np.sort(boo, axis=0)
# we print to observe results.
print(boo)
# we therafter sort the resultant array again, this time on the axis of 1/columns.
boo = np.sort(boo, axis=1)
# we thereafter reverse the contents of the array.
print(boo[::-1])
# output shows as follows:
array([[33, 81, 92],
[10, 12, 55],
[ 0, 3, 11]])

From a 2D array, create 2nd 2D array of Unique(non-repeated) random selected values from 1st array (values not shared among rows) without using a loop

This is a follow up on this question.
From a 2d array, create another 2d array composed of randomly selected values from original array (values not shared among rows) without using a loop
I am looking for a way to create a 2D array whose rows are randomly selected unique values (non-repeating) from another row, without using a loop.
Here is a way to do it With using a loop.
pool = np.random.randint(0, 30, size=[4,5])
seln = np.empty([4,3], int)
for i in range(0, pool.shape[0]):
seln[i] =np.random.choice(pool[i], 3, replace=False)
print('pool = ', pool)
print('seln = ', seln)
>pool = [[ 1 11 29 4 13]
[29 1 2 3 24]
[ 0 25 17 2 14]
[20 22 18 9 29]]
seln = [[ 8 12 0]
[ 4 19 13]
[ 8 15 24]
[12 12 19]]
Here is a method that does not uses a loop, however, it can select the same value multiple times in each row.
pool = np.random.randint(0, 30, size=[4,5])
print(pool)
array([[ 4, 18, 0, 15, 9],
[ 0, 9, 21, 26, 9],
[16, 28, 11, 19, 24],
[20, 6, 13, 2, 27]])
# New array shape
new_shape = (pool.shape[0],3)
# Indices where to randomly choose from
ix = np.random.choice(pool.shape[1], new_shape)
array([[0, 3, 3],
[1, 1, 4],
[2, 4, 4],
[1, 2, 1]])
ixs = (ix.T + range(0,np.prod(pool.shape),pool.shape[1])).T
array([[ 0, 3, 3],
[ 6, 6, 9],
[12, 14, 14],
[16, 17, 16]])
pool.flatten()[ixs].reshape(new_shape)
array([[ 4, 15, 15],
[ 9, 9, 9],
[11, 24, 24],
[ 6, 13, 6]])
I am looking for a method that does not use a loop, and if a particular value from a row is selected, that value can Not be selected again.
Here is a way without explicit looping. However, it requires generating an array of random numbers of the size of the original array. That said, the generation is done using compiled code so it should be pretty fast. It can fail if you happen to generate two identical numbers, but the chance of that happening is essentially zero.
m,n = 4,5
pool = np.random.randint(0, 30, size=[m,n])
new_width = 3
mask = np.argsort(np.random.rand(m,n))<new_width
pool[mask].reshape(m,3)
How it works:
We generate a random array of floats, and argsort it. By default, when artsort is applied to a 2d array it is applied along axis 1 so the value of the i,j entry of the argsorted list is what place the j-th entry of the i-th row would appear if you sorted the i-th row.
We then find all the values in this array where the entries whose values are less than new_width. Each row contains the numbers 0,...,n-1 in a random order, so exactly new_width of them will be less than new_width. This means each row of mask will have exactly new_width number of entries which are True, and the rest will be False (when you use a boolean operator between a ndarray and a scalar it applies it component-wise).
Finally, the boolean mask is applied to the original data to grab new_width many entries from each row.
You could also use np.vectorize for your loop solution, although that is just shorthand for a loop.

How can I find the final cumulative sum across numpy axis? [duplicate]

This question already has answers here:
How to calculate the sum of all columns of a 2D numpy array (efficiently)
(6 answers)
Closed 4 years ago.
I have a numpy array
np.array(data).shape
(50,50)
Now, I want to find the cumulative sums across axis=1. The problem is cumsum creates an array of cumulative sums, but I just care about the final value of every row.
This is incorrect of course:
np.cumsum(data, axis=1)[-1]
Is there a succinct way of doing this without looping through the array.
You are almost there, but as you have it now, you are selecting just the final row. What you need is to select all rows from the last column, so your indexing at the end should be: [:,-1].
Example:
>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
>>> a.cumsum(axis=1)[:,-1]
array([ 10, 35, 60, 85, 110])
Note, I'm leaving this up as I think it explains what was going wrong with your attempt, but admittedly, there are more effective ways of doing this in the other answers!
The final cumulative sum of every row, is in fact simply the sum of every row, or the row-wise sum, so we can implement this as:
>>> x.sum(axis=1)
array([ 10, 35, 60, 85, 110])
So here for every row, we calculate the sum of all the columns. We thus do not need to first generate the sums in between (well these will likely be stored in an accumulator in numpy), but not "emitted" in the array.
You can use numpy.ufunc.reduce if you don't need the intermediary accumulated results of any ufunc.
>>> a = np.arange(9).reshape(3,3)
>>> a
>>>
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>>
>>> np.add.reduce(a, axis=1)
>>> array([ 3, 12, 21])
However, in the case of sum, Willem's answer is clearly superior and to be preferred. Just keep in mind that in the general case, there's ufunc.reduce.

Categories