Form an numpy array from indices and fill with zeroes - python

I have a nympy array a = np.array([483, 39, 18, 999, 20, 48]
I have an array of indices indices = np.array([2, 3])
I would like to have all the indices of the array and fill the rest of the indices with 0 so I get as a result :
np.array([0, 0, 18, 999, 0, 0])
Thank you for your answer.

Create an all zeros array and copy the values at the desired indices:
import numpy as np
a = np.array([483, 39, 18, 999, 20, 48])
indices = np.array([2, 3])
b = np.zeros_like(a)
b[indices] = a[indices]
# a = b # if needed
print(a)
print(indices)
print(b)
Output:
[483 39 18 999 20 48]
[2 3]
[ 0 0 18 999 0 0]
Hope that helps!
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.8.1
NumPy: 1.18.1
----------------------------------------
EDIT: Even better, use np.setdiff1d:
import numpy as np
a = np.array([483, 39, 18, 999, 20, 48])
indices = np.array([2, 3])
print(a)
print(indices)
a[np.setdiff1d(np.arange(a.shape[0]), indices, True)] = 0
print(a)
Output:
[483 39 18 999 20 48]
[2 3]
[ 0 0 18 999 0 0]

What about using list comprehension?
a = np.array([n if i in indices else 0 for i, n in enumerate(a)])
print(a) #array([ 0, 0, 18, 999, 0, 0])

You can create a function that uses the input array and the index array to do this, as in the following:
import numpy as np
def remove_by_index(input_array, indexes):
for i,_ in enumerate(input_array):
if i not in indexes:
input_array[i] = 0
return input_array
input_array = np.array([483, 39, 18, 999, 20, 48])
indexes = np.array([2, 3])
new_out = remove_by_index(input_array, indexes)
expected_out = np.array([0, 0, 18, 999, 0, 0])
print(new_out == expected_out) # to check if it's correct
Edit
You can also use list comprehension inside the function, which would be better, as:
def remove_by_index(input_array, indexes):
return [input_array[i] if (i in indexes) else 0 for i,_ in enumerate(input_array)]
It is not, as pointed out in comments, the most efficient way of doing it, performing iteration at Python level instead of C level, but it does work, and for casual use it will solve.

Related

replace numbers from one list to another

strong text
I have matrix (3-d array)
strong text
"a" and "b" shape: (5, 3, depths), depths == can varaible, in this example is 2, but sometimes could be 3 or 4 or 6, I am looking for a function that works with different depths.
Blockquote
a= [[[10,15,10,9,45], [2,21,78,14,96], [2,2,78,14,96], [3,34,52,87,21], [52,14,45,85,74] ], [[52,14,45,85,74], [2,2,78,14,96], [15,41,48,48,74], [3,34,52,87,21], [14,71,84,85,41]]]
Blockquote
b= [[[0,1,0,1,1], [2,2,1,0,0], [2,2,1,1,0], [0,0,0,0,1], [0,0,1,1,1] ], [[0,0,0,1,1], [0,1,1,1,2], [2,2,2,2,0], [0,0,0,1,1], [1,0,0,0,1]]]
strong text
I want a matrix "c", "c" should be the copy of "a", but when a value in "b" is == 0, "c" will also be == 0
Blockquote
c= [[[0,15,0,9,45], [2,21,78,0,0], [2,21,78,14,0], [0,0,0,0,21], [0,0,45,85,74] ], [[0,0,0,85,74], [0,2,78,14,96], [15,41,48,48,0], [0,0,0,87,21], [14,0,0,0,41]]]
strong text
thank for yourl help
Use numpy arrays and element-wise multiplication:
import numpy as np
a= [[[10,15,10,9,45], [2,21,78,14,96], [2,2,78,14,96], [3,34,52,87,21], [52,14,45,85,74] ], [[52,14,45,85,74], [2,2,78,14,96], [15,41,48,48,74], [3,34,52,87,21], [14,71,84,85,41]]]
b= [[[0,1,0,1,1], [2,2,1,0,0], [2,2,1,1,0], [0,0,0,0,1], [0,0,1,1,1] ], [[0,0,0,1,1], [0,1,1,1,2], [2,2,2,2,0], [0,0,0,1,1], [1,0,0,0,1]]]
result = np.array(a) * np.array(b)
result = result.tolist() # if you want the final result as a list
print(result)
[[[0, 15, 0, 9, 45], [4, 42, 78, 0, 0], [4, 4, 78, 14, 0], [0, 0, 0, 0, 21], [0, 0, 45, 85, 74]], [[0, 0, 0, 85, 74], [0, 2, 78, 14, 192], [30, 82, 96, 96, 0], [0, 0, 0, 87, 21], [14, 0, 0, 0, 41]]]
Note : In the question, you are talking about 5x3 blocks, but your example is 5x5. i'll assume the correct format is 5x5xDepth
Without using numpy
Let's define our depth and a result blockquote
depth = len(a)
result = []
So we can iterate trought our blockquote :
for x in range(depth):
# This is a 5 x 5 array
2d_block = []
for y in range(5):
# This is our final dimension, array of lengh 5
1d_block = []
for z in range(5):
# Check if b is 0
if b[x][y][z] == 0:
1d_block.append(0)
else:
1d_block.append(a[x][y][z])
# Add our block to the current 2D block
2d_block.append(1d_block)
# Add our blocks to the result
result.append(2d_block)
Recursive alternative
A more advanced solution
def convert_list(a, b):
if isinstance(a, list):
# Recursive call on all sub list
return [convert_list(a[i], b[i]) for i in range(len(a))]
else
# When we find an element, return a if b is not 0
return a if b!=0 else 0
This is a recursive function so you don't need to mind about the dimensions of your blockquote (as look as a and b have the same lengh)
Using numpy
Inspired by msamsami anwser, with a step to convert all non-zero numbers in b to 1 to avoid multiplying the result (zeros stay 0 so we can filter a values)
# Convert b to an array with 0 and 1, to avoid multiplying by 2
def toBinary(item):
if isinstance(item, list):
return [toBinary(x) for x in item]
else:
return item != 0
filter = toBinary(b)
result = np.array(a) * np.array(filter)

How to efficiently shuffle some values of a numpy array while keeping their relative order?

I have a numpy array and a mask specifying which entries from that array to shuffle while keeping their relative order. Let's have an example:
In [2]: arr = np.array([5, 3, 9, 0, 4, 1])
In [4]: mask = np.array([True, False, False, False, True, True])
In [5]: arr[mask]
Out[5]: array([5, 4, 1]) # These entries shall be shuffled inside arr, while keeping their order.
In [6]: np.where(mask==True)
Out[6]: (array([0, 4, 5]),)
In [7]: shuffle_array(arr, mask) # I'm looking for an efficient realization of this function!
Out[7]: array([3, 5, 4, 9, 0, 1]) # See how the entries 5, 4 and 1 haven't changed their order.
I've written some code that can do this, but it's really slow.
import numpy as np
def shuffle_array(arr, mask):
perm = np.arange(len(arr)) # permutation array
n = mask.sum()
if n > 0:
old_true_pos = np.where(mask == True)[0] # old positions for which mask is True
old_false_pos = np.where(mask == False)[0] # old positions for which mask is False
new_true_pos = np.random.choice(perm, n, replace=False) # draw new positions
new_true_pos.sort()
new_false_pos = np.setdiff1d(perm, new_true_pos)
new_pos = np.hstack((new_true_pos, new_false_pos))
old_pos = np.hstack((old_true_pos, old_false_pos))
perm[new_pos] = perm[old_pos]
return arr[perm]
To make things worse, I actually have two large matrices A and B with shape (M,N). Matrix A holds arbitrary values, while each row of matrix B is the mask which to use for shuffling one corresponding row of matrix A according to the procedure that I outlined above. So what I want is shuffled_matrix = row_wise_shuffle(A, B).
The only way I have so far found to do it is via my shuffle_array() function and a for loop.
Can you think of any numpy'onic way to accomplish this task avoiding loops? Thank you so much in advance!
For 1d case:
import numpy as np
a = np.arange(8)
b = np.array([1,1,1,1,0,0,0,0])
# Get ordered values
ordered_values = a[np.where(b==1)]
# We'll shuffle both arrays
shuffled_ix = np.random.permutation(a.shape[0])
a_shuffled = a[shuffled_ix]
b_shuffled = b[shuffled_ix]
# Replace the values with correct order
a_shuffled[np.where(b_shuffled==1)] = ordered_values
a_shuffled # Notice that 0, 1, 2, 3 preserves order.
>>>
array([0, 1, 2, 6, 3, 4, 7, 5])
for 2d case, columnwise shuffle (along axis=1):
import numpy as np
a = np.arange(24).reshape(4,6)
b = np.array([[0,0,0,0,1,1], [1,1,1,0,0,0], [1,1,1,1,0,0], [0,0,1,1,0,0]])
# The code below works for column shuffle (i.e. axis=1).
# Get ordered values
i,j = np.where(b==1)
values = a[i, j]
values
# We'll shuffle both arrays for axis=1
# taken from https://stackoverflow.com/questions/5040797/shuffling-numpy-array-along-a-given-axis
idx = np.random.rand(*a.shape).argsort(axis=1)
a_shuffled = np.take_along_axis(a,idx,axis=1)
b_shuffled = np.take_along_axis(b,idx,axis=1)
# Replace the values with correct order
a_shuffled[np.where(b_shuffled==1)] = values
# Get the result
a_shuffled # see that 4,5 | 6,7,8 | 12,13,14,15 | 20, 21 preserves order
>>>
array([[ 4, 1, 0, 3, 2, 5],
[ 9, 6, 7, 11, 8, 10],
[12, 13, 16, 17, 14, 15],
[23, 20, 19, 22, 21, 18]])
for 2d case, rowwise shuffle (along axis=0), we can use the same code, first transpose arrays and after shuffle transpose back:
import numpy as np
a = np.arange(24).reshape(4,6)
b = np.array([[0,0,0,0,1,1], [1,1,1,0,0,0], [1,1,1,1,0,0], [0,0,1,1,0,0]])
# The code below works for column shuffle (i.e. axis=1).
# As you said rowwise, we first transpose
at = a.T
bt = b.T
# Get ordered values
i,j = np.where(bt==1)
values = at[i, j]
values
# We'll shuffle both arrays for axis=1
# taken from https://stackoverflow.com/questions/5040797/shuffling-numpy-array-along-a-given-axis
idx = np.random.rand(*at.shape).argsort(axis=1)
at_shuffled = np.take_along_axis(at,idx,axis=1)
bt_shuffled = np.take_along_axis(bt,idx,axis=1)
# Replace the values with correct order
at_shuffled[np.where(bt_shuffled==1)] = values
# Get the result
a_shuffled = at_shuffled.T
a_shuffled # see that 6,12 | 7, 13 | 8,14,20 | 15, 21 preserves order
>>>
array([[ 6, 7, 2, 3, 10, 17],
[18, 19, 8, 15, 16, 23],
[12, 13, 14, 21, 4, 5],
[ 0, 1, 20, 9, 22, 11]])

Value to Assign to Missing Values in uint Numpy Array

A numpy array z is constructed from 2 Python lists x and y where values of y can be 0 and values of x are not continuously incrementing (i.e. values can be skipped).
Since y values can also be 0, it will be confusing to assign missing values in z to be 0 as well.
What is the best practice to avoid this confusion?
import numpy as np
# Construct `z`
x = [1, 2, 3, 5, 8, 13]
y = [12, 34, 56, 0, 78, 0]
z = np.ndarray(max(x)+1).astype(np.uint32) # missing values become 0
for i in range(len(x)):
z[x[i]] = y[i]
print(z) # [ 0 12 34 56 0 0 0 0 78 0 0 0 0 0]
print(z[4]) # missing value but is assigned 0
print(z[13]) # non-missing value but also assigned 0
Solution
You could typically assign np.nan or any other value for the non-existing indices in x.
Also, no need for the for loop. You can directly assign all values of y in one line, as I showed here.
However, since you are typecasting to uint32, you cannot use np.nan (why not?). Instead, you could use a large number (for example, 999999) of your choice, which by design, will not show up in y. For more details, please refer to the links shared in the References section below.
import numpy as np
x = [1, 2, 3, 5, 8, 13]
y = [12, 34, 56, 0, 78, 0]
# cannot use np.nan with uint32 as np.nan is treated as a float
# choose some large value instead: 999999
z = np.ones(max(x)+1).astype(np.uint32) * 999999
z[x] = y
z
# array([999999, 12, 34, 56, 999999, 0, 999999, 999999,
# 78, 999999, 999999, 999999, 999999, 0], dtype=uint32)
References
Numpy integer nan
Maximum and minimum value of C types integers from Python

Python - mutate row elements using array of indices

given the following array:
import numpy as np
a = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
I can create an array of indices:
b = np.array([0, 2, 0, 1])
and mutate one element from each row using the indices:
a[np.arange(4),b] += 10
which yields:
[[11 2 3]
[ 4 5 16]
[17 8 9]
[10 21 12]]
Is there a more readable way to achieve the same result instead of a[np.arange(4),b] += 10?
Maybe writing it out more explicitly would help for "readability":
x = np.array([0, 2, 0, 1])
y = numpy.arange(x.size)
a[y, x] += 10
Otherwise, you are doing it in a very clear and succinct way, in my opinion.
Another option is to use a ufunc:
numpy.add.at(a, [y,x], 10)
Or if you prefer not to use numpy.arange:
y = numpy.indices((x.size,))

Combine list of numpy arrays and reshape

I'm hoping anybody could help me with the following.
I have 2 lists of arrays, which should be linked to each-other. Each list stands for a certain object. arr1 and arr2 are the attributes of that object.
For example:
import numpy as np
arr1 = [np.array([1, 2, 3]), np.array([1, 2]), np.array([2, 3])]
arr2 = [np.array([20, 50, 30]), np.array([50, 50]), np.array([75, 25])]
The arrays are linked to each other as in the 1 in arr1, first array belongs to the 20 in arr2 first array. The result I'm looking for in this example would be a numpy array with size 3,4. The 'columns' stand for 0, 1, 2, 3 (the numbers in arr1, plus 0) and the rows are filled with the corresponding values of arr2. When there are no corresponding values this cell should be 0.
Example:
array([[ 0, 20, 50, 30],
[ 0, 50, 50, 0],
[ 0, 0, 75, 25]])
How would I link these two list of arrays and reshape them in the desired format as shown in the above example?
Many thanks!
Here's an almost* vectorized approach -
lens = np.array([len(i) for i in arr1])
N = len(arr1)
row_idx = np.repeat(np.arange(N),lens)
col_idx = np.concatenate(arr1)
M = col_idx.max()+1
out = np.zeros((N,M),dtype=int)
out[row_idx,col_idx] = np.concatenate(arr2)
*: Almost because of the loop comprehension at the start, but that should be computationally negligible as it doesn't involve any computation there.
Here is a solution with for-loops. Showing each step in detail.
import numpy as np
arr1 = [np.array([1, 2, 3]), np.array([1, 2]), np.array([2, 3])]
arr2 = [np.array([20, 50, 30]), np.array([50, 50]), np.array([75, 25])]
maxi = []
for i in range(len(arr1)):
maxi.append(np.max(arr1[i]))
maxi = np.max(maxi)
output = np.zeros((len(arr2),maxi))
for i in range(len(arr1)):
for k in range(len(arr1[i])):
output[i][k]=arr2[i][k]
This is a straight forward approach, with only one level of iteration:
In [261]: res=np.zeros((3,4),int)
In [262]: for i,(idx,vals) in enumerate(zip(arr1, arr2)):
...: res[i,idx]=vals
...:
In [263]: res
Out[263]:
array([[ 0, 20, 50, 30],
[ 0, 50, 50, 0],
[ 0, 0, 75, 25]])
I suspect it is faster than #Divakar's approach for this example. And it should remain competitive as long as the number of columns is quite a bit larger than the number of rows.

Categories