Python: adding index as new column to 2D array - python

Suppose I have np.array like below
dat = array([[ 0, 1, 0],
[ 1, 0, 0],
[0, 0, 1]]
)
What I want to do is that adding the (index of row + 1) as a new column to this array, which is like
newdat = array([[ 0, 1, 0, 1],
[ 1, 0, 0, 2],
[0, 0, 1, 3]]
)
How should I achieve this.

You can also use np.append(). You can also get more info about [...,None] here
import numpy as np
dat = np.array([
[0, 1, 0],
[1, 0, 0],
[0, 0, 1]
])
a = np.array(range(1,4))[...,None] #None keeps (n, 1) shape
dat = np.append(dat, a, 1)
print (dat)
The output of this will be:
[[0 1 0 1]
[1 0 0 2]
[0 0 1 3]]
Or you can use hstack()
a = np.array(range(1,4))[...,None] #None keeps (n, 1) shape
dat = np.hstack((dat, a))
And as hpaulj mentioned, np.concatenate is the way to go. You can read more about concatenate documentation. Also, see additional examples of concatenate on stackoverflow
dat = np.concatenate([dat, a], 1)

Use numpy.column_stack:
newdat = np.column_stack([dat, range(1,dat.shape[0] + 1)])
print(newdat)
#[[0 1 0 1]
# [1 0 0 2]
# [0 0 1 3]]

Try something like this using numpy.insert():
import numpy as np
dat = np.array([
[0, 1, 0],
[1, 0, 0],
[0, 0, 1]
])
dat = np.insert(dat, 3, values=[range(1, 4)], axis=1)
print(dat)
Output:
[[0 1 0 1]
[1 0 0 2]
[0 0 1 3]]
More generally, you can make use of numpy.ndarray.shape for the appropriate sizing:
dat = np.insert(dat, dat.shape[1], values=[range(1, dat.shape[0] + 1)], axis=1)

Related

How to change diagonal elements in a matrix from 1 to 0, 0 to 1

Please can someone help with flipping elements on the diagonal of a matrix from 1 to 0 if 1, and 0 to 1 if 0 for the matrix rmat
mat = np.random.binomial(1,.5,4)
rmat = np.array([mat,]*4)
Thank you
You can use numpy.fill_diagonal.
NB. the operation is in place
diagonal = rmat.diagonal()
np.fill_diagonal(rmat, 1-diagonal)
input:
array([[1, 1, 1, 0],
[1, 1, 1, 0],
[1, 1, 1, 0],
[1, 1, 1, 0]])
output:
array([[0, 1, 1, 0],
[1, 0, 1, 0],
[1, 1, 0, 0],
[1, 1, 1, 1]])
Try this -
Unlike the np.fill_diagonal, this method is not inplace and doesnt need explicit copy of the input rmat matrix.
n = rmat.shape[0]
output = np.where(np.eye(n, dtype=bool), np.logical_not(rmat), rmat)
output
#Original
[[0 1 0 0]
[0 1 0 0]
[0 1 0 0]
[0 1 0 0]]
#diagonal inverted
[[1 1 0 0]
[0 0 0 0]
[0 1 1 0]
[0 1 0 1]]
Another way to do this would be to use np.diag_indices along with np.logical_not
n = rmat.shape[0]
idx = np.diag_indices(n)
rmat[idx] = np.logical_not(rmat[idx])
print(rmat)

Numpy Indexing problem..... Advance indexing what is X[0] doing here?

import numpy as np
X = np.array([[0, 1, 0, 1], [1, 0, 1, 1], [0, 0, 0, 1], [1, 0, 1, 0]])
y = np.array([0, 1, 0, 1])
counts = {}
print(X[y == 0])
# prints = [[0 1 0 1]
# [0 0 0 1]]
I want to know why X[y==0] printing two data point. Shouldn't it print only [0 1 0 1] ?
because X[0]?
y == 0 gives an array with same dimensions as y, with elements True where the corresponding element in y is 0, and False otherwise.
Here, y has 0 elements at indices 0 and 2. So, X[y == 0] gives you an array containing X[0] and X[2].

How can I find the value with the minimum MSE with a numpy array?

My possible values are:
0: [0 0 0 0]
1: [1 0 0 0]
2: [1 1 0 0]
3: [1 1 1 0]
4: [1 1 1 1]
I have some values:
[[0.9539342 0.84090066 0.46451256 0.09715253],
[0.9923432 0.01231235 0.19491441 0.09715253]
....
I want to figure out which of my possible values this is the closest to my new values. Ideally I want to avoid doing a for loop and wonder if there's some sort of vectorized way to search for the minimum mean squared error?
I want it to return an array that looks like: [2, 1 ....
You can use np.argmin to get the lowest index of the rmse value which can be calculated using np.linalg.norm
import numpy as np
a = np.array([[0, 0, 0, 0], [1, 0, 0, 0], [1, 1, 0, 0],[1, 1, 1, 0], [1, 1, 1, 1]])
b = np.array([0.9539342, 0.84090066, 0.46451256, 0.09715253])
np.argmin(np.linalg.norm(a-b, axis=1))
#outputs 2 which corresponds to the value [1, 1, 0, 0]
As mentioned in the edit, b can have multiple rows. The op wants to avoid for loop, but I can't seem to find a way to avoid the for loop. Here is a list comp way, but there could be a better way
[np.argmin(np.linalg.norm(a-i, axis=1)) for i in b]
#Outputs [2, 1]
Let's assume your input data is a dictionary. You can then use NumPy for a vectorized solution. You first convert your input lists to a NumPy array and the use axis=1 argument to get the RMSE.
# Input data
dicts = {0: [0, 0, 0, 0], 1: [1, 0, 0, 0], 2: [1, 1, 0, 0], 3: [1, 1, 1, 0],4: [1, 1, 1, 1]}
new_value = np.array([0.9539342, 0.84090066, 0.46451256, 0.09715253])
# Convert values to array
values = np.array(list(dicts.values()))
# Compute the RMSE and get the index for the least RMSE
rmse = np.mean((values-new_value)**2, axis=1)**0.5
index = np.argmin(rmse)
print ("The closest value is %s" %(values[index]))
# The closest value is [1 1 0 0]
Pure numpy:
val1 = np.array ([
[0, 0, 0, 0],
[1, 0, 0, 0],
[1, 1, 0, 0],
[1, 1, 1, 0],
[1, 1, 1, 1]
])
print val1
val2 = np.array ([0.9539342, 0.84090066, 0.46451256, 0.09715253], float)
val3 = np.round(val2, 0)
print val3
print np.where((val1 == val3).all(axis=1)) # show a match on row 2 (array([2]),)

How to create a numpy array from itertools.combinations without looping

Is there a way to get this result without a loop? I've made a couple attempts at fancy indexing with W[range(W.shape[0]),... but have been so far unsuccessful.
import itertools
import numpy as np
n = 4
ct = 2
one_index_tuples = list(itertools.combinations(range(n), r=ct))
W = np.zeros((len(one_index_tuples), n), dtype='int')
for row_index, col_index in enumerate(one_index_tuples):
W[row_index, col_index] = 1
print(W)
Result:
[[1 1 0 0]
[1 0 1 0]
[1 0 0 1]
[0 1 1 0]
[0 1 0 1]
[0 0 1 1]]
You can use fancy indexing (advanced indexing) as follows:
# reshape the row index to 2d since your column index is also 2d so that the row index and
# column index will broadcast properly
W[np.arange(len(one_index_tuples))[:, None], one_index_tuples] = 1
W
#array([[1, 1, 0, 0],
# [1, 0, 1, 0],
# [1, 0, 0, 1],
# [0, 1, 1, 0],
# [0, 1, 0, 1],
# [0, 0, 1, 1]])
Try this:
[[ 1 if i in x else 0 for i in range(n) ] for x in itertools.combinations( range(n), ct )]

Select two rows from bit array based on int array python

I have two arrays one Int, and one is bit
s = [ [1] x = [ [1 0 0 0 0]
[4] [1 1 1 1 0]
[9] [0 1 1 1 0]
[0] [0 0 1 0 0]
[3] ] [0 1 1 0 0]]
I want to find the smallest two elements in s (random given) then (select and print) two rows from x (random given) based on s array,
for example, the smallest elements in s[i] are s[3]=0, s[0]=1, so i want to select x[3][0 0 1 0 0], and x[0][1 0 0 0 0]
import numpy as np
np.set_printoptions(threshold=np.nan)
s= np.random.randint(5, size=(5))
x= np.random.randint (2, size=(5, 5))
print (s)
print (x)
I tried my best using the "for loop" but no luck, any advice will be appreciated.
You can use numpy.argpartition to find out the index of the two smallest elements from s and use it as row index to subset x:
s
# array([3, 0, 0, 1, 2])
x
# array([[1, 0, 0, 0, 1],
# [1, 0, 1, 1, 1],
# [0, 0, 1, 0, 0],
# [1, 0, 0, 1, 1],
# [0, 0, 1, 0, 1]])
x[s.argpartition(2)[:2], :]
# array([[1, 0, 1, 1, 1],
# [0, 0, 1, 0, 0]])

Categories