Matrix of labels to adjacency matrix - python

Just wondering if there is an off-the-shelf function to perform the following operation; given a matrix X, holding labels (that can be assumed to be integer numbers 0-to-N) in each entry e.g.:
X = [[0 1 1 2 2 3 3 3],
[0 1 1 2 2 3 3 4],
[0 1 5 5 5 5 3 4]]
I want its adjacency matrix G i.e. G[i,j] = 1 if i,j are adjacent in X and 0 otherwise.
For example G[1,2] = 1, because 1,2 are adjacent in (X[0,2],X[0,3]), (X[1,2],X[1,3]) etc..
The naive solution is to loop through all entries and check its neighbors, but I'd rather avoid loops for performance reason.

You can use fancy indexing to assign the values of G directly from your X array:
import numpy as np
X = np.array([[0,1,1,2,2,3,3,3],
[0,1,1,2,2,3,3,4],
[0,1,5,5,5,5,3,4]])
G = np.zeros([X.max() + 1]*2)
# left-right pairs
G[X[:, :-1], X[:, 1:]] = 1
# right-left pairs
G[X[:, 1:], X[:, :-1]] = 1
# top-bottom pairs
G[X[:-1, :], X[1:, :]] = 1
# bottom-top pairs
G[X[1:, :], X[:-1, :]] = 1
print(G)
#array([[ 1., 1., 0., 0., 0., 0.],
# [ 1., 1., 1., 0., 0., 1.],
# [ 0., 1., 1., 1., 0., 1.],
# [ 0., 0., 1., 1., 1., 1.],
# [ 0., 0., 0., 1., 1., 0.],
# [ 0., 1., 1., 1., 0., 1.]])

Related

Generate 3D "matrix" with Pandas, based on comparing two dataframes [Python]

Good morning everyone. I am working with Python and Pandas.
I have two DataFrames, of the following type:
df_C = pd.DataFrame(data=[[-3,-1,-1], [5,3,3], [3,3,1], [-1,-1,-3], [-3,-1,-1], [2,3,1], [1,1,1]], columns=['C1','C2','C3'])
C1 C2 C3
0 -3 -1 -1
1 5 3 3
2 3 3 1
3 -1 -1 -3
4 -3 -1 -1
5 2 3 1
6 1 1 1
df_F = pd.DataFrame(data=[[-1,1,-1,-1,-1],[1,1,1,1,1],[1,1,1,-1,1],[1,-1,-1,-1,1],[-1,0,0,-1,-1],[1,1,1,-1,0],[1,1,-1,1,-1]], columns=['F1','F2','F3','F4','F5'])
F1 F2 F3 F4 F5
0 -1 1 -1 -1 -1
1 1 1 1 1 1
2 1 1 1 -1 1
3 1 -1 -1 -1 1
4 -1 0 0 -1 -1
5 1 1 1 -1 0
6 1 1 -1 1 -1
I would like to be able to "cross" these two DataFrames, to generate or one in 3D, as follows:
The new data that is generated must compare the values of the df_F with the values of the df_C, taking into account the following:
If both values are positive, generate 1
If both values are negative, generate 1
If one value is positive and the other negative, it generates 0
If any of the values is zero, it generates None (NaN)
True table
Comparison of the data df_C vs df_F
df_C vs df_F = 3D
+ + 1
+ - 0
+ 0 None
- + 0
- - 1
- 0 None
0 + None
0 - None
0 0 None
You, who are experts in programming, could you please guide me, as I generate this matrix, I compare the values. I wish to do it with Pandas. I have done it with loops (for) and conditions (if), but it is visually unpleasant and I think that with Pandas it is more efficient and elegant.
Thank you.
Numpy broadcasting and np.select
Broadcast and multiply the values in df_C with the values from df_F in such a way that the shape of the resulting product matrix will be (3, 7, 5), then test for the condition where the values in the product matrix are positive, negative or zero and assign the corresponding values 1, 0 and NaN where the condition holds True
a = df_C.values.T[:, :, None] * df_F.values
a = np.select([a > 0, a < 0], [1, 0], np.nan)
array([[[ 1., 0., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 0., 1.],
[ 0., 1., 1., 1., 0.],
[ 1., nan, nan, 1., 1.],
[ 1., 1., 1., 0., nan],
[ 1., 1., 0., 1., 0.]],
[[ 1., 0., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 0., 1.],
[ 0., 1., 1., 1., 0.],
[ 1., nan, nan, 1., 1.],
[ 1., 1., 1., 0., nan],
[ 1., 1., 0., 1., 0.]],
[[ 1., 0., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 0., 1.],
[ 0., 1., 1., 1., 0.],
[ 1., nan, nan, 1., 1.],
[ 1., 1., 1., 0., nan],
[ 1., 1., 0., 1., 0.]]])

how to fix "index 3 is out of bounds for axis 1 with size 3" in one-hot encoding? [duplicate]

This question already has answers here:
Convert array of indices to one-hot encoded array in NumPy
(22 answers)
Closed 3 years ago.
I was working on one-hot encoding using python. but there is some problem when i run one-hot-encoding
def one_hot_encode(labels):
n_labels = len(labels)
n_unique_labels = len(np.unique(labels))
one_hot_encode = np.zeros((n_labels,n_unique_labels))
one_hot_encode[np.arange(n_labels), labels] = 1
return one_hot_encode
this is what i used to running one-hot endcode
and the data is like this...
[3 3 3 3 3 2 2 2 2 2 1 1 1 1 1]
It occurs this error
"index 3 is out of bounds for axis 1 with size 3"
And i try another path...
change the part of code
one_hot_encode = np.zeros((n_labels,n_unique_labels+1))
This is running but it its not the 3 classes...
The result is like this
array([[0., 0., 0., 1.],
[0., 0., 0., 1.],
[0., 0., 0., 1.],
[0., 0., 0., 1.],
[0., 0., 0., 1.],
[0., 0., 1., 0.],
[0., 0., 1., 0.],
[0., 0., 1., 0.],
[0., 0., 1., 0.],
[0., 0., 1., 0.],
[0., 1., 0., 0.],
[0., 1., 0., 0.],
[0., 1., 0., 0.],
[0., 1., 0., 0.],
[0., 1., 0., 0.]])
how do I fix this problem?
The error is raising from [3 3 3 3 3 2 2 2 2 2 1 1 1 1 1]. You have 3 in your mapping np.array which means in some position you are trying to equal index 3 to 1 but the problem is that maximum index in your mapping array is 2.
def one_hot_encode(labels):
n_labels = len(labels) # this will give 15
n_unique_labels = len(np.unique(labels)) # this will give 3
one_hot_encode = np.zeros((n_labels,n_unique_labels)) # will create 15x3 matrix
one_hot_encode[np.arange(n_labels), labels] = 1 # error here you try to map index 3 to 1 which does not exist
return one_hot_encode
Just simply change your mapping array from [3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1] to [2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0]

set all neighboring array items equal to 1 if adjacent to nonzero value

I am looking for an efficient way to set all array indexes neighboring a non-zero value equal to 1. So for example, if I have an array called arr that looks like the following:
import numpy as np
arr = np.zeros((5,5))
arr[1,1] = arr[2,2] = arr[3,3] = arr[0,5] = 1
arr
# array([[ 0., 0., 0., 0., 0.],
# [ 0., 1., 0., 0., 0.],
# [ 0., 0., 1., 0., 0.],
# [ 0., 0., 0., 1., 0.],
# [ 1., 0., 0., 0., 0.]])
Is there an easy way to get array called arr2 that keeps all non-zero values, but also sets left/right and up/down neighbors to non-zero values equal to 1? So in this small reproducible example, I would like a result that looks like:
arr2
# array([[ 0., 1., 0., 0., 0.],
# [ 1., 1., 1., 0., 0.],
# [ 0., 1., 1., 1., 0.],
# [ 1., 0., 1., 1., 1.],
# [ 1., 1., 0., 1., 0.]])
Diagonal neighbors aren't considered.
Based on the comments (#Code-Apprentice), here is what I was able to work out. Doesn't seem very 'pythonic' (i.e., elegant), but alas, it gets the job done. I was thinking there was a simpler alternative to the brute-force method for much larger 'real world' applications.
arr = np.zeros((5,5))
arr[1,1] = arr[2,2] = arr[3,3] = arr[0,4] = 1
arr2 = arr.copy()
for i in np.arange(0, arr.shape[0]):
for j in np.arange(0, arr.shape[1]):
im1 = i - 1
ip1 = i + 1
jm1 = j - 1
jp1 = j + 1
if(im1 >= 0):
if(arr[i, j] == 1 and arr[im1, j] == 0):
arr2[im1, j] = 1
if(ip1 < arr.shape[0]):
if(arr[i, j] == 1 and arr[ip1, j] == 0):
arr2[ip1, j] = 1
if(jm1 >= 0):
if(arr[i, j] == 1 and arr[i, jm1] == 0):
arr2[i, jm1] = 1
if(jp1 < arr.shape[1]):
if(arr[i, j] == 1 and arr[i, jp1] == 0):
arr2[i, jp1] = 1

Keras np_utils.to_categorical behaves differently

Why does Keras to_categorical behaves differently on [1, -1] and [2, -2]?
y = [1, -1, -1]
y_ = np_utils.to_categorical(y)
array([[ 0., 1.],
[ 0., 1.],
[ 0., 1.]])
y = [2, -2, -2]
y_ = np_utils.to_categorical(y)
array([[ 0., 0., 1.],
[ 0., 1., 0.],
[ 0., 1., 0.]])
to_categorical does not take negative values, if you have a dataset that has negative values, you can pass y - y.min() to to_categorical so it works as you would expect:
>>> y = numpy.array([2, -2, -2])
>>> to_categorical(y)
array([[ 0., 0., 1.],
[ 0., 1., 0.],
[ 0., 1., 0.]])
>>> to_categorical(y - y.min())
array([[ 0., 0., 0., 0., 1.],
[ 1., 0., 0., 0., 0.],
[ 1., 0., 0., 0., 0.]])
y = np.array(y, dtype='int').ravel()
if not num_classes:
num_classes = np.max(y) + 1
n = y.shape[0]
categorical = np.zeros((n, num_classes))
categorical[np.arange(n), y] = 1
above is the implementation of to_categorical.
So in [1, -1, -1] case what happened is :
num_classes = 2 [np.max()+1]
categorical shape becomes [3,2]
so when -1 comes it reads the last index and makes it 1. and for 1 also it reads index 1(index starts from 0).
that is why final output becomes
array([[ 0., 1.],
[ 0., 1.],
[ 0., 1.]])
in [2, -2, -2] case what happened is :
num_classes = 3 [np.max()+1]
categorical shape becomes [3,3]
so when -2 comes it reads the second last index and makes it 1. and for 2 it reads index 2(index starts from 0).
that is why final output becomes
array([[ 0., 0., 1.],
[ 0., 1., 0.],
[ 0., 1., 0.]])
so if you try something like [2, -4, -4] it will give you an error as there is no index -4 as categorical shape is [3,3].

Numpy: Affect diagonal elements of matrix prior to 1.10

I would like to change diagonal elements from a 2d matrix. These are both main and non-main diagonals.
numpy.diagonal()
In NumPy 1.10, it will return a read/write view, Writing to the returned
array will alter your original array.
numpy.fill_diagonal(), numpy.diag_indices()
Only works with main-diagonal elements
Here is my use case: I want to recreate a matrix of the following form, which is very trivial using diagonal notation given that I have all the x, y, z as arrays.
Try this:
>>> A = np.zeros((6,6))
>>> i,j = np.indices(A.shape)
>>> z = [1, 2, 3, 4, 5]
Now you can intuitively access any diagonal:
>>> A[i==j-1] = z
>>> A
array([[ 0., 1., 0., 0., 0., 0.],
[ 0., 0., 2., 0., 0., 0.],
[ 0., 0., 0., 3., 0., 0.],
[ 0., 0., 0., 0., 4., 0.],
[ 0., 0., 0., 0., 0., 5.],
[ 0., 0., 0., 0., 0., 0.]])
In the same way you can assign arrays to A[i==j], etc.
You could always use slicing to assign a value or array to the diagonals.
Passing in a list of row indices and a list of column indices lets you access the locations directly (and efficiently). For example:
>>> z = np.zeros((5,5))
>>> z[np.arange(5), np.arange(5)] = 1 # diagonal is 1
>>> z[np.arange(4), np.arange(4) + 1] = 2 # first upper diagonal is 2
>>> z[np.arange(4) + 1, np.arange(4)] = [11, 12, 13, 14] # first lower diagonal values
changes the array of zeros z to:
array([[ 1., 2., 0., 0., 0.],
[ 11., 1., 2., 0., 0.],
[ 0., 12., 1., 2., 0.],
[ 0., 0., 13., 1., 2.],
[ 0., 0., 0., 14., 1.]])
In general for a k x k array called z, you can set the ith upper diagonal with
z[np.arange(k-i), np.arange(k-i) + i]
and the ith lower diagonal with
z[np.arange(k-i) + i, np.arange(k-i)]
Note: if you want to avoid calling np.arange several times, you can simply write ix = np.arange(k) once and then slice that range as needed:
np.arange(k-i) == ix[:-i]
Here is another approach just for fun. You can write your own diagonal function to return of view of the diagonal you need.
import numpy as np
def diag(a, k=0):
if k > 0:
a = a[:, k:]
elif k < 0:
a = a[-k:, :]
shape = (min(a.shape),)
strides = (sum(a.strides),)
return np.lib.stride_tricks.as_strided(a, shape, strides)
a = np.arange(20).reshape((4, 5))
diag(a, 2)[:] = 88
diag(a, -2)[:] = 99
print(a)
# [[ 0 1 88 3 4]
# [ 5 6 7 88 9]
# [99 11 12 13 88]
# [15 99 17 18 19]]

Categories