I'm trying to efficiently map a N * 1 numpy array of ints to a N * 3 numpy array of floats using a ufunc.
What I have so far:
map = {1: (0, 0, 0), 2: (0.5, 0.5, 0.5), 3: (1, 1, 1)}
ufunc = numpy.frompyfunc(lambda x: numpy.array(map[x], numpy.float32), 1, 1)
input = numpy.array([1, 2, 3], numpy.int32)
ufunc(input) gives a 3 * 3 array with dtype object. I'd like this array but with dtype float32.
You could use np.hstack:
import numpy as np
mapping = {1: (0, 0, 0), 2: (0.5, 0.5, 0.5), 3: (1, 1, 1)}
ufunc = np.frompyfunc(lambda x: np.array(mapping[x], np.float32), 1, 1, dtype = np.float32)
data = np.array([1, 2, 3], np.int32)
result = np.hstack(ufunc(data))
print(result)
# [ 0. 0. 0. 0.5 0.5 0.5 1. 1. 1. ]
print(result.dtype)
# float32
print(result.shape)
# (9,)
If your mapping is a numpy array, you can just use fancy indexing for this:
>>> valmap = numpy.array([(0, 0, 0), (0.5, 0.5, 0.5), (1, 1, 1)])
>>> input = numpy.array([1, 2, 3], numpy.int32)
>>> valmap[input-1]
array([[ 0. , 0. , 0. ],
[ 0.5, 0.5, 0.5],
[ 1. , 1. , 1. ]])
You can use ndarray fancy index to get the same result, I think it should be faster than frompyfunc:
map_array = np.array([[0,0,0],[0,0,0],[0.5,0.5,0.5],[1,1,1]], dtype=np.float32)
index = np.array([1,2,3,1])
map_array[index]
Or you can just use list comprehension:
map = {1: (0, 0, 0), 2: (0.5, 0.5, 0.5), 3: (1, 1, 1)}
np.array([map[i] for i in [1,2,3,1]], dtype=np.float32)
Unless I misread the doc, the output of np.frompyfunc on a scalar a object indeed: when using a ndarray as input, you'll get a ndarray with dtype=obj.
A workaround is to use the np.vectorize function:
F = np.vectorize(lambda x: mapper.get(x), 'fff')
Here, we force the dtype of F's output to be 3 floats (hence the 'fff').
>>> mapper = {1: (0, 0, 0), 2: (0.5, 1.0, 0.5), 3: (1, 2, 1)}
>>> inp = [1, 2, 3]
>>> F(inp)
(array([ 0. , 0.5, 1. ], dtype=float32), array([ 0., 0.5, 1.], dtype=float32), array([ 0. , 0.5, 1. ], dtype=float32))
OK, not quite what we want: it's a tuple of three float arrays (as we gave 'fff'), the first array being equivalent to [mapper[i][0] for i in inp]. So, with a bit of manipulation:
>>> np.array(F(inp)).T
array([[ 0. , 0. , 0. ],
[ 0.5, 0.5, 0.5],
[ 1. , 1. , 1. ]], dtype=float32)
Related
suppose i have multiple 4x4 matrices which i want to add to a final 6x6 zero matrix by adding some of the values in the designated coordination. how would i do this. I throughout of adding slices to np.zero 6x6 matrix , but i believe this may be quite tedious.
matrix 1 would go to this position first position and you have matrix 2 going to this position position 2. these two positions would be added and form the following final matrix Final position matrix
import numpy as np
from math import sqrt
# Element 1
C_1= 3/5
S_1= 4/5
matrix_1 = np.matrix([[C_1**2, C_1*S_1,-C_1**2,-C_1*S_1],[C_1*S_1,S_1**2,-C_1*S_1,-S_1**2],
[-C_1**2,-C_1*S_1,C_1**2,C_1*S_1],[-C_1*S_1,-S_1**2,C_1*S_1,S_1**2]])
empty_mat1 = np.zeros((6,6))
empty_mat1[0:4 , 0:4] = empty_mat1[0:4 ,0:4] + matrix_1
#print(empty_mat1)
# Element 2
C_2 = 0
S_2 = 1
matrix_2 = 1.25*np.matrix([[C_2**2, C_2*S_2,-C_2**2,-C_2*S_2],[C_2*S_2,S_2**2,-C_2*S_2,-S_2**2],
[-C_2**2,-C_2*S_2,C_2**2,C_2*S_2],[-C_2*S_2,-S_2**2,C_2*S_2,S_2**2]])
empty_mat2 = np.zeros((6,6))
empty_mat2[0:2,0:2] = empty_mat2[0:2,0:2] + matrix_2[0:2,0:2]
empty_mat2[4:6,0:2] = empty_mat2[4:6,0:2] + matrix_2[2:4,0:2]
empty_mat2[0:2,4:6] = empty_mat2[0:2,4:6] + matrix_2[2:4,2:4]
empty_mat2[4:6,4:6] = empty_mat2[4:6,4:6] + matrix_2[0:2,0:2]
print(empty_mat1+empty_mat2)
Adding two arrays of differents dimensions is a little bit tricky with numpy.
However, with array comprehension, you could do it with the following "rustic" method :
Supposing M1 and M2 your 2 input arrays, M3 (from M1) and M4 (from M2) your temporary arrays and M5 the final array :
#Initalisation
M1 = np.array([[ 0.36, 0.48, -0.36, -0.48], [ 0.48, 0.64, -0.48, -0.64], [ -0.36, -0.48, 0.36, 0.48], [-0.48, -0.64, 0.48, 0.64]])
M2 = np.array([[ 0, 0, 0, 0], [ 0, 1.25, 0, -1.25], [ 0, 0, 0, 0], [ 0, -1.25, 0, 1.25]])
M3, M4 = np.zeros((6, 6)), np.zeros((6, 6))
#M3 and M4 operations
M3[0:4, 0:4] = M1[0:4, 0:4] + M3[0:4, 0:4]
M4[0:2, 0:2] = M2[0:2, 0:2]
M4[0:2, 4:6] = M2[0:2, 2:4]
M4[4:6, 0:2] = M2[2:4, 0:2]
M4[4:6, 4:6] = M2[2:4, 2:4]
#Final operation
M5 = M3+M4
print(M5)
Output :
[[ 0.36 0.48 -0.36 -0.48 0. 0. ]
[ 0.48 1.89 -0.48 -0.64 0. -1.25]
[-0.36 -0.48 0.36 0.48 0. 0. ]
[-0.48 -0.64 0.48 0.64 0. 0. ]
[ 0. 0. 0. 0. 0. 0. ]
[ 0. -1.25 0. 0. 0. 1.25]]
Have a good day.
You will need to encode some way of where your 4x4 matrices end up in the final 6x6 matrix. Suppose you have N (=2 in your case) such 4x4 matrices. You can then define two new arrays (shape Nx4) that denote the row and col indices of the final 6x6 matrix that you want your 4x4 matrices to end up in. Finally, you use fancy indexing and broadcasting to build up a Nx6x6 array which you can sum over. Your example:
import numpy as np
N = 2
arr = np.array([[
[0.36, 0.48, -0.36, -0.48],
[0.48, 0.64, -0.48, -0.64],
[-0.36, -0.48, 0.36, 0.48],
[-0.48, -0.64, 0.48, 0.64],
], [
[0, 0, 0, 0],
[0, 1.25, 0, -1.25],
[0, 0, 0, 0],
[0, -1.25, 0, 1.25],
]])
rows = np.array([
[0, 1, 2, 3],
[0, 1, 4, 5]
])
cols = np.array([
[0, 1, 2, 3],
[0, 1, 4, 5]
])
i = np.arange(N)
out = np.zeros((N, 6, 6))
out[
i[:, None, None],
rows[:, :, None],
cols[:, None, :]
] = arr
out = out.sum(axis=0)
Gives as output:
array([[ 0.36, 0.48, -0.36, -0.48, 0. , 0. ],
[ 0.48, 1.89, -0.48, -0.64, 0. , -1.25],
[-0.36, -0.48, 0.36, 0.48, 0. , 0. ],
[-0.48, -0.64, 0.48, 0.64, 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. , 0. ],
[ 0. , -1.25, 0. , 0. , 0. , 1.25]])
If you want even more control over where each row/col ends up, you can pull off some more trickery as follows:
rows = np.array([
[1, 2, 3, 4, 0, 0],
[1, 2, 0, 0, 3, 4]
])
cols = np.array([
[1, 2, 3, 4, 0, 0],
[1, 2, 0, 0, 3, 4]
])
i = np.arange(N)
out = np.pad(arr, ((0, 0), (1, 0), (1, 0)))[
i[:, None, None],
rows[:, :, None],
cols[:, None, :]
].sum(axis=0)
which has the same output. This would allow you to shuffle the rows/cols of arr by shuffling the values 1-4 in the rows, cols arrays. I would prefer option 1 though.
I probably should wait for you to correct your question, but I'll go ahead and give you some code - yes, in the most tedious form - based on your images
res = np.zeros((6,6))
# arr1, arr2 are (4,4) arrays
res[:4, :4] += arr1
idx = np.array([0,1,4,5])
res[idx[:,None], idx] += arr2
The first is contiguous block, so the 2 slices are enough.
The second is split up, so I'm using advanced indexing.
I have data in the following format:
[('user_1', 2, 1.0),
('user_2', 6, 2.5),
('user_3', 9, 3.0),
('user_4', 1, 3.0)]
And I want use this information to create a NumPy array that has the value 1.0 in position 2, value 2.5 in position 6, etc. All positions not listed in the above should be zeroes. Like this:
array([0, 3.0, 0, 0, 0, 0, 2.5, 0, 0, 3.0])
First reformat the data:
data = [
("user_1", 2, 1.0),
("user_2", 6, 2.5),
("user_3", 9, 3.0),
("user_4", 1, 3.0),
]
usernames, indices, values = zip(*data)
And then create the array:
length = max(indices) + 1
arr = np.zeros(shape=(length,))
arr[list(indices)] = values
print(arr) # array([0. , 3. , 1. , 0. , 0. , 0. , 2.5, 0. , 0. , 3. ])
Note that you need to convert indices to a list,
otherwise when using it for indexing numpy will
think it is trying to index multiple dimensions.
I've come up with this solution:
import numpy as np
a = [('user_1', 2, 1.0),
('user_2', 6, 2.5),
('user_3', 9, 3.0),
('user_4', 1, 3.0)]
res = np.zeros(max(x[1] for x in a)+1)
for i in range(len(a)):
res[a[i][1]] = a[i][2]
res
# array([0. , 3. , 1. , 0. , 0. , 0. , 2.5, 0. , 0. , 3. ])
First I create a 0 filled array with maximum value of the number in index 1 of each tuple in list a + 1 to account that your positions are 1 higher than the indexes inside the array are.
Then I do a simple loop and assign the values according to the arguments in the tuple.
I have a numpy 2D array (50x50) filled with values. I would like to flatten the 2D array into one column (2500x1), but the location of these values are very important. The indices can be converted to spatial coordinates, so I want another two (x,y) (2500x1) arrays so I can retrieve the x,y spatial coordinate of the corresponding value.
For example:
My 2D array:
--------x-------
[[0.5 0.1 0. 0.] |
[0. 0. 0.2 0.8] y
[0. 0. 0. 0. ]] |
My desired output:
#Values
[[0.5]
[0.1]
[0. ]
[0. ]
[0. ]
[0. ]
[0. ]
[0.2]
...],
#Corresponding x index, where I will retrieve the x spatial coordinate from
[[0]
[1]
[2]
[3]
[4]
[0]
[1]
[2]
...],
#Corresponding y index, where I will retrieve the x spatial coordinate from
[[0]
[0]
[0]
[0]
[1]
[1]
[1]
[1]
...],
Any clues on how to do this? I've tried a few things but they have not worked.
For the simplisity let's reproduce your array with this chunk of code:
value = np.arange(6).reshape(2, 3)
Firstly, we create variables x, y which contains index for each dimension:
x = np.arange(value.shape[0])
y = np.arange(value.shape[1])
np.meshgrid is the method, related to the issue you described:
xx, yy = np.meshgrid(x, y, sparse=False)
Finaly, transform all elements it in the shape you want with these lines:
xx = xx.reshape(-1, 1)
yy = yy.reshape(-1, 1)
value = value.reshape(-1, 1)
According to your example, with np.indices:
data = np.arange(2500).reshape(50, 50)
y_indices, x_indices = np.indices(data.shape)
Reshaping your data:
data = data.reshape(-1,1)
x_indices = x_indices.reshape(-1,1)
y_indices = y_indices.reshape(-1,1)
Assuming you want to flatten and reshape into a single column, use reshape:
a = np.array([[0.5, 0.1, 0., 0.],
[0., 0., 0.2, 0.8],
[0., 0., 0., 0. ]])
a.reshape((-1, 1)) # 1 column, as many row as necessary (-1)
output:
array([[0.5],
[0.1],
[0. ],
[0. ],
[0. ],
[0. ],
[0.2],
[0.8],
[0. ],
[0. ],
[0. ],
[0. ]])
getting the coordinates
y,x = a.shape
np.tile(np.arange(x), y)
# array([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3])
np.repeat(np.arange(y), x)
# array([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2])
or simply using unravel_index:
Y, X = np.unravel_index(range(a.size), a.shape)
# (array([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2]),
# array([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]))
I am sorry if this is obvious, but I am having trouble understanding why it seems that np.meshgrid produces array who's shape is more than the input:
grid = np.meshgrid(
np.linspace(-1, 1, 5),
np.linspace(-1, 1, 4),
np.linspace(-1, 1, 3), indexing='ij')
np.shape(grid)
(3, 5, 4, 3)
To me it should have been: (5, 4, 3)
or
grid = np.meshgrid(
np.linspace(-1, 1, 5),
np.linspace(-1, 1, 4), indexing='ij')
np.shape(grid)
(2, 5, 4)
To me it should have been: (5, 4)
I would be very grateful if somebody could explain me that.... Thanks a lot!
In [92]: grid = np.meshgrid(
...: np.linspace(-1, 1, 5),
...: np.linspace(-1, 1, 4), indexing='ij')
...:
In [93]: grid
Out[93]:
[array([[-1. , -1. , -1. , -1. ],
[-0.5, -0.5, -0.5, -0.5],
[ 0. , 0. , 0. , 0. ],
[ 0.5, 0.5, 0.5, 0.5],
[ 1. , 1. , 1. , 1. ]]),
array([[-1. , -0.33333333, 0.33333333, 1. ],
[-1. , -0.33333333, 0.33333333, 1. ],
[-1. , -0.33333333, 0.33333333, 1. ],
[-1. , -0.33333333, 0.33333333, 1. ],
[-1. , -0.33333333, 0.33333333, 1. ]])]
grid is a list with two arrays. The first array has numbers from the first argument (the one with 5 elements). The second has numbers from the second argument.
Why should np.shape(grid) is (5,4)? What layout were you expecting?
np.shape(grid) actually does np.array(grid).shape, which is why there's an added dimension.
Given a numpy array, which can be subset to indices for array elements meeting given criteria. How do I create tuples of triplets (or quadruplets, quintuplets, ...) from the resulting pairs of indices ?
In the example below, pairs_tuples is equal to [(1, 0), (3, 0), (3, 1), (3, 2)]. triplets_tuples should be [(0, 1, 3)] because all of its elements (i.e. (1, 0), (3, 0), (3, 1)) have pairwise values meeting the condition, whereas (3, 2) does not.
a = np.array([[0. , 0. , 0. , 0. , 0. ],
[0.96078379, 0. , 0. , 0. , 0. ],
[0.05498203, 0.0552454 , 0. , 0. , 0. ],
[0.46005028, 0.45468466, 0.11167813, 0. , 0. ],
[0.1030161 , 0.10350956, 0.00109096, 0.00928037, 0. ]])
pairs = np.where((a >= .11) & (a <= .99))
pairs_tuples = list(zip(pairs[0].tolist(), pairs[1].tolist()))
# [(1, 0), (3, 0), (3, 1), (3, 2)]
How to get to the below?
triplets_tuples = [(0, 1, 3)]
quadruplets_tuples = []
quintuplets_tuples = []
This has an easy part and an NP part. Here's the solution to the easy part.
Let's assume you have the full correlation matrix:
>>> c = a + a.T
>>> c
array([[0. , 0.96078379, 0.05498203, 0.46005028, 0.1030161 ],
[0.96078379, 0. , 0.0552454 , 0.45468466, 0.10350956],
[0.05498203, 0.0552454 , 0. , 0.11167813, 0.00109096],
[0.46005028, 0.45468466, 0.11167813, 0. , 0.00928037],
[0.1030161 , 0.10350956, 0.00109096, 0.00928037, 0. ]])
What you're doing is converting this into an adjacency matrix:
>>> adj = (a >= .11) & (a <= .99)
>>> adj.astype(int) # for readability below - False and True take a lot of space
array([[0, 1, 0, 1, 0],
[1, 0, 0, 1, 0],
[0, 0, 0, 1, 0],
[1, 1, 1, 0, 0],
[0, 0, 0, 0, 0]])
This now represents a graph where columns and rows corresponds to nodes, and a 1 is a line between them. We can use networkx to visualize this:
import networkx
g = networkx.from_numpy_matrix(adj)
networkx.draw(g)
You're looking for maximal fully-connected subgraphs, or "cliques", within this graph. This is the Clique problem, and is the NP part. Thankfully, networkx can solve that too:
>>> list(networkx.find_cliques(g))
[[3, 0, 1], [3, 2], [4]]
Here [3, 0, 1] is one of your triplets.