Using values in 2d matrix as lookups - python

I have a 2-D numpy array that I would like to use as a series of indexes. The values in the array are all integer values:
array([[3, 3, 3, 2],
[1, 5, 2, 3],
[4, 2, 3, 2],
[2, 3, 1, 3]])
I also have a 1D array with 5 values
array([x,y,z,q,p])
I'd like to use the cells of the 2D array as lookup values into the 1D array. In other words, where the 2D array is equal to 1, return x. Where it's equal to 2, return y, and so on.
This is simple enough to do in matlab, but I'd like a numpy solution that doesn't involve looping.
Thoughts?

Assuming x, y, z, q, p are variables (and not characters).
If you define the first array as (say, src):
src = np.array([[3, 3, 3, 2],
[1, 5, 2, 3],
[4, 2, 3, 2],
[2, 3, 1, 3]])
and the second as (say, lookup):
lookup = np.array([x,y,z,q,p])
You can get the desired output using:
lookup[src - 1]
Or if you want individual outputs:
lookup[src[i, j] - 1]
where (i, j) is the 2-D index of the value that you want to look up.
Please note that the -1 here is to account for the offset (as mentioned in the comment by slothrop)

Related

Applying torch.combinations on multidimensional tensor or tuple of tensors in PyTorch?

Using PyTorch, torch.combinations will only take a 1D tensor as input but I would like to apply it to each 1D tensor in a multidimensional tensor.
inp = torch.tensor([[1, 2, 3],
[2, 3, 4]])
torch.combinations((inp), r=2)
The result is an error saying I can't apply it to that shape but I want to apply it to [1, 2, 3] and [2, 3, 4] individually. I can't do it one by one because the idea is to apply this to large sets of data.
inp = torch.tensor([[1,2,3],[2,3,4]])
inp_tuple = torch.unbind(inp)
print(inp_tuple)
(tensor([1, 2, 3]), tensor([2, 3, 4]))
torch.combinations((inp_tuple), r=2)
I also tried unbinding the tensor and applying it to the tuple of tensors but it gives an error saying it can't be applied to a tuple.
Is there any way that I can get torch.combinations to automatically apply to each individual 1D tensor in a multidimensional tensor or each tensor in a tuple of tensors? If not are there any alternatives to achieve all combinations of each individual part of a multidimensional tensor?
Function torch.combinations returns all possible combinations of size r of the elements contained in the 1D input vector. The reason why multi-dimensional inputs are not supported is probably that you have no guarantee that the different vectors in your input have the exact same number of unique elements. Obviously if one of the vectors has a duplicate element then you would end up with one set of combinations bigger than another which is simply not possible to represent with a homogenous PyTorch tensor.
So from there on, I will assume that the input tensor inp is a 2D tensor shaped (N, C) where each of its N vectors contains C unique elements. The example you gave would fit to this requirement since both vectors have three unique elements each: {1, 2, 3} and {2, 3, 4}.
>>> inp = torch.tensor([[1,2,3],[2,3,4]])
The idea is to apply torch.combinations on an arrangement tensor of length equal to that of our vectors. We can then use those as indices to gather values in our different vectors in our input tensor.
We can retrieve all combinations of an arrangement with the following:
>>> c = torch.combinations(torch.arange(inp.size(1)), r=2)
tensor([[0, 1],
[0, 2],
[1, 2]])
Then we need to reshape and expand both inp and c such that they match in number of dimensions:
>>> x = inp[:,None].expand(-1,len(c),-1)
tensor([[[1, 2, 3],
[1, 2, 3],
[1, 2, 3]],
[[2, 3, 4],
[2, 3, 4],
[2, 3, 4]]])
>>> idx = c[None].expand(len(x), -1, -1)
tensor([[[0, 1],
[0, 2],
[1, 2]],
[[0, 1],
[0, 2],
[1, 2]]])
Finally we can apply torch.gather on x and idx on dim=2. This will return a 3D tensor out such that:
out[i][j][k] = x[i][j][index[i][j][k]]
Let's make our call on torch.gather:
>>> x.gather(dim=2, index=idx)
tensor([[[1, 2],
[1, 3],
[2, 3]],
[[2, 3],
[2, 4],
[3, 4]]])
Which is the desired result.

How to apply a function on jagged Numpy arrays (unequal row lengths) without using np.apply_along_axis()?

I'm trying to speed up a process, I think this might be possible using numpy's apply_along_axis. The problem is that not all my axis have the same length.
When I do:
a = np.array([[1, 2, 3],
[2, 3, 4],
[4, 5, 6]])
b = np.apply_along_axis(sum, 1, a)
print(b)
This works fine. But I would like to do something similar to (please note that the first row has 4 elements and the rest have 3):
a = np.array([[1, 2, 3, 4],
[2, 3, 4],
[4, 5, 6]])
b = np.apply_along_axis(sum, 1, a)
print(b)
But this fails because:
numpy.AxisError: axis 1 is out of bounds for array of dimension 1
I've looked around and the only 'solution' I've found is to add zeros to make all the arrays the same length, which would probably defeat the purpose of performance improvement.
Is there any way to use numpy_apply_along_axis on a non-regular shaped numpy array?
You can transform your initial array of iterable-objects to ndarray by padding them with zeros in a vectorized manner:
import numpy as np
a = np.array([[1, 2, 3, 4],
[2, 3, 4],
[4, 5, 6]])
max_len = len(max(a, key = lambda x: len(x))) # max length of iterable-objects contained in array
cust_func = np.vectorize(pyfunc=lambda x: np.pad(array=x,
pad_width=(0,max_len),
mode='constant',
constant_values=(0,0))[:max_len], otypes=[list])
a_pad = np.stack(cust_func(a))
output:
array([[1, 2, 3, 4],
[2, 3, 4, 0],
[4, 5, 6, 0]])
It depends.
Do you know the size of the vectors before or are you appending to a list?
see e.g. http://stackoverflow.com/a/58085045/7919597
You could for example pad the arrays
import numpy as np
a1 = [1, 2, 3, 4]
a2 = [2, 3, 4, np.nan] # pad with nan
a3 = [4, 5, 6, np.nan] # pad with nan
b = np.stack([a1, a2, a3], axis=0)
print(b)
# you can apply the normal numpy operations on
# arrays with nan, they usually just result in a nan
# in a resulting array
c = np.diff(b, axis=-1)
print(c)
Afterwards you can apply a moving window on each row over the columns.
Have a look at https://stackoverflow.com/a/22621523/7919597 which is only 1d, but can give you an idea of how it could work.
It is possible to use a 2d array with only one row as kernel (shape e.g. (1, 3)) with scipy.signal.convolve2d and use the idea above.
This is a workaround to get a "row-wise 1D convolution":
from scipy import signal
krnl = np.array([[0, 1, 0]])
d = signal.convolve2d(c, krnl, mode='same')
print(d)

How to reshape a Numpy array from (x,y,z) to (y,z,x)

I have an array of dimension (3,120,100) and I want to convert it into an array of dimensions (120,100,3). The array I have is
arr1 = np.ones((120,100), dtype = int)
arr2 = arr1*2
arr3 = arr1*3
arr = np.stack((arr1,arr2,arr3))
arr
It contains three 120x100 arrays of 1's, 2's, and 3's. When I use reshape on it, I get 120x100 arrays of 1's, 2's, or 3's.
I want to get an array of 120x100 where each element is [1,2,3]
If you want a big array containing 1, 2 and 3 as you describe, user3483203's answer would be the recommendable option. If you have, in general, an array with shape (X, Y, Z) and you want to have it as (Y, Z, X), you would normally use np.transpose:
import numpy as np
arr = ... # Array with shape (3, 120, 100)
arr_reshaped = np.transpose(arr, (1, 2, 0))
print(arr_reshaped.shape)
# (120, 100, 3)
EDIT: The question title says you want to reshape an array from (X, Y, Z) to (Z, Y, X), but the text seems to suggest you want to reshape from (X, Y, Z) to (Y, Z, X). I followed the text, but for the one in the title it would simply be np.transpose(arr, (2, 1, 0)).
I'll answer this assuming it's part of a larger problem, and this is just example data to demonstrate what you want to do. Otherwise the broadcasting solution works just fine.
When you use reshape it doesn't change how numpy interprets the order of individual elements. It simply affects how numpy views the shape. So, if you have elements a, b, c, d in an array on disk that can be interpreted as an array of shape (4,), or shape (2, 2), or shape (1, 4) and so on.
What it seems you're looking for is transpose. This affects allows swapping how numpy interprets the axes. In your case
>>>arr.transpose(2,1,0)
array([[[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
...,
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]])
You don't need to create a very large array and reshape. Since you know what you want each element to be, and the final shape, you can just use numpy.broadcast_to. This requires a setup of just creating a shape (3,) array.
Setup
arr = np.array([1,2,3])
np.broadcast_to(arr, (120, 100, 3))
array([[[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
...,
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]],
[[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
...,
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]])
To get a non read-only version of this output, you can call copy():
out = np.broadcast_to(arr, (120, 100, 3)).copy()

How would you reshuffle this array efficiently?

I have an array arr_val, which stores values of a certain function at large size of locations (for illustration let's just take a small one 4 locations). Now, let's say that I also have another array loc_array which stores the location of the function, and assume that location is again the same number 4. However, location array is multidimensional array such that each location index has the same 4 sub-location index, and each sub-location index is a pair coordinates. To clearly illustrate:
arr_val = np.array([1, 2, 3, 4])
loc_array = np.array([[[1,1],[2,3],[3,1],[3,2]],[[1,2],[2,4],[3,4],[4,1]],
[[2,1],[1,4],[1,3],[3,3]],[[4,2],[4,3],[2,2],[4,4]]])
The meaning of the above two arrays would be value of some parameter of interest at, for example locations [1,1],[2,3],[3,1],[3,2] is 1, and so on. However, I am interested in re-expressing the same thing above in a different form, which is instead of having random points, I would like to have coordinates in the following tractable form
coord = [[[1,1],[1,2],[1,3],[1,4]],[[2,1],[2,2],[2,3],[2,4]],[[3,1],[3,2],
[3,3],[3,4]],[[4,1],[4,2],[4,3],[4,4]]]
and the values at respective coordinates given as
val = [[1, 2, 3, 3],[3, 4, 1, 2],[1, 1, 3, 2], [2, 4, 4, 4]]
What would be a very efficient way to achieve the above for large numpy arrays?
You can use lexsort like so:
>>> order = np.lexsort(loc_array.reshape(-1, 2).T[::-1])
>>> arr_val.repeat(4)[order].reshape(4, 4)
array([[1, 2, 3, 3],
[3, 4, 1, 2],
[1, 1, 3, 2],
[2, 4, 4, 4]])
If you know for sure that loc_array is a permutation of all possible locations then you can avoid the sort:
>>> out = np.empty((4, 4), arr_val.dtype)
>>> out.ravel()[np.ravel_multi_index((loc_array-1).reshape(-1, 2).T, (4, 4))] = arr_val.repeat(4)
>>> out
array([[1, 2, 3, 3],
[3, 4, 1, 2],
[1, 1, 3, 2],
[2, 4, 4, 4]])
It could not be the answer what you want, but it works anyway.
val = [[1, 2, 3, 3],[3, 4, 1, 2],[1, 1, 3, 2], [2, 4, 4, 4]]
temp= ""
int_list = []
for element in val:
temp_int = temp.join(map(str, element ))
int_list.append(int(temp_int))
int_list.sort()
print(int_list)
## result ##
[1132, 1233, 2444, 3412]
Change each element array into int and construct int_list
Sort int_list
Construct 2D np.array from int_list
I skipped last parts. You may find the way on web.

Fill several parts of NumPy array, given a list of indexes

I want to fill a numpy.ndarray with data (32x32 pixel integer pictures==arrays)
From the name of the file of the picture I know where in my ndarray I want my values to be stored.
I would like to give my ndarray a list but also some slice(0) in it, because the picture is stored in the last two dimensions. How do I do that?
I would like to do something like
Pesudocode:
data=numpy.ndarray(dim1,dim2,dim3,32,32)
list=function(filename)
data[list,slice(0),slice(0)]=read_image(filename)
Is that possible?
My list has entries specifying the positions of the ndarray [int,int,int] and my read image is a 32 times 32 integer array (filling the last two dimension of my ndarray).
To perform this assignment, pass a suitable array in each of the first three dimensions, and : (meaning entire index range) in the last two dimensions.
If your list is, for example,
list = [[1, 2, 3], [4, 2, 0], [5, 3, 4], [2, 2, 2]]
then the array to pass as the first index is [1, 4, 5, 2], and similarly for two others: [2, 2, 3, 2] and [3, 0, 4, 2]. Complete example with fake (random) image:
data = np.zeros((6, 7, 8, 32, 32))
list = [[1, 2, 3], [4, 2, 0], [5, 3, 4], [2, 2, 2]]
image = np.random.uniform(size=(32, 32))
ix = np.array(list)
data[ix[:, 0], ix[:, 1], ix[:, 2], :, :] = image
Here ix[:, 0] is [1, 4, 5, 2], ix[:, 1] is [2, 2, 3, 2], and so on.
Reference: NumPy indexing and broadcasting.

Categories