Elements arrangement in a numpy array - python

import numpy as np
data = np.array([[0, 0, 1, 1, 2, 2],
[1, 0, 0, 1, 2, 2],
[1, 0, 1, 0, 0, 0],
[1, 1, 0, 0, 2, 0]])
How can I do the followings?
Within 2 by 2 patch:
if any element is 2: put 2
if any element is 1: put 1
if all elements are 0: put 0
The expected result is:
np.array([[1, 1, 2],
[1, 1, 2]])

Using extract_patches from scikit-learn you can write this as follows (copy and paste-able code):
import numpy as np
from sklearn.feature_extraction.image import extract_patches
data = np.array([[0, 0, 1, 1, 2, 2],
[1, 0, 0, 1, 2, 2],
[1, 0, 1, 0, 0, 0],
[1, 1, 0, 0, 2, 0]])
patches = extract_patches(data, patch_shape=(2, 2), extraction_step=(2, 2))
output = patches.max(axis=-1).max(axis=-1)
Explanation: extract_patches gives you a view on patches of your array, of size patch_shape and lying on a grid of extraction_step. The result is a 4D array where the first two axes index the patch and the last two axes index the pixels within the patch. We then evaluate the maximum over the last two axes to obtain the maximum per patch.
EDIT This is actually very much related to this question

I'm not sure where you get your input from or where you are supposed to leave the output, but you can adapt this.
import numpy as np
data = np.array([[0, 0, 1, 1, 2, 2],
[1, 0, 0, 1, 2, 2],
[1, 0, 1, 0, 0, 0],
[1, 1, 0, 0, 2, 0]])
def patchValue(i,j):
return max([data[i][j],
data[i][j+1],
data[i+1][j],
data[i+1][j+1]])
result = np.array([[0, 0, 0],
[0, 0, 0]])
for (v,i) in enumerate(range(0,4,2)):
for (w,j) in enumerate(range(0,6,2)):
result[v][w] = patchValue(i,j)
print(result)

Here's a rather lengthy one-liner that relies solely on reshaping, transposes, and taking maximums along different axes. It is fairly fast too.
data.reshape((-1,2)).max(axis=1).reshape((data.shape[0],-1)).T.reshape((-1,2)).max(axis=1).reshape((data.shape[1]/2,data.shape[0]/2)).T
Essentially what this does is reshape to take the maximum in pairs of two horizontally, then shuffle things around again and take the maximum in pairs of two vertically, ultimately giving the maximum of each block of 4, matching your desired output.

If the original array is large, and performance is an issue, the loops can be pushed down to numpy C code by manipulating the shape and strides of the original array to create the windows that you are acting on:
import numpy as np
from numpy.lib.stride_tricks import as_strided
data = np.array([[0, 0, 1, 1, 2, 2],
[1, 0, 0, 1, 2, 2],
[1, 0, 1, 0, 0, 0],
[1, 1, 0, 0, 2, 0]])
patch_shape = (2,2)
data_shape = np.array(data.shape)
# transform data to a 2x3 array of 2x2 patches/windows
# final shape of the computation on the windows can be calculated with:
# tuple(((data_shape-patch_shape) // patch_shape) + 1)
final_shape = (2,3)
# the shape of the windowed array can be calculated with:
# final_shape + patch_shape
newshape = (2, 3, 2, 2)
# the strides of the windowed array can be calculated with:
# tuple(np.array(data.strides) * patch_shape) + data.strides
newstrides = (48, 8, 24, 4)
# use as_strided to 'transform' the array
patch_array = as_strided(data, shape = newshape, strides = newstrides)
# flatten the windowed array for iteration - dim of 6x2x2
# the number of windows is the product of the 'first' dimensions of the array
# which can be calculated with:
# (np.product(newshape[:-len(patch_shape)])) + (newshape[-len(patch_array):])
dim = (6,2,2)
patch_array = patch_array.reshape(dim)
# perfom computations on the windows and reshape to final dimensions
result = [2 if np.any(patch == 2) else
1 if np.any(patch == 1) else
0 for patch in patch_array]
result = np.array(result).reshape(final_shape)
A generalized 1-d function for creating the windowed array can be found at Efficient rolling statistics with NumPy
A generalised multi-dimension function and a nice explanation can be found at Efficient Overlapping Windows with Numpy

Related

How to vectorize this pytorch code over (at least) the batch dimension?

I want to implement a code to build an adjacency matrix such that (for example):
If X[0] : [0, 1, 2, 0, 1, 0], then,
A[0, 1] = 1
A[1, 2] = 1
A[2, 0] = 1
A[0, 1] = 1
A[1, 0] = 1
The following code works fine, however, it's too slow! So, please help me to vectorize this code on the batch (first) dimension at least:
A = torch.zeros((3, 3, 3), dtype = torch.float)
X = torch.tensor([[0, 1, 2, 0, 1, 0], [1, 0, 0, 2, 1, 1], [0, 0, 2, 2, 1, 1]])
for a, x in zip(A, X):
for i, j in zip(x, x[1:]):
a[i, j] = 1
Thanks! :)
I am pretty sure that there is a much simpler way of doing this, but I tried to keep within the realm of torch function calls, to make sure that any gradient operation could be properly tracked.
In case this is not required for backpropagation, I strongly suggest you look into solution that maybe utilize some numpy functions, because I think there is a stronger guarantee to find something suitable here. But, without further ado, here is the solution I came up with.
It essentially transforms your X vector into a series of tuple entries that correspond to the position in A. For this, we need to align some of the indices (specifically, the first dimension is only implicitly given in X, since the first list in X corresponds to A[0,:,:], the second list to A[1,:,:], and so on.
This is also probably where you can start optimizing the code, because I did not find a reasonable description of such a matrix, and therefore had to come up with my own way of creating it.
# Start by "aligning" your shifted view of X
# Essentially, take the all but the last element,
# and put it on top of all but the first element.
X_shift = torch.stack([X[:,:-1], X[:,1:]], dim=2)
# X_shift.shape: (3,5,2) in your example
# To assign this properly, we need to turn it into a "concatenated" list,
# where each entry corresponds to a 2D tuple in the respective dimension of A.
temp_tuples = X_shift.view(-1,2).transpose(0,1)
# temp_tuples.shape: (2,15) in your example. Below are the values:
tensor([[0, 1, 2, 0, 1, 1, 0, 0, 2, 1, 0, 0, 2, 2, 1],
[1, 2, 0, 1, 0, 0, 0, 2, 1, 1, 0, 2, 2, 1, 1]])
# Now we have to create a matrix do indicate the proper "first dimension index"
fix_dims = torch.repeat_interleave(torch.arange(0,3,1), len(X[0])-1, 0).unsqueeze(dim=0)
# fix_dims.shape: (1,15)
# Long story short, this creates the following vector.
tensor([[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2]])
# Note that the unsqueeze is necessary to properly concatenate the two matrices:
access_tuples = tuple(torch.cat([fix_dims, temp_tuples], dim=0))
A[access_tuples] = 1
This further assumes that every dimension in X has the same number of tuples changed. If that is not the case, then you have to manually create a fix_dims vector, where each increment is repeated the length of X[i] times. If it is equal as in your example, you can safely use the proposed solution.
Make X a tuple instead of a tensor:
A = torch.zeros((3, 3, 3), dtype = torch.float)
X = ([0, 1, 2, 0, 1, 0], [1, 0, 0, 2, 1, 1], [0, 0, 2, 2, 1, 1])
A[X] = 1
For example, by casting it like this: A[tuple(X)]

How to select randomly elements from a tensor, with a condition on the elements to choose?

I have a tensor filled with 0 and 1. Now I want to randomly choose e.g. 50% of the elements which are equal to one. How do I do that?
For example I have the following tensor:
tensor = tf.constant([[0, 0, 1], [0, 1, 0], [1, 1, 0]])
Now I want to randomly choose the coordinates of 50% of the elements which are equal to one (in this case, I want to choose 2 elements out of the 4). The resulting tensors could look like follows:
[[0, 0, 1], [0, 0, 0], [0, 1, 0]]
You can use numpy.
import numpy as np
tensor = np.array([0, 1, 0, 1, 0, 1, 0, 1])
percentage = 0.5
ones_indices = np.where(tensor==1)
ones_length = np.shape(ones_indices)[1]
random_indices = np.random.permutation(ones_length)
ones_indices[0][random_indices][:int(ones_length * percentage)]
Edit: With your definition of a tensor I have adjusted the code:
import numpy as np
tensor = np.array([[0, 0, 1], [0, 1, 0], [1, 1, 0]])
percentage = 0.5
indices = np.where(tensor == 1)
length = np.shape(indices)[1]
random_idx = np.random.permutation(length)
random_idx = random_idx[:int(length * percentage)]
random_indices = (indices[0][random_idx], indices[1][random_idx])
z = np.zeros(np.shape(tensor), dtype=np.int64)
z[random_indices] = 1
# output
z

Vectorised non zero groups in numpy array

Say you have 1d numpy array:
[0,0,0,0,0,1,2,3,0,0,0,0,4,5,0,0,0]
How would you create the following groups without using for loop?
[1,2,3], [4,5]
Here's one way using np.split:
a
# array([0, 0, 0, 0, 0, 1, 2, 3, 0, 0, 0, 0, 4, 5, 0, 0, 0])
### find nonzeros
z = a!=0
### find switching points
z[1:] ^= z[:-1]
### split at switching points and discard zeros
np.split(a, *np.where(z))[1::2]
# [array([1, 2, 3]), array([4, 5])]

Make every possible combination in 2D array

I'm trying to make an array of 4x4 (16) pixel black and white images with all possible combinations. I made the following array as a template:
template = [[0,0,0,0], # start with all white pixels
[0,0,0,0],
[0,0,0,0],
[0,0,0,0]]
I then want to iterate through the template and changing the 0 to 1 for every possible combination.
I tried to iterate with numpy and itertools but can only get 256 combinations, and with my calculations there should be 32000 (Edit: 65536! don't know what happened there...). Any one with mad skills that could help me out?
As you said, you can use the itertools module to do this, in particular the product function:
import itertools
import numpy as np
# generate all the combinations as string tuples of length 16
seq = itertools.product("01", repeat=16)
for s in seq:
# convert to numpy array and reshape to 4x4
arr = np.fromiter(s, np.int8).reshape(4, 4)
# do something with arr
You would have a total of 65536 such combinations of such a (4 x 4) shaped array. Here's a vectorized approach to generate all those combinations, to give us a (65536 x 4 x 4) shaped multi-dim array -
mask = ((np.arange(2**16)[:,None] & (1 << np.arange(16))) != 0)
out = mask.astype(int).reshape(-1,4,4)
Sample run -
In [145]: out.shape
Out[145]: (65536, 4, 4)
In [146]: out
Out[146]:
array([[[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
[[1, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
[[0, 1, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
...,
[[1, 0, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]],
[[0, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]],
[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]]])
One possibility which relies on a for loop
out = []
for i in range(2**16):
out.append(np.frombuffer("{:016b}".format(i).encode('utf8')).view(np.uint8).reshape(4,4)-48)
Obviously you could make that a list comprehension if you like.
It takes advantage of Python string formatting which is able to produce the binary representation of integers. The format string instructs it to use 16 places filling with zeros on the left. The string is then encoded to give a bytes object which numpy can interpret as an array.
In the end we subtract the code for the character "0" to get a proper 0. Luckily, "1" sits just above "0", so that's all we need to do.
First I'll iterate for all numbers from 0 to (2^16)-1. Then I'll create a 16 character binary string for each of those numbers and thus covering all possible combinations
After that I converted the string to a list and made a 2d list out of it using list comprehension and slicing.
all_combinations = []
for i in xrange(pow(2,16))
binary = '{0:016b}'.format(i) ## Converted number to binary string
binary = map(int,list(binary)) ## String to list ## list(map(int,list(binary))) in py 3
template = [binary[i:i+4] for i in xrange(0, len(binary), 4)] #created 2d list
all_combinations.append(template)

How Do I create a Binomial Array of specific size

I am trying to generate a numpy array of length 100 randomly filled with sets of 5 1s and 0s as such:
[ [1,1,1,1,1] , [0,0,0,0,0] , [0,0,0,0,0] ... [1,1,1,1,1], [0,0,0,0,0] ]
Essentially there should be a 50% chance that at each position there will be 5 1s and a 50% chance there will be 5 0's
Currently, I have been messing about with numpy.random.binomial(), and tried running:
numpy.random.binomial(1, .5 , (100,5))
but this creates an array as such:
[ [0,1,0,0,1] , [0,1,1,1,0] , [1,1,0,0,1] ... ]
I need each each set of elements to be consistent and not random. How can I do this?
Use numpy.random.randint to generate a random column of 100 1s and 0s, then use tile to repeat the column 5 times:
>>> numpy.tile(numpy.random.randint(0, 2, size=(100, 1)), 5)
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[0, 0, 0, 0, 0],
...
You want to create a temporary array of zeros and ones, and then randomly index into that array to create a new array. In the code below, the first line creates an array whose 0'th row contains all zeros and whose 1st row contains all ones. The function randint returns a random sequence of zeros and ones, which can be used as indices into the temporary array.
import numpy as np
...
def make_coded_array(n, k=5):
_ = np.array([[0]*k,[1]*k])
return _[np.random.randint(2, size=500)]
import numpy as np
import random
c = random.sample([[0,0,0,0,0],[1,1,1,1,1]], 1)
for i in range(99):
c = np.append(c, random.sample([[0,0,0,0,0],[1,1,1,1,1]], 1))
Not the most efficient way though
Use numpy.ones and numpy.random.binomial
>>> numpy.ones((100, 5), dtype=numpy.int64) * numpy.random.binomial(1, .5, (100, 1))
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
...

Categories