How Do I create a Binomial Array of specific size

How Do I create a Binomial Array of specific size - python

I am trying to generate a numpy array of length 100 randomly filled with sets of 5 1s and 0s as such:
[ [1,1,1,1,1] , [0,0,0,0,0] , [0,0,0,0,0] ... [1,1,1,1,1], [0,0,0,0,0] ]
Essentially there should be a 50% chance that at each position there will be 5 1s and a 50% chance there will be 5 0's
Currently, I have been messing about with numpy.random.binomial(), and tried running:
numpy.random.binomial(1, .5 , (100,5))
but this creates an array as such:
[ [0,1,0,0,1] , [0,1,1,1,0] , [1,1,0,0,1] ... ]
I need each each set of elements to be consistent and not random. How can I do this?

Use numpy.random.randint to generate a random column of 100 1s and 0s, then use tile to repeat the column 5 times:
>>> numpy.tile(numpy.random.randint(0, 2, size=(100, 1)), 5)
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[0, 0, 0, 0, 0],
...

You want to create a temporary array of zeros and ones, and then randomly index into that array to create a new array. In the code below, the first line creates an array whose 0'th row contains all zeros and whose 1st row contains all ones. The function randint returns a random sequence of zeros and ones, which can be used as indices into the temporary array.
import numpy as np
...
def make_coded_array(n, k=5):
_ = np.array([[0]*k,[1]*k])
return _[np.random.randint(2, size=500)]

import numpy as np
import random
c = random.sample([[0,0,0,0,0],[1,1,1,1,1]], 1)
for i in range(99):
c = np.append(c, random.sample([[0,0,0,0,0],[1,1,1,1,1]], 1))
Not the most efficient way though

Use numpy.ones and numpy.random.binomial
>>> numpy.ones((100, 5), dtype=numpy.int64) * numpy.random.binomial(1, .5, (100, 1))
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
...

Related

Efficient way to substitute repeating np.vstack in python?

I am trying to implement this post in python.
import numpy as np
x = np.array([0,0,0])
for r in range(3):
x = np.vstack((x, np.array([-r, r, -r])))
x gets this value
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])
I am concerned the runtime efficiency about the repeating np.vstack. Is there a more efficient way to do this?

Build a list of arrays or lists, and apply np.array (or vstack) to that once:
In [598]: np.array([[-r,r,-r] for r in [0,0,1,2]])
Out[598]:
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])
But if the column pattern is consistent, broadcasting two arrays against each other will be faster
In [599]: np.array([-1,1,-1])*np.array([0,0,1,2])[:,None]
Out[599]:
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])

Would it be useful to use numpy.tile?
N = 3
A = np.array([[0, *range(0, -N, -1)]]).T
B = np.tile(A, (1, N))
B[:,1] = -B[:,1]
The first line sets the expected number of rows after the first row of zeroes. The second creates a NumPy array by creating an initial value of 0, followed by the linear sequence of 0, -1, -2, up to -N + 1. Note the use of the splat operator which unpacks the range object and creates elements in an individual list. These are concatenated with the first value of 0, and we create a 2D NumPy array that is a column vector. The third line tiles this vector N times horizontally to get the desired output. Finally the fourth line negates the second column to get your desired output
Example Run
In [175]: N = 3
In [176]: A = np.array([[0, *range(0, -N, -1)]]).T
In [177]: B = np.tile(A, (1, N))
In [178]: B[:,1] = -B[:,1]
In [178]: B
Out[178]:
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])

You can use np.block as following:
First create a block which you are currently doing inside the for loop
Finally, vertically stack a row of zeros using np.vstack to get the final desired answer
import numpy as np
size = 3
sign = np.ones(3)*((-1)**np.arange(1, size+1)) # General sign array of repeating -1, 1
A = np.ones((size, size), int)
B = np.arange(0, size) * A
B = sign * np.block([B.T])
# array([[ 0, 0, 0],
# [ -1, 1, -1],
# [ -2, 2, -2]])
answer = np.vstack([B[0], B])
# array([[ 0, 0, 0],
# [ 0, 0, 0],
# [ -1, 1, -1],
# [ -2, 2, -2]])

How to select randomly elements from a tensor, with a condition on the elements to choose?

I have a tensor filled with 0 and 1. Now I want to randomly choose e.g. 50% of the elements which are equal to one. How do I do that?
For example I have the following tensor:
tensor = tf.constant([[0, 0, 1], [0, 1, 0], [1, 1, 0]])
Now I want to randomly choose the coordinates of 50% of the elements which are equal to one (in this case, I want to choose 2 elements out of the 4). The resulting tensors could look like follows:
[[0, 0, 1], [0, 0, 0], [0, 1, 0]]

You can use numpy.
import numpy as np
tensor = np.array([0, 1, 0, 1, 0, 1, 0, 1])
percentage = 0.5
ones_indices = np.where(tensor==1)
ones_length = np.shape(ones_indices)[1]
random_indices = np.random.permutation(ones_length)
ones_indices[0][random_indices][:int(ones_length * percentage)]
Edit: With your definition of a tensor I have adjusted the code:
import numpy as np
tensor = np.array([[0, 0, 1], [0, 1, 0], [1, 1, 0]])
percentage = 0.5
indices = np.where(tensor == 1)
length = np.shape(indices)[1]
random_idx = np.random.permutation(length)
random_idx = random_idx[:int(length * percentage)]
random_indices = (indices[0][random_idx], indices[1][random_idx])
z = np.zeros(np.shape(tensor), dtype=np.int64)
z[random_indices] = 1
# output
z

Make every possible combination in 2D array

I'm trying to make an array of 4x4 (16) pixel black and white images with all possible combinations. I made the following array as a template:
template = [[0,0,0,0], # start with all white pixels
[0,0,0,0],
[0,0,0,0],
[0,0,0,0]]
I then want to iterate through the template and changing the 0 to 1 for every possible combination.
I tried to iterate with numpy and itertools but can only get 256 combinations, and with my calculations there should be 32000 (Edit: 65536! don't know what happened there...). Any one with mad skills that could help me out?

As you said, you can use the itertools module to do this, in particular the product function:
import itertools
import numpy as np
# generate all the combinations as string tuples of length 16
seq = itertools.product("01", repeat=16)
for s in seq:
# convert to numpy array and reshape to 4x4
arr = np.fromiter(s, np.int8).reshape(4, 4)
# do something with arr

You would have a total of 65536 such combinations of such a (4 x 4) shaped array. Here's a vectorized approach to generate all those combinations, to give us a (65536 x 4 x 4) shaped multi-dim array -
mask = ((np.arange(2**16)[:,None] & (1 << np.arange(16))) != 0)
out = mask.astype(int).reshape(-1,4,4)
Sample run -
In [145]: out.shape
Out[145]: (65536, 4, 4)
In [146]: out
Out[146]:
array([[[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
[[1, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
[[0, 1, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
...,
[[1, 0, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]],
[[0, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]],
[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]]])

One possibility which relies on a for loop
out = []
for i in range(2**16):
out.append(np.frombuffer("{:016b}".format(i).encode('utf8')).view(np.uint8).reshape(4,4)-48)
Obviously you could make that a list comprehension if you like.
It takes advantage of Python string formatting which is able to produce the binary representation of integers. The format string instructs it to use 16 places filling with zeros on the left. The string is then encoded to give a bytes object which numpy can interpret as an array.
In the end we subtract the code for the character "0" to get a proper 0. Luckily, "1" sits just above "0", so that's all we need to do.

First I'll iterate for all numbers from 0 to (2^16)-1. Then I'll create a 16 character binary string for each of those numbers and thus covering all possible combinations
After that I converted the string to a list and made a 2d list out of it using list comprehension and slicing.
all_combinations = []
for i in xrange(pow(2,16))
binary = '{0:016b}'.format(i) ## Converted number to binary string
binary = map(int,list(binary)) ## String to list ## list(map(int,list(binary))) in py 3
template = [binary[i:i+4] for i in xrange(0, len(binary), 4)] #created 2d list
all_combinations.append(template)

Elements arrangement in a numpy array

import numpy as np
data = np.array([[0, 0, 1, 1, 2, 2],
[1, 0, 0, 1, 2, 2],
[1, 0, 1, 0, 0, 0],
[1, 1, 0, 0, 2, 0]])
How can I do the followings?
Within 2 by 2 patch:
if any element is 2: put 2
if any element is 1: put 1
if all elements are 0: put 0
The expected result is:
np.array([[1, 1, 2],
[1, 1, 2]])

Using extract_patches from scikit-learn you can write this as follows (copy and paste-able code):
import numpy as np
from sklearn.feature_extraction.image import extract_patches
data = np.array([[0, 0, 1, 1, 2, 2],
[1, 0, 0, 1, 2, 2],
[1, 0, 1, 0, 0, 0],
[1, 1, 0, 0, 2, 0]])
patches = extract_patches(data, patch_shape=(2, 2), extraction_step=(2, 2))
output = patches.max(axis=-1).max(axis=-1)
Explanation: extract_patches gives you a view on patches of your array, of size patch_shape and lying on a grid of extraction_step. The result is a 4D array where the first two axes index the patch and the last two axes index the pixels within the patch. We then evaluate the maximum over the last two axes to obtain the maximum per patch.
EDIT This is actually very much related to this question

I'm not sure where you get your input from or where you are supposed to leave the output, but you can adapt this.
import numpy as np
data = np.array([[0, 0, 1, 1, 2, 2],
[1, 0, 0, 1, 2, 2],
[1, 0, 1, 0, 0, 0],
[1, 1, 0, 0, 2, 0]])
def patchValue(i,j):
return max([data[i][j],
data[i][j+1],
data[i+1][j],
data[i+1][j+1]])
result = np.array([[0, 0, 0],
[0, 0, 0]])
for (v,i) in enumerate(range(0,4,2)):
for (w,j) in enumerate(range(0,6,2)):
result[v][w] = patchValue(i,j)
print(result)

Here's a rather lengthy one-liner that relies solely on reshaping, transposes, and taking maximums along different axes. It is fairly fast too.
data.reshape((-1,2)).max(axis=1).reshape((data.shape[0],-1)).T.reshape((-1,2)).max(axis=1).reshape((data.shape[1]/2,data.shape[0]/2)).T
Essentially what this does is reshape to take the maximum in pairs of two horizontally, then shuffle things around again and take the maximum in pairs of two vertically, ultimately giving the maximum of each block of 4, matching your desired output.

If the original array is large, and performance is an issue, the loops can be pushed down to numpy C code by manipulating the shape and strides of the original array to create the windows that you are acting on:
import numpy as np
from numpy.lib.stride_tricks import as_strided
data = np.array([[0, 0, 1, 1, 2, 2],
[1, 0, 0, 1, 2, 2],
[1, 0, 1, 0, 0, 0],
[1, 1, 0, 0, 2, 0]])
patch_shape = (2,2)
data_shape = np.array(data.shape)
# transform data to a 2x3 array of 2x2 patches/windows
# final shape of the computation on the windows can be calculated with:
# tuple(((data_shape-patch_shape) // patch_shape) + 1)
final_shape = (2,3)
# the shape of the windowed array can be calculated with:
# final_shape + patch_shape
newshape = (2, 3, 2, 2)
# the strides of the windowed array can be calculated with:
# tuple(np.array(data.strides) * patch_shape) + data.strides
newstrides = (48, 8, 24, 4)
# use as_strided to 'transform' the array
patch_array = as_strided(data, shape = newshape, strides = newstrides)
# flatten the windowed array for iteration - dim of 6x2x2
# the number of windows is the product of the 'first' dimensions of the array
# which can be calculated with:
# (np.product(newshape[:-len(patch_shape)])) + (newshape[-len(patch_array):])
dim = (6,2,2)
patch_array = patch_array.reshape(dim)
# perfom computations on the windows and reshape to final dimensions
result = [2 if np.any(patch == 2) else
1 if np.any(patch == 1) else
0 for patch in patch_array]
result = np.array(result).reshape(final_shape)
A generalized 1-d function for creating the windowed array can be found at Efficient rolling statistics with NumPy
A generalised multi-dimension function and a nice explanation can be found at Efficient Overlapping Windows with Numpy

matlab find() for nonzero element in python

I have a sparse matrix (numpy.array) and I would like to have the index of the nonzero elements in it.
In Matlab I would write:
[i, j] = find(CM)
and in Python what should I do?
I have tried numpy.nonzero (but I don't know how to take the indices from that) and flatnonzero (but it's not convenient for me, I need both the row and column index).
Thanks in advance!

Assuming that by "sparse matrix" you don't actually mean a scipy.sparse matrix, but merely a numpy.ndarray with relatively few nonzero entries, then I think nonzero is exactly what you're looking for. Starting from an array:
>>> a = (np.random.random((5,5)) < 0.10)*1
>>> a
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 1],
[0, 0, 1, 0, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
nonzero returns the indices (here x and y) where the nonzero entries live:
>>> a.nonzero()
(array([1, 2, 3]), array([4, 2, 0]))
We can assign these to i and j:
>>> i, j = a.nonzero()
We can also use them to index back into a, which should give us only 1s:
>>> a[i,j]
array([1, 1, 1])
We can even modify a using these indices:
>>> a[i,j] = 2
>>> a
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 2],
[0, 0, 2, 0, 0],
[2, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
If you want a combined array from the indices, you can do that too:
>>> np.array(a.nonzero()).T
array([[1, 4],
[2, 2],
[3, 0]])
(there are lots of ways to do this reshaping; I chose one almost at random.)

This goes slightly beyond what you as and I only mention it since I once faced a similar problem. If you want the indices to access some other array there is some very simple sytax:
import numpy as np
array = np.random.randint(0, 2, size=(3, 3))
data = np.random.random(size=(3, 3))
Now array looks something like
>>> print array
array([[0, 1, 0],
[1, 0, 1],
[1, 1, 0]])
while data could be
>>> print data
array([[ 0.92824816, 0.43605604, 0.16627849],
[ 0.00301434, 0.94342538, 0.95297402],
[ 0.32665135, 0.03504204, 0.86902492]])
Then if we want the elements of data which are zero:
>>> print data[array==0]
array([ 0.92824816, 0.16627849, 0.94342538, 0.86902492])
Which is nice and simple.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How Do I create a Binomial Array of specific size - python

import numpy as np import random c = random.sample([[0,0,0,0,0],[1,1,1,1,1]], 1) for i in range(99): c = np.append(c, random.sample([[0,0,0,0,0],[1,1,1,1,1]], 1)) Not the most efficient way though

Use numpy.ones and numpy.random.binomial >>> numpy.ones((100, 5), dtype=numpy.int64) * numpy.random.binomial(1, .5, (100, 1)) array([[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], ...

Related

Efficient way to substitute repeating np.vstack in python?

How to select randomly elements from a tensor, with a condition on the elements to choose?

Make every possible combination in 2D array

Elements arrangement in a numpy array

matlab find() for nonzero element in python

Categories

Resources