Efficient way to substitute repeating np.vstack in python? - python

I am trying to implement this post in python.
import numpy as np
x = np.array([0,0,0])
for r in range(3):
x = np.vstack((x, np.array([-r, r, -r])))
x gets this value
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])
I am concerned the runtime efficiency about the repeating np.vstack. Is there a more efficient way to do this?

Build a list of arrays or lists, and apply np.array (or vstack) to that once:
In [598]: np.array([[-r,r,-r] for r in [0,0,1,2]])
Out[598]:
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])
But if the column pattern is consistent, broadcasting two arrays against each other will be faster
In [599]: np.array([-1,1,-1])*np.array([0,0,1,2])[:,None]
Out[599]:
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])

Would it be useful to use numpy.tile?
N = 3
A = np.array([[0, *range(0, -N, -1)]]).T
B = np.tile(A, (1, N))
B[:,1] = -B[:,1]
The first line sets the expected number of rows after the first row of zeroes. The second creates a NumPy array by creating an initial value of 0, followed by the linear sequence of 0, -1, -2, up to -N + 1. Note the use of the splat operator which unpacks the range object and creates elements in an individual list. These are concatenated with the first value of 0, and we create a 2D NumPy array that is a column vector. The third line tiles this vector N times horizontally to get the desired output. Finally the fourth line negates the second column to get your desired output
Example Run
In [175]: N = 3
In [176]: A = np.array([[0, *range(0, -N, -1)]]).T
In [177]: B = np.tile(A, (1, N))
In [178]: B[:,1] = -B[:,1]
In [178]: B
Out[178]:
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])

You can use np.block as following:
First create a block which you are currently doing inside the for loop
Finally, vertically stack a row of zeros using np.vstack to get the final desired answer
import numpy as np
size = 3
sign = np.ones(3)*((-1)**np.arange(1, size+1)) # General sign array of repeating -1, 1
A = np.ones((size, size), int)
B = np.arange(0, size) * A
B = sign * np.block([B.T])
# array([[ 0, 0, 0],
# [ -1, 1, -1],
# [ -2, 2, -2]])
answer = np.vstack([B[0], B])
# array([[ 0, 0, 0],
# [ 0, 0, 0],
# [ -1, 1, -1],
# [ -2, 2, -2]])

Related

Iterating through matrices Python

If I have two lists and want to iterate through subtracting one from the other how would I go about this? I was thinking broadcasting. Right now I have:
array1 = [0,2,2,0]
array2 = [2,2,0,1]
I would like to subtract array1 from each value in array2 and make a new matrix of outputs:
output = [2, 0, 0, 2,
2, 0, 0, 2,
0, -2, -2, 0,
1, -1, -1, 1]
so in the end it's a 4x4 matrix.
Is this possible? Is the easiest way to use broadcasting? I was thinking of making each row value in array2 into it's own array, subtracting that from array2 using broadcasting, then summing all the array's at the end into one big array (using Numpy)... is there an easier way?
If I have two lists and want to iterate through subtracting one from the other how would I go about this? I was thinking broadcasting. Right now I have:
array1 = [0,2,2,0]
array2 = [2,2,0,1]
I would like to subtract array1 from each value in array2 and make a new matrix of outputs:
output = [2, 0, 0, 2,
2, 0, 0, 2,
0, -2, -2, 0,
1, -1, -1, 1]
so in the end it's a 4x4 matrix.
Is this possible? Is the easiest way to use broadcasting? I was thinking of making each row value in array2 into it's own array, subtracting that from array2 using broadcasting, then summing all the array's at the end into one big array (using Numpy)... is there an easier way?
Broadcasting with numpy:
>>> a1 = np.array([0,2,2,0])
>>> a2 = np.array([2,2,0,1])
>>> a2[:, np.newaxis] - a1
array([[ 2, 0, 0, 2],
[ 2, 0, 0, 2],
[ 0, -2, -2, 0],
[ 1, -1, -1, 1]])
Something like this?
def all_differences(x, y):
return (a - b for a in y for b in x)
print(list(all_differences([0, 2, 2, 0], [2, 2, 0,1])))
# -> [2, 0, 0, 2, 2, 0, 0, 2, 0, -2, -2, 0, 1, -1, -1, 1]
It just itertates over every item in the second list for every item in the first list, and gives their difference.
This can also be solved with itertools.product and can be generalised for multiple lists:
import itertools
import functools
import operator
difference = functools.partial(functools.reduce, operator.sub)
def all_differences(*lists):
return map(difference, itertools.product(*reversed(lists)))
print(list(all_differences([0, 2, 2, 0], [2, 2, 0,1])))
Or just handling two lists:
import itertools
def all_differences(x, y):
return (b - a for (a, b) in itertools.product((x, y)))
print(list(all_differences([0, 2, 2, 0], [2, 2, 0,1])))

Numpy: increment elements of an array given the indices required to increment

I am trying to turn a second order tensor into a binary third order tensor. Given a second order tensor as a m x n numpy array: A, I need to take each element value: x, in A and replace it with a vector: v, with dimensions equal to the maximum value of A, but with a value of 1 incremented at the index of v corresponding to the value x (i.e. v[x] = 1). I have been following this question: Increment given indices in a matrix, which addresses producing an array with increments at indices given by 2 dimensional coordinates. I have been reading the answers and trying to use np.ravel_multi_index() and np.bincount() to do the same but with 3 dimensional coordinates, however I keep on getting a ValueError: "invalid entry in coordinates array". This is what I have been using:
def expand_to_tensor_3(array):
(x, y) = array.shape
(a, b) = np.indices((x, y))
a = a.reshape(x*y)
b = b.reshape(x*y)
tensor_3 = np.bincount(np.ravel_multi_index((a, b, array.reshape(x*y)), (x, y, np.amax(array))))
return tensor_3
If you know what is wrong here or know an even better method to accomplish my goal, both would be really helpful, thanks.
You can use (A[:,:,np.newaxis] == np.arange(A.max()+1)).astype(int).
Here's a demonstration:
In [52]: A
Out[52]:
array([[2, 0, 0, 2],
[3, 1, 2, 3],
[3, 2, 1, 0]])
In [53]: B = (A[:,:,np.newaxis] == np.arange(A.max()+1)).astype(int)
In [54]: B
Out[54]:
array([[[0, 0, 1, 0],
[1, 0, 0, 0],
[1, 0, 0, 0],
[0, 0, 1, 0]],
[[0, 0, 0, 1],
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]],
[[0, 0, 0, 1],
[0, 0, 1, 0],
[0, 1, 0, 0],
[1, 0, 0, 0]]])
Check a few individual elements of A:
In [55]: A[0,0]
Out[55]: 2
In [56]: B[0,0,:]
Out[56]: array([0, 0, 1, 0])
In [57]: A[1,3]
Out[57]: 3
In [58]: B[1,3,:]
Out[58]: array([0, 0, 0, 1])
The expression A[:,:,np.newaxis] == np.arange(A.max()+1) uses broadcasting to compare each element of A to np.arange(A.max()+1). For a single value, this looks like:
In [63]: 3 == np.arange(A.max()+1)
Out[63]: array([False, False, False, True], dtype=bool)
In [64]: (3 == np.arange(A.max()+1)).astype(int)
Out[64]: array([0, 0, 0, 1])
A[:,:,np.newaxis] is a three-dimensional view of A with shape (3,4,1). The extra dimension is added so that the comparison to np.arange(A.max()+1) will broadcast to each element, giving a result with shape (3, 4, A.max()+1).
With a trivial change, this will work for an n-dimensional array. Indexing a numpy array with the ellipsis ... means "all the other dimensions". So
(A[..., np.newaxis] == np.arange(A.max()+1)).astype(int)
converts an n-dimensional array to an (n+1)-dimensional array, where the last dimension is the binary indicator of the integer in A. Here's an example with a one-dimensional array:
In [6]: a = np.array([3, 4, 0, 1])
In [7]: (a[...,np.newaxis] == np.arange(a.max()+1)).astype(int)
Out[7]:
array([[0, 0, 0, 1, 0],
[0, 0, 0, 0, 1],
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0]])
You can make it work this way:
tensor_3 = np.bincount(np.ravel_multi_index((a, b, array.reshape(x*y)),
(x, y, np.amax(array) + 1)))
The difference is that I add 1 to the amax() result, because ravel_multi_index() expects that the indexes are all strictly less than the dimensions, not less-or-equal.
I'm not 100% sure if this is what you wanted; another way to make the code run is to specify mode='clip' or mode='wrap' in ravel_multi_index(), which does something a bit different and I'm guessing is less correct. But you can try it.

How Do I create a Binomial Array of specific size

I am trying to generate a numpy array of length 100 randomly filled with sets of 5 1s and 0s as such:
[ [1,1,1,1,1] , [0,0,0,0,0] , [0,0,0,0,0] ... [1,1,1,1,1], [0,0,0,0,0] ]
Essentially there should be a 50% chance that at each position there will be 5 1s and a 50% chance there will be 5 0's
Currently, I have been messing about with numpy.random.binomial(), and tried running:
numpy.random.binomial(1, .5 , (100,5))
but this creates an array as such:
[ [0,1,0,0,1] , [0,1,1,1,0] , [1,1,0,0,1] ... ]
I need each each set of elements to be consistent and not random. How can I do this?
Use numpy.random.randint to generate a random column of 100 1s and 0s, then use tile to repeat the column 5 times:
>>> numpy.tile(numpy.random.randint(0, 2, size=(100, 1)), 5)
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[0, 0, 0, 0, 0],
...
You want to create a temporary array of zeros and ones, and then randomly index into that array to create a new array. In the code below, the first line creates an array whose 0'th row contains all zeros and whose 1st row contains all ones. The function randint returns a random sequence of zeros and ones, which can be used as indices into the temporary array.
import numpy as np
...
def make_coded_array(n, k=5):
_ = np.array([[0]*k,[1]*k])
return _[np.random.randint(2, size=500)]
import numpy as np
import random
c = random.sample([[0,0,0,0,0],[1,1,1,1,1]], 1)
for i in range(99):
c = np.append(c, random.sample([[0,0,0,0,0],[1,1,1,1,1]], 1))
Not the most efficient way though
Use numpy.ones and numpy.random.binomial
>>> numpy.ones((100, 5), dtype=numpy.int64) * numpy.random.binomial(1, .5, (100, 1))
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
...

matlab find() for nonzero element in python

I have a sparse matrix (numpy.array) and I would like to have the index of the nonzero elements in it.
In Matlab I would write:
[i, j] = find(CM)
and in Python what should I do?
I have tried numpy.nonzero (but I don't know how to take the indices from that) and flatnonzero (but it's not convenient for me, I need both the row and column index).
Thanks in advance!
Assuming that by "sparse matrix" you don't actually mean a scipy.sparse matrix, but merely a numpy.ndarray with relatively few nonzero entries, then I think nonzero is exactly what you're looking for. Starting from an array:
>>> a = (np.random.random((5,5)) < 0.10)*1
>>> a
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 1],
[0, 0, 1, 0, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
nonzero returns the indices (here x and y) where the nonzero entries live:
>>> a.nonzero()
(array([1, 2, 3]), array([4, 2, 0]))
We can assign these to i and j:
>>> i, j = a.nonzero()
We can also use them to index back into a, which should give us only 1s:
>>> a[i,j]
array([1, 1, 1])
We can even modify a using these indices:
>>> a[i,j] = 2
>>> a
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 2],
[0, 0, 2, 0, 0],
[2, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
If you want a combined array from the indices, you can do that too:
>>> np.array(a.nonzero()).T
array([[1, 4],
[2, 2],
[3, 0]])
(there are lots of ways to do this reshaping; I chose one almost at random.)
This goes slightly beyond what you as and I only mention it since I once faced a similar problem. If you want the indices to access some other array there is some very simple sytax:
import numpy as np
array = np.random.randint(0, 2, size=(3, 3))
data = np.random.random(size=(3, 3))
Now array looks something like
>>> print array
array([[0, 1, 0],
[1, 0, 1],
[1, 1, 0]])
while data could be
>>> print data
array([[ 0.92824816, 0.43605604, 0.16627849],
[ 0.00301434, 0.94342538, 0.95297402],
[ 0.32665135, 0.03504204, 0.86902492]])
Then if we want the elements of data which are zero:
>>> print data[array==0]
array([ 0.92824816, 0.16627849, 0.94342538, 0.86902492])
Which is nice and simple.

Concatenating column vectors using numpy arrays

I'd like to concatenate 'column' vectors using numpy arrays but because numpy sees all arrays as row vectors by default, np.hstack and np.concatenate along any axis don't help (and neither did np.transpose as expected).
a = np.array((0, 1))
b = np.array((2, 1))
c = np.array((-1, -1))
np.hstack((a, b, c))
# array([ 0, 1, 2, 1, -1, -1]) ## Noooooo
np.reshape(np.hstack((a, b, c)), (2, 3))
# array([[ 0, 1, 2], [ 1, -1, -1]]) ## Reshaping won't help
One possibility (but too cumbersome) is
np.hstack((a[:, np.newaxis], b[:, np.newaxis], c[:, np.newaxis]))
# array([[ 0, 2, -1], [ 1, 1, -1]]) ##
Are there better ways?
I believe numpy.column_stack should do what you want.
Example:
>>> a = np.array((0, 1))
>>> b = np.array((2, 1))
>>> c = np.array((-1, -1))
>>> numpy.column_stack((a,b,c))
array([[ 0, 2, -1],
[ 1, 1, -1]])
It is essentially equal to
>>> numpy.vstack((a,b,c)).T
though. As it says in the documentation.
I tried the following. Hope this is good enough for what you are doing ?
>>> np.vstack((a,b,c))
array([[ 0, 1],
[ 2, 1],
[-1, -1]])
>>> np.vstack((a,b,c)).T
array([[ 0, 2, -1],
[ 1, 1, -1]])

Categories