Iterating through matrices Python - python

If I have two lists and want to iterate through subtracting one from the other how would I go about this? I was thinking broadcasting. Right now I have:
array1 = [0,2,2,0]
array2 = [2,2,0,1]
I would like to subtract array1 from each value in array2 and make a new matrix of outputs:
output = [2, 0, 0, 2,
2, 0, 0, 2,
0, -2, -2, 0,
1, -1, -1, 1]
so in the end it's a 4x4 matrix.
Is this possible? Is the easiest way to use broadcasting? I was thinking of making each row value in array2 into it's own array, subtracting that from array2 using broadcasting, then summing all the array's at the end into one big array (using Numpy)... is there an easier way?
If I have two lists and want to iterate through subtracting one from the other how would I go about this? I was thinking broadcasting. Right now I have:
array1 = [0,2,2,0]
array2 = [2,2,0,1]
I would like to subtract array1 from each value in array2 and make a new matrix of outputs:
output = [2, 0, 0, 2,
2, 0, 0, 2,
0, -2, -2, 0,
1, -1, -1, 1]
so in the end it's a 4x4 matrix.
Is this possible? Is the easiest way to use broadcasting? I was thinking of making each row value in array2 into it's own array, subtracting that from array2 using broadcasting, then summing all the array's at the end into one big array (using Numpy)... is there an easier way?

Broadcasting with numpy:
>>> a1 = np.array([0,2,2,0])
>>> a2 = np.array([2,2,0,1])
>>> a2[:, np.newaxis] - a1
array([[ 2, 0, 0, 2],
[ 2, 0, 0, 2],
[ 0, -2, -2, 0],
[ 1, -1, -1, 1]])

Something like this?
def all_differences(x, y):
return (a - b for a in y for b in x)
print(list(all_differences([0, 2, 2, 0], [2, 2, 0,1])))
# -> [2, 0, 0, 2, 2, 0, 0, 2, 0, -2, -2, 0, 1, -1, -1, 1]
It just itertates over every item in the second list for every item in the first list, and gives their difference.
This can also be solved with itertools.product and can be generalised for multiple lists:
import itertools
import functools
import operator
difference = functools.partial(functools.reduce, operator.sub)
def all_differences(*lists):
return map(difference, itertools.product(*reversed(lists)))
print(list(all_differences([0, 2, 2, 0], [2, 2, 0,1])))
Or just handling two lists:
import itertools
def all_differences(x, y):
return (b - a for (a, b) in itertools.product((x, y)))
print(list(all_differences([0, 2, 2, 0], [2, 2, 0,1])))

Related

Find first n non zero values in in numpy 2d array

I would like to know the fastest way to extract the indices of the first n non zero values per column in a 2D array.
For example, with the following array:
arr = [
[4, 0, 0, 0],
[0, 0, 0, 0],
[0, 4, 0, 0],
[2, 0, 9, 0],
[6, 0, 0, 0],
[0, 7, 0, 0],
[3, 0, 0, 0],
[1, 2, 0, 0],
With n=2 I would have [0, 0, 1, 1, 2] as xs and [0, 3, 2, 5, 3] as ys. 2 values in the first and second columns and 1 in the third.
Here is how it is currently done:
x = []
y = []
n = 3
for i, c in enumerate(arr.T):
a = c.nonzero()[0][:n]
if len(a):
x.extend([i]*len(a))
y.extend(a)
In practice I have arrays of size (405, 256).
Is there a way to make it faster?
Here is a method, although quite confusing as it uses a lot of functions, that does not require sorting the array (only a linear scan is necessary to get non null values):
n = 2
# Get indices with non null values, columns indices first
nnull = np.stack(np.where(arr.T != 0))
# split indices by unique value of column
cols_ids= np.array_split(range(len(nnull[0])), np.where(np.diff(nnull[0]) > 0)[0] +1 )
# Take n in each (max) and concatenate the whole
np.concatenate([nnull[:, u[:n]] for u in cols_ids], axis = 1)
outputs:
array([[0, 0, 1, 1, 2],
[0, 3, 2, 5, 3]], dtype=int64)
Here is one approach using argsort, it gives a different order though:
n = 2
m = arr!=0
# non-zero values first
idx = np.argsort(~m, axis=0)
# get first 2 and ensure non-zero
m2 = np.take_along_axis(m, idx, axis=0)[:n]
y,x = np.where(m2)
# slice
x, idx[y,x]
# (array([0, 1, 2, 0, 1]), array([0, 2, 3, 3, 5]))
Use dislocation comparison for the row results of the transposed nonzero:
>>> n = 2
>>> i, j = arr.T.nonzero()
>>> mask = np.concatenate([[True] * n, i[n:] != i[:-n]])
>>> i[mask], j[mask]
(array([0, 0, 1, 1, 2], dtype=int64), array([0, 3, 2, 5, 3], dtype=int64))

Efficient way to substitute repeating np.vstack in python?

I am trying to implement this post in python.
import numpy as np
x = np.array([0,0,0])
for r in range(3):
x = np.vstack((x, np.array([-r, r, -r])))
x gets this value
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])
I am concerned the runtime efficiency about the repeating np.vstack. Is there a more efficient way to do this?
Build a list of arrays or lists, and apply np.array (or vstack) to that once:
In [598]: np.array([[-r,r,-r] for r in [0,0,1,2]])
Out[598]:
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])
But if the column pattern is consistent, broadcasting two arrays against each other will be faster
In [599]: np.array([-1,1,-1])*np.array([0,0,1,2])[:,None]
Out[599]:
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])
Would it be useful to use numpy.tile?
N = 3
A = np.array([[0, *range(0, -N, -1)]]).T
B = np.tile(A, (1, N))
B[:,1] = -B[:,1]
The first line sets the expected number of rows after the first row of zeroes. The second creates a NumPy array by creating an initial value of 0, followed by the linear sequence of 0, -1, -2, up to -N + 1. Note the use of the splat operator which unpacks the range object and creates elements in an individual list. These are concatenated with the first value of 0, and we create a 2D NumPy array that is a column vector. The third line tiles this vector N times horizontally to get the desired output. Finally the fourth line negates the second column to get your desired output
Example Run
In [175]: N = 3
In [176]: A = np.array([[0, *range(0, -N, -1)]]).T
In [177]: B = np.tile(A, (1, N))
In [178]: B[:,1] = -B[:,1]
In [178]: B
Out[178]:
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])
You can use np.block as following:
First create a block which you are currently doing inside the for loop
Finally, vertically stack a row of zeros using np.vstack to get the final desired answer
import numpy as np
size = 3
sign = np.ones(3)*((-1)**np.arange(1, size+1)) # General sign array of repeating -1, 1
A = np.ones((size, size), int)
B = np.arange(0, size) * A
B = sign * np.block([B.T])
# array([[ 0, 0, 0],
# [ -1, 1, -1],
# [ -2, 2, -2]])
answer = np.vstack([B[0], B])
# array([[ 0, 0, 0],
# [ 0, 0, 0],
# [ -1, 1, -1],
# [ -2, 2, -2]])

Make every possible combination in 2D array

I'm trying to make an array of 4x4 (16) pixel black and white images with all possible combinations. I made the following array as a template:
template = [[0,0,0,0], # start with all white pixels
[0,0,0,0],
[0,0,0,0],
[0,0,0,0]]
I then want to iterate through the template and changing the 0 to 1 for every possible combination.
I tried to iterate with numpy and itertools but can only get 256 combinations, and with my calculations there should be 32000 (Edit: 65536! don't know what happened there...). Any one with mad skills that could help me out?
As you said, you can use the itertools module to do this, in particular the product function:
import itertools
import numpy as np
# generate all the combinations as string tuples of length 16
seq = itertools.product("01", repeat=16)
for s in seq:
# convert to numpy array and reshape to 4x4
arr = np.fromiter(s, np.int8).reshape(4, 4)
# do something with arr
You would have a total of 65536 such combinations of such a (4 x 4) shaped array. Here's a vectorized approach to generate all those combinations, to give us a (65536 x 4 x 4) shaped multi-dim array -
mask = ((np.arange(2**16)[:,None] & (1 << np.arange(16))) != 0)
out = mask.astype(int).reshape(-1,4,4)
Sample run -
In [145]: out.shape
Out[145]: (65536, 4, 4)
In [146]: out
Out[146]:
array([[[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
[[1, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
[[0, 1, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
...,
[[1, 0, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]],
[[0, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]],
[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]]])
One possibility which relies on a for loop
out = []
for i in range(2**16):
out.append(np.frombuffer("{:016b}".format(i).encode('utf8')).view(np.uint8).reshape(4,4)-48)
Obviously you could make that a list comprehension if you like.
It takes advantage of Python string formatting which is able to produce the binary representation of integers. The format string instructs it to use 16 places filling with zeros on the left. The string is then encoded to give a bytes object which numpy can interpret as an array.
In the end we subtract the code for the character "0" to get a proper 0. Luckily, "1" sits just above "0", so that's all we need to do.
First I'll iterate for all numbers from 0 to (2^16)-1. Then I'll create a 16 character binary string for each of those numbers and thus covering all possible combinations
After that I converted the string to a list and made a 2d list out of it using list comprehension and slicing.
all_combinations = []
for i in xrange(pow(2,16))
binary = '{0:016b}'.format(i) ## Converted number to binary string
binary = map(int,list(binary)) ## String to list ## list(map(int,list(binary))) in py 3
template = [binary[i:i+4] for i in xrange(0, len(binary), 4)] #created 2d list
all_combinations.append(template)

1D numpy array which is shifted to the right for each consecutive row in a new 2D array

I am trying to optimise some code by removing for loops and using numpy arrays only as I am working with large data sets.
I would like to take a 1D numpy array, for example:
a = [1, 2, 3, 4, 5]
and produce a 2D numpy array whereby the value in each column shifts along a place, for example in the case above for a I wish to have a function which returns:
[[1 2 3 4 5]
[0 1 2 3 4]
[0 0 1 2 3]
[0 0 0 1 2]
[0 0 0 0 1]]
I have found examples which use the strides function to do something similar to produce, for example:
[[1 2 3]
[2 3 4]
[3 4 5]]
However I am trying to shift each of my columns in the other direction. Alternatively, one can view the problem as putting the first element of a on the first diagonal, the second element on the second diagonal and so on. However, I would like to stress again how I would like to avoid using a for, while or if loop entirely. Any help would be greatly appreciated.
Such a matrix is an example of a Toeplitz matrix. You could use scipy.linalg.toeplitz to create it:
In [32]: from scipy.linalg import toeplitz
In [33]: a = range(1,6)
In [34]: toeplitz(a, np.zeros_like(a)).T
Out[34]:
array([[1, 2, 3, 4, 5],
[0, 1, 2, 3, 4],
[0, 0, 1, 2, 3],
[0, 0, 0, 1, 2],
[0, 0, 0, 0, 1]])
Inspired by #EelcoHoogendoorn's answer, here's a variation that doesn't use as much memory as scipy.linalg.toeplitz:
In [47]: from numpy.lib.stride_tricks import as_strided
In [48]: a
Out[48]: array([1, 2, 3, 4, 5])
In [49]: t = as_strided(np.r_[a[::-1], np.zeros_like(a)], shape=(a.size,a.size), strides=(a.itemsize, a.itemsize))[:,::-1]
In [50]: t
Out[50]:
array([[1, 2, 3, 4, 5],
[0, 1, 2, 3, 4],
[0, 0, 1, 2, 3],
[0, 0, 0, 1, 2],
[0, 0, 0, 0, 1]])
The result should be treated as a "read only" array. Otherwise, you'll be in for some surprises when you change an element. For example:
In [51]: t[0,2] = 99
In [52]: t
Out[52]:
array([[ 1, 2, 99, 4, 5],
[ 0, 1, 2, 99, 4],
[ 0, 0, 1, 2, 99],
[ 0, 0, 0, 1, 2],
[ 0, 0, 0, 0, 1]])
Here is the indexing-tricks based solution. Not nearly as elegant as the toeplitz solution already posted, but should memory consumption or performance be a concern, it is to be preferred. As demonstrated, this also makes it easy to subsequently manipulate the entries of the matrix in a consistent manner.
import numpy as np
a = np.arange(5)+1
def toeplitz_view(a):
b = np.concatenate((np.zeros_like(a),a))
i = a.itemsize
v = np.lib.index_tricks.as_strided(b,
shape=(len(b),)*2,
strides=(-i, i))
#return a view on the 'original' data as well, for manipulation
return v[:len(a), len(a):], b[len(a):]
v, a = toeplitz_view(a)
print v
a[0] = 10
v[2,1] = -1
print v

matlab find() for nonzero element in python

I have a sparse matrix (numpy.array) and I would like to have the index of the nonzero elements in it.
In Matlab I would write:
[i, j] = find(CM)
and in Python what should I do?
I have tried numpy.nonzero (but I don't know how to take the indices from that) and flatnonzero (but it's not convenient for me, I need both the row and column index).
Thanks in advance!
Assuming that by "sparse matrix" you don't actually mean a scipy.sparse matrix, but merely a numpy.ndarray with relatively few nonzero entries, then I think nonzero is exactly what you're looking for. Starting from an array:
>>> a = (np.random.random((5,5)) < 0.10)*1
>>> a
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 1],
[0, 0, 1, 0, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
nonzero returns the indices (here x and y) where the nonzero entries live:
>>> a.nonzero()
(array([1, 2, 3]), array([4, 2, 0]))
We can assign these to i and j:
>>> i, j = a.nonzero()
We can also use them to index back into a, which should give us only 1s:
>>> a[i,j]
array([1, 1, 1])
We can even modify a using these indices:
>>> a[i,j] = 2
>>> a
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 2],
[0, 0, 2, 0, 0],
[2, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
If you want a combined array from the indices, you can do that too:
>>> np.array(a.nonzero()).T
array([[1, 4],
[2, 2],
[3, 0]])
(there are lots of ways to do this reshaping; I chose one almost at random.)
This goes slightly beyond what you as and I only mention it since I once faced a similar problem. If you want the indices to access some other array there is some very simple sytax:
import numpy as np
array = np.random.randint(0, 2, size=(3, 3))
data = np.random.random(size=(3, 3))
Now array looks something like
>>> print array
array([[0, 1, 0],
[1, 0, 1],
[1, 1, 0]])
while data could be
>>> print data
array([[ 0.92824816, 0.43605604, 0.16627849],
[ 0.00301434, 0.94342538, 0.95297402],
[ 0.32665135, 0.03504204, 0.86902492]])
Then if we want the elements of data which are zero:
>>> print data[array==0]
array([ 0.92824816, 0.16627849, 0.94342538, 0.86902492])
Which is nice and simple.

Categories