Elementwise multiplication ignoring certain rows of a matrix - python

Let's say I have a matrix like this:
import numpy as np
a = np.array([[1, 2, 3], [89, 43, 2], [12, -3, 4], [-2, 4, 7]])
array([[ 1, 2, 3],
[89, 43, 2],
[12, -3, 4],
[-2, 4, 7]])
and a vector that looks like this:
b = np.array([1, 2, 3])
If I now want to do a elementwise multiplication I can simply do
c = a * b
and obtain
array([[ 1, 4, 9],
[89, 86, 6],
[12, -6, 12],
[-2, 8, 21]])
My question is: How can I do this kind of multiplication only for certain rows in my matrix? I currently do it like this:
E = a.copy()
# ignore these rows
ignInd = [1, 3]
for ind in xrange(a.shape[0]):
if ind not in ignInd:
E[ind, :] = a[ind, :] * b
The matrix E looks as desired (the rows 1 and 3 are the same as in a):
array([[ 1, 4, 9],
[89, 43, 2],
[12, -6, 12],
[-2, 4, 7]])
Can someone come up with a smarter solution than this?

It seems like you could just do the multiplication and then put back the original data where you want to ignore ...
>>> import numpy as np
>>> a = np.array([[1,2,3],[89,43,2],[12, -3, 4], [-2, 4, 7]])
>>> b = np.array([1,2,3])
>>> c = a * b
>>> ignInd = [1,3]
>>> c[ignInd, :]
array([[89, 86, 6],
[-2, 8, 21]])
>>> c[ignInd, :] = a[ignInd, :]
>>> c
array([[ 1, 4, 9],
[89, 43, 2],
[12, -6, 12],
[-2, 4, 7]])

You can index an NumPy array directly with another NumPy array. In your case, you have the indexes of rows you want to ignore, so you can build an array of indexes to include from this:
In [21]: ignInd = [1,3] #ignore these rows
In [22]: ind = np.array([i for i in range(a.shape[0]) if i not in ignInd])
In [23]: E2 = a.copy()
In [24]: E2[ind,:] = a[ind,:]*b
In [25]: E2
Out[25]:
array([[ 1, 4, 9],
[89, 43, 2],
[12, -6, 12],
[-2, 4, 7]])
EDIT: as #DSM comments, for large arrays it would be more efficient to build the index array using NumPy's vectorized methods, viz. ind = np.setdiff1d(np.arange(len(a)), ignInd) instead of the list comprehension used above.

You can use boolean indexing with np.in1d to select the rows excluded in the give indices list. The implementation would look something like this -
E = a.copy()
mask = ~np.in1d(np.arange(a.shape[0]),ignInd,)
E[mask] = a[mask]*b

For the 1st row in a
a[0] = a[0] * b
..and so on for the other rows.

Related

Add a numpy array to the beginning or end of each element of another array

I have an array of arrays:
x = array([array([[1, 2],
[3, 4]]),
array([[22, 4],
[ 9, 10],
[ 3, 2]])], dtype=object)
And i have a list of arrays with same length like:
xa = [array([11, 22]), array([33, 44])]
I would like to add, in pure numpy, each element of xa to the end or beginning of x, as follows:
In the end:
result = array([array([[ 1, 2],
[ 3, 4],
[11, 22]]),
array([[22, 4],
[ 9, 10],
[ 3, 2],
[33, 44]])], dtype=object)
In the beginning:
result = array([array([[11, 22],
[ 1, 2],
[ 3, 4]]),
array([[33, 44],
[22, 4],
[ 9, 10],
[ 3, 2]])], dtype=object)
*Numpy version = 1.9.3
assuming you imported numpy like that:
import numpy as np
from numpy import array
(1) not biutifully but this would work:
result = array([np.vstack((x[0], xa[0])), np.vstack((x[1], xa[1]))])
or respectively:
result = array([np.vstack((xa[0], x[0])), np.vstack((xa[1], x[1]))])
(2) better with flexible lengths of the two arrays:
result = array([np.vstack((x[i], xa[i])) for i in range(len(x))])
result = array([np.vstack((xa[i], x[i])) for i in range(len(x))])
(3) a bit more pythonic way of handling this:
result = array([np.vstack((x_i, xa_i)) for (x_i, xa_i) in zip(x, xa)])
result = array([np.vstack((xa_i, x_i)) for (x_i, xa_i) in zip(x, xa)])

Fastest method for determining if 2 (vertically or horizontally) adjacent elements of a numpy array have the same value

I am looking for the fastest way of determining if 2 (vertically or horizontally) adjacent elements have the same value.
Let's say I have a numpy array of size 4x4.
array([
[8, 7, 4, 3],
[8, 4, 0, 4],
[3, 2, 2, 1],
[9, 8, 7, 6]])
I want to be able to identify that there are two adjacent 8s in the first column and there are two adjacent 2s in the third row. I could hard code a check but that would be ugly and I want to know if there is a faster way.
All guidance is appreciated. Thank you.
We would look for differentiation values along rows and columns for zeros signalling repeated ones there. Thus, we could do -
(np.diff(a,axis=0) == 0).any() | (np.diff(a,axis=1) == 0).any()
Or with slicing for performance boost -
(a[1:] == a[:-1]).any() | (a[:,1:] == a[:,:-1]).any()
So, (a[1:] == a[:-1]).any() is the vertical adjacency, whereas the other one is for horizontal one.
Extending to n adjacent ones (of same value) along rows or columns -
from scipy.ndimage.filters import convolve1d as conv
def vert_horz_adj(a, n=1):
k = np.ones(n,dtype=int)
v = (conv((a[1:]==a[:-1]).astype(int),k,axis=0,mode='constant')>=n).any()
h = (conv((a[:,1:]==a[:,:-1]).astype(int),k,axis=1,mode='constant')>=n).any()
return v | h
Sample run -
In [413]: np.random.seed(0)
...: a = np.random.randint(11,99,(10,4))
...: a[[2,3,4,6,7,8],0] = 1
In [414]: a
Out[414]:
array([[55, 58, 75, 78],
[78, 20, 94, 32],
[ 1, 98, 81, 23],
[ 1, 76, 50, 98],
[ 1, 92, 48, 36],
[88, 83, 20, 31],
[ 1, 80, 90, 58],
[ 1, 93, 60, 40],
[ 1, 30, 25, 50],
[43, 76, 20, 68]])
In [415]: vert_horz_adj(a, n=1)
Out[415]: True # Because of first col
In [416]: vert_horz_adj(a, n=2)
Out[416]: True # Because of first col
In [417]: vert_horz_adj(a, n=3)
Out[417]: False
In [418]: a[-1] = 10
In [419]: vert_horz_adj(a, n=3)
Out[419]: True # Because of last row
You can find the coordinates of the pairs with the following code:
import numpy as np
a = np.array([
[8, 7, 4, 3],
[8, 4, 0, 4],
[3, 2, 2, 1],
[9, 8, 7, 6]])
vertical = np.where((a == np.roll(a, 1, 0))[1:-1])
print(vertical) # (0,0) is the coordinate of the first of the repeating 8's
horizontal = np.where((a == np.roll(a, 1, 1))[:, 1:-1])
print(horizontal) # (2,1) is the coordinate of the first of the repeating 2's
which returns
(array([0], dtype=int64), array([0], dtype=int64))
(array([2], dtype=int64), array([1], dtype=int64))
if you want to locate the first occurence of each pair :
A=array([
[8, 7, 4, 3],
[8, 4, 0, 4],
[3, 2, 2, 1],
[9, 8, 7, 6]])
x=(A[1:]==A[:-1]).nonzero()
y=(A[:,1:]==A[:,:-1]).nonzero()
In [45]: x
Out[45]: (array([0], dtype=int64), array([0], dtype=int64))
In [47]: y
Out[47]: (array([2], dtype=int64), array([1], dtype=int64))
In [48]: A[x]
Out[48]: array([8])
In [49]: A[y]
Out[49]: array([2])
x and y give respectively the locations of the first 8 and the first 2.

Sum nth columns elementwise in Numpy matrix

I have the following numpy matrix:
A = [[a,b,c,d,e,f],
[g,h,i,j,k,l],
...]
I need to sum all the nth columns in a element-wise fashion. So, if n is 2, the answer needs to be:
B = [[a+c+e, b+d+f],
[g+i+k, h+j+l],
...]
(like breaking the matrix into 3, each with 2 columns and adding them.)
But if n is 3, the answer needs to be:
C = [[a+d, b+e, c+f],
[g+j, h+k, i+l],
...]
(like breaking the matrix into 2, each with 3 columns and adding them.)
Is there a general case that will accept the value n without resorting to looping?
Reshape to split the last axis into two with the latter of length n and sum along the former -
A.reshape(A.shape[0],-1,n).sum(1)
Sample run -
In [38]: A
Out[38]:
array([[0, 5, 3, 2, 5, 6],
[6, 1, 0, 8, 4, 0],
[8, 6, 1, 5, 7, 0]])
In [39]: n = 2
In [40]: A.reshape(A.shape[0],-1,n).sum(1)
Out[40]:
array([[ 8, 13],
[10, 9],
[16, 11]])
In [41]: n = 3
In [42]: A.reshape(A.shape[0],-1,n).sum(1)
Out[42]:
array([[ 2, 10, 9],
[14, 5, 0],
[13, 13, 1]])

Generate and fill 2 dimensional array with Numpy

I would like to generate a 2 dimensional array with NumPy , iterate some variable few times, and fill the 2 dimensional array with 2 pieces of float data that gets calculated inside the for iteration. Then export it out to .csv
Technically I would like to do it like this:
max_array=8000
ARRAY=numpy.zeros( [max_array*2] , dtype=float)
ARRAY=numpy.arange(max_array*2).reshape((max_array,2))
for i in range(1,max_array):
######calculations here#######
array[i,i]=[data1,data2]
numpy.savetxt("output.csv", numpy.asarray(ARRAY), delimiter=",")
Unfortunately it doesn't work, I am very clumsy with syntax, and probably the [,] brackets are the problem. I would be very thankful if somebody would fix my snippet.
import numpy as np
max_array = 4
ARRAY = np.arange(max_array*2).reshape((max_array,2))
You have created a 2-d array
>>> ARRAY
array([[0, 1],
[2, 3],
[4, 5],
[6, 7]])
>>>
ARRAY[i,i] indexes a single element in the array
>>> i = 0
>>> ARRAY[i,i]
0
>>> ARRAY[i,i] = 222
>>> ARRAY
array([[222, 1],
[ 2, 3],
[ 4, 5],
[ 6, 7]])
If you want to assign values to a row:
>>> ARRAY[0] = 99, 99
>>> ARRAY
array([[99, 99],
[ 2, 3],
[ 4, 5],
[ 6, 7]])
Or
>>> ARRAY[2,:] = 66, 66
>>> ARRAY
array([[99, 99],
[ 2, 3],
[66, 66],
[ 6, 7]])
The second value in the subscript indexes a column
>>> ARRAY[:, 1]
array([99, 3, 66, 7])
>>> ARRAY[:, 1] = 0
>>> ARRAY
array([[99, 0],
[ 2, 0],
[66, 0],
[ 6, 0]])
>>>
ARRAY[i,i] = data1, data2 : tries to assign two things to a single element in the array - that is why you get an error.
Numpy docs
Indexing, Slicing, Iterating
Indexing

Adding a dimension to every element of a numpy.array

I'm trying to transform each element of a numpy array into an array itself (say, to interpret a greyscale image as a color image). In other words:
>>> my_ar = numpy.array((0,5,10))
[0, 5, 10]
>>> transformed = my_fun(my_ar) # In reality, my_fun() would do something more useful
array([
[ 0, 0, 0],
[ 5, 10, 15],
[10, 20, 30]])
>>> transformed.shape
(3, 3)
I've tried:
def my_fun_e(val):
return numpy.array((val, val*2, val*3))
my_fun = numpy.frompyfunc(my_fun_e, 1, 3)
but get:
my_fun(my_ar)
(array([[0 0 0], [ 5 10 15], [10 20 30]], dtype=object), array([None, None, None], dtype=object), array([None, None, None], dtype=object))
and I've tried:
my_fun = numpy.frompyfunc(my_fun_e, 1, 1)
but get:
>>> my_fun(my_ar)
array([[0 0 0], [ 5 10 15], [10 20 30]], dtype=object)
This is close, but not quite right -- I get an array of objects, not an array of ints.
Update 3! OK. I've realized that my example was too simple beforehand -- I don't just want to replicate my data in a third dimension, I'd like to transform it at the same time. Maybe this is clearer?
Does numpy.dstack do what you want? The first two indexes are the same as the original array, and the new third index is "depth".
>>> import numpy as N
>>> a = N.array([[1,2,3],[4,5,6],[7,8,9]])
>>> a
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> b = N.dstack((a,a,a))
>>> b
array([[[1, 1, 1],
[2, 2, 2],
[3, 3, 3]],
[[4, 4, 4],
[5, 5, 5],
[6, 6, 6]],
[[7, 7, 7],
[8, 8, 8],
[9, 9, 9]]])
>>> b[1,1]
array([5, 5, 5])
Use map to apply your transformation function to each element in my_ar:
import numpy
my_ar = numpy.array((0,5,10))
print my_ar
transformed = numpy.array(map(lambda x:numpy.array((x,x*2,x*3)), my_ar))
print transformed
print transformed.shape
I propose:
numpy.resize(my_ar, (3,3)).transpose()
You can of course adapt the shape (my_ar.shape[0],)*2 or whatever
Does this do what you want:
tile(my_ar, (1,1,3))

Categories