Python: efficient operations on numpy arrays

Python: efficient operations on numpy arrays - python

Say I have a numpy array x:
x = array([[ 3, 2, 1],
[ 3, 25, 34],
[ 33, 333, 3],
[ 43, 32, 2]])
I want to carry out the following operations without explicitly writing a for loop i.e. say a method which uses automatic in built looping;
1) Replace the 2nd column by a column of all 1 i.e.
x = array([[ 3, 1, 1],
[ 3, 1, 34],
[ 33, 1, 3],
[ 43, 1, 2]])
2) In the original array , replace 3rd column with the product of 2nd and 3rd i.e.
x = array([[ 3, 2, 1*2],
[ 3, 25, 34*25],
[ 33, 333, 3*333],
[ 43, 32, 2*32]])
3) Finally, I would like to replace the 2nd column in the original array based on a condition i.e.
x[1] = 0 if x[0] > 5 else 4
i.e. the array now looks like:
x = array([[ 3, 4, 1],
[ 3, 4, 34],
[ 33, 0, 3],
[ 43, 0, 2]])
Any suggestions ?
Thanks !

The documentation on numpy is well worth reading as this is fairly basic stuff...
x[:,1] = 1
x[:,2] *= x[:,1]
x[:,1] = np.where( x[:,0] > 5, 0, 4 )

Related

np.dot in NumPy printing the transpose of what should be expected

I'm really new to Python and am wondering why this is printing the opposite of expected. A (7x4)(4x2)(2x1) multiplication should result in a 7x1 column vector.
import numpy as np
nutrition = np.array([[61, 100, 7, 2.2, 1, 7, 215],
[156, 340, 18, 7, 44, 5, 0],
[19, 110, 9, 3.3, 0, 6, 16],
[27, 60, 2, 0.5, 8, 2, 16]])
meals = np.array([[2, 1, 0, 0],
[0, 1, 1, 1]]
M = np.array([40, 10])
print(np.dot(nutrition.T, np.dot(meals.T, M)))
Instead, it is printing a 1x7 row vector:
[13140. 26700. 1570. 564. 2360. 890. 17520.]
Any explanation or problems to look into would be appreciated.

Your array M is of shape (2,) and NOT (2,1):
print(M.shape)
(2,)
Hence, the output shape is (7,) and NOT (7,1). Which makes it a 1-D array represented in a single row:
print(np.dot(nutrition.T, np.dot(meals.T, M)).shape)
(7,)
If you want a (7,1) output, simply reshape your M to (2,1):
M = M.reshape(-1,1)
#[[40]
# [10]]
And output would be:
[[13140.]
[26700.]
[ 1570.]
[ 564.]
[ 2360.]
[ 890.]
[17520.]]

python numpy 3-d matrix times 1-d array

I have a multi-dimentional array named a (dimention is (2,3,3)) and another array named c (dimention is (2,)) as following code: how to get the output as the combination--->(a[0]*c[0],a[1]*c[1]) without loops, which means 1 times first group of a, i.e.,[[1,2],[2,-2],[3,-3]] and 10 times second group of a, namely [[4,-4],[5,-5],[6,-6]]. Btw, i have tried a*c, np.multipy(a,c), etc, but it seems like 1 times first column of a and 10 times second column, that is not what i want. Many thanks.
In [88]: a = np.array([[[1,2],[2,-2],[3,-3]],[[4,-4],[5,-5],[6,-6]]])
In [89]: a
Out[89]:
array([[[ 1, 2],
[ 2, -2],
[ 3, -3]],
[[ 4, -4],
[ 5, -5],
[ 6, -6]]])
In [90]: c = np.array([1,10])
In [91]: c
Out[91]: array([ 1, 10])
In [92]: a*c
Out[92]:
array([[[ 1, 20],
[ 2, -20],
[ 3, -30]],
[[ 4, -40],
[ 5, -50],
[ 6, -60]]])
The output that i want is like
array([[[ 1, 2],
[ 2, -2],
[ 3, -3]],
[[ 40, -40],
[ 50, -50],
[ 60, -60]]])

import numpy as np
a = np.array([[[1,2],
[2,-2],
[3,-3]],
[[4,-4],
[5,-5],
[6,-6]]])
c = np.array([1,10])
print(a*c)
Output:
[[[ 1 20]
[ 2 -20]
[ 3 -30]]
[[ 4 -40]
[ 5 -50]
[ 6 -60]]]
I'm guessing that's what you asked.

What is your question? How to multiply? That you could do like this:
import numpy as np
a = np.array([[[1,2],[2,-2],[3,-3]], [[4,-4],[5,-5],[6,-6]]]);
c = np.array([1, 10]);
print a.dot(c)

How to find the rows having values between -1 and 1 in a given numpy 2D-array?

I have a np.array of shape (15,3).
final_vals = array([[ 37, -84, -143],
[ 29, 2, -2],
[ -18, -2, 0],
[ -3, 6, 0],
[ 361, -5, 2],
[ -23, 4, 8],
[ 0, -1, 0],
[ -1, 1, 0],
[ 62, 181, 83],
[-193, -14, -2],
[ 42, -154, -92],
[ 16, -13, 1],
[ -10, -3, 0],
[-299, 244, 110],
[ 223, -237, -110]])
am trying to find the rows whose element values are between -1 and 1.In the array printed above ROW-6 and ROW-7 are target/result rows.
I tried,
result_idx = np.where(np.logical_and(final_vals>=-1, final_vals<=1))
which returns,
result_idx = (array([ 2, 3, 6, 6, 6, 7, 7, 7, 11, 12], dtype=int64),
array([2, 2, 0, 1, 2, 0, 1, 2, 2, 2], dtype=int64))
I want my program to return only row numbers

You could take the absolute value of all elements, and check which rows's elements are smaller or equal to 1. Then use np.flatnonzero to find the indices where all columns fullfil the condition:
np.flatnonzero((np.abs(final_vals) <= 1).all(axis=1))
Output
array([6, 7], dtype=int64)

Another way to do this based on your approach is to find the truth value of each element and then use numpy.all for each row. Then numpy.where gets you what you want.
mask = (final_vals <= 1) * (final_vals >= -1)
np.where(np.all(mask, axis=1))

How about
np.where(np.all((-1<=final_vals) & (final_vals<=1),axis=1))

You could use np.argwhere:
r = np.logical_and(final_vals <= 1, final_vals >=-1)
result = np.argwhere(r.all(1)).flatten()
print(result)
Output
[6 7]

Another way is using pandas,
you can achieve the row with following code:
df = pd.DataFrame(final_vals)
temp= ((df>=-1) & (df<=1 )).product(axis=1)
rows = temp[temp!=0].keys()
rows
At first it check numbers that are between -1 and +1 and then check rows(with axis=1) that all values accept the condition.
and the result is:
Int64Index([ 6, 7], dtype='int64')

Just a simple list comprehension:
[ i for i, row in enumerate(final_vals) if all([ e >= -1 and e <= 1 for e in row ]) ]
#=> [6, 7]

Fastest method for determining if 2 (vertically or horizontally) adjacent elements of a numpy array have the same value

I am looking for the fastest way of determining if 2 (vertically or horizontally) adjacent elements have the same value.
Let's say I have a numpy array of size 4x4.
array([
[8, 7, 4, 3],
[8, 4, 0, 4],
[3, 2, 2, 1],
[9, 8, 7, 6]])
I want to be able to identify that there are two adjacent 8s in the first column and there are two adjacent 2s in the third row. I could hard code a check but that would be ugly and I want to know if there is a faster way.
All guidance is appreciated. Thank you.

We would look for differentiation values along rows and columns for zeros signalling repeated ones there. Thus, we could do -
(np.diff(a,axis=0) == 0).any() | (np.diff(a,axis=1) == 0).any()
Or with slicing for performance boost -
(a[1:] == a[:-1]).any() | (a[:,1:] == a[:,:-1]).any()
So, (a[1:] == a[:-1]).any() is the vertical adjacency, whereas the other one is for horizontal one.
Extending to n adjacent ones (of same value) along rows or columns -
from scipy.ndimage.filters import convolve1d as conv
def vert_horz_adj(a, n=1):
k = np.ones(n,dtype=int)
v = (conv((a[1:]==a[:-1]).astype(int),k,axis=0,mode='constant')>=n).any()
h = (conv((a[:,1:]==a[:,:-1]).astype(int),k,axis=1,mode='constant')>=n).any()
return v | h
Sample run -
In [413]: np.random.seed(0)
...: a = np.random.randint(11,99,(10,4))
...: a[[2,3,4,6,7,8],0] = 1
In [414]: a
Out[414]:
array([[55, 58, 75, 78],
[78, 20, 94, 32],
[ 1, 98, 81, 23],
[ 1, 76, 50, 98],
[ 1, 92, 48, 36],
[88, 83, 20, 31],
[ 1, 80, 90, 58],
[ 1, 93, 60, 40],
[ 1, 30, 25, 50],
[43, 76, 20, 68]])
In [415]: vert_horz_adj(a, n=1)
Out[415]: True # Because of first col
In [416]: vert_horz_adj(a, n=2)
Out[416]: True # Because of first col
In [417]: vert_horz_adj(a, n=3)
Out[417]: False
In [418]: a[-1] = 10
In [419]: vert_horz_adj(a, n=3)
Out[419]: True # Because of last row

You can find the coordinates of the pairs with the following code:
import numpy as np
a = np.array([
[8, 7, 4, 3],
[8, 4, 0, 4],
[3, 2, 2, 1],
[9, 8, 7, 6]])
vertical = np.where((a == np.roll(a, 1, 0))[1:-1])
print(vertical) # (0,0) is the coordinate of the first of the repeating 8's
horizontal = np.where((a == np.roll(a, 1, 1))[:, 1:-1])
print(horizontal) # (2,1) is the coordinate of the first of the repeating 2's
which returns
(array([0], dtype=int64), array([0], dtype=int64))
(array([2], dtype=int64), array([1], dtype=int64))

if you want to locate the first occurence of each pair :
A=array([
[8, 7, 4, 3],
[8, 4, 0, 4],
[3, 2, 2, 1],
[9, 8, 7, 6]])
x=(A[1:]==A[:-1]).nonzero()
y=(A[:,1:]==A[:,:-1]).nonzero()
In [45]: x
Out[45]: (array([0], dtype=int64), array([0], dtype=int64))
In [47]: y
Out[47]: (array([2], dtype=int64), array([1], dtype=int64))
In [48]: A[x]
Out[48]: array([8])
In [49]: A[y]
Out[49]: array([2])
x and y give respectively the locations of the first 8 and the first 2.

convert separate 1D np.arrays into a list of 2D np.arrays

I'm trying to convert three 1D arrays into a list of 2D arrays. I've managed to do this by creating an empty ndarray and populating it line by line. Could someone show me a more elegant approach?
import numpy as np
import pandas as pd
one=np.arange(1,4,1)
two=np.arange(10,40,10)
three=np.arange(100,400,100)
df=pd.DataFrame({'col1':one,'col2':two,'col3':three})
desired_output=[np.array([[1.,10.],[1.,100.]]),np.array([[2.,20.],[2.,200.]]),np.array([[3.,30.],[3.,300.]])]
current, inelegant approach that works:
output=[]
for i in range(len(df)):
temp=np.zeros(shape=(2,2))
temp[0][0]=df.iloc[i,0]
temp[0][1]=df.iloc[i,1]
temp[1][0]=df.iloc[i,0]
temp[1][1]=df.iloc[i,2]
output.append(temp)

so first of all you can get array from df values by simply doing the following
In [61]:
arr = df.values
arr
Out[61]:
array([[ 1, 10, 100],
[ 2, 20, 200],
[ 3, 30, 300]])
then add the first column in the array again
In [73]:
arr_mod = np.hstack((arr , arr[: , 0][:, np.newaxis]))
arr_mod
Out[73]:
array([[ 1, 10, 100, 1],
[ 2, 20, 200, 2],
[ 3, 30, 300, 3]])
swap the column you've just added with the last column in the array
In [74]:
arr_mod[: , [2 , 3]] = arr_mod [: , [3 , 2]]
arr_mod
Out[74]:
array([[ 1, 10, 1, 100],
[ 2, 20, 2, 200],
[ 3, 30, 3, 300]])
then convert this 2d array to 3d array and convert it to list
In [78]:
list(arr_mod.reshape( -1, 2 , 2))
Out[78]:
[array([[ 1, 10],
[ 1, 100]]), array([[ 2, 20],
[ 2, 200]]), array([[ 3, 30],
[ 3, 300]])]

Here's one approach using np.column_stack and np.vsplit -
arr2D = np.column_stack((df['col1'],df['col2'],df['col1'],df['col3']))
out_list = np.vsplit(arr2D.reshape(-1,2),arr2D.shape[0])
Basically, we use np.column_stack to stack column-1 with column-2 and then again column-1 with column-3 to give us a 2D NumPy array arr2D of shape N x 4. Next, we reshape arr2D to a 2*N X 2 array and split along the rows with np.vsplit to give us the expected list of 2D arrays.
Sample run -
>>> df
col1 col2 col3
0 1 10 100
1 2 20 200
2 3 30 300
3 4 40 400
4 5 50 500
5 6 60 600
>>> arr2D = np.column_stack((df['col1'],df['col2'],df['col1'],df['col3']))
>>> out_list = np.vsplit(arr2D.reshape(-1,2),arr2D.shape[0])
>>> print out_list
[array([[ 1, 10],
[ 1, 100]]), array([[ 2, 20],
[ 2, 200]]), array([[ 3, 30],
[ 3, 300]]), array([[ 4, 40],
[ 4, 400]]), array([[ 5, 50],
[ 5, 500]]), array([[ 6, 60],
[ 6, 600]])]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: efficient operations on numpy arrays - python

The documentation on numpy is well worth reading as this is fairly basic stuff... x[:,1] = 1 x[:,2] *= x[:,1] x[:,1] = np.where( x[:,0] > 5, 0, 4 )

Related

np.dot in NumPy printing the transpose of what should be expected

python numpy 3-d matrix times 1-d array

How to find the rows having values between -1 and 1 in a given numpy 2D-array?

Fastest method for determining if 2 (vertically or horizontally) adjacent elements of a numpy array have the same value

convert separate 1D np.arrays into a list of 2D np.arrays

Categories

Resources