how to use map() instead of the following nested for loop
the idea to not use for loop :)
def f(matrix):
r, c = np.shape(array)
for col in range(0,c):
for row in range(0,r):
if array[row][col] >= max(array[row]):
print("true")
else:
print("false")
I tried to use something similar this formate but I am stuck:
print(list(map(lambda (x,y): print(x[y]) , A)))
but not working
thank you :)
You can easily compare every element of a row to the row's max with np.where. Numpy has a built-in function which applies a 1D function along a given axis called np.apply_along_axis, which is equivalent but faster then looping over your array. Here is your solution on an example matrix with random elements:
import numpy as np
def max_comp(row):
return np.where(row>=max(row), True, False)
matrix = np.random.randint(10, size=(5,5))
output = np.apply_along_axis(max_comp, axis=1, arr=matrix)
print(matrix)
print(output
Out:
[[2 3 1 4 5]
[9 6 0 1 1]
[9 6 3 4 1]
[3 7 6 1 7]
[2 1 5 7 2]]
[[False False False False True]
[ True False False False False]
[ True False False False False]
[False True False False True]
[False False False True False]]
Without using numpy
If using a for in a list comphrehension still matches your question:
matrix = [[1, 2], [3, 4]]
map(
lambda row: [print(c >= max(row), end=" ") for c in row] and print(),
matrix
)
We can make this statement even a bit more cryptic by also removing the list comprehension:
matrix = [[1, 2], [3, 4]]
a = list(map(
lambda row:
list(map(lambda c: print(c >= max(row), end=" "), row))
and print(),
matrix
))
While this code does not make use of a for loop, I don't think it improves readibilty and/or performance.
Related
Definition:Arranged array
an arranged array is an array of dim 2 , shape is square matrix (NXN) and for every cell in the matrix : A[I,J] > A[I,J+1] AND A[I,J] > A[I+1,J]
I have an assignment to write a func that:
gets a numpy array and returns
True if - the given array is an arranged array
False - otherwise
note: We CANNOT use loops, list comps OR recursion. the point of the task is to use numpy things.
assumptions: we can assume that the array isn't empty and has no NA's, also all of the cells are numerics
My code isn't very numpy oriented.. :
def is_square_ordered_matrix(A):
# Checking if the dimension is 2
if A.ndim != 2:
return False
# Checking if it is a squared matrix
if A.shape[0] != A.shape[1]:
return False
# Saving the original shape to reshape later
originalDim = A.shape
# Making it a dim of 1 to use it as a list
arrayAsList = list((A.reshape((1,originalDim[0]**2)))[0])
# Keeping original order before sorting
originalArray = arrayAsList[:]
# Using the values of the list as keys to see if there are doubles
valuesDictionary = dict.fromkeys(arrayAsList, 1)
# If len is different, means there are doubles and i should return False
if len(arrayAsList) != len(valuesDictionary):
return False
# If sorted list is equal to original list it means the original is already ordered and i should return True
arrayAsList.sort(reverse=True)
if originalArray == arrayAsList:
return True
else:
return False
True example:
is_square_ordered_matrix(np.arange(8,-1,-1).reshape((3,3)))
False example:
is_square_ordered_matrix(np.arange(9).reshape((3,3)))
is_square_ordered_matrix(np.arange(5,-1,-1).reshape((3,2)))
Simple comparison:
>>> def is_square_ordered_matrix(a):
... return a.shape[0] == a.shape[1] and np.all(a[:-1] > a[1:]) and np.all(a[:, :-1] > a[:, 1:])
...
>>> is_square_ordered_matrix(np.arange(8,-1,-1).reshape((3,3)))
True
>>> is_square_ordered_matrix(np.arange(9).reshape((3,3)))
False
>>> is_square_ordered_matrix(np.arange(5,-1,-1).reshape((3,2)))
False
First, compare a[:-1] with a[1:], which will compare the elements of each row with the elements of the next row, and then use np.all to judge:
>>> a = np.arange(8,-1,-1).reshape((3,3))
>>> a[:-1] # First and second lines
array([[8, 7, 6],
[5, 4, 3]])
>>> a[1:] # Second and third lines
array([[5, 4, 3],
[2, 1, 0]])
>>> a[:-1] > a[1:]
array([[ True, True, True],
[ True, True, True]])
>>> np.all(a[:-1] > a[1:])
True
Then compare a[:,:-1] with a[:, 1:], which will compare the columns:
>>> a[:, :-1] # First and second columns
array([[8, 7],
[5, 4],
[2, 1]])
>>> a[:, 1:] # Second and third columns
array([[7, 6],
[4, 3],
[1, 0]])
>>> a[:, :-1] > a[:, 1:]
array([[ True, True],
[ True, True],
[ True, True]])
The result of row comparison and column comparison is the result you want.
I am trying to use np.where() function with nested lists.
I would like to find an index with a given condition of the first layer of the nested list.
For example, if I have the following code
arr = [[1,1], [2,2],[3,3]]
a = np.where(arr == [2,2])
then ideally I would like code to return 'a' as 1.
Since [2,2] is in index 1 of the nested list.
However, I am just getting a empty array back as a result.
Of course, I can make it work easily by implementing external for loop such as
for n in range(len(arr)):
if arr[n] == [2,2]:
a = n
but I would like to implement this simply within the function np.where(write the entire code here).
Is there a way to do this?
Well you can write your own function to do so:
You'll need to
Find every line equal to what you looking for
Get indices of found rows (You can use where):
numpy compression
You can use compression operator to see if each line satisfies the condition. Such as:
np_arr = np.array(
[1, 2, 3, 4, 5]
)
print(np_arr < 3)
This will return a boolean where every element is True or False where the condition is satisfied:
[ True True False False False]
For a 2D array you'll get a 2D boolean array:
to_find = np.array([2, 2])
np_arr = np.array(
[
[1, 1],
[2, 2],
[3, 3],
[2, 2]
]
)
print(np_arr == to_find)
The result is:
[[False False]
[ True True]
[False False]
[ True True]]
Now we are looking for lines with all True values. So we can use all method of ndarray. And we will provide the axis we are looking to look to all. X, Y or Both. We want to look to x axis:
to_find = np.array([2, 2])
np_arr = np.array(
[
[1, 1],
[2, 2],
[3, 3],
[2, 2]
]
)
print((np_arr == to_find).all(axis=1))
The result is:
[False True False True]
Get indices of Trues
At the end you are looking for indices where the values are True:
np.where((np_arr == to_find).all(axis=1))
The result would be:
(array([1, 3]),)
The best solution is that mentioned by #Michael Szczesny, but using np.where you can do this too:
a = np.where(np.array(arr) == [2, 2])[0]
resulted_ind = np.where(np.bincount(a) == 2)[0] # --> [1]
numpy runs in Python, so you can use both the basic Python lists and numpy arrays (which are more like MATLAB matrices)
A list of lists:
In [43]: alist = [[1,1], [2,2],[3,3]]
A list has an index method, which tests against each element of the list (elements here are 2 element lists):
In [44]: alist.index([2,2])
Out[44]: 1
In [45]: alist.index([2,3])
Traceback (most recent call last):
Input In [45] in <cell line: 1>
alist.index([2,3])
ValueError: [2, 3] is not in list
alist==[2,2] returns False, because the list is not the same as the [2,2] list.
If we make an array from that list:
In [46]: arr = np.array(alist)
In [47]: arr
Out[47]:
array([[1, 1],
[2, 2],
[3, 3]])
we can do an == test - but it compares numeric elements.
In [48]: arr == np.array([2,2])
Out[48]:
array([[False, False],
[ True, True],
[False, False]])
Underlying this comparison is the concept of broadcasting, allow it to compare a (3,2) array with a (2,) (a 2d with a 1d). Here's its trivial, but it can be much more complicated.
To find rows where all values are True, use:
In [50]: (arr == np.array([2,2])).all(axis=1)
Out[50]: array([False, True, False])
and where finds the True in that array (the result is a tuple with 1 array):
In [51]: np.where(_)
Out[51]: (array([1]),)
In Octave the equivalent is:
>> arr = [[1,1];[2,2];[3,3]]
arr =
1 1
2 2
3 3
>> all(arr == [2,2],2)
ans =
0
1
0
>> find(all(arr == [2,2],2))
ans = 2
I'm having troubles creating a mask without using a for loop.
I've got a numpy array of size N with my labels and I want to create a mask of size NxN where mask[i, j] = True if and only if y[i] == y[j].
I've managed to do so by using a for loop :
mask = np.asarray([np.where(y==y[k], 1, 0) for k in range(len(y))])
But I'm working on a GPU and this greatly increase the compute time. How can I do it without looping?
This might get you started:
n = 3
a = np.arange(n)
np.equal.outer(a, a)
# this is the same as
a[:,None] == a
Output:
array([[ True, False, False],
[False, True, False],
[False, False, True]])
This is basically comparing the elements from a cartesian product. a[0] == a[1], a[1] == a[1], a[1] == a[2] and so forth, which is why the diagonal values are True when using np.arange.
You can use np.repeat and .T
a and b are just arbitrary data - the labels in your case.
import numpy as np
size = 4
a = np.arange(size)[:, None]
b = a.T
b[0, 2] = 1
c = np.repeat(a.T, repeats=size, axis=0)
d = np.repeat(b, repeats=size, axis=0).T
print(c)
print(d)
e = np.equal(c, d)
print(e)
out:
[[0 1 1 3]
[0 1 1 3]
[0 1 1 3]
[0 1 1 3]]
[[0 0 0 0]
[1 1 1 1]
[1 1 1 1]
[3 3 3 3]]
[[ True False False False]
[False True True False]
[False True True False]
[False False False True]]
For problems like these, np.indices is your friend:
dims = (len(y), len(y))
inds = np.indices(dims)
mask = np.empty(dims, dtype=bool)
mask[inds] = y[inds[0]] == y[inds[1]]
edit:
Kevin's more specific solution is more concise and almost certainly faster than this method.
import pandas as pd
import numpy as np
df = pd.DataFrame({'Li':[[1,2],[5,6],[8,9]],'Tu':[(1,2),(5,6),(8,9)]}
df
Li Tu
0 [1, 2] (1, 2)
1 [5, 6] (5, 6)
2 [8, 9] (8, 9)
Working fine for Tuple
df.Tu == (1,2)
0 True
1 False
2 False
Name: Tu, dtype: bool
When its List it gives value error
df.Li == [1,2]
ValueError: Lengths must match to compare
The problem is that lists aren't hashable, so it is necessary to compare tuples:
print (df.Li.map(tuple) == (1,2))
0 True
1 False
2 False
Name: Li, dtype: bool
Or in list comprehension:
mask = [tuple(x) == (1,2) for x in df.Li]
#alternative
mask = [x == [1,2] for x in df.Li]
print (mask)
[True, False, False]
If all lists have the same lengths:
mask = (np.array(df.Li.tolist()) == [1,2]).all(axis=1)
print (mask)
[ True False False]
The problem is that pandas is considering [1, 2] as a series-like object and trying to compare each element of df.Li with each element of [1, 2], hence the error:
ValueError: Lengths must match to compare
You cannot compare a list of size two with a list of size 3 (df.Li). In order to verify this you can do the following:
print(df.Li == [1, 2, 3])
Output
0 False
1 False
2 False
Name: Li, dtype: bool
It doesn't throw any error and works, but returns False for all as expected. In order to compare using list, you can do the following:
# this creates an array where each element is [1, 2]
data = np.empty(3, dtype=np.object)
data[:] = [[1, 2] for _ in range(3)]
print(df.Li == data)
Output
0 True
1 False
2 False
Name: Li, dtype: bool
All in all it seems like a bug in the pandas side.
My column 'vectors' contained numpy ndarrays and I got the same error when I want to compare to another ndarray 'centroid'. The following works for numpy ndarrays:
df['vectors'].apply(lambda x: ((vec==centroid).sum() == centroid.shape[0]))
Which also works for Lists:
df.Li.apply(lambda x: x==[1,2])
I am using numpy to do some calculations. In the following code:
assert(len(A.shape) == 2) # A is a 2D nparray
d1, d2 = A.shape
# want to initial G,which has the same dimension as A. And assign the last column of A to the last column of G
# initial with value 0
G = zero_likes(A)
# assign the last column to that of G
G[:, d2-1] = A[:, d2-1]
# the columns[0,dw-1] of G is the average of columns [0, dw-1] of A, based on the condition of B
for iW in range(d2-1):
n = 0
sum = 0.0
for i in range(d1):
if B[i, 0] != iW and B[i, 1] == 0:
sum += A[i, iW]
n += 1
for i in range(d1):
if B[i, 0] != iW and B[i, 1] == 0:
G[i, iW] = sum / (1.0 * n)
return G
Is there an easier way using "slicing" or "boolean array"?
Thanks!
In case you want G to have the same dimensionality as A and then change the appropriate elements of G, the following code should work:
# create G as a copy of A, otherwise you might change A by changing G
G = A.copy()
# getting the mask for all columns except the last one
m = (B[:,0][:,None] != np.arange(d2-1)[None,:]) & (B[:,1]==0)[:,None]
# getting a matrix with those elements of A which fulfills the conditions
C = np.where(m,A[:,:d2-1],0).astype(np.float)
# get the 'modified' average you use
avg = np.sum(C,axis=0)/np.sum(m.astype(np.int),axis=0)
# change the appropriate elements in all the columns except the last one
G[:,:-1] = np.where(m,avg,A[:,:d2-1])
After fiddling a long time and finding bugs ... I ended up with this code. I checked it against several random matrices A and specific choices of B
A = numpy.random.randint(100,size=(5,10))
B = np.column_stack(([4,2,1,3,4],np.zeros(5)))
and so far your and my result were in agreement.
Here's a start, focusing on the first inner loop:
In [35]: A=np.arange(12).reshape(3,4)
In [36]: B=np.array([[0,0],[1,0],[2,0]])
In [37]: sum=0
In [38]: for i in range(3):
if B[i,0]!=iW and B[i,1]==0:
sum += A[i,iW]
print(i,A[i,iW])
....:
1 4
2 8
In [39]: A[(B[:,0]!=iW)&(B[:,1]==0),iW].sum()
Out[39]: 12
I had to provide my own sample data to test this.
The 2nd loop has the same condition (B[:,0]!=iW)&(B[:,1]==0), and should work in the same way.
As one of the comments said, the dimensions of G look funny. To make things work with my sample, lets make zeros array. It looks like you are assigning to selected elements of G, the mean of a subset of A (sum/n)
In [52]: G=np.zeros_like(A)
In [53]: G[I,iW]=A[I,iW].mean()
Assuming n, the number of terms summed for each iW varies, it may be difficult to compress the outer loop into a vectorized step. If n was the same, you could pull out subset of A that matches the condition, e.g, A1, take the mean on one axis, an assign the values to G. With different numbers of terms in the sums, you still have to loop.
It just occurred to me that masked arrays might work. Mask off the terms of A that don't meet the condition, and then take the mean.
In [91]: I=(B[:,[0]]!=np.arange(4))&(B[:,[1]]==0)
In [92]: I
Out[92]:
array([[False, True, True, True],
[ True, False, True, True],
[ True, True, False, True]], dtype=bool)
In [93]: A1=np.ma.masked_array(A, ~I)
In [94]: A1
Out[94]:
masked_array(data =
[[-- 1 2 3]
[4 -- 6 7]
[8 9 -- 11]],
mask =
[[ True False False False]
[False True False False]
[False False True False]],
fill_value = 999999)
In [95]: A1.mean(0)
Out[95]:
masked_array(data = [6.0 5.0 4.0 7.0],
mask = [False False False False],
fill_value = 1e+20)
Or with plonser's where:
In [111]: np.where(I,A,0).sum(0)/I.sum(0)
Out[111]: array([ 6., 5., 4., 7.])