apply a boolean operator among all elements of an array - python

Is there a practical way to apply the same boolean operator (say or) to all elements of an array without using a for loop ?
I will clarify what I need with an example:
import numpy as np
a=np.array([[1,0,0],[1,0,0],[0,0,1]])
b=a[0] | a[1] | a[2]
print b
What is the synthetic way to apply the or boolean operator to all arrays of a matrix as I have done above?

The usual way to do this would be to apply numpy.any along an axis:
numpy.any(a, axis=0)
That said, there is also a way to do this through the operator more directly. NumPy ufuncs have a reduce method that can be used to apply them along an axis of an array, or across all elements of an array. Using numpy.logical_or.reduce, we can express this as
numpy.logical_or.reduce(a, axis=0)
This doesn't come up much, because most ufuncs you'd want to call reduce on already have equivalent helper functions defined. add has sum, multiply has prod, logical_and has all, logical_or has any, maximum has amax, and minimum has amin.

try either:
np.any(arr, axis=0)
or
np.apply_along_axis(any, 0, arr)
or if you want to use pandas for some reason,
df.any(axis=0)

You can use reduce function for that:
from functools import reduce
a = np.array([[1,0,0],[1,0,0],[0,0,1]])
reduce(np.bitwise_or, a)

NOTE: I'm not a numby expert, so I am making an assumption below;
In your example, b is comparing the arrays with each other, so it is asking:
"Are there any items in a[0] OR any items in a1 OR any items in a2" is that your goal? in which case, you could use builtin any()
for example (changed numby to a simple list of lists):
a=[[1,0,0],[1,0,0],[0,0,1]]
b=any(a)
print b
b will be True
if however, you want to know if any element in it is true, so, for example, you want to know if a[0][0] OR a0 | a0 | a[1][0] | ...
you could use the builtin map command, so something like:
a=[[1,0,0],[1,0,0],[0,0,1]]
b=any(map(any, a)
print b
b will still be True
Note: below is based on looking at the NumPy docs, not actual experience.
For NumPy, you could also use the NumPy any() option something like
a=np.array([[1,0,0],[1,0,0],[0,0,1]])
b=a.any()
print b
or, if your doing all numbers anyway, you could sum the array and see if it != 0

Related

The sum of the products of a two-dimensional array python

I have 2 arrays of a million elements (created from an image with the brightness of each pixel)
I need to get a number that is the sum of the products of the array elements of the same name. That is, A(1,1) * B(1,1) + A(1,2) * B(1,2)...
In the loop, python takes the value of the last variable from the loop (j1) and starts running through it, then adds 1 to the penultimate variable and runs through the last one again, and so on. How can I make it count elements of the same name?
res1, res2 - arrays (specifically - numpy.ndarray)
Perhaps there is a ready-made function for this, but I need to make it as open as possible, without a ready-made one.
sum = 0
for i in range(len(res1)):
for j in range(len(res2[i])):
for i1 in range(len(res2)):
for j1 in range(len(res1[i1])):
sum += res1[i][j]*res2[i1][j1]
In the first part of my answer I'll explain how to fix your code directly. Your code is almost correct but contains one big mistake in logic. In the second part of my answer I'll explain how to solve your problem using numpy. numpy is the standard python package to deal with arrays of numbers. If you're manipulating big arrays of numbers, there is no excuse not to use numpy.
Fixing your code
Your code uses 4 nested for-loops, with indices i and j to iterate on the first array, and indices i1 and j1 to iterate on the second array.
Thus you're multiplying every element res1[i][j] from the first array, with every element res2[i1][j1] from the second array. This is not what you want. You only want to multiply every element res1[i][j] from the first array with the corresponding element res2[i][j] from the second array: you should use the same indices for the first and the second array. Thus there should only be two nested for-loops.
s = 0
for i in range(len(res1)):
for j in range(len(res1[i])):
s += res1[i][j] * res2[i][j]
Note that I called the variable s instead of sum. This is because sum is the name of a builtin function in python. Shadowing the name of a builtin is heavily discouraged. Here is the list of builtins: https://docs.python.org/3/library/functions.html ; do not name a variable with a name from that list.
Now, in general, in python, we dislike using range(len(...)) in a for-loop. If you read the official tutorial and its section on for loops, it suggests that for-loop can be used to iterate on elements directly, rather than on indices.
For instance, here is how to iterate on one array, to sum the elements on an array, without using range(len(...)) and without using indices:
# sum the elements in an array
s = 0
for row in res1:
for x in row:
s += x
Here row is a whole row, and x is an element. We don't refer to indices at all.
Useful tools for looping are the builtin functions zip and enumerate:
enumerate can be used if you need access both to the elements, and to their indices;
zip can be used to iterate on two arrays simultaneously.
I won't show an example with enumerate, but zip is exactly what you need since you want to iterate on two arrays:
s = 0
for row1, row2 in zip(res1, res2):
for x, y in zip(row1, row2):
s += x * y
You can also use builtin function sum to write this all without += and without the initial = 0:
s = sum(x * y for row1,row2 in zip(res1, res2) for x,y in zip(row1, row2))
Using numpy
As I mentioned in the introduction, numpy is a standard python package to deal with arrays of numbers. In general, operations on arrays using numpy is much, much faster than loops on arrays in core python. Plus, code using numpy is usually easier to read than code using core python only, because there are a lot of useful functions and convenient notations. For instance, here is a simple way to achieve what you want:
import numpy as np
# convert to numpy arrays
res1 = np.array(res1)
res2 = np.array(res2)
# multiply elements with corresponding elements, then sum
s = (res1 * res2).sum()
Relevant documentation:
sum: .sum() or np.sum();
pointwise multiplication: np.multiply() or *;
dot product: np.dot.
Solution 1:
import numpy as np
a,b = np.array(range(100)), np.array(range(100))
print((a * b).sum())
Solution 2 (more open, because of use of pd.DataFrame):
import pandas as pd
import numpy as np
a,b = np.array(range(100)), np.array(range(100))
df = pd.DataFrame(dict({'col1': a, 'col2': b}))
df['vect_product'] = df.col1 * df.col2
print(df['vect_product'].sum())
Two simple and fast options using numpy are: (A*B).sum() and np.dot(A.ravel(),B.ravel()). The first method sums all elements of the element-wise multiplication of A and B. np.sum() defaults to sum(axis=None), so we will get a single number. In the second method, you create a 1D view into the two matrices and then apply the dot-product method to get a single number.
import numpy as np
A = np.random.rand(1000,1000)
B = np.random.rand(1000,1000)
s = (A*B).sum() # method 1
s = np.dot(A.ravel(),B.ravel()) # method 2
The second method should be extremely fast, as it doesn't create new copies of A and B but a view into them, so no extra memory allocations.

vectorize recursive function of numpy array where each element depend on all of the previous ones

Be a an ndarray, e. g.:
a = np.random.randn(Size)
Where Size >> 1. Is it possible to define an array b s.t. its i-th element depends on all of the elements of a up to i (excluded or included is not the problem) without a for loop?
b[i] = function(a[:i])
So if function was simply np.sum(a[:i]) my desired output would be:
for i in range(1, Size):
b[i] = np.sum(a[:i])
The only solution I was able to think about was to write the corresponding C code and wrap it, but is there some python native solution to avoid it???
I stress that the sum is a mere ex., I'm lookin for a generalization to arbitrary function that can, howewver, be expressed elementwise by means of numpy mathematical function (np.exp() e.g.)
Many of the ufunc have an accumulate method. np.cumsum is basically np.add.accumulate. But if you can't use one of those, or some clever combination, and you still want speed, you will need to write some sort of compiled code. numba seems to be preferred tool these days.
In your example use just numpy cumsum operation https://numpy.org/doc/stable/reference/generated/numpy.cumsum.html.
Edit:
For example, if you create a = np.ones(10) with all values equal 1. Then b = np.cumsum(a) will contain [1 2 ... 10].
Or as you wanted:
for i in range(1, Size):
b[i] = np.sum(a[:i])
Also you can specify axis to apply cumsum to or maybe use numpy.cumprod (same operation but with product).

python: avoiding nested for-loop [duplicate]

This question already has answers here:
How to get the cartesian product of multiple lists
(17 answers)
Closed 4 years ago.
How can I avoid the nested for-loops in the following Code and using the map-function instead for example?
import numpy as np
A = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
n = len(A[0])
B = np.zeros((n,n))
for i in range(n):
for j in range(n):
B[j,i] = min(A[:,i]/A[:,j])
I don't think map() is a good candidate for this problem because your desired result is nested rather than flattened. It'd be a little more verbose than just using a list comprehension to achieve your desired result. Here's a cleaner way of initializing B.
B = np.array([np.min(A.T/r, axis=1) for r in A.T])
This iterates over every column of A (every row of A.T) and computes the broadcasted division A.T/r. Numpy is able to optimize that far beyond what we can do with raw loops. Then the use of np.min() computes minimums of that matrix we just computed along every row (rows instead of columns because of the parameter axis=1) more quickly than we would be able to with the builtin min() because of internal vectorization.
If you did want to map something, #MartijnPieters points out that you could use itertools with itertools.product(range(len(A[0])), repeat=2) to achieve the same pairs of indices. Otherwise, you could use the generator ((r, s) for r in A.T for s in A.T) to get pairs of rows instead of pairs of indices. In either case you could apply map() across that iterable, but you'd still have to somehow nest the results to initalize B properly.
Note: that this probably isn't the result you would expect. The elements of A are integers by default (notice that you used, e.g., 3 instead of 3.). When you divide the integer elements of A you get integers again. In your code, that was partially obfuscated by the fact that np.zeros() casts those integers to floats, but the math was wrong regardless. To fix that, pass an additional argument when constructing A:
A = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]], float)

Piecewise Operation on List of Numpy Arrays

My question is, can I make a function or variable that can perform an on operation or numpy method on each np.array element within a list in a more succinct way than what I have below (preferably by just calling one function or variable)?
Generating the list of arrays:
import numpy as np
array_list = [np.random.rand(3,3) for x in range(5)]
array_list
Current Technique of operating on each element:
My current method (as seen below) involves unpacking it and doing something to it:
[arr.std() for arr in array_list]
[arr + 2 for arr in array_list]
Goal:
My hope it to get something that could perform the operations above by simply typing:
x.std()
or
x +2
Yes - use an actual NumPy array and perform your operations over the desired axes, instead of having them stuffed in a list.
actual_array = np.array(array_list)
actual_array.std(axis=(1, 2))
# array([0.15792346, 0.25781021, 0.27554279, 0.2693581 , 0.28742179])
If you generally wanted all axes except the first, this could be something like tuple(range(1, actual_array.ndim)) instead of explicitly specifying the tuple.

Applying a function element-wise to multiple numpy arrays

Say I have two numpy arrays of the same dimensions, e.g.:
a = np.ones((4,))
b = np.linspace(0,4,4)
and a function that is supposed to operate on elements of those arrays:
def my_func (x,y):
# do something, e.g.
z = x+y
return z
How can I apply this function to the elements of a and b in an element-wise fashion and get the result back?
It depends, really. For the given function; how about 'a+b', for instance? Presumably you have something more complex in mind though.
The most general solution is np.vectorize; but its also the slowest. Depending on what you want to do, more clever solutions may exist though. Take a look at numexp for example.

Categories