I'm very new to numpy. I have an array like this and want to apply some operations on it.
It's easy in 2d array, lots of example are there but won't any such in 3d array.
arr = np.array([[[ True, True, False, True],[False, False, True, False],[False, False, False, False], [False, False, False, False],[False, False, True, False]],[[False, False, False, False], [True, True, True, True], [False, False, False, False], [False, False, False, False], [True, True, True, True] ], [[False, False, False, False], [False, False, False, False], [True, True, True, True], [False, False, False, False], [False, False, False, False] ] ])
If after applying np.all in the inner arrays the result would be: [[False, False, False, False, False], [False, True, False, False, True], [False, False, True, False, False]]
Then after np.any result would be: [False, True, True]
What, I mean is to get the overall index of array where any of the all 1d array values are true. Like in my certain case it should return index first and second, as at first index the sub first index and sub fourth index of array all values true and in second index sub second index all array values are true while in zero'th index not a single array whose all value are true.
What I've done so far is
np.all(np.any(arr, axis=1), axis=1)
OR
np.any(np.all(arr, axis=1), axis=1)
But both are not fruitfull, yes ,I can solve by comprehension but don't want any type looping which is gonna be my last option.
I think the issue comes from confusion in the axis handling.
As a rule of thumb, you can consider that the innermost brackets contain data of the highest dimension.
In your example, if you type arr.shape in your interpreter it will return the following tuple (3, 5, 4) which represent the size of arr in each dimension. If you now do the same with the expected result of the first operation you mentioned in your question you'll get (3, 5), so it looks like you (sort of speak) want to project the data of the highest dimension (i.e. axis=2) to the other axis. Same thing for the next operation except that the highest dimension is now axis=1.
To conclude, the following line would do the job np.any(np.all(arr, axis=2), axis=1).
Related
I'm trying to use a 2d boolean array (ix) to pick elements from a 1d array (c) to create a 2d array (r). The resulting 2d array is also a boolean array. Each column stands for the unique value in c.
Example:
>>> ix
array([[ True, True, False, False, False, False, False],
[False, False, True, False, False, False, True],
[False, False, False, True, False, False, False]])
>>> c
array([1, 2, 3, 4, 8, 2, 4])
Expected result
1, 2, 3, 4, 8
r = [
[ True, True, False, False, False], # c[ix[0][0]] == 1 and c[ix[0][1]] == 2; it doesn't matter that ix[0][5] (pointing to `2` in `c`) is False as ix[0][1] was already True which is sufficient.
[False, False, True, True, False], # [3]
[False, False, False, True, False] # [4] as ix[2][3] is True
]
Can this be done in a vectorised way?
Let us try:
# unique values
uniques = np.unique(c)
# boolean index into each row
vals = np.tile(c,3)[ix.ravel()]
# search within the unique values
idx = np.searchsorted(uniques, vals)
# pre-populate output
out = np.full((len(ix), len(uniques)), False)
# index into the output:
out[np.repeat(np.arange(len(ix)), ix.sum(1)), idx ] = True
Output:
array([[ True, True, False, False, False],
[False, False, True, True, False],
[False, False, False, True, False]])
X = np.arange(1, 26).reshape(5, 5)
X[:,1:2] % 2 == 0
The conditions should only be applied to the second column
I want the whole matrix where the condition is true like
[array([[False, True, False, False, False],
[ False, False, False, False, False],
[False, True, False, False, False],
[ False, False, False, False, False],
[False, True, False, False, False]])]
It's giving the error
IndexError: boolean index did not match indexed array along dimension 1; dimension is 5 but corresponding boolean dimension is 1
Is this what you want?
import numpy as np
X = np.arange(1, 26).reshape(5, 5)
X=[X[::] % 2 == 0]
print(X)
Output
[array([[False, True, False, True, False],
[ True, False, True, False, True],
[False, True, False, True, False],
[ True, False, True, False, True],
[False, True, False, True, False]])]
If you want to get the whole matrix where the condition is true. You can simply do this
X % 2 == 0
If you want to get the first column where condition is true then
X[:, 1:2] % 2 ==0
I want to find up/down patterns in a time series. This is what I use for simple up/down:
diff = np.diff(source, n=1)
encoding = np.where(diff > 0, 1, 0)
Is there a way with Numpy to do that for patterns with a given lookback length without a slow loop? For example up/up/up = 0 down/down/down = 1 up/down/up = 2 up/down/down = 3.....
Thank you for your help.
I learned yesterday about np.lib.stride_tricks.as_strided from one of StackOverflow answers similar to this. This is an awesome trick and not that hard to understand as I expected. Now, if you get it, let's define a function called rolling that lists all the patterns to check with:
def rolling(a, window):
shape = (a.size - window + 1, window)
strides = (a.itemsize, a.itemsize)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
compare_with = [True, False, True]
bool_arr = np.random.choice([True, False], size=15)
paterns = rolling(bool_arr, len(compare_with))
And after that you can calculate indexes of pattern matches as discussed here
idx = np.where(np.all(paterns == compare_with, axis=1))
Sample run:
bool_arr
array([ True, False, True, False, True, True, False, False, False,
False, False, False, True, True, False])
patterns
array([[ True, False, True],
[False, True, False],
[ True, False, True],
[False, True, True],
[ True, True, False],
[ True, False, False],
[False, False, False],
[False, False, False],
[False, False, False],
[False, False, False],
[False, False, True],
[False, True, True],
[ True, True, False]])
idx
(array([ 0, 2, 13], dtype=int64),)
I was debugging a code written by someone who has left the organization and came across a line, which uses np.less_equal.outer & np.greater_equal.outer functions. I know that np.outer creates a Cartesian cross product of two 1-dimensional arrays and creates two arrays, and np.less_equal compares the element of two arrays and returns true or false. Can someone please explain how this combined form works.
Thanks!
less_equal and greater_equal are special types of numpy functions called ufuncs, in that they have extendible functionalities, including accumulate, at, and outer.
In this case ufunc.outer extends the function to work similarly to the outer product - but while the actual outer product would be multiply.outer, this instead does the greater or less than comparison.
So you get a 2d array of booleans corresponding to each element of the first array, and whether they are greater or less than each of the elements in the second array.
np.less_equal.outer(range(1,18),range(1,13))
Out[]:
array([[ True, True, True, ..., True, True, True],
[False, True, True, ..., True, True, True],
[False, False, True, ..., True, True, True],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]], dtype=bool)
EDIT: a much more pythonic way of doing this would be:
np.triu(np.ones((18, 13), dtype = bool), 0)
That is, the upper triangle of a boolean array of shape (18, 13)
From the documentation, we have that for one-dimensional arrays A and B, the operation np.less_equal.outer(A, B) is equivalent to:
m = len(A)
n = len(B)
r = empty(m, n)
for i in range(m):
for j in range(n):
r[i,j] = (A[i] <= B[j])
Here's the mathematical representation of the result:
here is an example:
np.less_equal([4, 2, 1], [2, 2, 2])
array([False, True, True])
np.greater_equal([4, 2, 1], [2, 2, 2])
array([ True, True, False], dtype=bool)
and first the outer function
np.outer(range(1,2), range(1,3))
array([[1 2 3],
[2 4 6],
)
hope that helps.
Here is the problem:
Take a numpy boolean array:
a = np.array([False, False, True, True, True, False, True, False])
Which I am using as indexes to panda dataframe. But I need to create 2 new arrays where they each have half the True's as the original array. Note the example arrays are much shorter than actual set.
So like:
1st_half = array([False, False, True, True, False, False, False, False])
2nd_half = array([False, False, False, False, True, False, True, False])
Does anyone have a good way to do this? Thanks.
First find true indices
inds = np.where(a)[0]
cutoff = inds[inds.shape[0]//2]
Set values equivalent before and after cutoff:
b = np.zeros(a.shape,dtype=bool)
c = np.zeros(a.shape,dtype=bool)
c[cutoff:] = a[cutoff:]
b[:cutoff] = a[:cutoff]
Results:
b
Out[65]: array([False, False, True, True, False, False, False, False], dtype=bool)
c
Out[64]: array([False, False, False, False, True, False, True, False], dtype=bool)
There are numerous ways to handle the indexing.