I have a numpy array containing integers and slice objects, e.g.:
x = np.array([0,slice(None)])
How do I retrieve the (logical) indices of the integers or slice objects? I tried np.isfinite(x) (producing an error), np.isreal(x) (all True), np.isscalar(x) (not element-wise), all in vain.
What seems to work though is
ind = x<np.Inf # Out[1]: array([True, False], dtype=bool)
but I'm reluctant to use a numerical comparison on an object who's numerical value is completely arbitrary (and might change in the future?). Is there a better solution to achieve this?
You can do this:
import numpy as np
checker = np.vectorize( lambda x: isinstance(x,slice) )
x = np.array([0,slice(None),slice(None),0,0,slice(None)])
checker(x)
#array([False, True, True, False, False, True], dtype=bool)
Related
Let's say I have two numpy arrays:
>>> v1
array([ True, False, False, False, True])
>>> v2
array([False, False, True, True, True])
I'm trying to retrieve an array that has the same length (5) and contains True in each position where v1==True AND v2==False. That would be:
array([True, False, False, False, False])
Is there a quick way in numpy, something like logical_not() but considering v1 as the reference and v2 as the query?
You just need to use the right bitwise operators:
v1 & ~v2
# array([ True, False, False, False, False])
For boolean values, logical and bitwise operations are the same. It is therefore quite idiomatic to write
v1 & ~v2
However, this is a bitwise operation, and produces a potentially unnecessary temp array. You can not write v1 and not v2 as much as you'd like to because python expects to convert the inputs to single boolean values. Instead, you have to call the logical_and and logical_not ufuncs:
np.logical_and(v1, np.logical_not(v2))
The nice thing is that you can avoid the temp array, or even write directly to a buffer of your choice:
result = np.empty_like(v1)
np.logical_not(v2, out=result)
np.logical_and(v1, result, out=result)
You can even do the whole thing in-place (in v2):
np.logical_and(v1, np.logical_not(v2, out=v2), out=v2)
You can make use of bitwise operators here:
>>> v1 & ~v2
array([ True, False, False, False, False])
I am currently working with NumPy version 1.12.1, and every call to numpy.where() returns an empty list with the following warning:
FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
I am comparing a string, date_now and a list, dates_list:
np.where(date_now==dates_list)
This causes errors, as the program subsequently calls functions that expect the numpy.where() output to be a non-empty. Does anyone have a solution for this?
Thanks in advance.
In your current comparison, you are comparing the entire list object, dates_list, to a string, date_now. This will cause element-wise comparison to fail and return a scalar as if you are just comparing two scalar values:
date_now = '2017-07-10'
dates_list = ['2017-07-10', '2017-07-09', '2017-07-08']
np.where(dates_list==date_now, True, False)
Out[3]: array(0)
What you want is to declare dates_list as a NumPy array to facilitate element-wise comparison.
np.where(np.array(dates_list)==date_now, True, False)
Out[8]: array([ True, False, False], dtype=bool)
I had a question about equality comparison with numpy and arrays of strings.
Say I define the following array:
x = np.array(['yes', 'no', 'maybe'])
Then I can test for equality with other strings and it does element wise comparison with the single string (following, I think, the broadcasting rules here: http://docs.scipy.org/doc/numpy-1.10.1/user/basics.broadcasting.html ?):
'yes' == x
#op : array([ True, False, False], dtype=bool)
x == 'yes'
#op : array([ True, False, False], dtype=bool)
However, if I compare with unicode strings I get different behaviour with element wise comparison only happening if I compare the array to the string and only a single comparison being made if I compare the string to the array.
x == u'yes'
#op : array([ True, False, False], dtype=bool)
u'yes' == x
#op : False
I can't find details of this behaviour in the numpy docs and was hoping someone could explain or point me to details of why comparison with unicode strings behaves differently?
The relevant piece of information is this part of the Python's coercion rules:
For objects xand y, first x.__op__(y) is tried. If this is not implemented or returns NotImplemented, y.__rop__(x) is tried.
Using your numpy array x, when the left-hand side is a str ('yes' == x):
'yes'.__eq__(x) returns NotImplemented and
therefore resolves to x.__eq__('yes') – resulting in numpy's element-wise comparison.
However, when the left-hand side is a unicode (u'yes' == x):
u'yes'.__eq__(x) simply returns False.
The reason for the different __eq__ behaviours is that str.__eq__() simply returns NotImplemented if its argument is not a str type, whereas unicode.__eq__() first tries to convert its argument to a unicode, and only returns NotImplemented if that conversion fails. In this case, the numpy array is convertible to a unicode: u'yes' == x is essentially u'yes' == unicode(x).
I understand that tf.where will return the locations of True values, so that I could use the result's shape[0] to get the number of Trues.
However, when I try and use this, the dimension is unknown (which makes sense as it needs to be computed at runtime). So my question is, how can I access a dimension and use it in an operation like a sum?
For example:
myOtherTensor = tf.constant([[True, True], [False, True]])
myTensor = tf.where(myOtherTensor)
myTensor.get_shape() #=> [None, 2]
sum = 0
sum += myTensor.get_shape().as_list()[0] # Well defined at runtime but considered None until then.
You can cast the values to floats and compute the sum on them:
tf.reduce_sum(tf.cast(myOtherTensor, tf.float32))
Depending on your actual use case you can also compute sums per row/column if you specify the reduce dimensions of the call.
I think this is the easiest way to do it:
In [38]: myOtherTensor = tf.constant([[True, True], [False, True]])
In [39]: if_true = tf.count_nonzero(myOtherTensor)
In [40]: sess.run(if_true)
Out[40]: 3
Rafal's answer is almost certainly the simplest way to count the number of true elements in your tensor, but the other part of your question asked:
[H]ow can I access a dimension and use it in an operation like a sum?
To do this, you can use TensorFlow's shape-related operations, which act on the runtime value of the tensor. For example, tf.size(t) produces a scalar Tensor containing the number of elements in t, and tf.shape(t) produces a 1D Tensor containing the size of t in each dimension.
Using these operators, your program could also be written as:
myOtherTensor = tf.constant([[True, True], [False, True]])
myTensor = tf.where(myOtherTensor)
countTrue = tf.shape(myTensor)[0] # Size of `myTensor` in the 0th dimension.
sess = tf.Session()
sum = sess.run(countTrue)
There is a tensorflow function to count non-zero values tf.count_nonzero. The function also accepts an axis and keep_dims arguments.
Here is a simple example:
import numpy as np
import tensorflow as tf
a = tf.constant(np.random.random(100))
with tf.Session() as sess:
print(sess.run(tf.count_nonzero(tf.greater(a, 0.5))))
I'm trying to do an "&" operation across all the values in a simple bool array. The array I have is as follows:
array([False False True], dtype=bool)
The only thing I've come up with is to slice out the values in the array and use "&" to give a "False" result. I feel like there must be a better way but I don't know enough about numpy to use it properly.
Use arr.all(), which is the same as np.all(arr):
import numpy as np
arr = np.array([False, False, True], dtype=bool)
arr.all()
=> False
np.all(arr)
=> False