Evaluate normal cdfs at each of several points - python

I want to evaluate several normal CDFs, defined by a 4x3 grid of points, at each of 5 points.
import numpy as np
import scipy.stats
a = np.array([-1, 0, 1])
b = np.array([1, 2, 3, 4])
x = np.array([-.5, 0, .5, 1, 2])
grid_a, grid_b = np.meshgrid(a,b)
scipy.stats.norm(loc=grid_a, scale=grid_b).cdf(x)
Raises this exception:
ValueErrorTraceback (most recent call last)
<ipython-input-46-82423c7451d2> in <module>()
3 x = np.array([-.5, 0, .5, 1, 2])
4 grid_a, grid_b = np.meshgrid(a,b)
----> 5 scipy.stats.norm(loc=grid_a, scale=grid_b).cdf(x)
~/.envs/practice/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py in cdf(self, x)
454
455 def cdf(self, x):
--> 456 return self.dist.cdf(x, *self.args, **self.kwds)
457
458 def logcdf(self, x):
~/.envs/practice/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py in cdf(self, x, *args, **kwds)
1733 args = tuple(map(asarray, args))
1734 dtyp = np.find_common_type([x.dtype, np.float64], [])
-> 1735 x = np.asarray((x - loc)/scale, dtype=dtyp)
1736 cond0 = self._argcheck(*args) & (scale > 0)
1737 cond1 = self._open_support_mask(x) & (scale > 0)
ValueError: operands could not be broadcast together with shapes (5,) (4,3)

You have to reshape a, b and x to be compatible for broadcasting. You can do this, for example, by adding one trivial dimension to a and two trivial dimensions to b. That is, use a[:, None] (which has shape (3, 1)) and b[:, None, None] (which has shape (4, 1, 1)). (Instead of None, you might prefer the more explicit np.newaxis, but its value is just None.) Then with x having shape (5,) and the reshaped a and b having shapes (3, 1) and (4, 1, 1), respectively, the shape of the computed result with broadcasting is (4, 3, 5):
In [45]: from scipy.stats import norm
In [46]: a = np.array([-1, 0, 1])
In [47]: b = np.array([1, 2, 3, 4])
In [48]: x = np.array([-.5, 0, .5, 1, 2])
In [49]: c = norm.cdf(x, loc=a[:, None], scale=b[:, None, None])
In [50]: c.shape
Out[50]: (4, 3, 5)
In [51]: c
Out[51]:
array([[[0.69146246, 0.84134475, 0.9331928 , 0.97724987, 0.9986501 ],
[0.30853754, 0.5 , 0.69146246, 0.84134475, 0.97724987],
[0.0668072 , 0.15865525, 0.30853754, 0.5 , 0.84134475]],
[[0.59870633, 0.69146246, 0.77337265, 0.84134475, 0.9331928 ],
[0.40129367, 0.5 , 0.59870633, 0.69146246, 0.84134475],
[0.22662735, 0.30853754, 0.40129367, 0.5 , 0.69146246]],
[[0.56618383, 0.63055866, 0.69146246, 0.74750746, 0.84134475],
[0.43381617, 0.5 , 0.56618383, 0.63055866, 0.74750746],
[0.30853754, 0.36944134, 0.43381617, 0.5 , 0.63055866]],
[[0.54973822, 0.59870633, 0.64616977, 0.69146246, 0.77337265],
[0.45026178, 0.5 , 0.54973822, 0.59870633, 0.69146246],
[0.35383023, 0.40129367, 0.45026178, 0.5 , 0.59870633]]])
It also works to use the cdf() method of the "frozen" distribution norm(loc=a[:, None], scale=b[:, None, None]) like you did in the question:
In [52]: c = norm(loc=a[:, None], scale=b[:, None, None]).cdf(x)
In [53]: c.shape
Out[53]: (4, 3, 5)

Related

Selective meshgrid in Tensorflow

Given the following code:
import tensorflow as tf
def combine(x, y):
xx, yy = tf.meshgrid(x, y, indexing='ij')
combo = tf.stack([tf.reshape(xx, [-1]), tf.reshape(yy, [-1])], axis=1)
print(combo)
x = tf.constant([11, 0, 7, 1])
combine(x, x)
I want to clean combo vector in order to obtain the following tf vector [(11, 0), (11, 7), (11, 1), (0, 7), (0, 1), (7, 1)]. Is it possible to do this in Tensorflow?
You can introduce a mask, to do get the desired result-
def combine(x, y):
xx, yy = tf.meshgrid(x, y, indexing='ij')
#create a mask to take the strictly upper triangular matrix
ones = tf.ones_like(xx)
mask = tf.cast(tf.linalg.band_part(ones, 0, -1) - tf.linalg.band_part(ones, 0, 0) , dtype=tf.bool)
x = tf.boolean_mask(xx, mask)
y = tf.boolean_mask(yy, mask)
combo = tf.stack([x, y], axis=1)
print(combo)
x = tf.constant([11, 0, 7, 1])
a = combine(x, x)
#output
[[11 0]
[11 7]
[11 1]
[ 0 7]
[ 0 1]
[ 7 1]],

In numpy, multipy two structured matrices concisely

I have two matrices. The first has the following structure:
[[1, 0, a],
[0, 1, b],
[1, 0, c],
[0, 1, d]]
where 1, 0, a, b, c, and d are scalars. The matrix is 4 by 3
The second is just a 2 by 3 matrix:
[[r1],
[r2]]
where r1 and r2 are the first and second rows respectively, each having 3 elements.
I would like the output to be:
[[r1, 0, a*r1],
[0, r1, b*r1],
[r2, 0, c*r2],
[0, r2, d*r2]]
which would be a 4 by 9 matrix.
This is similar to the Kronecker product, except separately for each row of the second matrix. Of course this could be done with cumbersome loops which I want to avoid.
How can I do this concisely?
You can do exactly what you said in the last line: do a separate Kronecker product for each row of the second column and then concatenate the results.
Let's assume that the two matrices are called x (4 by 3) and y (2 by 3). The first thing to do is to split x in two parts because only half matrix participates in each part of the product.
x = x.reshape(2, 2, 3)
Then you can calculate the two products separately:
z0 = np.kron(x[0], y[0])
z1 = np.kron(x[1], y[1])
Finally, concatenate the two results along the first axis:
z = np.concatenate([z0, z1], axis=0)
Or if, like me, you enjoy big ugly one-liners you can do:
z = np.concatenate([np.kron(xr, yr) for xr, yr in zip(x.reshape(2, 2, 3), y)], axis=0)
In the general case you mentioned in the comments, it would become:
z = np.concatenate([np.kron(xr, yr) for xr, yr in zip(x.reshape(int(n / 2), 2, 3), y)], axis=0)
This gives equal results to the explicit loop, which can be numba.jit compiled I believe:
def solve_explicit(x, y):
# sanity checks
assert x.shape[0] == 2*y.shape[0]
assert x.shape[1] == y.shape[1]
n = x.shape[0]
z = np.zeros((n, 9))
for i in range(n):
for j in range(3):
for k in range(3):
z[i, k + 3 * j] = x[i, j] * y[int(i / 2), k]
return z
Using broadcasting, with x.shape (n, 3), and y.shape (n//2, 3):
out = (x.reshape(-1, 2, 3, 1) * y.reshape(-1, 1, 1, 3)).reshape(-1, 9)
I personally would use np.einsum in this situation because I think it's easier to understand than broadcasting.
import numpy as np
(a, b, c, d) = np.random.rand(4)
x = np.array([[1, 0, a], [0, 1, b], [1, 0, c], [0, 1, d]])
y = np.random.rand(2, 3)
z = np.einsum("ij,ik->ijk", x.reshape(-1, 6), y).reshape(-1, 9)
# timeit magic commands.
# %timeit -n 50000 np.einsum("ij,ik->ijk", x.reshape(-1, 6), y).reshape(-1, 9)
# %timeit -n 50000 (x.reshape(-1, 2, 3, 1) * y.reshape(-1, 1, 1, 3)).reshape(-1, 9)
Some good references on Einstein summation in NumPy: [2, 3, 4].

advanced indexing using numpy

I'm trying to use advanced indexing but I cannot get it to work with this simple array
arr = np.array([[[ 1, 10, 100,1000],[ 2, 20, 200,2000]],[[ 3, 30, 300,3000],[ 4,40,400,4000]],[[5, 50, 500,5000],[6, 60,600,6000]]])
d1=np.array([0])
d2=np.array([0,1])
d3=np.array([0,1,2])
arr[d1,d2,d3]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (1,) (2,) (3,)
and
arr[d1[:,np.newaxis],d2[np.newaxis,:],d3]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (1,1) (1,2) (3,)
Expected output:
array([[[ 1, 10, 100],
[ 2, 20, 200]]])
You can use np.ix_ to combine several one-dimensional index arrays of different lengths to index a multidimensional array. For example:
arr[np.ix_(d1,d2,d3)]
To add more context, np.ix_ returns a tuple of ndimensional arrays. The same can be achieved "by hand" by adding np.newaxis for appropriate dimensions:
xs, ys, zs = np.ix_(d1,d2,d3)
# xs.shape == (1, 1, 1) == (len(d1), 1, 1 )
# ys.shape == (1, 2, 1) == (1, len(d2), 1 )
# zs.shape == (1, 1, 3) == (1, 1, len(d3))
result_ix = arr[xs, ys, zs]
# using newaxis:
result_newaxis = arr[
d1[:, np.newaxis, np.newaxis],
d2[np.newaxis, :, np.newaxis],
d3[np.newaxis, np.newaxis, :],
]
assert (result_ix == result_newaxis).all()
You need only d1 to select the first cell:
>>> arr[d1]
array([[[ 1, 10, 100],
[ 2, 20, 200]]])

N-D indexing with defaults in NumPy

Can I index NumPy N-D array with fallback to default values for out-of-bounds indexes? Example code below for some imaginary np.get_with_default(a, indexes, default):
import numpy as np
print(np.get_with_default(
np.array([[1,2,3],[4,5,6]]), # N-D array
[(np.array([0, 0, 1, 1, 2, 2]), np.array([1, 2, 2, 3, 3, 5]))], # N-tuple of indexes along each axis
13, # Default for out-of-bounds fallback
))
should print
[2 3 6 13 13 13]
I'm looking for some built-in function for this. If such not exists then at least some short and efficient implementation to do that.
I arrived at this question because I was looking for exactly the same. I came up with the following function, which does what you ask for 2 dimension. It could likely be generalised to N dimensions.
def get_with_defaults(a, xx, yy, nodata):
# get values from a, clipping the index values to valid ranges
res = a[np.clip(yy, 0, a.shape[0] - 1), np.clip(xx, 0, a.shape[1] - 1)]
# compute a mask for both x and y, where all invalid index values are set to true
myy = np.ma.masked_outside(yy, 0, a.shape[0] - 1).mask
mxx = np.ma.masked_outside(xx, 0, a.shape[1] - 1).mask
# replace all values in res with NODATA, where either the x or y index are invalid
np.choose(myy + mxx, [res, nodata], out=res)
return res
xx and yy are the index array, a is indexed by (y,x).
This gives:
>>> a=np.zeros((3,2),dtype=int)
>>> get_with_defaults(a, (-1, 1000, 0, 1, 2), (0, -1, 0, 1, 2), -1)
array([-1, -1, 0, 0, -1])
As an alternative, the following implementation achieves the same and is more concise:
def get_with_default(a, xx, yy, nodata):
# get values from a, clipping the index values to valid ranges
res = a[np.clip(yy, 0, a.shape[0] - 1), np.clip(xx, 0, a.shape[1] - 1)]
# replace all values in res with NODATA (gets broadcasted to the result array), where
# either the x or y index are invalid
res[(yy < 0) | (yy >= a.shape[0]) | (xx < 0) | (xx >= a.shape[1])] = nodata
return res
I don't know if there is anything in NumPy to do that directly, but you can always implement it yourself. This is not particularly smart or efficient, as it requires multiple advanced indexing operations, but does what you need:
import numpy as np
def get_with_default(a, indices, default=0):
# Ensure inputs are arrays
a = np.asarray(a)
indices = tuple(np.broadcast_arrays(*indices))
if len(indices) <= 0 or len(indices) > a.ndim:
raise ValueError('invalid number of indices.')
# Make mask of indices out of bounds
mask = np.zeros(indices[0].shape, np.bool)
for ind, s in zip(indices, a.shape):
mask |= (ind < 0) | (ind >= s)
# Only do masking if necessary
n_mask = np.count_nonzero(mask)
# Shortcut for the case where all is masked
if n_mask == mask.size:
return np.full_like(a, default)
if n_mask > 0:
# Ensure index arrays are contiguous so masking works right
indices = tuple(map(np.ascontiguousarray, indices))
for ind in indices:
# Replace masked indices with zeros
ind[mask] = 0
# Get values
res = a[indices]
if n_mask > 0:
# Replace values of masked indices with default value
res[mask] = default
return res
# Test
print(get_with_default(
np.array([[1,2,3],[4,5,6]]),
(np.array([0, 0, 1, 1, 2, 2]), np.array([1, 2, 2, 3, 3, 5])),
13
))
# [ 2 3 6 13 13 13]
I also needed a solution to this, but I wanted a solution that worked in N dimensions. I made Markus' solution work for N-dimensions, including selecting from an array with more dimensions than the coordinates point to.
def get_with_defaults(arr, coords, nodata):
coords, shp = np.array(coords), np.array(arr.shape)
# Get values from arr, clipping to valid ranges
res = arr[tuple(np.clip(c, 0, s-1) for c, s in zip(coords, shp))]
# Set any output where one of the coords was out of range to nodata
res[np.any(~((0 <= coords) & (coords < shp[:len(coords), None])), axis=0)] = nodata
return res
import numpy as np
if __name__ == '__main__':
A = np.array([[1,2,3],[4,5,6]])
B = np.array([[[1, -9],[2, -8],[3, -7]],[[4, -6],[5, -5],[6, -4]]])
coords1 = [[0, 0, 1, 1, 2, 2], [1, 2, 2, 3, 3, 5]]
coords2 = [[0, 0, 1, 1, 2, 2], [1, 2, 2, 3, 3, 5], [1, 1, 1, 1, 1, 1]]
out1 = get_with_defaults(A, coords1, 13)
out2 = get_with_defaults(B, coords1, 13)
out3 = get_with_defaults(B, coords2, 13)
print(out1)
# [2, 3, 6, 13, 13, 13]
print(out2)
# [[ 2 -8]
# [ 3 -7]
# [ 6 -4]
# [13 13]
# [13 13]
# [13 13]]
print(out3)
# [-8, -7, -4, 13, 13, 13]

Split output of a layer in keras

Say, I have a layer with output dims (4, x, y). I want to split this into 4 separate (1, x, y) tensors, which I can use as input for 4 other layers.
What I'm essentially looking for is the opposite of the Merge layer. I know that there's no split layer in keras, but is there a simple way to do this in keras?
Are you looking for something like this?
import keras.backend as K
import numpy as np
val = np.random.random((4, 2, 3))
t = K.variable(value=val)
t1 = t[0, :, :]
t2 = t[1, :, :]
t3 = t[2, :, :]
t4 = t[3, :, :]
print('t1:\n', K.eval(t1))
print('t2:\n', K.eval(t2))
print('t3:\n', K.eval(t3))
print('t4:\n', K.eval(t4))
print('t:\n', K.eval(t))
It gives the following output:
t1:
[[ 0.18787734 0.1085723 0.01127671]
[ 0.06032621 0.14528386 0.21176969]]
t2:
[[ 0.34292713 0.56848335 0.83797884]
[ 0.11579451 0.21607392 0.80680907]]
t3:
[[ 0.1908586 0.48186591 0.23439431]
[ 0.93413448 0.535191 0.16410089]]
t4:
[[ 0.54303145 0.78971165 0.9961108 ]
[ 0.87826216 0.49061012 0.42450914]]
t:
[[[ 0.18787734 0.1085723 0.01127671]
[ 0.06032621 0.14528386 0.21176969]]
[[ 0.34292713 0.56848335 0.83797884]
[ 0.11579451 0.21607392 0.80680907]]
[[ 0.1908586 0.48186591 0.23439431]
[ 0.93413448 0.535191 0.16410089]]
[[ 0.54303145 0.78971165 0.9961108 ]
[ 0.87826216 0.49061012 0.42450914]]]
Note that, now t1, t2, t3, t4 is of shape(2,3).
print(t1.shape.eval()) # prints [2 3]
So, if you want to keep the 3d shape, you need to do the following:
t1 = t[0, :, :].reshape((1, 2, 3))
t2 = t[1, :, :].reshape((1, 2, 3))
t3 = t[2, :, :].reshape((1, 2, 3))
t4 = t[3, :, :].reshape((1, 2, 3))
Now, you get the spitted tensors in correct dimension.
print(t1.shape.eval()) # prints [1 2 3]
Hope that it will help you to solve your problem.
You can define Lambda layers to do the slicing for you:
from keras.layers import Lambda
from keras.backend import slice
.
.
x = Lambda( lambda x: slice(x, START, SIZE))(x)
For your specific example, try:
x1 = Lambda( lambda x: slice(x, (0, 0, 0), (1, -1, -1)))(x)
x2 = Lambda( lambda x: slice(x, (1, 0, 0), (1, -1, -1)))(x)
x3 = Lambda( lambda x: slice(x, (2, 0, 0), (1, -1, -1)))(x)
x4 = Lambda( lambda x: slice(x, (3, 0, 0), (1, -1, -1)))(x)
You can just simply use tf.split.

Categories