I am trying to configure PyMC3 Polynomial kernel with the following hyperpriors;
with pm.Model() as self.model:
EPSILON = 0.1
l = pm.Gamma("l", alpha=2, beta=1)
offset = pm.Gamma("offset", alpha=2, beta=1)
nu = pm.HalfCauchy("nu", beta=1)
d = pm.HalfNormal("d", sd=5)
cov = nu ** 2 * pm.gp.cov.Polynomial(X.shape[1], l, d, offset)
self.gp = pm.gp.Marginal(cov_func=cov)
sigma = pm.HalfCauchy("sigma", beta=1)
y_ = self.gp.marginal_likelihood("y", X=X, y=Y, noise=sigma)
self. map_trace = [pm.find_MAP()]
However, I'm getting Cholesky decomposition failed error as follows;
LinAlgError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/theano/compile/function_module.py in __call__(self, *args, **kwargs)
902 outputs =\
--> 903 self.fn() if output_subset is None else\
904 self.fn(output_subset=output_subset)
24 frames
LinAlgError: 7-th leading minor of the array is not positive definite
During handling of the above exception, another exception occurred:
LinAlgError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/scipy/linalg/decomp_cholesky.py in _cholesky(a, lower, overwrite_a, clean, check_finite)
38 if info > 0:
39 raise LinAlgError("%d-th leading minor of the array is not positive "
---> 40 "definite" % info)
41 if info < 0:
42 raise ValueError('LAPACK reported an illegal value in {}-th argument'
LinAlgError: 7-th leading minor of the array is not positive definite
Apply node that caused the error: Cholesky{lower=True, destructive=False, on_error='raise'}(Elemwise{Composite{((sqr(i0) * i1) + i2 + i3)}}[(0, 0)].0)
Toposort index: 11
Inputs types: [TensorType(float64, matrix)]
Inputs shapes: [(40, 40)]
Inputs strides: [(320, 8)]
Inputs values: ['not shown']
Outputs clients: [[Solve{A_structure='lower_triangular', lower=False, overwrite_A=False, overwrite_b=False}(Cholesky{lower=True, destructive=False, on_error='raise'}.0, TensorConstant{[ 69.79 .. 472.83]}), Solve{A_structure='lower_triangular', lower=False, overwrite_A=False, overwrite_b=False}(Cholesky{lower=True, destructive=False, on_error='raise'}.0, Elemwise{Composite{(sqr(i0) * i1)}}[(0, 0)].0)]]
Changing the hyperpriors seems to change the error like instead of 7th leading minor it will show some other x-th leading minor. But I'm not sure if this is caused by hyperpriors or something else.
Any thoughts are welcome :)
Thanks
Related
So I'm working on a problem where I need to simply use SciPy to perform linear regression to get the weights and statistics on the weights, but I'm getting the error
"ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 11 and the array at index 1 has size 100"
The code is simply:
from scipy import stats
x = x_copy
y = y_copy
stats.linregress(x, y)
Where x is a dataframe and y is a numpy array.
When doing x.shape and y.shape I get that x is (100, 11) and y is (100,). Running the exact same matrices in np.linalg.lstsq and sklearn.linear_model.LinearRegression both work fine and output the weights, but as far as I'm aware I need SciPy to get the statistics on the weights themselves.
I've also checked x.dtypes and all variables are float64, and y.dtype also returns float64. I've also tried replacing to x in the regression call with x.to_numpy() incase there was something with the headers/index but received the same issue.
Any suggestions?
Edit:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_28012/197024375.py in <module>
4 y = y_copy
5
----> 6 stats.linregress(x.values, y)
7
8 x.values.shape
~\anaconda3\lib\site-packages\scipy\stats\_stats_mstats_common.py in linregress(x, y, alternative)
153 # ssxm = mean( (x-mean(x))^2 )
154 # ssxym = mean( (x-mean(x)) * (y-mean(y)) )
--> 155 ssxm, ssxym, _, ssym = np.cov(x, y, bias=1).flat
156
157 # R-value
<__array_function__ internals> in cov(*args, **kwargs)
~\anaconda3\lib\site-packages\numpy\lib\function_base.py in cov(m, y, rowvar, bias, ddof, fweights, aweights, dtype)
2426 if not rowvar and y.shape[0] != 1:
2427 y = y.T
-> 2428 X = np.concatenate((X, y), axis=0)
2429
2430 if ddof is None:
<__array_function__ internals> in concatenate(*args, **kwargs)
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 11 and the array at index 1 has size 100```
I'm confused by Jax documentation, here's what I'm trying to do:
def line(m,x,b):
return m*x + b
grad(line)(1,2,3)
And the error:
---------------------------------------------------------------------------
FilteredStackTrace Traceback (most recent call last)
<ipython-input-48-d14b17620b30> in <module>()
3
----> 4 grad(line)(1,2,3)
FilteredStackTrace: TypeError: grad requires real- or complex-valued inputs (input dtype that is a sub-dtype of np.floating or np.complexfloating), but got int32. If you want to use integer-valued inputs, use vjp or set allow_int to True.
The stack trace above excludes JAX-internal frames.
The following is the original exception that occurred, unmodified.
--------------------
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
6 frames
/usr/local/lib/python3.7/dist-packages/jax/api.py in _check_input_dtype_revderiv(name, holomorphic, allow_int, x)
844 elif not allow_int and not (dtypes.issubdtype(aval.dtype, np.floating) or
845 dtypes.issubdtype(aval.dtype, np.complexfloating)):
--> 846 raise TypeError(f"{name} requires real- or complex-valued inputs (input dtype that "
847 "is a sub-dtype of np.floating or np.complexfloating), "
848 f"but got {aval.dtype.name}. If you want to use integer-valued "
TypeError: grad requires real- or complex-valued inputs (input dtype that is a sub-dtype of np.floating or np.complexfloating), but got int32. If you want to use integer-valued inputs, use vjp or set allow_int to True.
I'm referencing the official tutorial code:
import jax.numpy as jnp
from jax import grad, jit, vmap
from jax import random
key = random.PRNGKey(0)
def sigmoid(x):
return 0.5 * (jnp.tanh(x / 2) + 1)
# Outputs probability of a label being true.
def predict(W, b, inputs):
return sigmoid(jnp.dot(inputs, W) + b)
# Build a toy dataset.
inputs = jnp.array([[0.52, 1.12, 0.77],
[0.88, -1.08, 0.15],
[0.52, 0.06, -1.30],
[0.74, -2.49, 1.39]])
targets = jnp.array([True, True, False, True])
# Training loss is the negative log-likelihood of the training examples.
def loss(W, b):
preds = predict(W, b, inputs)
label_probs = preds * targets + (1 - preds) * (1 - targets)
return -jnp.sum(jnp.log(label_probs))
# Initialize random model coefficients
key, W_key, b_key = random.split(key, 3)
W = random.normal(W_key, (3,))
b = random.normal(b_key, ())
W_grad = grad(loss, argnums=0)(W, b)
print('W_grad', W_grad)
And the result:
W_grad [-0.16965576 -0.8774648 -1.4901345 ]
What am I doing wrong here? I gather key is being used in some important way, but I can't figure out why/how it's necessary. To answer this question, please adjust code in the first block as necessary to remove the error.
Jax is telling you it doesn't like integers. grad(line)(1.,2.,3.) (using floats) fixes the problem.
I think the Error here is clear:
TypeError: grad requires real- or complex-valued inputs (input dtype that is a sub-dtype of np.floating or np.complexfloating), but got int32. If you want to use integer-valued inputs, use vjp or set allow_int to True.
To use grad(line)(1,2,3) with Int32, change it to grad(line, allow_int=True)(1,2,3)
I am trying to increase the brightness of a gray-scale image. To do that I want to create a spline. But when I am trying to use scipy.interpolate.UnivariateSpline, it is raising an error.
Traceback:
---------------------------------------------------------------------------
error Traceback (most recent call last)
<ipython-input-38-289216dd01e1> in <module>
8 x=[0,128,255]
9 y=[0,190,255]
---> 10 myLUT=spline_to_lookup_table(x,y)
<ipython-input-38-289216dd01e1> in spline_to_lookup_table(spline_breaks, break_values)
1 def spline_to_lookup_table(spline_breaks: list, break_values: list):
----> 2 spl = UnivariateSpline(spline_breaks, break_values)
4 return spl(range(256))
~/anaconda3/envs/computer-vision/lib/python3.8/site-packages/scipy/interpolate/fitpack2.py in __init__(self, x, y, w, bbox, k, s, ext, check_finite)
200
201 # _data == x,y,w,xb,xe,k,s,n,t,c,fp,fpint,nrdata,ier
--> 202 data = dfitpack.fpcurf0(x, y, k, w=w, xb=bbox[0],
203 xe=bbox[1], s=s)
204 if data[-1] == 1:
error: (m>k) failed for hidden m: fpcurf0:m=3
Source code:
def spline_to_lookup_table(spline_breaks: list, break_values: list):
spl = UnivariateSpline(spline_breaks, break_values)
return spl(range(256))
x=[0,128,255]
y=[0,190,255]
myLUT=spline_to_lookup_table(x,y)
img_curved=cv2.LUT(img_gray, myLUT).astype(np.uint8)
Indeed, you cannot fit a cubic spline with three points: even a single cubic parabola has four parameters.
I am trying to change the very simplest getting started - example of pymc3 (https://docs.pymc.io/notebooks/getting_started.html), the motivating example of linear regression into fitting a stretched exponential.
The simplest version of the model I tried is y = exp(-x**beta)
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-darkgrid')
# Initialize random number generator
np.random.seed(1234)
# True parameter values
sigma = .1
beta = 1
# Size of dataset
size = 1000
# Predictor variable
X1 = np.random.randn(size)
# Simulate outcome variable
Y = np.exp(-X1**beta) + np.random.randn(size)*sigma
# specify the model
import pymc3 as pm
import theano.tensor as tt
print('Running on PyMC3 v{}'.format(pm.__version__))
basic_model = pm.Model()
with basic_model:
# Priors for unknown model parameters
beta = pm.HalfNormal('beta', sigma=1)
sigma = pm.HalfNormal('sigma', sigma=1)
# Expected value of outcome
mu = pm.math.exp(-X1**beta)
# Likelihood (sampling distribution) of observations
Y_obs = pm.Normal('Y_obs', mu=mu, sigma=sigma, observed=Y)
with basic_model:
# draw 500 posterior samples
trace = pm.sample(500)
which yields the output
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [sigma, beta]
Sampling 4 chains: 0%| | 0/4000 [00:00<?, ?draws/s]/opt/conda/lib/python3.7/site-packages/numpy/core/fromnumeric.py:2920: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
/opt/conda/lib/python3.7/site-packages/numpy/core/fromnumeric.py:2920: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
Bad initial energy, check any log probabilities that are inf or -inf, nan or very small:
Y_obs NaN
---------------------------------------------------------------------------
RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/pymc3/parallel_sampling.py", line 160, in _start_loop
point, stats = self._compute_point()
File "/opt/conda/lib/python3.7/site-packages/pymc3/parallel_sampling.py", line 191, in _compute_point
point, stats = self._step_method.step(self._point)
File "/opt/conda/lib/python3.7/site-packages/pymc3/step_methods/arraystep.py", line 247, in step
apoint, stats = self.astep(array)
File "/opt/conda/lib/python3.7/site-packages/pymc3/step_methods/hmc/base_hmc.py", line 144, in astep
raise SamplingError("Bad initial energy")
pymc3.exceptions.SamplingError: Bad initial energy
"""
The above exception was the direct cause of the following exception:
SamplingError Traceback (most recent call last)
SamplingError: Bad initial energy
The above exception was the direct cause of the following exception:
ParallelSamplingError Traceback (most recent call last)
<ipython-input-310-782c941fbda8> in <module>
1 with basic_model:
2 # draw 500 posterior samples
----> 3 trace = pm.sample(500)
/opt/conda/lib/python3.7/site-packages/pymc3/sampling.py in sample(draws, step, init, n_init, start, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, **kwargs)
435 _print_step_hierarchy(step)
436 try:
--> 437 trace = _mp_sample(**sample_args)
438 except pickle.PickleError:
439 _log.warning("Could not pickle model, sampling singlethreaded.")
/opt/conda/lib/python3.7/site-packages/pymc3/sampling.py in _mp_sample(draws, tune, step, chains, cores, chain, random_seed, start, progressbar, trace, model, **kwargs)
967 try:
968 with sampler:
--> 969 for draw in sampler:
970 trace = traces[draw.chain - chain]
971 if (trace.supports_sampler_stats
/opt/conda/lib/python3.7/site-packages/pymc3/parallel_sampling.py in __iter__(self)
391
392 while self._active:
--> 393 draw = ProcessAdapter.recv_draw(self._active)
394 proc, is_last, draw, tuning, stats, warns = draw
395 if self._progress is not None:
/opt/conda/lib/python3.7/site-packages/pymc3/parallel_sampling.py in recv_draw(processes, timeout)
295 else:
296 error = RuntimeError("Chain %s failed." % proc.chain)
--> 297 raise error from old_error
298 elif msg[0] == "writing_done":
299 proc._readable = True
ParallelSamplingError: Bad initial energy
INFO (theano.gof.compilelock): Waiting for existing lock by process '30255' (I am process '30252')
INFO (theano.gof.compilelock): To manually release the lock, delete /home/jovyan/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-buster-sid-x86_64-3.7.3-64/lock_dir
/opt/conda/lib/python3.7/site-packages/numpy/core/fromnumeric.py:2920: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
/opt/conda/lib/python3.7/site-packages/numpy/core/fromnumeric.py:2920: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
Instead of the stretched exponential, I have also tried power laws, and sine functions. It seems to me that the problem arises as soon as my model is not injective. Can this be an issue (as apparent, I am a newbie in this field)? Can I restrict sampling to only positive x values? Are there any tricks to this?
So the problem here is that
X1**beta
is only defined when X1 >= 0, or when beta is an integer. When you feed this into your observations, for most places, beta will be a float, and so many of
mu = pm.math.exp(-X1**beta)
will be nan.
I found this out with
>>> basic_model.check_test_point()
beta_log__ -0.77
sigma_log__ -0.77
Y_obs NaN
Name: Log-probability of test_point, dtype: float64
I am not sure what model you are trying to specify! There are ways to require beta to be an integer, and ways to require that X1 be positive, but I would need more details to help you describe the model.
I am not sure why stats.multivariate_normal.pdf is not working.
At the moment I have
from scipy import stats
stats.multivariate_normal.pdf(X, meanX, covX)
where
X.shape = (150, 2)
meanX.shape = () # just a float
covX.shape = (150,)
The error I get is: "total size of new array must be unchanged"
Now I tried to follow the answer:
meanL = np.float(np.mean(xL))
covL = np.cov(xL)
stats.multivariate_normal.pdf(xL.T, np.full((150,), meanL), covL)
I get the following error:
LinAlgError Traceback (most recent call last)
<ipython-input-77-4c0280512087> in <module>()
2 meanL = np.full((150,), meanL)
3 covL = np.cov(xL)
----> 4 stats.multivariate_normal.pdf(xL.T, meanL, covL)
5
/Users/laura/anaconda/lib/python3.5/site-packages/scipy/stats/_multivariate.py in pdf(self, x, mean, cov, allow_singular)
497 dim, mean, cov = self._process_parameters(None, mean, cov)
498 x = self._process_quantiles(x, dim)
--> 499 psd = _PSD(cov, allow_singular=allow_singular)
500 out = np.exp(self._logpdf(x, mean, psd.U, psd.log_pdet, psd.rank))
501 return _squeeze_output(out)
/Users/laura/anaconda/lib/python3.5/site-packages/scipy/stats/_multivariate.py in __init__(self, M, cond, rcond, lower, check_finite, allow_singular)
148 d = s[s > eps]
149 if len(d) < len(s) and not allow_singular:
--> 150 raise np.linalg.LinAlgError('singular matrix')
151 s_pinv = _pinv_1d(s, eps)
152 U = np.multiply(u, np.sqrt(s_pinv))
LinAlgError: singular matrix
I can't reproduce the exact error your getting, but dimensions have to match:
mean and covariance need to have shapes (N,) and (N, N). and X must have width N. Some but not all of these requirements may be alleviated by broadcasting. Anyway, the following works for me:
>>> X = np.random.random((150,2))
>>> meanX = 0.5
>>> covX = np.identity(150)
>>> print(stats.multivariate_normal.pdf(X.T, np.full((150,), meanX), covX))
[4.43555177e-63 2.84151145e-63]
Update From the updated Q I suspect you want
>>> X = np.random.random((150,2))
>>>
>>> meanX = np.mean(X, axis=0)
>>> covX = np.cov(X.T)
>>> stats.multivariate_normal.pdf(X, meanX, covX)
array([0.83292328, 0.18944144, 0.37425605, 1.22840732, 0.5089164 ,
1.78568641, 0.31210331, 0.64079837, 1.05805662, 0.66416311,
0.77964264, 0.65744803, 0.53025325, 1.22309949, 1.62169299,
0.84558019, 1.23537247, 0.44383979, 1.45601888, 0.85368635,
...