Error with pymc3 sampler in pypesto: theano.graph.fg MissingInputError - python

I am tackling a bayesian inference problem and am having trouble using a pymc3 sampler provided by pypesto on my windows laptop. To make sure I can run with the sampler I create a simple dummy objective to use.
I install create a conda (I tried both 3.7 & 3.8) environment and install the pymc3 and theano modules using pip3/pip. I've tried several different versions of both pymc3/theano and managed to import them succesfully. However, there is an error message I cannot figure out how to go around. I have tried looking online for a solution but was not able to find it either. I currently have the latest versions of pymc3 and theano installed (3.11.0 and 1.0.5 respectively). This is the final line of the message
theano.graph.fg.MissingInputError: Input 0 of the graph (indices start from 0), used to compute sigmoid(x2_interval__), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.
Here is the full message:
Sampling 1 chain for 1_000 tune and 100 draw iterations (1_000 + 100 draws total) took 7 seconds.
Traceback (most recent call last):
File "samplingPymc3.py", line 70, in <module>
result2 = sample.sample(problem1, 100, sampler2, x0=np.array([0,0]))
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\pypesto\sample\sample.py", line 68, in sample
sampler.sample(n_samples=n_samples)
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\pypesto\sample\pymc3.py", line 102, in sample
**self.options)
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\pymc3\sampling.py", line 637, in sample
idata = arviz.from_pymc3(trace, **ikwargs)
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\arviz\data\io_pymc3.py", line 559, in from_pymc3
density_dist_obs=density_dist_obs,
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\arviz\data\io_pymc3.py", line 163, in __init__
self.observations, self.multi_observations = self.find_observations()
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\arviz\data\io_pymc3.py", line 176, in find_observations
multi_observations[key] = val.eval() if hasattr(val, "eval") else val
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\theano\graph\basic.py", line 554, in eval
self._fn_cache[inputs] = theano.function(inputs, self)
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\theano\compile\function\__init__.py", line 350, in function
output_keys=output_keys,
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\theano\compile\function\pfunc.py", line 532, in pfunc
output_keys=output_keys,
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\theano\compile\function\types.py", line 1978, in orig_function
name=name,
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\theano\compile\function\types.py", line 1584, in __init__
fgraph, additional_outputs = std_fgraph(inputs, outputs, accept_inplace)
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\theano\compile\function\types.py", line 188, in std_fgraph
fgraph = FunctionGraph(orig_inputs, orig_outputs, update_mapping=update_mapping)
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\theano\graph\fg.py", line 162, in __init__
self.import_var(output, reason="init")
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\theano\graph\fg.py", line 330, in import_var
self.import_node(var.owner, reason=reason)
File "C:\Users\germa\anaconda3\envs\sampling\lib\site-packages\theano\graph\fg.py", line 383, in import_node
raise MissingInputError(error_msg, variable=var)
theano.graph.fg.MissingInputError: Input 0 of the graph (indices start from 0), used to compute sigmoid(x2_interval__), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.
I read somewhere that the issue may lie with the version of arviz used but that does not appear to be the issue in my case.
I wanted to include the script I am running. Here is the code for the script:
import numpy as np
import scipy as sp
import scipy.optimize as so
from scipy.stats import multivariate_normal
import pypesto
import pypesto.sample as sample
from pypesto import Objective
A = np.array([[2.0, 0.0], [0.0, 1.0]])
b = np.array([2.0, 1.0])
x_init = np.array([3.4302, 2.915])
x_true = np.array([1.0, 1.0])
temp = lambda x: A.dot(x) - b
f = lambda x: .5 * np.linalg.norm(temp(x))
A_t = A.transpose()
K = np.dot(A_t, A)
df = lambda x: K.dot(x) - A_t.dot(b)
def obj1(x):
# f_val = f(x)
# grad = df(x)
return (f(x), df(x))
objfun = lambda x: obj1(x)
dim_full = 2
lb = -10 * np.ones((dim_full, 1))
ub = 10 * np.ones((dim_full, 1))
x_names = ['x1', 'x2']
# step_fcn = pymc3.step_methods.hmc.hmc.HamiltonianMC
objective = pypesto.Objective(fun=objfun, grad=True, hess=False)
problem1 = pypesto.Problem(objective=objective, lb=lb, ub=ub, x_names=x_names)
sampler = sample.AdaptiveMetropolisSampler()
print('function val: ', objfun(x_init))
sampler2 = sample.Pymc3Sampler()
result2 = sample.sample(problem1, 100, sampler2, x0=np.array([0, 0]))
print('Done sampling!')
Thank you in advance for any help!

pymc3 support of pypesto is at the moment limited, as it was implemented at a time when theano was discontinued in favor of aesara in pymc3. Thus, pypesto only supports specific version of the involved tools, specifically
arviz >= 0.8.1, < 0.9.0
theano >= 1.0.4
packaging >= 20.0
pymc3 >= 3.8, < 3.9.2
(see https://github.com/ICB-DCM/pyPESTO/blob/main/setup.cfg#L111).
The switch to full aesara and later pymc3 version support is underway, but not out yet.

Related

Error using np.arange() in t_span with solve_ivp error in Scipy 1.8.0 but not 1.5.0

For the following input
neuron_dict = {'param_set': sb.morris_lecar_defaults(V_3 = 11.96), 'time_range': (0, 10000, 0.0001), 'initial_cond': (-3.06560496e+01, 7.33832272e-03, 8.35251563e-01), 'stretch': 4.2, 'track_event': sb.voltage_passes_threshold ,'location': np.array([20,0,100])}
sb.ivp_solver(sb.morris_lecar, time_range = neuron_dict['time_range'], initial_cond = neuron_dict['initial_cond'], params = neuron_dict['param_set'], track_event = neuron_dict['track_event'])
def ivp_solver(system_of_equations: callable, time_range: tuple, initial_cond: tuple,params: callable = morris_lecar_defaults(), track_event: callable = voltage_passes_threshold, numerical_method = 'BDF', rtol = 1e-8) -> object:
track_event.direction = 1
sol = solve_ivp(system_of_equations, time_range, initial_cond, args=(params,), events= track_event, t_eval= np.arange(time_range[0], time_range[1], time_range[2]), method = numerical_method, rtol = rtol)
return sol
solve_ivp fails with the following output for Scipy version 1.8.0 with traceback:
Traceback (most recent call last):
File "...MEA_foward_model.py", line 438, in <module>
main()
File "...MEA_foward_model.py", line 430, in main
near_synchronous_dual_bursting_example()
File "...MEA_foward_model.py", line 377, in near_synchronous_dual_bursting_example
ts, voltages, currents, time_events, y_events = integrate_neurons(neurons_list)
File "...MEA_foward_model.py", line 185, in integrate_neurons
sol = sb.ivp_solver(sb.morris_lecar, time_range = neuron_dict['time_range'], initial_cond = neuron_dict['initial_cond'], params = neuron_dict['param_set'], track_event = neuron_dict['track_event'])
File "...\Square_bursting_oscillations.py", line 106, in ivp_solver
sol = solve_ivp(system_of_equations, time_range, initial_cond, args=(params,), events= track_event, t_eval= np.arange(time_range[0], time_range[1], time_range[2]), method = numerical_method, rtol = rtol)
File "C:\Anaconda\envs\test\lib\site-packages\scipy\integrate\_ivp\ivp.py", line 512, in solve_ivp
t0, tf = map(float, t_span)
ValueError: too many values to unpack (expected 2)
but in Scipy version 1.5.0 (my base interpreter) it runs without issue. Looking at the line highlighted in the traceback in ivp.py:
Scipy 1.8.0: t0, tf = map(float, t_span) vs Scipy 1.5.0: t0, tf = float(t_span[0]), float(t_span[1])
not sure if this has any bearing on the reason why it failed but it is odd that Scipy 1.8.0 doesn't accept the same input. I would like to use np.arange(start, finish, interval) for my integration, is there any reason why this is failing?
According to the docs
t_span
2-tuple of floats
Interval of integration (t0, tf). The solver starts with t=t0 and
integrates until it reaches t=tf.
For a 2 element tuple, these are the same:
t0, tf = map(float, t_span)
t0, tf = float(t_span[0]), float(t_span[1])
But the first raises this error when t_span is longer than 2. The second just ignores the additional values. It's the t0,tf=... unpacking that enforces the 2-element requirement.
Maybe you want to give t_eval the longer arange, and t_span just the end points.

calling Matlab's portfolio optimizer via Python failed

Here I estimated my asset returns and covariances in Python and want to use Matlab's portfolio optimizer to get a portfolio with maximum Sharpe ratio
in Matlab I have a function
function pwgt= calculate_portfolio(r0,m,C,lb,ub,target)
p = Portfolio('RiskFreeRate', r0);
p = setAssetMoments(p, m, C);
p = setBounds(p,lb,ub);
p = setBudget(p,target,target);
pwgt = estimateMaxSharpeRatio(p);
In Python I have
import matlab
import matlab.engine
eng = matlab.engine.connect_matlab()
test_mean = matlab.double(mat_mean.values.tolist())
test_cov = matlab.double(mat_cov.values.tolist())
result = eng.calculate_portfolio(0.03, test_mean, test_cov, -1, 1, 1)
Where test_mean is a 5*1 pandas series, test_cov is a 5*5 covariance matrix (stored as pandas dataframe).
However, when I call the function in Python, the error goes as:
MatlabExecutionError:
File /Applications/MATLAB_R2016b.app/toolbox/optim/optim/linprog.m, line 158, in linprog
File /Applications/MATLAB_R2016b.app/toolbox/finance/finance/#Portfolio/private/mv_estimate_lower_bound.p, line 0, in mv_estimate_lower_bound
File /Applications/MATLAB_R2016b.app/toolbox/finance/finance/#Portfolio/private/mv_optim_transform.p, line 0, in mv_optim_transform
File /Applications/MATLAB_R2016b.app/toolbox/finance/finance/#Portfolio/estimateFrontierLimits.m, line 70, in estimateFrontierLimits
File /Applications/MATLAB_R2016b.app/toolbox/finance/finance/#Portfolio/estimateMaxSharpeRatio.m, line 33, in estimateMaxSharpeRatio
File /Users/Michael/Documents/MATLAB/calculate_portfolio.m, line 6, in calculate_portfolio
LINPROG requires the following inputs to be of data type double: 'b','LB'.
How do I solve this problem.

Fipy Grid3D 'an index can only have a single ellipsis' error

I am interesting in solving differential equation using fipy.
The following code is working correctly when I am using Grid2D.
from fipy import *
mesh = Grid2D(nx=3, ny=3)
#mesh = Grid3D(nx=3, ny=3, nz=3)
phi = CellVariable(name='solution variable', mesh=mesh, value=0.)
phi.constrain(0, mesh.facesLeft)
phi.constrain(10, mesh.facesRight)
coeff = CellVariable(mesh=mesh, value=1.)
eq = DiffusionTerm(coeff) == 0
eq.solve(var=phi)
When I am using Grid3D instead of Grid2D (commented line), I get following error:
Traceback (most recent call last):
File "/home/user/Programming/python/fdms/forSo.py", line 11, in <module>
eq.solve(var=phi)
File "/home/user/Programs/miniconda2/envs/FipyEnv2/lib/python3.6/site-packages/fipy/terms/term.py", line 211, in solve
solver = self._prepareLinearSystem(var, solver, boundaryConditions, dt)
File "/home/user/Programs/miniconda2/envs/FipyEnv2/lib/python3.6/site-packages/fipy/terms/term.py", line 169, in _prepareLinearSystem
diffusionGeomCoeff=self._getDiffusionGeomCoeff(var),
File "/home/user/Programs/miniconda2/envs/FipyEnv2/lib/python3.6/site-packages/fipy/terms/abstractDiffusionTerm.py", line 458, in _getDiffusionGeomCoeff
return self._getGeomCoeff(var)
File "/home/user/Programs/miniconda2/envs/FipyEnv2/lib/python3.6/site-packages/fipy/terms/term.py", line 465, in _getGeomCoeff
self.geomCoeff = self._calcGeomCoeff(var)
File "/home/user/Programs/miniconda2/envs/FipyEnv2/lib/python3.6/site-packages/fipy/terms/abstractDiffusionTerm.py", line 177, in _calcGeomCoeff
tmpBop = (coeff * FaceVariable(mesh=mesh, value=mesh._faceAreas) / mesh._cellDistances)[numerix.newaxis, :]
File "/home/user/Programs/miniconda2/envs/FipyEnv2/lib/python3.6/site-packages/fipy/variables/variable.py", line 1151, in __mul__
return self._BinaryOperatorVariable(lambda a,b: a*b, other)
File "/home/user/Programs/miniconda2/envs/FipyEnv2/lib/python3.6/site-packages/fipy/variables/variable.py", line 1116, in _BinaryOperatorVariable
if not v.unit.isDimensionless() or len(v.shape) > 3:
File "/home/user/Programs/miniconda2/envs/FipyEnv2/lib/python3.6/site-packages/fipy/variables/variable.py", line 255, in _getUnit
return self._extractUnit(self.value)
File "/home/user/Programs/miniconda2/envs/FipyEnv2/lib/python3.6/site-packages/fipy/variables/variable.py", line 538, in _getValue
value = self._calcValue()
File "/home/user/Programs/miniconda2/envs/FipyEnv2/lib/python3.6/site-packages/fipy/variables/cellToFaceVariable.py", line 48, in _calcValue
alpha = self.mesh._faceToCellDistanceRatio
File "/home/user/Programs/miniconda2/envs/FipyEnv2/lib/python3.6/site-packages/fipy/meshes/uniformGrid3D.py", line 269, in _faceToCellDistanceRatio
XZdis[..., 0,...] = 1
IndexError: an index can only have a single ellipsis ('...')
I installed fipy using «Recomended method» from https://www.ctcms.nist.gov/fipy/INSTALLATION.html. I tried to install using Miniconda for both Pthon 3.6 and Python 2.7 and got same errors.
How to solve equations using Grid3D?
This is because newer versions of numpy are less tolerant of our sloppy syntax. You can either checkout our develop source branch or make this change to your code.

Running deseq2 through rpy2

I am trying to run DEseq2 from Python using rpy2.
How should I pass the design matrix?
My script is as follows:
from numpy import *
from numpy.random import multinomial, random
from rpy2 import robjects
import rpy2.robjects.numpy2ri
robjects.numpy2ri.activate()
from rpy2.robjects.packages import importr
deseq = importr('DESeq2')
# Generate some data. 1000 genes, 10 samples
n = 1000
probabilities = random(n)
probabilities /= sum(probabilities)
data = zeros((n,10), int)
for i in range(10):
data[:,i] = multinomial(1000000, probabilities)
# Make the data frame
d = {}
categories = ('1','2') * 5
d["key_1"] = robjects.IntVector(categories)
dataframe = robjects.DataFrame(d)
# Create the design matrix, and run DESeqDataSetFromMatrix
design = "~ key_1" # <--- I guess this is wrong
dds = deseq.DESeqDataSetFromMatrix(countData=data, colData=dataframe,design=design)
The error I am getting is
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/rpy2-2.8.5-py3.6-macosx-10.11-x86_64.egg/rpy2/rinterface/__init__.py:186: RRuntimeWarning: Error: $ operator is invalid for atomic vectors
warnings.warn(x, RRuntimeWarning)
Traceback (most recent call last):
File "testrpy.py", line 23, in <module>
dds = deseq.DESeqDataSetFromMatrix(countData=data, colData=dataf,design=design)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/rpy2-2.8.5-py3.6-macosx-10.11-x86_64.egg/rpy2/robjects/functions.py", line 178, in __call__
return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/rpy2-2.8.5-py3.6-macosx-10.11-x86_64.egg/rpy2/robjects/functions.py", line 106, in __call__
res = super(Function, self).__call__(*new_args, **new_kwargs)
rpy2.rinterface.RRuntimeError: Error: $ operator is invalid for atomic vectors
My guess is that the design argument is not correct.
Does anybody have an example of running DEseq via rpy2?
Thanks.
Ah ! You were almost there:
# Create the design matrix, and run DESeqDataSetFromMatrix
design = "~ key_1" # <--- I guess this is wrong
design is a string, but I guess that it should be a formula. Formulae are language objects in R.
Try with:
from rpy2.robjects import Formula
design = Formula("~ key_1")

Interpolate a discontinuous function with Scipy

I am having problems interpolating some data points using Scipy. I guess that it might depend on the fact that the function I'm trying to interpolate is discontinuous at x roughly 4.
Here is the code I'm using to interpolate:
from scipy import *
y_interpolated = interp1d(x,y,buonds_error=False,fill_value=0.,kind='cubic')
new_x_array = arange(min(x),max(x),0.05)
plot(new_x_array,x_interpolated(new_x_array),'r-')
The error I get is
File "<stdin>", line 2, in <module>
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/site-packages/scipy/interpolate/interpolate.py", line 357, in __call__
out_of_bounds = self._check_bounds(x_new)
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/site-packages/scipy/interpolate/interpolate.py", line 415, in _check_bounds
raise ValueError("A value in x_new is above the interpolation "
ValueError: A value in x_new is above the interpolation range.
These are my data points:
1.56916432074 -27.9998263169
1.76773750527 -27.6198430485
1.98360238449 -27.2397962268
2.25133982943 -26.8596491107
2.49319293195 -26.5518194791
2.77823462692 -26.1896935372
3.07201297519 -25.9540514619
3.46090507092 -25.7362456112
3.65968688527 -25.6453922172
3.84116464506 -25.53652509
3.97070419447 -25.3374215879
4.03087127145 -24.8493356465
4.08217147954 -24.0540196233
4.12470899596 -23.0960856364
4.17612639206 -22.4634289328
4.19318305992 -22.1380894034
4.2708234589 -21.902951035
4.3745696768 -21.9027079759
4.52158254627 -21.9565591238
4.65985875536 -21.8839570732
4.80666329863 -21.6486676004
4.91026629192 -21.4496126386
5.05709528961 -21.2685401725
5.29054655428 -21.2860476871
5.54129211534 -21.3215908912
5.73174988353 -21.6645019816
6.06035782465 -21.772138994
6.30243916407 -21.7715483093
6.59656410998 -22.0238656166
6.86481948673 -22.3665921479
7.01182409559 -22.4385289076
7.17609125906 -22.4200564296
7.37494987052 -22.4376476472
7.60844044988 -22.5093814451
7.79869207061 -22.5812017094
8.00616642549 -22.5445612485
8.17903446593 -22.4899243886
8.29141325457 -22.4715846981
What version of scipy are you using?
The script you posted has some syntax errors (I assume due to wrong copy and paste).
This script works, with scipy.__version__ == 0.9.0. .
import sys
from scipy import *
from scipy.interpolate import *
from pylab import plot
x = []
y = []
for line in sys.stdin:
a, b = line.split()
x.append(float(a))
y.append(float(b))
y_interpolated = interp1d(x,y,bounds_error=False,fill_value=0.,kind='cubic')
new_x_array = arange(min(x),max(x),0.05)
plot(new_x_array,y_interpolated(new_x_array),'r-')

Categories