I think the question is clear enough. I want to make a hidden Markov model in Python and draw a vizualization model of it. So, it's something like this picture:
Is there any module to do that? I've googled it and found nothing.
The dot package from graphviz is the best I've found. The syntax is simple, simpler than xml.
Though I've never worked with Hidden Markov Models, when I need to visualize a graph (directed, with labels, colors, etc.), I use Gephi, a GUI graph browser/editor and generate the graphs programmatically as GraphML files, which is an XML-based format. Python has good XML-handling tools (in the standard library and lxml). Gephi recognizes some of the <data> sub-elements as positions, colors, and labels for nodes and edges.
The Python library pomegranate has good support for Hidden Markov Models. It includes functionality for defining such models, learning it from data, doing inference, and visualizing the transitions graph (as you request here).
Below is example code for defining a model, and plotting the states and transitions. The image output will be like this:
from pomegranate import HiddenMarkovModel, State, DiscreteDistribution
from matplotlib import pyplot as plt
def build_model():
d1 = DiscreteDistribution({'A' : 0.50, 'B' : 0.50})
d2 = DiscreteDistribution({'A' : 0.10, 'B' : 0.90})
d3 = DiscreteDistribution({'A' : 0.90, 'B' : 0.10})
s1 = State(d1, name="s1")
s2 = State(d2, name="s2")
s3 = State(d3, name="s3")
model = HiddenMarkovModel(name='my model')
model.add_states(s1, s2, s3)
model.add_transition(model.start, s1, 1.0)
model.add_transition(s1, s1, 0.7)
model.add_transition(s1, s2, 0.3) # s1->s2
model.add_transition(s2, s2, 0.8)
model.add_transition(s2, s3, 0.0) # no transition from s2 to s3
model.add_transition(s1, s3, 0.1) # indirect from s1 to s3
model.add_transition(s3, s1, 0.1) # indirect from s3 to s1
model.add_transition(s3, s3, 0.9)
model.add_transition(s3, model.end, 0.1)
model.start.name = 'start'
model.end.name = 'end'
model.bake()
return model
model = build_model()
fig, ax = plt.subplots(1)
model.plot(ax=ax, precision=2)
fig.savefig('model.png')
Related
I am currently trying my hand at dynamic time warping. I am using the library dtaidistance for this. I currently only get the DTW graphs as a saved image. But I would like to add axis labels to them.
Does anyone know how I can add the axis labels? I already have the following code:
from dtaidistance import dtw
from dtaidistance import dtw_visualisation as dtwvis
[...]
s1 = array1
s2 = array2
d, paths = dtw.warping_paths(s1, s2)
best_path = dtw.best_path(paths)
dtwvis.plot_warpingpaths(s1, s2, paths, best_path,shownumbers=True)
path = dtw.warping_path(s1, s2)
dtwvis.plot_warping(s1, s2, path , filename="warp.svg")
distance = dtw.distance(s1, s2)
print("DTW distance=",distance)
I am trying to fit the parameters of a transit light curve.
I have observed transit light curve data and I am using a .py in python that through 4 parameters (period, a(semi-major axis), inclination, planet radius) returns a model transit light curve. I would like to minimize the residual between these two light curves. This is what I am trying to do: First - Estimate a max likelihood using method = "L-BFGS-B" and then apply the mcmc using emcee to estimate the uncertainties.
The code:
p = lmfit.Parameters()
p.add_many(('per', 2.), ('inc', 90.), ('a', 5.), ('rp', 0.1))
per_b = [1., 3.]
a_b = [4., 6.]
inc_b = [88., 90.]
rp_b = [0.1, 0.3]
bounds = [(per_b[0], per_b[1]), (inc_b[0], inc_b[1]), (a_b[0], a_b[1]), (rp_b[0], rp_b[1])]
def residual(p):
v = p.valuesdict()
eclipse.criarEclipse(v['per'], v['a'], v['inc'], v['rp'])
lc0 = numpy.array(eclipse.getCurvaLuz()) (observed flux data)
ts0 = numpy.array(eclipse.getTempoHoras()) (observed time data)
c = numpy.linspace(min(time_phased[bb]),max(time_phased[bb]),len(time_phased[bb]),endpoint=True)
nn = interpolate.interp1d(ts0,lc0)
return nn(c) - smoothed_LC[bb] (residual between the model and the data)
Inside def residual(p) I make sure that both the observed data (time_phased[bb] and smoothed_LC[bb]) have the same size of the model transit light curve. I want it to give me the best fit values for the parameters (v['per'], v['a'], v['inc'], v['rp']).
I need your help and I appreciate your time and your attention. Kindest regards, Yuri.
Your example is incomplete, with many partial concepts and some invalid Python. This makes it slightly hard to understand your intention. If the answer below is not sufficient, update your question with a complete example.
It seems pretty clear that you want to model your data smoothed_LC[bb] (not sure what bb is) with a model for some effect of an eclipse. With that assumption, I would recommend using the lmfit.Model approach. Start by writing a function that models the data, just so you check and plot your model. I'm not entirely sure I understand everything you're doing, but this model function might look like this:
import numpy
from scipy import interpolate
from lmfit import Model
# import eclipse from somewhere....
def eclipse_lc(c, per, a, inc, p):
eclipse.criarEclipse(per, a, inc, rp)
lc0 = numpy.array(eclipse.getCurvaLuz()) # observed flux data
ts0 = numpy.array(eclipse.getTempoHoras()) # observed time data
return interpolate.interp1d(ts0,lc0)(c)
With this model function, you can build a Model:
lc_model = Model(eclipse_lc)
and then build parameters for your model. This will automatically name them after the argument names of your model function. Here, you can also give them initial values:
params = lc_model.make_params(per=2, inc=90, a=5, rp=0.1)
You wanted to place upper and lower bounds on these parameters. This is done by setting min and max parameters, not making an ordered array of bounds:
params['per'].min = 1.0
params['per'].max = 3.0
and so on. But also: setting such tight bounds is usually a bad idea. Set bounds to avoid unphysical parameter values or when it becomes evident that you need to place them.
Now, you can fit your data with this model. Well, first you need to get the data you want to model. This seems less clear from your example, but perhaps:
c_data = numpy.linspace(min(time_phased[bb]), max(time_phased[bb]),
len(time_phased[bb]), endpoint=True)
lc_data = smoothed_LC[bb]
Well: why do you need to make this c_data? Why not just use time_phased as the independent variable? Anyway, now you can fit your data to your model with your parameters:
result = lc_model(lc_data, params, c=c_data)
At this point, you can print out a report of the results and/or view or get the best-fit arrays:
print(result.fit_report())
for p in result.params.items(): print(p)
import matplotlib.pyplot as plt
plt.plot(c_data, lc_data, label='data')
plt.plot(c_data. result.best_fit, label='fit')
plt.legend()
plt.show()
Hope that helps...
I am trying to fit a Gaussian to a set of data points using the astropy.modeling package but all I am getting is a flat line. See below:
Here's my code:
%pylab inline
from astropy.modeling import models,fitting
from astropy import modeling
#Fitting a gaussian for the absorption lines
wavelength= linspace(galaxy1_wavelength_extracted_1.min(),galaxy1_wavelength_extracted_1.max(),200)
g_init = models.Gaussian1D(amplitude=1., mean=5000, stddev=1.)
fit_g = fitting.LevMarLSQFitter()
g = fit_g(g_init, galaxy1_wavelength_extracted_1, galaxy1_flux_extracted_1)
#Plotting
plot(galaxy1_wavelength_extracted_1,galaxy1_flux_extracted_1,".k")
plot(wavelength, g(wavelength))
xlabel("Wavelength ($\\AA$)")
ylabel("Flux (counts)")
What am I doing wrong or missing?
I made some fake data that sort of resembles yours, and tried running your code on it and obtained similar results. I think the problem is that if you don't adjust your model's initial parameters to at least sort of resemble the original model, or else the fitter won't be able to converge no matter how many rounds of fitting it performs.
If I'm fitting a Gaussian I like to give the initial model some initial parameters based on computationally "eyeballing" them like so (here I named your real data's flux and wavelength as orig_flux and orig_wavelength respectively):
>>> an_amplitude = orig_flux.min()
>>> an_mean = orig_wavelength[orig_flux.argmin()]
>>> an_stddev = np.sqrt(np.sum((orig_wavelength - an_mean)**2) / (len(orig_wavelength) - 1))
>>> print(f'mean: {an_mean}, stddev: {an_stddev}, amplitude: {an_amplitude}')
mean: 5737.979797979798, stddev: 42.768052162734605, amplitude: 84.73925092448636
where for the standard deviation I used the unbiased standard deviation estimate.
Plotting this over my fake data shows that these are reasonable values I might have picked if I manually eyeballed the data as well:
>>> plt.plot(orig_wavelength, orig_flux, '.k', zorder=1)
>>> plt.scatter(an_mean, an_amplitude, color='red', s=100, zorder=2)
>>> plt.vlines([an_mean - an_stddev, an_mean + an_stddev], orig_flux.min(), orig_flux.max(),
... linestyles='dashed', colors='gg', zorder=2)
One feature I've wanted to add to astropy.modeling in the past is optional methods that can be attached to some models to give reasonable estimates for their parameters based on some data. So for Gaussians such a method would return much like I just computed above. I don't know if that's ever been implemented though.
It is also worth noting that your Gaussian would be inverted (with a negative amplitude) and that it's displaced on the flux axis some 120 points, so I added a Const1D to my model to account for this, and subtracted the displacement from the amplitude:
>>> an_disp = orig_flux.max()
>>> g_init = (
... models.Const1D(an_disp) +
... models.Gaussian1D(amplitude=(an_amplitude - an_disp), mean=an_mean, stddev=an_stddev)
... )
>>> fit_g = fitting.LevMarLSQFitter()
>>> g = fit_g(g_init, orig_wavelength, orig_flux)
This results in the following fit which looks much better already:
>>> plt.plot(orig_wavelength, orig_flux, '.k')
>>> plt.plot(orig_wavelength, g(orig_wavelength), 'r-')
I'm not an expert in modeling or statistics, so someone with deeper knowledge could likely improve on this. I've added a notebook with my full analysis of the problem, including how I generated my sample data here.
I apologize for a longer than usual intro, but it is important for the question:
I've recently been assigned to work on an existing project, which uses Keras+Tensorflow to create a Fully Connected Net.
Overall the model has 3 fully connected layers with 500 neurons and has 2 output classes. The first layer has 500 neurons which are connected to 82 input features. The model is used in the production and is retrained weekly, using this week information generated by an outer source.
The engineer which designed the model is no longer working here and I'm trying to reverse engineer and understand the behavior of the model.
Couple of objectives I have defined for myself are:
Understand the feature selection process and feature importance.
Understand and control the weekly re-training process.
In order to try and answer both of them, I've implemented an experiment where I feed my code with two models: one from the previous week and the other from the current week:
import pickle
import numpy as np
import matplotlib.pyplot as plt
from keras.models import model_from_json
path1 = 'C:/Model/20190114/'
path2 = 'C:/Model/20190107/'
model_name1 = '0_10.1'
model_name2 = '0_10.2'
models = [path1 + model_name1, path2 + model_name2]
features_cum_weight = {}
I then take each feature and try to sum all the weights (their absolute value) which connect it to the first hidden layer.
This way I create two vectors of 82 values:
for model_name in models:
structure_filename = model_name + "_structure.json"
weights_filename = model_name + "_weights.h5"
with open(structure_filename, 'r') as model_json:
model = model_from_json(model_json.read())
model.load_weights(weights_filename)
in_layer_weights = model.layers[0].get_weights()[0]
in_layer_weights = abs(in_layer_weights)
features_cum_weight[model_name] = in_layer_weights.sum(axis=1)
I then plot them, using MatplotLib:
# Plot the Evolvement of Input Neuron Weights:
keys = list(features_cum_weight.keys())
weights_1 = features_cum_weight[keys[0]]
weights_2 = features_cum_weight[keys[1]]
fig, ax = plt.subplots(nrows=2, ncols=2)
width = 0.35 # the width of the bars
n_plots = 4
batch = int(np.ceil(len(weights_1)/n_plots))
for i in range(n_plots):
start = i*(batch+1)
stop = min(len(weights_1), start + batch + 1)
cur_w1 = weights_1[start:stop]
cur_w2 = weights_2[start:stop]
ind = np.arange(len(cur_w1))
cur_ax = ax[i//2][i%2]
cur_ax.bar(ind - width/2, cur_w1, width, color='SkyBlue', label='Current Model')
cur_ax.bar(ind + width/2, cur_w2, width, color='IndianRed', label='Previous Model')
cur_ax.set_ylabel('Sum of Weights')
cur_ax.set_title('Sum of all weights connected by feature')
cur_ax.set_xticks(ind)
cur_ax.legend()
cur_ax.set_ylim(0, 30)
plt.show()
Resulting in the following plot:
MatPlotLib plot
I then try to compare the vectors to deduce:
If the vectors have been changed drastically - there might be some major change in the training data or some problem while retraining the model.
If some value is close to zero the model might have recognized this feature as not important.
I want your opinion and insights on the following:
The overall approach to this experiment.
Advice on other ideas on reverse engineering on a given model.
Insights on the output I provide here.
Thank you all, I am open to any suggestions and critic!
This type of deduction is not entirely true. The combination between the features is not linear. It is true that if is strictly 0 does not matter, but it may be that it is then recombined in another way and in another deep layer.
It would be true if your model is linear. In fact, this is how the PCA analysis works, where it searches for linear relationships through the covariance matrix. The eigenvalue would indicate the importance of each feature.
I think that there are several ways to confirm your suspicions:
Eliminate features that you think are not important to train again and see the result. If it is similar, your suspicions are correct.
Apply the current model, take an example (we will call it as pivot) to evaluate and significantly change the features that you consider irrelevant and create many examples. This applies for several pivots. If the result is similar, that field should not matter. Example (I consider the first feature to be irrelevant):
data = np.array([[0.5, 1, 0.5], [1, 2, 5]])
range_values = 50
new_data = []
for i in range(data.shape[0]):
sample = data[i]
# We create new samples
for i in range (1000):
noise = np.random.rand () * range_values
new_sample = sample.copy()
new_sample[0] += noise
new_data.append(new_sample)
I'm trying to combine Holoviews' Pointdraw functionality with its Sample functionality (I couldn't find a specific page, but it is shown in action here http://holoviews.org/gallery/demos/bokeh/mandelbrot_section.html)
Specifically, I want to have two subplots with interactivity. The one on the left shows a colormap, and the one on the right shows a sample (a linecut) of the colormap. This is achieved with .sample. Inside this right plot I'd like to have points that can be drawn, moved, and removed, typically done with pointdraw. I'd then also like to access their coordinates once I am done moving, which is possible when following the example from the documentation.
Now, I've got the two working independently, following the examples above. But when combined in the way that I have, this results in a plot that looks like this:
It has the elements I am looking for, except the points cannot be interacted with. This is somehow related to Holoviews' streams, but I am not sure how to solve it. Would anyone be able to help out?
The code that generates the above:
%%opts Points (color='color' size=10) [tools=['hover'] width=400 height=400]
%%opts Layout [shared_datasource=True] Table (editable=True)
import param
import numpy as np
import holoviews as hv
hv.extension('bokeh', 'matplotlib')
from holoviews import streams
def lorentzian(x, x0, gamma):
return 1/np.pi*1/2*gamma/((x-x0)**2+(1/2*gamma)**2)
xs = np.arange(0,4*np.pi,0.05)
ys = np.arange(0,4*np.pi,0.05)
data = hv.OrderedDict({'x': [2., 2., 2.], 'y': [0.5, 0.4, 0.2], 'color': ['red', 'green', 'blue']})
z = lorentzian(xs.reshape(len(xs),1),2*np.sin(ys.reshape(1,len(ys)))+5,1) + lorentzian(xs.reshape(len(xs),1),-2*np.sin(ys.reshape(1,len(ys)))+5,1)
def dispersions(f0):
points = hv.Points(data, vdims=['color']).redim.range(x=(xs[0], xs[-1]), y=(np.min(z), np.max(z)))
point_stream = streams.PointDraw(data=points.columns(), source=points, empty_value='black')
image = hv.Image(z, bounds=(xs[0], ys[0], xs[-1], ys[-1]))
return image* hv.VLine(x=f0) + image.sample(x=f0)*points
dmap = hv.DynamicMap(dispersions, kdims=['f0'])
dmap.redim.range(f0=(0,10)).redim.step(f0=(0.1))
I apologize for the weird function that we are plotting, I couldn't immediately come up with a simple one.
Based on your example it's not yet quite clear to me what you will be doing with the points but I do have some suggestions on structuring the code better.
In general it is always better to compose plots from several separate DynamicMaps than creating a single DynamicMap that does everything. Not only is it more composable but you also get handles on the individual objects allowing you to set up streams to listen to changes on each component and most importantly it's more efficient, only the plots that need to be updated will be updated. In your example I'd split up the code as follows:
def lorentzian(x, x0, gamma):
return 1/np.pi*1/2*gamma/((x-x0)**2+(1/2*gamma)**2)
xs = np.arange(0,4*np.pi,0.05)
ys = np.arange(0,4*np.pi,0.05)
data = hv.OrderedDict({'x': [2., 2., 2.], 'y': [0.5, 0.4, 0.2], 'color': ['red', 'green', 'blue']})
points = hv.Points(data, vdims=['color']).redim.range(x=(xs[0], xs[-1]), y=(np.min(z), np.max(z)))
image = hv.Image(z, bounds=(xs[0], ys[0], xs[-1], ys[-1]))
z = lorentzian(xs.reshape(len(xs),1),2*np.sin(ys.reshape(1,len(ys)))+5,1) + lorentzian(xs.reshape(len(xs),1),-2*np.sin(ys.reshape(1,len(ys)))+5,1)
taps = []
def vline(f0):
return hv.VLine(x=f0)
def sample(f0):
return image.sample(x=f0)
dim = hv.Dimension('f0', step=0.1, range=(0,10))
vline_dmap = hv.DynamicMap(vline, kdims=[dim])
sample_dmap = hv.DynamicMap(sample, kdims=[dim])
point_stream = streams.PointDraw(data=points.columns(), source=points, empty_value='black')
(image * vline_dmap + sample_dmap * points)
Since the Image and Points are not themselves dynamic there is no reason to put them inside the DynamicMap and the VLine and the sampled Curve are easily split out. The PointDraw stream doesn't do anything yet but you can now set that up as yet another DynamicMap which you can compose with the rest.