bayespy - does it work online? - python

how to define the chain if it is intended to work online i.e. the length of the chain will change at each iteration in a cumulative manner?
for example the bayespy categorical markov chain function asks for the number of states in the chain if it cannot infer it (which by providing it, it means the length is fixed) and also the evidence variables observed with observe() from bayespy.inference.vmp.nodes.expfamily.ExponentialFamily module, it needs to have the same length as the chain. The data associated with such variables instead is known only at runtime for me.

Related

Deciding the minimum support threshold for the frequent itemset generation in apriori algorithm

I want to find the minimum support threshold for the apriori algorithm.I know its completely user and dataset dependent, but I found an article where an exponential decay function was used.
http://data-mining.philippe-fournier-viger.com/how-to-auto-adjust-the-minimum-support-threshold-according-to-the-data-size/
This is the link where the formula is derived against the apriori algorithm. I want to know how to decide the values for the constants 'a','b' as they can differ on the user's basis as well. Also 'c' is said to be the minimum possible support. I used the inbuilt python mlxtend package to generate the frequent itemsets, where min_support acts as one of the inputs. So, how do we decide 'c' before using the apriori method and henceforth generating the minimum possible support
Having to choose a, b and c is no better than having to choose just over parameter...
I think this equation is meant for adjusting your parameter over time as your data set grows. But it seems you already have found multiple decent parameters to compute a and b.

Why are components executed two times for each Gauss-Seidel iteration? (OpenMDAO 2.4.0)

I've been using the the NonLinearBlockGS as nonlinear_solver for my MDO system consisting of ExplicitComponents and this works as expected. First I was using this with simple mathematical functions (hence runtime << 1s), but now I'm also implementing a system with multiple explicit components that have runtimes of around one minute or more. That's when I noticed that the NonLinearBlockGS solver actually needs to run the tools in the coupled system two times per iteration. These runs originate from the self._iter_execute() and the self._run_apply() in the _run_iterator() method of the solver (class Solver in file solver.py).
My main question is, are two runs per iteration really required, and if so, why?
It seems the first component run (self.iter_execute()) uses an initial guess for the feedback variables that need to be converged and then runs the components sequentially while updating any feedforward data. This is the step I would expect for Gauss-Seidel. But then the second component run (self._run_apply()) runs the components again with the updated feedback variables that resulted from the first run while keeping the feedforwards as they were in that first run. If I'm not mistaken, this information is then (only) used to assess the convergence of that iteration (self._iter_get_norm()).
Instead of having this second run inside the iteration, wouldn't it be more efficient to directly continue to the next iteration? In that iteration we can use the new values of the feedback variables and do another self._iter_execute() with the update of feedforward data and then assess the convergence based on the difference between the results between those two iterations. Of course this means that we need at least two iterations to assess convergence, but it saves one component run per iteration. (This is actually the existing implementation that I have for convergence of these components in MATLAB and that works as expected, hence it finds the same converged design, but with half the amount of component runs.)
So another way of putting this is: why do we need the self._run_apply() in each iteration when doing Gauss-Seidel convergence? (And could this be turned off?)
There are a couple of different aspects to your question. First, I'll address the details of solve_nonlinear vs apply_nonlinear. With underlying mathematical algorithms of OpenMDAO, based on the MAUD framework , solve_nonlinear computes the values of the output values only (does not set residuals). apply_nonlinear computes only the residuals (and does not set outputs).
For sub-classes of ExplicitComponent, the user only implements a compute method, and the base class implements both solve_nonlinear and apply_nonlinear using compute.
As you described it, in OpenMDAO V2.4 current implementation of NonlinearBlockGaussSeidel for each iteration, performs one recursive solve_nonlinear call on its group and then calls apply_nonlinear to check the residual and look for convergence.
However, you're also correct that we could be doing this more efficiently. The modification you suggested to the algorithm would work, and we'll put it on the development pipeline for for V2.6 (as of the time of this post, we're just about to release V2.5 and there won't be time to add this into that release)

Finding maximum value of unknown target function, given samples

I have a function that takes 4 variable and returns a single float value in range [0,1].
I want to know which inputs will maximize function's output. However, this function runs slow, so I just made 1000 random samples. i.e. 1000 tuples of (input, output)
Is there any good method to predict values that maximize my function with these tuples? I don't care if there are more function running, but not many.
Thanks in advance.
No there is no general method to do what you're asking.
Global optimization is a collection of techniques (and a whole field of study) that are used to minimize a function based on some of its general properties. Without more information about the underlying function, niave random sampling (as you're doing) is a 'reasonable' approach.
You're best best is to find additional information about the character of your function mapping (is the output spikey or smoothly varying with the input? Are there lots of minima, or just a few?), or just keep sampling.

tf.constant Vs tf.placeholder

I am going through Andrew Ng's deep learning course and I don't understand the basics purpose of using constants. When place holders can do the trick, why do we need constants? Suppose I need to calculate a function..the same can be performed by taking constants as well as placeholders. I am very confused. Shall be really grateful if anyone can shed some light.
Constants and placeholders are both nodes in the computation graph with zero inputs and one outputs -- that is, they represent constant values.
The difference is when you as the programmer specify those values. With a constant, the value is a part of the computation graph itself, specified when the constant is created: tf.constant(4), for instance. With a placeholder, every time you run the computation graph, you can feed in a different value in your feed_dict.
In machine learning, placeholders are usually used for nodes that hold data, because we may want to run the same graph again and again, in a loop, with different parts of our dataset. (This would be impossible using constants.) People also use placeholders for parameters that change during training, like the learning rate. (Training generally involves running your computation graph over and over again with different placeholder values.) Constants are used only for things that are actually constant. For those things, we don't want to use placeholders, because we don't want to have to specify them over and over every time we run our graph.
If you're curious, this Jupyter notebook has an in-depth explanation of the computation graph and the role played by placeholders, constants, and variables: https://github.com/kevinjliang/Duke-Tsinghua-MLSS-2017/blob/master/01B_TensorFlow_Fundamentals.ipynb
As their names indicate, a placeholder does not have any fixed value but just 'holds place for a tensor' which is needed in a computation graph. Whereas constant is something (which also holds a tensor) which holds a fixed value. A constant does not change its value during its lifetime (not just a session). Once defined (during programming), it's fixed at that. A placeholder on the other hand, does not indicate any value during graph definition (programming), but gets its value fed in at the time of session run start. In fact, all the placeholders should get their value in such manner.
session.run(a_variable, feed_dict={a_placeholder: [1.0, 2.1]})
Now it might come to one's mind that how is a placeholder different than a tf.variable, well a placeholder can't be asked to be evaluated to a session, like a variable can be:
session.run(a_tf_variable)
Typical use of placeholders is for input nodes, where we feed in the values for different inputs (and we don't expect them to be asked to be evaluated). Typical use for constants is holding values like PI or areas of geographical blocks/districts in population study.

How to get a family of independent universal hash function?

I am trying to implement the hyperloglog counting algorithm using stochastic averaging. To do that, I need many independent universal hash functions to hash items in different substreams.
I found that there are only a few hash function available in hashlib
and there seems to be no way for me to provide a seed or something? I am thinking using different salts for different substreams.
You probably DON'T need different hash functions. A common solution to this problem is to use only part of the hash to compute the HyperLogLog rho statistic, and the other part to select the substream. If you use a good hash function (e.g. murmur3), it effectively behaves as multiple independent ones.
See the "stochastic averaging" section here for an explanation of this:
https://research.neustar.biz/2012/10/25/sketch-of-the-day-hyperloglog-cornerstone-of-a-big-data-infrastructure/

Categories