How can I include categorical distributions as observed data in PyMC3?

How can I include categorical distributions as observed data in PyMC3? - python

I have a dataset where each observation is basically a series of noisy measurements and one of the measurements contains signal.
Raw observed data y:
[[ 1.93542253e-01 1.39657327e-04 7.53918636e-01 5.23994535e-02]
[ 6.44964587e-02 8.50087384e-01 1.09894665e-02 7.44266910e-02]
[ 1.68387463e-02 5.38121456e-01 6.98554551e-02 3.75184342e-01]
...,
[ 5.79786789e-01 1.47417427e-02 3.15395731e-01 9.00757372e-02]
[ 8.66796124e-02 8.66999904e-02 4.47848127e-02 7.81835584e-01]
[ 8.18765043e-01 3.23448859e-03 5.61247840e-04 1.77439220e-01]]
I want to put each observation into an appropriate cluster based on this measurement data. For example the first datapoint above is drawn from the third column and the second datapoint above is drawn from the second column. If I sample from the known original distribution and provide those samples to the model as inputs to the Categorical I can get back the original distribution.
Sampled from original data y_choice:
[ 2. 3. 3. 1. 2. 2. 2. 2. 3. 3. 1. 2. 3. 0. 2. 0. 3. 1.
3. 0. 2. 0. 3. 0. 2. 0. 1. 0. 3. 0. 2. 0. 0. 0. 3. 0.
2. 0. 0. 3. 3. 1. ...
However this seems like I'm losing information because my choice sampler is outside the PyMC model. How can I supply the actual observed data y directly into the model? I'm guessing it has something to do with another model parameter based on the Dirichlet, but I haven't been able to wrap my head around how that works.
The sample code I'm operating from is below. I want to be able to supply y to the model and get the true_probs back out, but I've only managed to get it to work with y_choice so far.
import numpy as np
from pymc3 import *
import pymc3 as pm
import pandas as pd
print 'pymc3 version: ' + pm.__version__
def generate_noisy_observations():
y = np.ones((sample_size,k))
for i in range(sample_size):
#print("Iteration %d" % i)
true_category = np.random.choice(k, size=1, p=true_probs)
true_distribution = np.zeros(k)
true_distribution[true_category] = 1
noise_distribution = np.random.dirichlet(np.ones(k))
noise = np.random.normal(0,1,k)
distribution_weights = [0.9, 0.1]
raw_distribution = (true_distribution*distribution_weights[0] + noise**2*distribution_weights[1] )/\ (np.sum(true_distribution*distribution_weights[0])+np.sum(noise**2*distribution_weights[1]))
y[i] = raw_distribution
return y
def generate_choices_from_noisy_observations(y):
y_choice = np.ones(sample_size)
for i in range(sample_size):
y_choice[i] = np.random.choice(k, size=1, p=y[i])
return y_choice
sample_size = 1000
true_probs = [0.2, 0.1, 0.3, 0.4]
k = len(true_probs)
y = generate_noisy_observations()
y_choice = generate_choices_from_noisy_observations(y)
with pm.Model() as multinom_test:
probs = pm.Dirichlet('a', a=np.ones(k))
#data = Categorical('data',p = probs, observed = y)
data = Categorical('data',p = probs, observed = y_choice)
start = pm.find_MAP()
trace = pm.sample(50000, pm.Metropolis())
pm.traceplot(trace[500:])

Related

How to randomly change some members of a 2D-matrix to zero in python?

I have a 2D-matrix of some numbers and I want to randomly change a fraction of the non-zero members (e.x. 0.2) to become zero and then again randomly choose equal to that fraction amount (0.2) between all zeroes and give them random numbers. Is there any straight forward way to do that?
for example:
The original matrix is : x = [[1,2,3],[4,0,7],[2,10,0]]
After first step (2 randomly selected numbers change to zero): x = [[1,0,0],[4,0,7],[2,10,0]]
After second step (2 randomly selected zeros change to random numbers): x = [[1,0,5],[4,7,7],[2,10,0]]

One method:
arr = np.ones((5, 5)) # Your matrix
print("Before Replacement")
print(arr)
# Number of elements to replace
num_replaced = 3
# Random (x, y) coordinates
indices_x = np.random.randint(0, arr.shape[0], num_replaced)
indices_y = np.random.randint(0, arr.shape[1], num_replaced)
arr[indices_x, indices_y] = 0
print("After replacement")
print(arr)
Sample Output:
Before Replacement
[[1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1.]]
After replacement
[[0. 1. 1. 1. 1.]
[1. 0. 1. 1. 1.]
[1. 1. 1. 1. 1.]
[1. 0. 1. 1. 1.]
[1. 1. 1. 1. 1.]]
EDIT
You can use np.random.choice instead on np.random.randint as follows:
indices_x = np.random.choice(range(arr.shape[0]), num_replaced, replace=REPLACE)
indices_y = np.random.choice(range(arr.shape[1]), num_replaced, replace=REPLACE)
Here, you can easily switch between sampling with or without replacement.

I would try to create a simple function for this. So you can input the number desired.
import pandas as pd
import random
def random_converter(dataframe, k, isZero=True, input_data='random_value'):
# Copy df
dataframe_local = dataframe.copy()
if input_data=='random_value':
input_data = random.randint(0,10)
ki = 0
while ki < k:
row_selected = dataframe_local.sample(1).T
# VERIFY CONDITION
if isZero:
attributes = row_selected[row_selected.iloc[:, 0] == 0]
else:
attributes = row_selected[row_selected.iloc[:, 0] != 0]
# No zero in the row
if attributes.size == 0:
continue
column_index = attributes.index
row_index = attributes.columns
dataframe_local.iloc[row_index, column_index] = input_data
ki += 0
return dataframe_local

How can I implement locally connected layer in pure Numpy

I would like to build a locally connected weight matrix that represents a locally connected neural network in pure python/numpy without deep learning frameworks like Torch or TensorFlow.
The weight matrix is a non-square 2D matrix with the dimension (number_input, number_output). (an autoencoder in my case; input>hidden)
So the function I would like to build, take the matrix dimension and the size of the receptive field (number of local connection) and give the associated weight matrix. I've already create a function like this, but for an input size of 8 and an output size of 4 (and RF = 4) my function output :
[[ 0.91822845 0. 0. 0. ]
[-0.24264655 -0.54754138 0. 0. ]
[ 0.55617366 0.12832513 -0.28733965 0. ]
[ 0.27993286 -0.33150324 0.06994107 0.61184121]
[ 0. 0.04286912 -0.20974503 -0.37633903]
[ 0. 0. -0.10386762 0.33553009]
[ 0. 0. 0. 0.09562682]
[ 0. 0. 0. 0. ]]
but I would like :
[[ 0.91822845 0. 0. 0. ]
[-0.24264655 -0.54754138 0. 0. ]
[ 0.55617366 0.12832513 0. 0. ]
[ 0 -0.33150324 0.06994107 0 ]
[ 0. 0.04286912 -0.20974503 0. ]
[ 0. 0. -0.10386762 0.33553009]
[ 0. 0. 0.11581854 0.09562682]
[ 0. 0. 0. 0.03448418]]
Here's my python code :
import numpy as np
def local_weight(input_size, output_size, RF):
input_range = 1.0 / input_size ** (1/2)
w = np.zeros((input_size, output_size))
for i in range(0, RF):
for j in range(0, output_size):
w[j+i, j] = np.random.normal(loc=0, scale=input_range, size=1)
return w
print(local_weight(8, 4, 4))
I look forward for your response!

The trick is in a small pad to work more comfortably (or control the limits).
Then you must define the step you will take with respect to the input (it is not more than the input / output). Once this is done you just have to fill in the gaps and then remove the pad.
import math
import numpy as np
def local_weight(input_size, output_size, RF):
input_range = 1.0 / input_size ** (1/2)
padding = ((RF - 1) // 2)
w = np.zeros(shape=(input_size + 2*padding, output_size))
step = float(w.shape[0] - RF) / (output_size - 1)
for i in range(output_size):
j = int(math.ceil(i * step))
j_next = j + RF
w[j:j_next, i] = np.random.normal(loc=0, scale=input_range, size=(j_next - j))
return w[padding:-padding, :]
I hope that is what you are looking for.
EDIT:
I think the implementation was misguided. I reimplement the function, we go by parts.
I calculate the radius of the receptive field (padding).
Determine the size of the W.
I calculate the step by removing the padding area so that I always stay inside.
I calculate the weights.
Remove the padding.

Applying softmax to non-zero elements in the matrix across a dimension

Perhaps this is trivial, but perhaps it is not. I have spent way too much time trying to figure out how to make this work. Here is the code:
# batch x time x events
batch = 2
time = 3
events = 4
tensor = np.random.rand(batch, time, events)
tensor[0][0][2] = 0
tensor[0][0][3] = 0
tensor[0][1][3] = 0
tensor[0][2][1] = 0
tensor[0][2][2] = 0
tensor[0][2][3] = 0
tensor[1][0][3] = 0
non_zero = ~tf.equal(tensor, 0.)
s = tf.Session()
g = tf.global_variables_initializer()
s.run(g)
s.run(non_zero)
I am trying to apply tf.nn.softmax to the non-zero values across each of the time dimensions. However, when I am using tf.boolean_mask then it actually gathers all of the non-zero values together. That is not what I want. I want to preserve the dimensions.
Here is the screenshot of what the tensor looks like:
So tf.nn.softmax should be applied to only those groups and it should "put them back" into their original positions. Does anyone know how to do this?
EDIT:
I almost found a solution that I need, with your help guys. But I am still missing one step. Assigning the softmax across each time dimension to the non-zero values:
def apply_sparse_softmax(time_vector):
non_zeros = ~tf.equal(time_vector, 0.)
sparse_softmax = tf.nn.softmax(tf.boolean_mask(time_vector, non_zeros))
new_time_vector = sparse_softmax * tf.cast(non_zeros, tf.float64) # won't work because dimensions are different
return time_vector
Please also note that this solution should handle the cases when you have zeros all across the time dimension. Then it should just stay the same.

possible duplicate: Applying tf.nn.softmax() only to positive elements of a tensor
With the help of tf.map_fn and tf.where
session.run(tf.map_fn(
lambda x : tf.where(x > 0, tf.nn.softmax(x,axis=2,name="pidgeon"), x), tensor))
Tested for np.random.seed(1992)
# tensor
[[[0.86018176 0.42148685 0. 0. ]
[0.64714 0.68271286 0.6449022 0. ]
[0.92037941 0. 0. 0. ]]
[[0.38479139 0.26825327 0.43027759 0. ]
[0.56077674 0.49309016 0.2433904 0.85396874]
[0.1267429 0.1861004 0.92251748 0.67904445]]]
# result
[[[0.34841156, 0.33845624, 0. , 0. ],
[0.28155918, 0.43949257, 0.48794109, 0. ],
[0.37002926, 0. , 0. , 0. ]],
[[0.33727059, 0.31513436, 0.2885575 , 0. ],
[0.40216839, 0.39458556, 0.23936921, 0.44145382],
[0.26056102, 0.29028008, 0.47207329, 0.37060957]]])
0.34841156 == np.exp(0.86018176) / (np.exp(0.86018176) + np.exp(0.64714) + np.exp(0.92037941))

This is my approach using numpy and tensorflow:
> tensor
array([[[0.2891092 , 0.76259227, 0. , 0. ],
[0.93660715, 0.18361367, 0.07234135, 0. ],
[0.23128076, 0. , 0. , 0. ]],
[[0.45708066, 0.76883403, 0.7584804 , 0. ],
[0.51019332, 0.73361557, 0.87442305, 0.66796383],
[0.9297317 , 0.22428208, 0.69184613, 0.06162719]]])
Find mask of non-zero elemets
non_zero = ~tf.equal(tensor, 0.)
# convert to numpy
with tf.Session() as sess:
non_zero_mask = non_zero.eval()
Retrieve the non-zero values
non_zero_val = tensor[non_zero_mask]
> non_zero_val
array([0.2891092 , 0.76259227, 0.93660715, 0.18361367, 0.07234135,
0.23128076, 0.45708066, 0.76883403, 0.7584804 , 0.51019332,
0.73361557, 0.87442305, 0.66796383, 0.9297317 , 0.22428208,
0.69184613, 0.06162719])
Apply softmax on non-zero values
# apply softmax
soft_max = tf.nn.softmax(non_zero_val)
# convert to numpy
with tf.Session() as sess:
soft_max_np = soft_max.eval()
> soft_max_np
array([0.04394964, 0.07056453, 0.08397696, 0.03954934, 0.0353846 ,
0.04148019, 0.05198816, 0.07100635, 0.07027497, 0.05482403,
0.06854914, 0.07891397, 0.06419332, 0.08340156, 0.0411909 ,
0.06574485, 0.0350075 ])
Update tensor with softmax applied to non-zero elements
tensor[non_zero_mask] = soft_max_np
tensor
array([[[0.04394964, 0.07056453, 0. , 0. ],
[0.08397696, 0.03954934, 0.0353846 , 0. ],
[0.04148019, 0. , 0. , 0. ]],
[[0.05198816, 0.07100635, 0.07027497, 0. ],
[0.05482403, 0.06854914, 0.07891397, 0.06419332],
[0.08340156, 0.0411909 , 0.06574485, 0.0350075 ]]])

OK, I figured out a solution from tenticon's duplicate link and his answer. Although this fails when the whole time vector is zeros. So I still need to fix that. Happy to hear your suggestions. But here is the solution:
def sparse_softmax(T):
# Creating partition based on condition:
condition_mask = tf.cast(tf.greater(T, 0.), tf.int32)
partitioned_T = tf.dynamic_partition(T, condition_mask, 2)
# Applying the operation to the target partition:
partitioned_T[1] = tf.nn.softmax(partitioned_T[1])
# Stitching back together, flattening T and its indices to make things easier::
condition_indices = tf.dynamic_partition(tf.range(tf.size(T)), tf.reshape(condition_mask, [-1]), 2)
res_T = tf.dynamic_stitch(condition_indices, partitioned_T)
res_T = tf.reshape(res_T, tf.shape(T))
return res_T
my_softmax = tf.map_fn(lambda batch:
tf.map_fn(lambda time_vector: sparse_softmax(time_vector), batch, dtype=tf.float64)
, tensor, dtype=tf.float64)
Another solution I came up with that still suffers when the whole vector is zeros:
def softmax(tensor):
# tensor_ = tf.placeholder(dtype=tf.float64, shape=(4,))
non_zeros = ~tf.equal(tensor, 0.)
sparse_softmax = tf.nn.softmax(tf.boolean_mask(tensor, non_zeros))
sparse_softmax_shape = tf.shape(sparse_softmax)[0]
orig_shape = tf.shape(tensor)[0]
shape_ = orig_shape-sparse_softmax_shape
zeros = tf.zeros(shape=shape_, dtype=tf.float64)
new_vec = tf.concat([sparse_softmax, zeros], axis=0)
return new_vec
but this does not work.... i.e. this is supposed to return zeros vector when the vector is all zeros, instead I get reshape error for some sort of an empty tensor..
def softmax_(tensor):
zeros = tf.cast(tf.equal(tensor, 0.), tf.float64)
cond_ = tf.reduce_sum(zeros)
def true_fn():
non_zeros = ~tf.equal(tensor, 0.)
sparse_softmax = tf.nn.softmax(tf.boolean_mask(tensor, non_zeros))
sparse_softmax_shape = tf.shape(sparse_softmax)[0]
orig_shape = tf.shape(tensor)[0]
shape_ = orig_shape-sparse_softmax_shape
zeros = tf.zeros(shape=shape_, dtype=tf.float64)
new_vec = tf.concat([sparse_softmax, zeros], axis=0)
return new_vec
def false_fn():
return tf.zeros(shape=tf.shape(tensor), dtype=tf.float64)
return tf.cond(tf.equal(cond_, tf.cast(tf.shape(tensor)[0], tf.float64)), false_fn, true_fn)
Still can't make it work for the vector of all zeros. Would be glad to hear about your solutions.
EDIT: actually the last code snippet works exactly how I want.

Predicting missing values in recommender System

I am trying to implement Non-negative Matrix Factorization so as to find the missing values of a matrix for a Recommendation Engine Project. I am using the nimfa library to implement matrix factorization. But can't seem to figure out how to predict the missing values.
The missing values in this matrix is represented by 0.
a=[[ 1. 0.45643546 0. 0.1 0.10327956 0.0225877 ]
[ 0.15214515 1. 0.04811252 0.07607258 0.23570226 0.38271325]
[ 0. 0.14433757 1. 0.07905694 0. 0.42857143]
[ 0.1 0.22821773 0.07905694 1. 0. 0.27105237]
[ 0.06885304 0.47140452 0. 0. 1. 0.13608276]
[ 0.00903508 0.4592559 0.17142857 0.10842095 0.08164966 1. ]]
import nimfa
model = nimfa.Lsnmf(a, max_iter=100000,rank =4)
#fit the model
fit = model()
#get U and V matrices from fit
U = fit.basis()
V = fit.coef()
print numpy.dot(U,V)
But the ans given is nearly same as a and I can't predict the zero values.
Please tell me which method to use or any other implementations possible and any possible resources.
I want to use this function to minimize the error in predicting the values.
error=|| a - UV ||_F + c*||U||_F + c*||V||_F
where _F denotes the frobenius norm

I have not used nimfa before so I cannot answer on exactly how to do that, but with sklearn you can perform a preprocessor to transform the missing values, like this:
In [28]: import numpy as np
In [29]: from sklearn.preprocessing import Imputer
# prepare a numpy array
In [30]: a = np.array(a)
In [31]: a
Out[31]:
array([[ 1. , 0.45643546, 0. , 0.1 , 0.10327956,
0.0225877 ],
[ 0.15214515, 1. , 0.04811252, 0.07607258, 0.23570226,
0.38271325],
[ 0. , 0.14433757, 1. , 0.07905694, 0. ,
0.42857143],
[ 0.1 , 0.22821773, 0.07905694, 1. , 0. ,
0.27105237],
[ 0.06885304, 0.47140452, 0. , 0. , 1. ,
0.13608276],
[ 0.00903508, 0.4592559 , 0.17142857, 0.10842095, 0.08164966,
1. ]])
In [32]: pre = Imputer(missing_values=0, strategy='mean')
# transform missing_values as "0" using mean strategy
In [33]: pre.fit_transform(a)
Out[33]:
array([[ 1. , 0.45643546, 0.32464951, 0.1 , 0.10327956,
0.0225877 ],
[ 0.15214515, 1. , 0.04811252, 0.07607258, 0.23570226,
0.38271325],
[ 0.26600665, 0.14433757, 1. , 0.07905694, 0.35515787,
0.42857143],
[ 0.1 , 0.22821773, 0.07905694, 1. , 0.35515787,
0.27105237],
[ 0.06885304, 0.47140452, 0.32464951, 0.27271009, 1. ,
0.13608276],
[ 0.00903508, 0.4592559 , 0.17142857, 0.10842095, 0.08164966,
1. ]])
You can read more here.

Does this function compute convolution correctly?

I need to write a basic function that computes a 2D convolution between a matrix and a kernel.
I have recently got into Python, so I'm sorry for my mistakes.
My dissertation teacher said that I should write one by myself so I can handle it better and to be able to modify it for future improvements.
I have found an example of this function on a website, but I don't understand how the returned values are obtained.
This is the code (from http://docs.cython.org/src/tutorial/numpy.html )
from __future__ import division
import numpy as np
def naive_convolve(f, g):
# f is an image and is indexed by (v, w)
# g is a filter kernel and is indexed by (s, t),
# it needs odd dimensions
# h is the output image and is indexed by (x, y),
# it is not cropped
if g.shape[0] % 2 != 1 or g.shape[1] % 2 != 1:
raise ValueError("Only odd dimensions on filter supported")
# smid and tmid are number of pixels between the center pixel
# and the edge, ie for a 5x5 filter they will be 2.
#
# The output size is calculated by adding smid, tmid to each
# side of the dimensions of the input image.
vmax = f.shape[0]
wmax = f.shape[1]
smax = g.shape[0]
tmax = g.shape[1]
smid = smax // 2
tmid = tmax // 2
xmax = vmax + 2*smid
ymax = wmax + 2*tmid
# Allocate result image.
h = np.zeros([xmax, ymax], dtype=f.dtype)
# Do convolution
for x in range(xmax):
for y in range(ymax):
# Calculate pixel value for h at (x,y). Sum one component
# for each pixel (s, t) of the filter g.
s_from = max(smid - x, -smid)
s_to = min((xmax - x) - smid, smid + 1)
t_from = max(tmid - y, -tmid)
t_to = min((ymax - y) - tmid, tmid + 1)
value = 0
for s in range(s_from, s_to):
for t in range(t_from, t_to):
v = x - smid + s
w = y - tmid + t
value += g[smid - s, tmid - t] * f[v, w]
h[x, y] = value
return h
I don't know if this function does the weighted sum from input and filter, because I see no sum here.
I applied this with
kernel = np.array([(1, 1, -1), (1, 0, -1), (1, -1, -1)])
file = np.ones((5,5))
naive_convolve(file, kernel)
I got this matrix:
[[ 1. 2. 1. 1. 1. 0. -1.]
[ 2. 3. 1. 1. 1. -1. -2.]
[ 3. 3. 0. 0. 0. -3. -3.]
[ 3. 3. 0. 0. 0. -3. -3.]
[ 3. 3. 0. 0. 0. -3. -3.]
[ 2. 1. -1. -1. -1. -3. -2.]
[ 1. 0. -1. -1. -1. -2. -1.]]
I tried to do a manual calculation (on paper) for the first full iteration of the function and I got 'h[0,0] = 0', because of the matrix product: 'filter[0, 0] * matrix[0, 0]', but the function returns 1. I am very confused with this.
If anyone can help me understand what is going on here, I would be very grateful. Thanks! :)

Yes, that function computes the convolution correctly. You can check this using scipy.signal.convolve2d
import numpy as np
from scipy.signal import convolve2d
kernel = np.array([(1, 1, -1), (1, 0, -1), (1, -1, -1)])
file = np.ones((5,5))
x = convolve2d(file, kernel)
print x
Which gives:
[[ 1. 2. 1. 1. 1. 0. -1.]
[ 2. 3. 1. 1. 1. -1. -2.]
[ 3. 3. 0. 0. 0. -3. -3.]
[ 3. 3. 0. 0. 0. -3. -3.]
[ 3. 3. 0. 0. 0. -3. -3.]
[ 2. 1. -1. -1. -1. -3. -2.]
[ 1. 0. -1. -1. -1. -2. -1.]]
It's impossible to know how to explain all this to you since I don't know where to start, and I don't know how all the other explanations aren't working for you. I think, though, that you are doing all of this as a learning exercise so you can figure this out for yourself. From what I've seen on SO, asking big questions on SO is not a substitute for working it through yourself.
Your specific question of why does
h[0,0] = 0
in your calculation not match this matrix is a good one. In fact, both are correct. The reason for mismatch is that the output of the convolution doesn't have the mathematical indices specified, but instead they are implied. The center, which is mathematically indicated by the indices [0,0] corresponds to x[3,3] in the matrix above.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How can I include categorical distributions as observed data in PyMC3? - python

Related

How to randomly change some members of a 2D-matrix to zero in python?

How can I implement locally connected layer in pure Numpy

Applying softmax to non-zero elements in the matrix across a dimension

Predicting missing values in recommender System

Does this function compute convolution correctly?

Categories

Resources