I am trying to write a custom layer in Keras (with tensorflow backend) that makes certain positions binary.
For example suppose that I have [0.6,0.8,0.9,0.2] and that position 1 and 3 must be binary, I would like to have a layer that outputs [0.6,1,0.9,0]
e.g. output[pos] > 0.5 then output[pos] = 1, else output[pos] = 0
I wrote this but it is not working at all...
...Layers of the net...
x = Lambda(self.adjust_positions)(x)
here the functions I wrote
def update_1(self, x, pos):
with tf.control_dependencies([tf.assign(x[pos],[1])]):
return tf.identity(x)
def update_0(self, x, pos):
with tf.control_dependencies([tf.assign(x[pos],[0])]):
return tf.identity(x)
def adjust_positions(self, x):
for pos in indexes:
tf.cond(tf.gather(x, pos)<[0.5], self.update_0(x, pos), self.update_1(x,pos))
return x
The error I get is:
ValueError: Sliced assignment is only supported for variables
55 def update_0(self, x, pos):
---> 56 with tf.control_dependencies([tf.assign(x[pos],[0])]):
57 return tf.identity(x)
How can I implement this functionality? What I have done is reasonable?
Related
Why am I getting a singular error when I run it as a function, but when it was done piece wise before it worked correctly? Using the same matrices for both. Following along from a video and I am not able to see what I did different when looking though what he did.
#Artificial data for learning purpose provided by Alex
X= [
[148, 24,1385],
[132,25,2031],
[453,11,86],
[158,24,185],
[172,25,201],
[413,11,86],
[38,54,185],
[142,25,431],
[453,31,86]
]
X = np.array(X)
# Add in the bias (default) value of car before calculating features.
ones = np.ones(X.shape[0])
X = np.column_stack([ones, X])
# y values provided by Alex
y = [10000,20000,15000,20050,10000,20000,15000,25000,12000]
XTX = X.T.dot(X)
XTX_inv = np.linalg.inv(XTX)
w_full = XTX_inv.dot(X.T).dot(y)
# bias value
w0 = w_full[0]
# features
w = w_full[1:]
print(w0, w)
#Output: 25844.754055766753, array([ -16.08906468, -199.47254894, -1.22802883])
At this point the code runs as expected. However, this function gives an error:
# Error in function
def train_linear_regression(X, y):
ones = np.ones(X.shape[0])
X = np.column_stack([ones, X])
XTX = X.T.dot(X)
XTX_inv = np.linalg.inv(XTX)
w = XTX_inv.dot(X.T).dot(y)
return w[0], w[1:]
w0, w = train_linear_regression(X, y)
print(w0, w)
---------------------------------------------------------------------------
LinAlgError Traceback (most recent call last)
<ipython-input-48-ed5d7ddc40b1> in <module>
----> 1 train_linear_regression(X,y)
2 frames
<__array_function__ internals> in inv(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/numpy/linalg/linalg.py in _raise_linalgerror_singular(err, flag)
86
87 def _raise_linalgerror_singular(err, flag):
---> 88 raise LinAlgError("Singular matrix")
89
90 def _raise_linalgerror_nonposdef(err, flag):
LinAlgError: Singular matrix
It looks like you are adding/stacking bias (ones) to X twice. First with in the normal flow and second time within the function, that is leading to determinant of 0 for the matrix XTX in the function.
So you need to remove the addition of ones from one of them.
def train_linear_regression(X, y):
ones = np.ones(X.shape[0])
X = np.column_stack([ones, X]) # stacking ones here again - REMOVE
....
I am implementing this code (found here: https://emukit.readthedocs.io/en/latest/notebooks/Emukit-tutorial-custom-model.html)
import numpy as np
from emukit.experimental_design import ExperimentalDesignLoop
from emukit.core import ParameterSpace, ContinuousParameter
from emukit.core.loop import UserFunctionWrapper
from sklearn.gaussian_process import GaussianProcessRegressor
x_min = -30.0
x_max = 30.0
X = np.random.uniform(x_min, x_max, (10, 1))
Y = np.sin(X) + np.random.randn(10, 1) * 0.05
sklearn_gp = GaussianProcessRegressor();
sklearn_gp.fit(X, Y);
from emukit.core.interfaces import IModel
class SklearnGPModel(IModel):
def __init__(self, sklearn_model):
self.model = sklearn_model
def predict(self, X):
mean, std = self.model.predict(X, return_std=True)
return mean[:, None], np.square(std)[:, None]
def set_data(self, X: np.ndarray, Y: np.ndarray) -> None:
self.model.fit(X, Y)
def optimize(self, verbose: bool = False) -> None:
# There is no separate optimization routine for sklearn models
pass
#property
def X(self) -> np.ndarray:
return self.model.X_train_
#property
def Y(self) -> np.ndarray:
return self.model.y_train_
emukit_model = SklearnGPModel(sklearn_gp)
p = ContinuousParameter('c', x_min, x_max)
space = ParameterSpace([p])
loop = ExperimentalDesignLoop(space, emukit_model)
loop.run_loop(np.sin, 50)
I am trying to implement this code but with the exteral data set. To do this, I need to understand if I can extract the 50 x-values propagated through the np.sin function when the loop.run_loop(np.sin, 50) is executed. Then, having obtained these 50 inputs (x-values), I need to propagate them in an external software, which saves the result as .csv file.
The information that I would have, that needs to be "put through" the loop.run_loop() is as follows:
So, I need to make the loop.run_loop() code work by loading an external results data but do now know how to implement that.
If i understand your question correctly, passing data does not make sense in this context. The default acquisition function will select the next input (or experiment) based on the your model. Your model is updated at each iteration from the outcome of your experiment and the next experiment is dependent on previous observations - it's not random.
Passing your samples independently of this loop would be significantly less informative.
In short, you need to define a function similar to np.sin that can be queried.
Hope this makes sense!
I'm trying to implement my own Bernoulli class with its own fit function in order to fit my train and test lists that contains words (spam detection)
here's my Bernoulli class:
class BernoulliNB(object):
def __init__(self, alpha=1.0):
self.alpha = alpha
def fit(self, X, y):
count_sample = len(X)
separated = [[x for x, t in zip(X, y) if t == c] for c in np.unique(y)]
self.class_log_prior_ = [np.log(len(i) / count_sample) for i in separated]
count = np.array([np.array(i).sum(axis=0) for i in separated]) + self.alpha
smoothing = 2 * self.alpha
n_doc = np.array([len(i) + smoothing for i in separated])
self.feature_prob_ = count / n_doc[np.newaxis].T
return self
def predict_log_proba(self, X):
return [(np.log(self.feature_prob_) * x + \
np.log(1 - self.feature_prob_) * np.abs(x - 1)
).sum(axis=1) + self.class_log_prior_ for x in X]
def predict(self, X):
return np.argmax(self.predict_log_proba(X), axis=1)
And here's my implementation:
nb = BernoulliNB(alpha=1).fit(train_list, test_list)
Expected result:
Been able to fit with my class my train and test lists
But instead I get the following error:
TypeError: cannot perform reduce with flexible type
on the following line:
count = np.array([np.array(i).sum(axis=0) for i in separated]) + self.alpha
I don't know why it fails though, maybe due to the fact that I have lists instead of np? not even sure how to fix it.
Can someone help me or explain to me how to achieve the fitting?
I get this error message by apply sum to a structured array:
In [754]: np.array([(1,.2),(3,.3)], dtype='i,f')
Out[754]: array([(1, 0.2), (3, 0.3)], dtype=[('f0', '<i4'), ('f1', '<f4')])
In [755]: _.sum(axis=0)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-755-69a91062a784> in <module>()
----> 1 _.sum(axis=0)
/usr/local/lib/python3.6/dist-packages/numpy/core/_methods.py in _sum(a, axis, dtype, out, keepdims, initial)
34 def _sum(a, axis=None, dtype=None, out=None, keepdims=False,
35 initial=_NoValue):
---> 36 return umr_sum(a, axis, dtype, out, keepdims, initial)
37
38 def _prod(a, axis=None, dtype=None, out=None, keepdims=False,
TypeError: cannot perform reduce with flexible type
I'm guessing your error occurs in
np.array(i).sum(axis=0)
and that i produces, or is a structured array.
I can't recreate your run just by reading your fit code. You'll need to run it with some diagnostic prints (focus on shape and dtype). A general observation, when running numpy code, never assume that you got things like shape and dtype right. Verify!
I have the following dataframe:
ID Text
1 qwerty
2 asdfgh
I am trying to create md5 hash for Text field and remove ID field from the dataframe above. To achieve that i have created a simple pipeline with custom transformers from sklearn.
Here is the code I have used:
class cust_txt_col(sklearn.base.BaseEstimator, sklearn.base.TransformerMixin):
def __init__(self, key):
self.key = key
def fit(self, x, y=None):
return self
def hash_generate(self, txt):
m = hashlib.md5()
text = str(txt)
long_text = ' '.join(text.split())
m.update(long_text.encode('utf-8'))
text_hash= m.hexdigest()
return text_hash
def transform(self, x):
return x[self.key].apply(lambda z: self.hash_generate(z)).values
class cust_regression_vals(sklearn.base.BaseEstimator, sklearn.base.TransformerMixin):
def fit(self, x, y=None):
return self
def transform(self, x):
x = x.drop(['Gene', 'Variation','ID','Text'], axis=1)
return x.values
fp = pipeline.Pipeline([
('union', pipeline.FeatureUnion([
('hash', cust_txt_col('Text')), # can pass in either a pipeline
('normalized', cust_regression_vals()) # or a transformer
]))
])
When I run this I receive the follwoing error:
ValueError: all the input arrays must have same number of dimensions
Can you, please, tell me what is wrong with my code?
if i run the classes one by one :
for cust_txt_col i got below o/p
['3e909f222a1e06098ec7ca1ea7e84540' '1691bdba3b75df145169e0501369fce3'
'1691bdba3b75df145169e0501369fce3' ..., 'e11ec9863aaeb93f77a231319021e14d'
'851c517b2af0a46cb9bc9373b748b6ff' '0ffe46fc75d21a5347b1f1a5a84526ad']
for cust_regression_vals i got below o/p
[[qwerty],
[asdfgh]]
cust_txt_col is returning a 1d array. FeatureUnion demands that each constituent transformer returns a 2d array.
I would like to be able to draw a curve like this sample, and then turn that into a function that approximates the curve. Some pseudo python code might look like
>> drawing = file.open('sample_curve.jpg')
>> approx_function = function_from_drawing(drawing, x_scale=10, y_scale=5, y_offset=3)
>> print approx_function(2.2)
5.3
I figure you might be able to pick a pixed in each column that has the line going through it (and decide to use the lowest one if there is more than one) and and then smooth that out with bezier curves. I guess what I'm wondering is what is does something like this exist already (of course it does...) and how can I integrate this with python. Also, how would I go about implementing this in python if I can't find something that is up to snuff? Would it be easier to use a vector drawing instead?
this is my preliminary hacky solution:
from PIL import Image
import numpy as np
class Pic_Function():
def __init__(self, picture_path):
self.picture = Image.open(picture_path)
self.pixels = self.picture.load()
self.columns = []
# is there really no image method to get a numpy array of pixels?
for i in range(self.picture.size[0]):
self.columns.append([self.pixels[i,j] for j in range(self.picture.size[1])])
self.first_black = []
for i in self.columns:
try:
self.first_black.append(self.picture.size[0] - i.index((0,0,0)))
except ValueError:
self.first_black.append(None)
self.max, self.min = max(self.first_black), min([j for j in self.first_black if j != None])
def at(self, x):
upper_idx = int(math.ceil(x))
lower_idx = upper_idx - 1
try:
upper = self.first_black[upper_idx]
lower = self.first_black[lower_idx]
except IndexError:
return 0
if None in [upper, lower]:
return 0
up_weight, low_weight = abs(upper-x), abs(lower-x)
return (up_weight*upper + low_weight*lower)/(up_weight + low_weight)
def norm_at(self, x, length):
un_normed = self.at(x*self.picture.size[0]/length)
return (un_normed - self.min)/self.max