I am trying to do a correlated fit of both x and y data, however when I pass in covariance matrices for my x and y measurements, I get the following error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-173-273ef42c6f27> in <module>()
----> 1 odrout = theodr.run()
/Users/anaconda/lib/python2.7/site-packages/scipy/odr/odrpack.pyc in run(self)
1098 for attr in kwd_l:
1099 obj = getattr(self, attr)
-> 1100 if obj is not None:
1101 kwds[attr] = obj
1102
ValueError: could not convert we to a suitable array
Here is a minimal NOT working example that triggers this error on my machine:
import numpy as np
import scipy.odr as spodr
# make x and y data for a function
xx = np.linspace(0, 2*np.pi, 100)
yy = 2.*np.sin(3*xx) - 1
# randomize both variables a bit, and make 10 measurements
# of each data point
xdat = xx + np.random.normal(scale=0.3, size=(10,100))
ydat = yy + np.random.normal(scale=0.3, size=(10, 100))
# the function I will fit to
sin = lambda beta, x: beta[0]*np.sin(beta[1] * x) + beta[2]
# the covariance matrices for both data sets, here I summed over
# the 10 measurements I made for both my x and y data
xcov = np.cov(xdat.transpose())
ycov = np.cov(ydat.transpose())
# setup the odr data
odrdat = spodr.RealData(np.mean(xdat, axis=0),
np.mean(ydat, axis=0), covx=xcov, covy=ycov)
# set up the odr model
model = spodr.Model(sin)
# make the odr object
theodr = spodr.ODR(odrdat, model, beta0=[2,3,-1])
# run the odr object
odrout = theodr.run()
I can't seem to see why the matrices I'm passing are not suitable arrays. From the docs:
Covariance of x covx is an array of covariance matrices of x and are converted to weights by performing a matrix inversion on each observation’s covariance matrix.
This makes me think I should be passing a covariance matrix for each data point, but I don't have that type of information, and I don't think I need it. For a correlated fit it should be enough to have the covariances between all the data. For instance, in scipy.curve_fit you can pass in a 2d-array as a covariance matrix for the y-data, you don't need one for every single point.
Is there a particular way I should be passing these covariance matrices?
Related
I am computing these derivatives using the Montecarlo approach for a generic call option. I am interested in this combined derivative (with respect to both S and Sigma). Doing this with the algorithmic differentiation, I get an error that can be seen at the end of the page. What could be a possible solution? Just to explain something regarding the code, I am going to attach the formula used to compute the "X" in the code below:
from jax import jit, grad, vmap
import jax.numpy as jnp
from jax import random
Underlying_asset = jnp.linspace(1.1,1.4,100)
volatilities = jnp.linspace(0.5,0.6,100)
def second_derivative_mc(S,vol):
N = 100
j,T,q,r,k = 10000,1.,0,0,1.
S0 = jnp.array([S]).T #(Nx1) vector underlying asset
C = jnp.identity(N)*vol #matrix of volatilities with 0 outside diagonal
e = jnp.array([jnp.full(j,1.)])#(1xj) vector of "1"
Rand = np.random.RandomState()
Rand.seed(10)
U= Rand.normal(0,1,(N,j)) #Random number for Brownian Motion
sigma2 = jnp.array([vol**2]).T #Vector of variance Nx1
first = jnp.dot(sigma2,e) #First part equation
second = jnp.dot(C,U) #Second part equation
X = -0.5*first+jnp.sqrt(T)*second
St = jnp.exp(X)*S0
P = jnp.maximum(St-k,0)
payoff = jnp.average(P, axis=-1)*jnp.exp(-q*T)
return payoff
greek = vmap(grad(grad(second_derivative_mc, argnums=1), argnums=0)(Underlying_asset,volatilities)
This is the error message:
> UnfilteredStackTrace Traceback (most recent call
> last) <ipython-input-78-0cc1da97ae0c> in <module>()
> 25
> ---> 26 greek = vmap(grad(grad(second_derivative_mc, argnums=1), argnums=0))(Underlying_asset,volatilities)
>
> 18 frames UnfilteredStackTrace: TypeError: Gradient only defined for
> scalar-output functions. Output had shape: (100,).
The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.
The above exception was the direct cause of the following exception:
> TypeError Traceback (most recent call
> last) /usr/local/lib/python3.7/dist-packages/jax/_src/api.py in
> _check_scalar(x)
> 894 if isinstance(aval, ShapedArray):
> 895 if aval.shape != ():
> --> 896 raise TypeError(msg(f"had shape: {aval.shape}"))
> 897 else:
> 898 raise TypeError(msg(f"had abstract value {aval}"))
> TypeError: Gradient only defined for scalar-output functions. Output had shape: (100,).
As the error message indicates, gradients can only be computed for functions that return a scalar. Your function returns a vector:
print(len(second_derivative_mc(1.1, 0.5)))
# 100
For vector-valued functions, you can compute the jacobian (which is similar to a multi-dimensional gradient). Is this what you had in mind?
from jax import jacobian
greek = vmap(jacobian(jacobian(second_derivative_mc, argnums=1), argnums=0))(Underlying_asset,volatilities)
Also, this is not what you asked about, but the function above will probably not work as you intend even if you solve the issue in the question. Numpy RandomState objects are stateful, and thus will generally not work correctly with jax transforms like grad, jit, vmap, etc., which require side-effect-free code (see Stateful Computations In JAX). You might try using jax.random instead; see JAX: Random Numbers for more information.
I am trying to create a multiple linear regression model from scratch in python. Dataset used: Boston Housing Dataset from Sklearn. Since my focus was on the model building I did not perform any pre-processing steps on the data. However, I used an OLS model to calculate p-values and dropped 3 features from the data. After that, I used a Linear Regression model to find out the weights for each feature.
import pandas as pd
from sklearn.datasets import load_boston
from sklearn.linear_model import LinearRegression
X=load_boston()
data=pd.DataFrame(X.data,columns=X.feature_names)
y=X.target
data.head()
#dropping three features
data=data.drop(['INDUS','NOX','AGE'],axis=1)
#new shape of the data (506,10) not including the target variable
#Passed the whole dataset to Linear Regression Model
model_lr=LinearRegression()
model_lr.fit(data,y)
model_lr.score(data,y)
0.7278959820021539
model_lr.intercept_
22.60536462807957 #----- intercept value
model_lr.coef_
array([-0.09649731, 0.05281081, 2.3802989 , 3.94059598, -1.05476566,
0.28259531, -0.01572265, -0.75651996, 0.01023922, -0.57069861]) #--- coefficients
Now I wanted to calculate the coefficients manually in excel before creating the model in python. To calculate the weights of each feature I used this formula:
Calculating the Weights of the Features
To calculate the intercept I used the formula
b0 = mean(y)-b1*mean(x1)-b2*(mean(x2)....-bn*mean(xn)
The intercept value from my calculations was 22.63551387(almost same to that of the model)
The problem is that the weights of the features from my calculation are far off from that of the sklearn linear model.
-0.002528644 #-- CRIM
-0.001028914 #-- Zn
-0.038663314 #-- CHAS
-0.035026972 #-- RM
-0.014275311 #-- DIS
-0.004058291 #-- RAD
-0.000241103 #-- TAX
-0.015035534 #-- PTRATIO
-0.000318376 #-- B
-0.006411897 #-- LSTAT
Using the first row as a test data to check my calculations, I get 22.73167044199992 while the Linear Regression model predicts 30.42657776. The original value is 24.
But as soon as I check for other rows the sklearn model is having more variation while the predictions made by the weights from my calculations are all showing values close to 22.
I think I am making a mistake in calculating the weights, but I am not sure where the problem is? Is there a mistake in my calculation? Why are all my coefficients from the calculations so close to 0?
Here is my Code for Calculating the coefficients:(beginner here)
x_1=[]
x_2=[]
for i,j in zip(data['CRIM'],y):
mean_x=data['CRIM'].mean()
mean_y=np.mean(y)
c=i-mean_x*(j-mean_y)
d=(i-mean_x)**2
x_1.append(c)
x_2.append(d)
print(sum(x_1)/sum(x_2))
Thank you for reading this long post, I appreciate it.
It seems like the trouble lies in the coefficient calculation. The formula you have given for calculating the coefficients is in scalar form, used for the simplest case of linear regression, namely with only one feature x.
EDIT
Now after seeing your code for the coefficient calculation, the problem is clearer.
You cannot use this equation to calculate the coefficients of each feature independent of each other, as each coefficient will depend on all the features. I suggest you take a look at the derivation of the solution to this least squares optimization problem in the simple case here and in the general case here. And as a general tip stick with matrix implementation whenever you can, as this is radically more efficient.
However, in this case we have a 10-dimensional feature vector, and so in matrix notation it becomes.
See derivation here
I suspect you made some computational error here, as implementing this in python using the scalar formula is more tedious and untidy than the matrix equivalent. But since you haven't shared this peace of your code its hard to know.
Here's an example of how you would implement it:
def calc_coefficients(X,Y):
X=np.mat(X)
Y = np.mat(Y)
return np.dot((np.dot(np.transpose(X),X))**(-1),np.transpose(np.dot(Y,X)))
def score_r2(y_pred,y_true):
ss_tot=np.power(y_true-y_true.mean(),2).sum()
ss_res = np.power(y_true -y_pred,2).sum()
return 1 -ss_res/ss_tot
X = np.ones(shape=(506,11))
X[:,1:] = data.values
B=calc_coefficients(X,y)
##### Coeffcients
B[:]
matrix([[ 2.26053646e+01],
[-9.64973063e-02],
[ 5.28108077e-02],
[ 2.38029890e+00],
[ 3.94059598e+00],
[-1.05476566e+00],
[ 2.82595310e-01],
[-1.57226536e-02],
[-7.56519964e-01],
[ 1.02392192e-02],
[-5.70698610e-01]])
#### Intercept
B[0]
matrix([[22.60536463]])
y_pred = np.dot(np.transpose(B),np.transpose(X))
##### First 5 rows predicted
np.array(y_pred)[0][:5]
array([30.42657776, 24.80818347, 30.69339701, 29.35761397, 28.6004966 ])
##### First 5 rows Ground Truth
y[:5]
array([24. , 21.6, 34.7, 33.4, 36.2])
### R^2 score
score_r2(y_pred,y)
0.7278959820021539
Complete Solution - 2020 - boston dataset
As the other said, to compute the coefficients for the linear regression you have to compute
β = (X^T X)^-1 X^T y
This give you the coefficients ( all B for the feature + the intercept ).
Be sure to add a column with all 1ones to the X for compute the intercept(more in the code)
Main.py
from sklearn.datasets import load_boston
import numpy as np
from CustomLibrary import CustomLinearRegression
from CustomLibrary import CustomMeanSquaredError
boston = load_boston()
X = np.array(boston.data, dtype="f")
Y = np.array(boston.target, dtype="f")
regression = CustomLinearRegression()
regression.fit(X, Y)
print("Projection matrix sk:", regression.coefficients, "\n")
print("bias sk:", regression.intercept, "\n")
Y_pred = regression.predict(X)
loss_sk = CustomMeanSquaredError(Y, Y_pred)
print("Model performance:")
print("--------------------------------------")
print("MSE is {}".format(loss_sk))
print("\n")
CustomLibrary.py
import numpy as np
class CustomLinearRegression():
def __init__(self):
self.coefficients = None
self.intercept = None
def fit(self, x , y):
x = self.add_one_column(x)
x_T = np.transpose(x)
inverse = np.linalg.inv(np.dot(x_T, x))
pseudo_inverse = inverse.dot(x_T)
coef = pseudo_inverse.dot(y)
self.intercept = coef[0]
self.coefficients = coef[1:]
return coef
def add_one_column(self, x):
'''
the fit method with x feature return x coefficients ( include the intercept)
so for have the intercept + x feature coefficients we have to add one column ( in the beginning )
with all 1ones
'''
X = np.ones(shape=(x.shape[0], x.shape[1] +1))
X[:, 1:] = x
return X
def predict(self, x):
predicted = np.array([])
for sample in x:
result = self.intercept
for idx, feature_value_in_sample in enumerate(sample):
result += feature_value_in_sample * self.coefficients[idx]
predicted = np.append(predicted, result)
return predicted
def CustomMeanSquaredError(Y, Y_pred):
mse = 0
for idx,data in enumerate(Y):
mse += (data - Y_pred[idx])**2
return mse * (1 / len(Y))
I am trying to build a simple VAR(p) model using pymc3, but I'm getting some cryptic errors about incompatible dimensions. I suspect the issue is that I'm not properly generating random matrices. Here is an attempt at VAR(1), any help would be welcome:
# generate some data
y_full = numpy.zeros((2,100))
t = numpy.linspace(0,2*numpy.pi,100)
y_full[0,:] = numpy.cos(5*t)+numpy.random.randn(100)*0.02
y_full[1,:] = numpy.sin(6*t)+numpy.random.randn(100)*0.01
y_obs = y_full[:,1:]
y_lag = y_full[:,:-1]
with pymc3.Model() as model:
beta= pymc3.MvNormal('beta',mu=numpy.ones((4)),cov=numpy.ones((4,4)),shape=(4))
mu = pymc3.Deterministic('mu',beta.reshape((2,2)).dot(y_lag))
y = pymc3.MvNormal('y',mu=mu,cov=numpy.eye(2),observed=y_obs)
The last line should be
y = pm.MvNormal('y',mu=mu.T, cov=np.eye(2),observed=y_obs.T)
MvNormal interprets the last dimension as the mvnormal vectors. This is because the behaviour of numpy indexing implies that y_obs is a vector of length 2 containing vectors of length 100 (y_lag[i].shape == (100,))
Using scipy's splrep I can easily fit a test sinewave:
import numpy as np
from scipy.interpolate import splrep, splev
import matplotlib.pyplot as plt
plt.style.use("ggplot")
# Generate test sinewave
x = np.arange(0, 20, .1)
y = np.sin(x)
# Interpolate
tck = splrep(x, y)
x_spl = x + 0.05 # Just to show it wors
y_spl = splev(x_spl, tck)
plt.plot(x_spl, y_spl)
The splrep documentation states that the default value for the weight parameter is np.ones(len(x)). However, plotting this results in a totally different plot:
tck = splrep(x, y, w=np.ones(len(x_spl)))
y_spl = splev(x_spl, tck)
plt.plot(x_spl, y_spl)
The documentation also states that the smoothing condition s is different when a weight array is given - but even when setting s=len(x_spl) - np.sqrt(2*len(x_spl)) (the default value without a weight array) the result does not strictly correspond to the original curve as shown in the plot.
What do I need to change in the code listed above in order to make the interpolation with weight array (as listed above) output the same result as the interpolation without the weights?
I have tested this with scipy 0.17.0. Gist with a test IPython notebook
You only have to change one line of your code to get the identical output:
tck = splrep(x, y, w=np.ones(len(x_spl)))
should become
tck = splrep(x, y, w=np.ones(len(x_spl)), s=0)
So, the only difference is that you have to specify s instead of using the default one.
When you look at the source code of splrep you will see why that is necessary:
if w is None:
w = ones(m, float)
if s is None:
s = 0.0
else:
w = atleast_1d(w)
if s is None:
s = m - sqrt(2*m)
which means that, if neither weights nor s are provided, s is set to 0 and if you provide weights but no s then s = m - sqrt(2*m) where m = len(x).
So, in your example above you compare outputs with the same weights but with different s (which are 0 and m - sqrt(2*m), respectively).
I am trying to implement this algorithm to find the intercept and slope for single variable:
Here is my Python code to update the Intercept and slope. But it is not converging. RSS is Increasing with Iteration rather than decreasing and after some iteration it's becoming infinite. I am not finding any error implementing the algorithm.How Can I solve this problem? I have attached the csv file too.
Here is the code.
import pandas as pd
import numpy as np
#Defining gradient_decend
#This Function takes X value, Y value and vector of w0(intercept),w1(slope)
#INPUT FEATURES=X(sq.feet of house size)
#TARGET VALUE=Y (Price of House)
#W=np.array([w0,w1]).reshape(2,1)
#W=[w0,
# w1]
def gradient_decend(X,Y,W):
intercept=W[0][0]
slope=W[1][0]
#Here i will get a list
#list is like this
#gd=[sum(predicted_value-(intercept+slope*x)),
# sum(predicted_value-(intercept+slope*x)*x)]
gd=[sum(y-(intercept+slope*x) for x,y in zip(X,Y)),
sum(((y-(intercept+slope*x))*x) for x,y in zip(X,Y))]
return np.array(gd).reshape(2,1)
#Defining Resudual sum of squares
def RSS(X,Y,W):
return sum((y-(W[0][0]+W[1][0]*x))**2 for x,y in zip(X,Y))
#Reading Training Data
training_data=pd.read_csv("kc_house_train_data.csv")
#Defining fixed parameters
#Learning Rate
n=0.0001
iteration=1500
#Intercept
w0=0
#Slope
w1=0
#Creating 2,1 vector of w0,w1 parameters
W=np.array([w0,w1]).reshape(2,1)
#Running gradient Decend
for i in range(iteration):
W=W+((2*n)* (gradient_decend(training_data["sqft_living"],training_data["price"],W)))
print RSS(training_data["sqft_living"],training_data["price"],W)
Here is the CSV file.
Firstly, I find that when writing machine learning code, it's best NOT to use complex list comprehension because anything that you can iterate,
it's easier to read if written when normal loops and indentation and/or
it can be done with numpy broadcasting
And using proper variable names can help you better understand the code. Using Xs, Ys, Ws as short hand is nice only if you're good at math. Personally, I don't use them in the code, especially when writing in python. From import this: explicit is better than implicit.
My rule of thumb is to remember that if I write code I can't read 1 week later, it's bad code.
First, let's decide what is the input parameters for gradient descent, you will need:
feature_matrix (The X matrix, type: numpy.array, a matrix of N * D size, where N is the no. of rows/datapoints and D is the no. of columns/features)
output (The Y vector, type: numpy.array, a vector of size N)
initial_weights (type: numpy.array, a vector of size D).
Additionally, to check for convergence you will need:
step_size (the magnitude of change when iterating through to change the weights; type: float, usually a small number)
tolerance (the criteria to break the iterations, when the gradient magnitude is smaller than tolerance, assume that your weights have convereged, type: float, usually a small number but much bigger than the step size).
Now to the code.
def regression_gradient_descent(feature_matrix, output, initial_weights, step_size, tolerance):
converged = False # Set a boolean to check for convergence
weights = np.array(initial_weights) # make sure it's a numpy array
while not converged:
# compute the predictions based on feature_matrix and weights.
# iterate through the row and find the single scalar predicted
# value for each weight * column.
# hint: a dot product can solve this easily
predictions = [??? for row in feature_matrix]
# compute the errors as predictions - output
errors = predictions - output
gradient_sum_squares = 0 # initialize the gradient sum of squares
# while we haven't reached the tolerance yet, update each feature's weight
for i in range(len(weights)): # loop over each weight
# Recall that feature_matrix[:, i] is the feature column associated with weights[i]
# compute the derivative for weight[i]:
# Hint: the derivative is = 2 * dot product of feature_column and errors.
derivative = 2 * ????
# add the squared value of the derivative to the gradient magnitude (for assessing convergence)
gradient_sum_squares += (derivative * derivative)
# subtract the step size times the derivative from the current weight
weights[i] -= (step_size * derivative)
# compute the square-root of the gradient sum of squares to get the gradient magnitude:
gradient_magnitude = ???
# Then check whether the magnitude is lower than the tolerance.
if ???:
converged = True
# Once it while loop breaks, return the loop.
return(weights)
I hope the extended pseudo-code helps you better understand the gradient descent. I won't fill in the ??? so as to not spoil your homework.
Note that your RSS code is also unreadable and unmaintainable. It's easier to do just:
>>> import numpy as np
>>> prediction = np.array([1,2,3])
>>> output = np.array([1,1,5])
>>> residual = output - prediction
>>> RSS = sum(residual * residual)
>>> RSS
5
Going through numpy basics will go a long way to machine learning and matrix-vector manipulation without going nuts with iterations: http://docs.scipy.org/doc/numpy-1.10.1/user/basics.html
I have solved my own problem!
Here is the solved way.
import numpy as np
import pandas as pd
import math
from sys import stdout
#function Takes the pandas dataframe, Input features list and the target column name
def get_numpy_data(data, features, output):
#Adding a constant column with value 1 in the dataframe.
data['constant'] = 1
#Adding the name of the constant column in the feature list.
features = ['constant'] + features
#Creating Feature matrix(Selecting columns and converting to matrix).
features_matrix=data[features].as_matrix()
#Target column is converted to the numpy array
output_array=np.array(data[output])
return(features_matrix, output_array)
def predict_outcome(feature_matrix, weights):
weights=np.array(weights)
predictions = np.dot(feature_matrix, weights)
return predictions
def errors(output,predictions):
errors=predictions-output
return errors
def feature_derivative(errors, feature):
derivative=np.dot(2,np.dot(feature,errors))
return derivative
def regression_gradient_descent(feature_matrix, output, initial_weights, step_size, tolerance):
converged = False
#Initital weights are converted to numpy array
weights = np.array(initial_weights)
while not converged:
# compute the predictions based on feature_matrix and weights:
predictions=predict_outcome(feature_matrix,weights)
# compute the errors as predictions - output:
error=errors(output,predictions)
gradient_sum_squares = 0 # initialize the gradient
# while not converged, update each weight individually:
for i in range(len(weights)):
# Recall that feature_matrix[:, i] is the feature column associated with weights[i]
feature=feature_matrix[:, i]
# compute the derivative for weight[i]:
#predict=predict_outcome(feature,weights[i])
#err=errors(output,predict)
deriv=feature_derivative(error,feature)
# add the squared derivative to the gradient magnitude
gradient_sum_squares=gradient_sum_squares+(deriv**2)
# update the weight based on step size and derivative:
weights[i]=weights[i] - np.dot(step_size,deriv)
gradient_magnitude = math.sqrt(gradient_sum_squares)
stdout.write("\r%d" % int(gradient_magnitude))
stdout.flush()
if gradient_magnitude < tolerance:
converged = True
return(weights)
#Example of Implementation
#Importing Training and Testing Data
# train_data=pd.read_csv("kc_house_train_data.csv")
# test_data=pd.read_csv("kc_house_test_data.csv")
# simple_features = ['sqft_living', 'sqft_living15']
# my_output= 'price'
# (simple_feature_matrix, output) = get_numpy_data(train_data, simple_features, my_output)
# initial_weights = np.array([-100000., 1., 1.])
# step_size = 7e-12
# tolerance = 2.5e7
# simple_weights = regression_gradient_descent(simple_feature_matrix, output,initial_weights, step_size,tolerance)
# print simple_weights
It is so simple
def mean(values):
return sum(values)/float(len(values))
def variance(values, mean):
return sum([(x-mean)**2 for x in values])
def covariance(x, mean_x, y, mean_y):
covar = 0.0
for i in range(len(x)):
covar+=(x[i]-mean_x) * (y[i]-mean_y)
return covar
def coefficients(dataset):
x = []
y = []
for line in dataset:
xi, yi = map(float, line.split(','))
x.append(xi)
y.append(yi)
dataset.close()
x_mean, y_mean = mean(x), mean(y)
b1 = covariance(x, x_mean, y, y_mean)/variance(x, x_mean)
b0 = y_mean-b1*x_mean
return [b0, b1]
dataset = open('trainingdata.txt')
b0, b1 = coefficients(dataset)
n=float(raw_input())
print(b0+b1*n)
reference : www.machinelearningmastery.com/implement-simple-linear-regression-scratch-python/