Related
During initializing, I tried to reduce the repeats in my code, so instead of:
output= (torch.zeros(2, 3),
torch.zeros(2, 3))
I wrote:
z = torch.zeros(2, 3)
output= (z,z)
However, I find that the second method is wrong.
If I assign the data to variables h,c, any change on h would also be applied to c
h,c = output
print(h,c)
h +=torch.ones(2,3)
print('-----------------')
print(h,c)
results of the test above:
tensor([[0., 0., 0.],
[0., 0., 0.]]) tensor([[0., 0., 0.],
[0., 0., 0.]])
-----------------
tensor([[1., 1., 1.],
[1., 1., 1.]]) tensor([[1., 1., 1.],
[1., 1., 1.]])
Is there a more elegent way to initialize two indenpendent variables?
I agree that your initial line needs no modification but if you do want an alternative, consider:
z = torch.zeros(2, 3)
output= (z,z.clone())
The reason the other one (output = (z,z)) doesn't work, as you've correctly discovered is that no copy is made. You're only passing the same reference in each entry of the tuple to z
Assign it in a single statement but use separate reference for each, as below:
h, c = output, output
I use UnivariateSpline from scipy module to fit data.It works for almost all cases except for this one, which gives rise to Process finished with exit code -1073741819 (0xC0000005) error. If I change smoothing factor s to 0, it also works. Any suggestions to solve this problem will help.
Update1
My working environment is:
python 3.7
scipy 1.3.2
numpy 1.17.4
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import UnivariateSpline, InterpolatedUnivariateSpline
x = np.arange(78)
y = np.asarray([
0., 0., 0., 0., 0., 0.,
0., 0., 5.03989319, 4.03191455, 4.03191455, 3.02393591,
3.02393591, 2.01595727, 2.01595727, 1.00797864, 0., 0.,
0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0.])
spl = UnivariateSpline(x, y, k=1, s=0.01)
knots = list(map(int, spl.get_knots()))
plt.plot(knots, y[knots], 'rx')
plt.plot(knots, y[knots], 'r-')
plt.plot(x, y, 'b-')
plt.show()
The combination of you s and k parameter are causing the issue.
According to the documentation, the number of knots increases until the condition sum((w[i] * (y[i]-spl(x[i])))**2, axis=0) <= s is met. However, because you have a limited number of non-zero data points, you can only add so many meaningful knots to the data set, and because you are doing k=1 spline (as opposed to cubic for example), the difference between the spline value and the data values is never reaching the prescribed s value.
Your options include increasing k (I tested with k=3 and it worked) or increase the s value to have a less strict condition (anything above s=0.08 worked for me). Note your code worked when s=0 because for that condition, instead of doing a smoothing, the algorithm just interpolates between each point and does no smoothing (which maybe is what you want).
I have compared many Quadratic Programming(QP) solvers like cvxopt, qpoases and osqp and found that osqp works faster and better for my application.
Now, I want to minimize an indefinite quadratic function with both equality and inequality constraints that may get violated depending on various factors. So I want to use l1 penalty method that penalizes the violating constraints.
for example,
I have modified an example, to violate the constraints.
import osqp
import scipy.sparse as sparse
import numpy as np
# Define problem data
P = sparse.csc_matrix([[4., 1.], [1., 2.]])
q = np.array([1., 1.])
A = sparse.csc_matrix([[1., 0.], [0., 1.], [1., 0.], [0., 1.]])
l = np.array([0., 0., 0.2, 1.1])
u = np.array([1., 1., 0.2, 1.1])
# Create an OSQP object
prob = osqp.OSQP()
# Setup workspace and change alpha parameter
prob.setup(P, q, A, l, u, alpha=1.0)
# Solve problem
res = prob.solve()
print res.x
Obviously, this is an infeasible problem, so we need to change the objective function to penalize the error.
So, I need help to formulate this problem that can be solved using osqp's python interface.
Or, please let me know if there is any other python interface available to solve this kind of constraint violation problems.
In general abs functions can be dangerous (they are non-differentiable). A standard way to deal with this is to add slacks. E.g.
g(x) <= 0
becomes
g(x) <= s
s >= 0
Now add a term mu*s to the objective.
For
h(x) = 0
one could do
h(x) = s1 - s2
s1, s2 >= 0
and add mu*(s1+s2) to the objective.
As usual: this is just one approach (there are other formulations).
I had the same problem and this question helped a lot. This is how I solved it in OSQP interface.
I redefined example to be:
# Define problem data
P = sparse.csc_matrix([[4., 1.], [1., 2.]])
q = np.array([1., 1.])
A = sparse.csc_matrix([[1., 0.], [0., 1.], [1., 1.]])
l = np.array([0., 0., 3])
u = np.array([1., 1., 3])
Here first and second variable are constrained to be at most 1. But their sum should equal 3. This makes this problem unfeasible.
Now let's transform inequality constraints as Erwin suggested by adding two slack variables.
# Redefine problem data with 2 slack variableы
# Added quadratic penalties to variables s1 and s2 with penalty coefficient == 1
P = sparse.csc_matrix([[4., 1., 0., 0.], [1., 2., 0., 0.], [0., 0., 1., 0.], [0., 0., 0., 1.]])
# Zero linear penalties for s1 and s2.
q = np.array([1., 1., 0., 0.])
# First constraint is x1 <= s1, second is s1 >= 0.
# Third constraint is x2 <= s2, fourth is s2 >= 0.
A = sparse.csc_matrix([[1., 0., -1., 0.], [0., 0., 1., 0.], [0., 1., 0., -1.], [0., 0., 0., 1.], [1., 1., 0., 0.]])
l = np.array([-np.inf, 0., -np.inf, 0., 3])
u = np.array([0., np.inf, 0., np.inf, 3])
When I run solver, problem has a solution and is softly penalised for exceeding upper bounds.
iter objective pri res dua res rho time
1 -4.9403e-03 3.00e+00 5.99e+02 1.00e-01 8.31e-04s
50 1.3500e+01 1.67e-07 7.91e-08 9.96e-01 8.71e-04s
status: solved
number of iterations: 50
optimal objective: 13.5000
run time: 8.93e-04s
optimal rho estimate: 1.45e+00
[1.00 2.00 1.00 2.00]
Hope this helps somebody.
I am attempting to classify some data with the scikit learn LDA classifier. I'm not entirely sure what to "expect" from it, but what I am getting is weird. Seems like a good opportunity to learn about either a shortcoming of the technique, or a way in which I am applying it wrong. I understand that no line could completely separate this data, but it seems that there are much "better" lines than the one it is finding. I'm just using the default options. Any thoughts on how to do this better? I'm using LDA because it is linear in the size of my dataset. Although I think a linear SVM has a similar complexity. Perhaps it would be better for such data? I will update when I have tested other possibilities.
The picture: (light blue is what my LDA classifier predicts will be dark blue)
The code:
import numpy as np
from numpy import array
import matplotlib.pyplot as plt
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
import itertools
X = array([[ 0.23125754, 0.79170351],
[ 0.78021491, -0.24999486],
[ 0.00856446, 0.41452734],
[ 0.66381753, -0.09872504],
[-0.03178685, 0.04876317],
[ 0.65574645, -0.68214948],
[ 0.14290684, 0.38256002],
[ 0.05156987, 0.11094875],
[ 0.06843403, 0.19110019],
[ 0.24070898, -0.07403764],
[ 0.03184353, 0.4411446 ],
[ 0.58708124, -0.38838008],
[-0.00700369, 0.07540799],
[-0.01907816, 0.07641038],
[ 0.30778608, 0.30317186],
[ 0.55774143, -0.38017325],
[-0.00957214, -0.03303287],
[ 0.8410637 , 0.158594 ],
[-0.00294113, -0.00380608],
[ 0.26577841, 0.07833684],
[-0.32249375, 0.49290502],
[ 0.11313078, 0.35697211],
[ 0.41153679, -0.4471876 ],
[-0.00313315, 0.30065913],
[ 0.14344143, -0.19127107],
[ 0.04857767, 0.01339191],
[ 0.5865007 , 0.71209886],
[ 0.08157439, 0.40909955],
[ 0.72495202, 0.29583866],
[-0.09391461, 0.17976605],
[ 0.06149141, 0.79323099],
[ 0.52208024, -0.2877661 ],
[ 0.01992141, -0.00435266],
[ 0.68492617, -0.46981335],
[-0.00641231, 0.29699622],
[ 0.2369677 , 0.140319 ],
[ 0.6602586 , 0.11200433],
[ 0.25311836, -0.03085372],
[-0.0895014 , 0.45147252],
[-0.18485667, 0.43744524],
[ 0.94636701, 0.16534406],
[ 0.01887734, -0.07702135],
[ 0.91586801, 0.17693792],
[-0.18834833, 0.31944796],
[ 0.20468328, 0.07099982],
[-0.15506378, 0.94527383],
[-0.14560083, 0.72027034],
[-0.31037647, 0.81962815],
[ 0.01719756, -0.01802322],
[-0.08495304, 0.28148978],
[ 0.01487427, 0.07632112],
[ 0.65414479, 0.17391618],
[ 0.00626276, 0.01200355],
[ 0.43328095, -0.34016614],
[ 0.05728525, -0.05233956],
[ 0.61218382, 0.20922571],
[-0.69803697, 2.16018536],
[ 1.38616732, -1.86041621],
[-1.21724616, 2.72682759],
[-1.26584365, 1.80585403],
[ 1.67900048, -2.36561699],
[ 1.35537903, -1.60023078],
[-0.77289615, 2.67040114],
[ 1.62928969, -1.20851808],
[-0.95174264, 2.51515935],
[-1.61953649, 2.34420531],
[ 1.38580104, -1.9908369 ],
[ 1.53224512, -1.96537012]])
y = array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1.])
classifier = LDA()
classifier.fit(X,y)
xx = np.array(list(itertools.product(np.linspace(-4,4,300), np.linspace(-4,4,300))))
yy = classifier.predict(xx)
b_colors = ['salmon' if yyy==0 else 'deepskyblue' for yyy in yy]
p_colors = ['r' if yyy==0 else 'b' for yyy in y]
plt.scatter(xx[:,0],xx[:,1],s=1,marker='o',edgecolor=b_colors,c=b_colors)
plt.scatter(X[:,0], X[:,1], marker='o', s=5, c=p_colors, edgecolor=p_colors)
plt.show()
UPDATE: Changing from using sklearn.discriminant_analysis.LinearDiscriminantAnalysis to sklearn.svm.LinearSVC also using the default options gives the following picture:
I think using the zero-one loss instead of the hinge loss would help, but sklearn.svm.LinearSVC doesn't seem to allow custom loss functions.
UPDATE: The loss function to sklearn.svm.LinearSVC approaches the zero-one loss as the parameter C goes to infinity. Setting C = 1000 gives me what I was originally hoping for. Not posting this as an answer, because the original question was about LDA.
picture:
LDA models each class as a Gaussian, so the model for each class is determined by the class' estimated mean vector and covariance matrix.
Judging by the eye only, your blue and red classes have approximately the same mean and same covariance, which means the 2 Gaussians will 'sit' on top of each other, and the discrimination will be poor. Actually it also means that the separator (the blue-pink border) will be noisy, that is it will change a lot between random samples of your data.
Btw your data is clearly not linearly-separable, so every linear model will have a hard time discriminating the data.
If you must use a linear model, try using LDA with 3 components, such that the top-left blue blob is classified as '0', the bottom-right blue blob as '1', and the red as '2'. This way you will get a much better linear model. You can do it by preprocessing the blue class with a clustering algorithm with K=2 classes.
I am trying to compute the coefficients of the kth Chebyshev polynomial. Let's just set k to 5 for this. So far, I have the following:
a = (0,0,0,0,0,1) #selects the 5th Chebyshev polynomial
p = numpy.polynomial.chebyshev.Chebyshev(a) #type here is Chebyshev
cpoly = numpy.polynomial.chebyshev.cheb2poly(p) #trying to convert to Poly
print cpoly.all_coeffs()
After the second line runs, I have an object of type Chebyshev, as expected. However, the third line fails to convert to a type Poly, and converts to type numpy.ndarray. Thus, I get an error saying that ndarray has no attribute all_coeffs.
Anyone know how to fix this?
#cel has the right idea in the comments - you need to pass the coefficients of the Chebyshev polynomial to cheb2poly, not the object itself:
import numpy as np
cheb = np.polynomial.chebyshev.Chebyshev((0,0,0,0,0,1))
coef = np.polynomial.chebyshev.cheb2poly(cheb.coef)
print(coef)
# [ 0., 5., 0., -20., 0., 16.]
i.e. 16x5 - 20x3 + 5x. You can confirm that these are the correct coefficients here.
To turn these coefficients into a Polynomial object, you just need to pass the array to the Polynomial constructor:
poly = np.polynomial.Polynomial(coef)
In [1]: import numpy.polynomial
In [2]: p = numpy.polynomial.Chebyshev.basis(5)
In [3]: p
Out[3]: Chebyshev([ 0., 0., 0., 0., 0., 1.], [-1., 1.], [-1., 1.])
In [4]: p.convert(kind=numpy.polynomial.Polynomial)
Out[4]: Polynomial([ 0., 5., 0., -20., 0., 16.], [-1., 1.], [-1., 1.])