I am trying to make find the equation of a function using pandas dataframe. This has worked in the past on other projects, however, now nothing seems to work.
I am aware that there might be easier ways to solve this, but i need this to work somehow.
additional_cols = ['xVerdier','fDer']
fdata = pd.DataFrame({"idx":findex,"x":xVerdier[:-1],"y":fDer})
print(fdata)
fdata = fdata.reindex(fdata.columns.tolist() + additional_cols, axis = 1)
fdata=fdata [[xVerdier[:-1],fDer]]
fdata = mpd.DataFrame(fdata)
train=fdata[:(int((len(fdata))))]
test=fdata[(int((len(fdata)))):]
regr=linear_model.LinearRegression()
train_x=np.array(train[[xVerdier]])
train_y=np.array(train[[fDer]])
regr.fit(train_x,train_y)
xVerdier is a list of x-values of a graph
[0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999, 0.8999999999999999, 0.9999999999999999, 1.0999999999999999, 1.2, 1.3, 1.4000000000000001, 1.5000000000000002, 1.6000000000000003, 1.7000000000000004, 1.8000000000000005, 1.9000000000000006, 2.0000000000000004, 2.1000000000000005, 2.2000000000000006, 2.3000000000000007, 2.400000000000001, 2.500000000000001, 2.600000000000001, 2.700000000000001, 2.800000000000001, 2.9000000000000012, 3.0000000000000013, 3.1000000000000014, 3.2000000000000015, 3.3000000000000016, 3.4000000000000017, 3.5000000000000018, 3.600000000000002, 3.700000000000002, 3.800000000000002, 3.900000000000002, 4.000000000000002, 4.100000000000001, 4.200000000000001, 4.300000000000001, 4.4, 4.5, 4.6, 4.699999999999999, 4.799999999999999, 4.899999999999999, 4.999999999999998]
fDer is a list of y-values of said graph
[1.2, 1.6000000000000003, 2.0000000000000004, 2.4, 2.799999999999999, 3.1999999999999984, 3.5999999999999988, 3.999999999999999, 4.3999999999999995, 4.8, 5.2, 5.600000000000005, 6.000000000000005, 6.400000000000006, 6.800000000000006, 7.200000000000006, 7.600000000000007, 7.999999999999998, 8.400000000000016, 8.79999999999999, 9.200000000000017, 9.600000000000009, 10.000000000000018, 10.40000000000001, 10.800000000000018, 11.20000000000001, 11.600000000000001, 12.000000000000028, 12.40000000000002, 12.799999999999976, 13.200000000000038, 13.60000000000003, 14.000000000000021, 14.400000000000013, 14.80000000000004, 15.199999999999996, 15.600000000000023, 16.00000000000005, 16.400000000000006, 16.799999999999926, 17.19999999999999, 17.59999999999991, 17.99999999999997, 18.399999999999892, 18.799999999999955, 19.199999999999946, 19.599999999999866, 19.99999999999993, 20.39999999999999, 20.799999999999912]
This is the error message
KeyError: "None of [Index([(0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999, 0.8999999999999999, 0.9999999999999999, 1.0999999999999999, 1.2, 1.3, 1.4000000000000001, 1.5000000000000002, 1.6000000000000003, 1.7000000000000004, 1.8000000000000005, 1.9000000000000006, 2.0000000000000004, 2.1000000000000005, 2.2000000000000006, 2.3000000000000007, 2.400000000000001, 2.500000000000001, 2.600000000000001, 2.700000000000001, 2.800000000000001, 2.9000000000000012, 3.0000000000000013, 3.1000000000000014, 3.2000000000000015, 3.3000000000000016, 3.4000000000000017, 3.5000000000000018, 3.600000000000002, 3.700000000000002, 3.800000000000002, 3.900000000000002, 4.000000000000002, 4.100000000000001, 4.200000000000001, 4.300000000000001, 4.4, 4.5, 4.6, 4.699999999999999, 4.799999999999999, 4.899999999999999), (1.2, 1.6000000000000003, 2.0000000000000004, 2.4, 2.799999999999999, 3.1999999999999984, 3.5999999999999988, 3.999999999999999, 4.3999999999999995, 4.8, 5.2, 5.600000000000005, 6.000000000000005, 6.400000000000006, 6.800000000000006, 7.200000000000006, 7.600000000000007, 7.999999999999998, 8.400000000000016, 8.79999999999999, 9.200000000000017, 9.600000000000009, 10.000000000000018, 10.40000000000001, 10.800000000000018, 11.20000000000001, 11.600000000000001, 12.000000000000028, 12.40000000000002, 12.799999999999976, 13.200000000000038, 13.60000000000003, 14.000000000000021, 14.400000000000013, 14.80000000000004, 15.199999999999996, 15.600000000000023, 16.00000000000005, 16.400000000000006, 16.799999999999926, 17.19999999999999, 17.59999999999991, 17.99999999999997, 18.399999999999892, 18.799999999999955, 19.199999999999946, 19.599999999999866, 19.99999999999993, 20.39999999999999, 20.799999999999912)], dtype='object')] are in the [columns]"
I'm trying to obtain from a list of sorted α-values (Ex: 0.01, 0.2, 0.5, 1.1, 1.5, 2.4, 3.1, 4.0, 5.7, 6.3) with a confidence level set at 0.8. Where I want to use the value at this location, after traversing 80% of my array. I want to get alpha score to make prediction intervals
alpha_scores = array([0.01, 0.2, 0.5, 1.1, 1.5, 2.4, 3.1, 4.0, 5.7, 6.3])
confidence_level = 0.80
confidence_percentile = int(np.floor(confidence_level * (alpha_scores.size + 1))) - 1 #Calculate the confidence percentile
alpha_index = min(max(confidence_level , 0), alpha_scores.size - 1)
err_dist = alpha_scores[alpha_index]
Would this be the correct way to obtain this? I get a score but this does not always meet that same value.
Not sure what is wrong with this function but would appriciate any help I could get on it. New to python and a bit confused.
def summer(tables):
"""
MODIFIES the table to add a column summing the previous elements in the row.
Example: Suppose that a is
[['First', 'Second', 'Third'], [0.1, 0.3, 0.5], [0.6, 0.2, 0.7], [0.5, 1.1, 0.1]]
then place_sums(a) modifies the table a so that it is now
[['First', 'Second', 'Third', 'Sum'],
[0.1, 0.3, 0.5, 0.8], [0.6, 0.2, 0.7, 1.5], [0.5, 1.1, 0.1, 1.7]]
Parameter table: the nested list to process
"""
numrows = len(tables)
sums = []
for n in range(numrows):
sums = [sum(item) for item in tables]
return sums
This is what you are looking for. You don't need to create a new list. You just need to update your variable tables. Also putting a return statement inside your loop just make it run one iteration. You should look at how for loop work and what the return statement actually does.
def summer(tables):
"""
MODIFIES the table to add a column summing the previous elements in the row.
Example: Suppose that a is
[['First', 'Second', 'Third'], [0.1, 0.3, 0.5], [0.6, 0.2, 0.7], [0.5, 1.1, 0.1]]
then place_sums(a) modifies the table a so that it is now
[['First', 'Second', 'Third', 'Sum'],
[0.1, 0.3, 0.5, 0.8], [0.6, 0.2, 0.7, 1.5], [0.5, 1.1, 0.1, 1.7]]
Parameter table: the nested list to process
"""
tables[0].append('Sum')
for i in range(1, len(tables)):
tables[i].append(sum(tables[i]))
I am running some code that I originally developed with SciPy 0.18. Now using SciPy 0.19 I often get warning messages like this:
/usr/lib/python3/dist-packages/scipy/linalg/basic.py:223:
RuntimeWarning: scipy.linalg.solve Ill-conditioned matrix detected.
Result is not guaranteed to be accurate. Reciprocal condition number:
1.8700410190617105e-17 ' condition number: {}'.format(rcond), RuntimeWarning)
Here is a small snippet that generates the message above:
from scipy import interpolate
xx = [0.5, 0.5, 0.5, 1.5, 1.5, 1.5, 2.5, 2.5, 2.5]
yy = [2.5, 1.5, 0.5, 2.5, 1.5, 0.5, 2.5, 1.5, 0.5]
vals = [30.0, 20.0, 10.0, 31.0, 21.0, 11.0, 32.0, 22.0, 12.0]
f = interpolate.Rbf(xx, yy, vals, epsilon=100)
In spite of the warning the results are correct. What is causing this warning? Can it be suppressed somehow?
When inspecting the matrix with
numpy.linalg.cond(f.A)
6.213533820748747e+16
you'll find that its condition number is in the range of machine precision, meaning that your solution contains no significant digits.
Try, e.g.,
b = numpy.random.rand(f.A.shape[0])
x = numpy.linalg.solve(f.A, b)
print(numpy.dot(f.A, x) - b)
[-0.22342786 -0.06718507 -0.13027724 -0.09972579 -0.16589076 -0.06328093
0.05480577 -0.12606864 0.02067541]
If x was indeed a solution, all those numbers would be close to 0. Take it easy on the epsilon to get something meaningful.
I have a one dimensional NumPy array:
a = numpy.array([2,3,3])
I would like to have the product of all elements, 18 in this case.
The only way I could find to do this would be:
b = reduce(lambda x,y: x*y, a)
Which looks pretty, but is not very fast (I need to do this a lot).
Is there a numpy method that does this? If not, what is the most efficient way of doing this? My real world arrays have 39 float elements.
In NumPy you can try:
numpy.prod(a)
For a larger array numpy.arange(1,40) / 10.:
array([ 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. , 1.1,
1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2. , 2.1, 2.2,
2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3. , 3.1, 3.2, 3.3,
3.4, 3.5, 3.6, 3.7, 3.8, 3.9])
your reduce(lambda x,y: x*y, a) needs 24.2µs,
numpy.prod(a) needs 3.9µs.
EDIT: a.prod() needs 2.67µs. Thanks to J.F. Sebastian!
Or if the loss of numerical accuracy is not a problem, we can do
>>> numpy.exp(numpy.sum(numpy.log(a)))
17.999999999999996
>>> numpy.prod(a)
18