I am working in TabPy inside Tableau and want to perform normal statistical calculations.
I am stuck with Cp calculation. Here is the code that I wrote -
SCRIPT_REAL("
import pandas as pd
import numpy as np
from scipy import stats
# Calculate Cp
def Cp(list,_arg2,_arg3):
arr = np.array(list)
arr = arr.ravel()
sigma = np.std(arr)
Cp = float(_arg2 - _arg3) / (6*sigma)
return Cp
",FLOAT([USL - Param]), FLOAT([LSL - Param]))
The error that I am getting is -
No Return Value
although I am clearly returning Cp. What could be the issue?
Please help.
Something like the below would solve some of the issues you're seeing.
I haven't checked the validity of your Cp function, and whether this would work with lists or single values.
SCRIPT_REAL("
import pandas as pd
import numpy as np
from scipy import stats
# Define Cp
def Cp(argu_1,argu_2):
arr = np.array(list)
arr = arr.ravel()
sigma = np.std(arr)
Cp_value = float(argu_1 - argu_2) / (6*sigma)
return Cp_value
# Call function with variables from Tableau, and return the Cp_value
return Cp(<Argument 1>, <Argument 2>)
",FLOAT([USL - Param]), FLOAT([LSL - Param]))
Related
How to execute this code:
import numpy as np
import math
x = np.arange(1,9, 0.5)
k = math.cos(x)
print(x)
I got an error like this:
TypeError: only size-1 arrays can be converted to Python scalars
Thank you in advance.
So this is happening because math.cos doesn't accept numpy arrays larger than size 1. That's why if you had a np array of size 1, your approach would still work.
A simpler way you can achieve the result is to use np.cos(x) directly:
import numpy as np
x = np.arange(1,9, 0.5)
k = np.cos(x)
print(x)
print(k)
If you have to use the math module, you can try iterating through the array and applying math.cos to each member of the array:
import numpy as np
import math
x = np.arange(1,9,0.5)
for item in x:
k = math.cos(item)
print(k) # or add to a new array/list
You're looking for something like this?
import numpy as np
import math
x = np.arange(1,9, 0.5)
for ang in x:
k = math.cos(ang)
print(k)
You are trying to pass ndarray (returned by arange) to a function, which expects just real number. Use np.cos instead.
If you want pure-Python:
You can use math.fun in map like below:
import math
x = range(1,9)
print(list(map(math.cos, x)))
Output:
[0.5403023058681398, -0.4161468365471424, -0.9899924966004454, -0.6536436208636119, 0.2836621854632263, 0.9601702866503661, 0.7539022543433046, -0.14550003380861354]
I have a numpy array of booleans:
import numpy as np
x = np.zeros(100).astype(np.bool)
x[20] = True # say
When I try to insert this (one element per document) as part of an OrderedDict into mongodb, I get the following error:
InvalidDocument: cannot encode object: False, of type: <class 'numpy.bool_'>
This is a serialization issue I have encountered before for singleton numpy booleans.
How do I convert the numpy array into an array of python booleans for serialization?
The following did not work:
y = x.astype(bool)
You can use numpy.ndarray.tolist here.
import numpy as np
x = np.zeros(100).astype(np.bool)
y = x.tolist()
print(type(x))
# numpy.ndarray
print(type(x[0]))
# numpy.bool_
print(type(y))
# list
print(type(y[0]))
# bool
You can try numpy.asscalar
import numpy as np
x = np.zeros(100).astype(np.bool)
z = [np.asscalar(x_i) for x_i in x]
print(type(z))
You can also use item() which is a better option since asscalar is depreceted.
import numpy as np
x = np.zeros(100).astype(np.bool)
z = [x_i.item() for x_i in x]
print(type(z))
print(z)
For a longer list, tolist() is better option.
import numpy as np
import time
x = np.zeros(100000).astype(np.bool)
t1 = time.time()
z = [x_i.item() for x_i in x]
t2 = time.time()
print(t2-t1)
t1 = time.time()
z = x.tolist()
t2 = time.time()
print(t2-t1)
0.0519254207611084
0.0015206336975097656
So, I have just this week come across a solution to this (albeit my own) question from two years ago... Thanks SO!
I am going to invoke the brilliant numpyencoder (https://pypi.org/project/numpyencoder) as follows:
# Set up the problem
import numpy as np
x = np.zeros(100).astype(bool) # Note: bool <- np.bool is now deprecated!
x[20] = True
# Let's roll
import json
from numpyencoder import NumpyEncoder
sanitized_json_string = json.dumps(x, cls=NumpyEncoder)
# One could stop there since the payload is now ready to go - but just to confirm:
x_sanitized=json.loads(sanitized_json_string)
print(x_sanitized)
As part of an exercise i needed to check whether a given sample's true mean is 1.75 or not by generating tvalue using numpy and compare with the output from scipy.
Code:
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
np.random.seed(seed=42) # make example reproducible
n = 100
x = np.random.normal(loc=1.78, scale=.1, size=n) # the sample is here
tval, pval = stats.ttest_1samp(x, 1.75)
var_x = x.var(ddof=1)
std_x = np.sqrt(var_x)
tval1 = (x.mean() - 1.75)/(std_x*np.sqrt(n))
print("Scipy: ",tval,"\nNumpy: ",tval1)
The output from Scipy is 2.1598800019529265,
while output from numpy is 0.021598800019529265
I guess the logic i used is incorrect, Please suggest.
You made a mistake in the denominator. It should be
tval1 = (x.mean() - 1.75)/(std_x / np.sqrt(n)) # (std_x divided by root n)
That's why you will find there is a factor of 100 difference ((1/10)/10 = 1/100) between your Scipy and numpy output.
Here is the Wiki of Student's t-test
An example using another sample size:
np.random.seed(seed=42)
n = 369
x = np.random.normal(loc=1.78, scale=.1, size=n) # the sample is here
tval, pval = stats.ttest_1samp(x, 1.75)
var_x = x.var(ddof=1)
std_x = np.sqrt(var_x)
tval1 = (x.mean() - 1.75)/(std_x / np.sqrt(n))
print("Scipy: ",tval,"\nNumpy: ",tval1)
# Output:
# Scipy: 6.306500305262841
# Numpy: 6.306500305262841
I'm trying to optimize parameters using data for my model with scipy optimize but scipy fails to minimize the function and find values of the parameters. It just returns the initial guess which the user gives as input.Also, it gives the following error: RuntimeWarning: invalid value encountered in reduce.
import pandas as pd
import numpy as np
from math import log10
import math
import scipy.optimize as op
from scipy.integrate import odeint
df1 = pd.read_csv('dataset1.csv')
z=df1.loc[: , "z"]
za=z.as_matrix(columns=None)
mu=df1.loc[: , "mu"]
mua=mu.as_matrix(columns=None)
si=df1.loc[: , "sig"]
sia=si.as_matrix(columns=None)
c = 299792.458;
H0 = 70;
m_t=0.3
d_t=0.7
mu0 = 25 + 5*log10(c/H0);
def model(x,t,m,d):
dydt = 1/(math.sqrt((((1+x)**2)*(1+m*x))-(x*d*(2+x))))
return dydt
def Io(zb,m,d):
return odeint(model,0,zb, args=(m,d))
def lnlike(theta,zb, mub,sib):
m, d = theta
isia2 = 1.0/np.square(sib)
return 0.5*(np.sum(((((5*(np.log10((1+zb)*Io(zb,m,d)))+mu0)-mub)**2)*isia2)- np.log(isia2)))
nll = lambda *args: -lnlike(*args)
result = op.minimize(nll, [m_t, d_t], args=(za, mua,sia))
m_ml, d_ml = result["x"]
print(m_ml, d_ml)
I think scipy is not able to handle illegal values generated due to the square root.If so, how can one bypass the illegal values?
the dataset1 file can be found at the link:https://drive.google.com/file/d/1HDzQ7rz_u9y63ECNkhtB49T2KBvu0qu6/view?usp=sharing
I have matrices where elements can be defined as arithmetic expressions and have written Python code to optimise parameters in these expressions in order to minimize particular eigenvalues of the matrix. I have used scipy to do this, but was wondering if it is possible with NLopt as I would like to try a few more algorithms which it has (derivative free variants).
In scipy I would do something like this:
import numpy as np
from scipy.linalg import eig
from scipy.optimize import minimize
def my_func(x):
y, w = x
arr = np.array([[y+w,-2],[-2,w-2*(w+y)]])
ev, ew=eig(arr)
return ev[0]
x0 = np.array([10, 3.45]) # Initial guess
minimize(my_func, x0)
In NLopt I have tried this:
import numpy as np
from scipy.linalg import eig
import nlopt
def my_func(x,grad):
arr = np.array([[x[0]+x[1],-2],[-2,x[1]-2*(x[1]+x[0])]])
ev, ew=eig(arr)
return ev[0]
opt = nlopt.opt(nlopt.LN_BOBYQA, 2)
opt.set_lower_bounds([1.0,1.0])
opt.set_min_objective(my_func)
opt.set_xtol_rel(1e-7)
x = opt.optimize([10.0, 3.5])
minf = opt.last_optimum_value()
print "optimum at ", x[0],x[1]
print "minimum value = ", minf
print "result code = ", opt.last_optimize_result()
This returns:
ValueError: nlopt invalid argument
Is NLopt able to process this problem?
my_func should return double, posted sample return complex
print(type(ev[0]))
None
<class 'numpy.complex128'>
ev[0]
(13.607794065928395+0j)
correct version of my_func:
def my_func(x, grad):
arr = np.array([[x[0]+x[1],-2],[-2,x[1]-2*(x[1]+x[0])]])
ev, ew=eig(arr)
return ev[0].real
updated sample returns:
optimum at [ 1. 1.]
minimum value = 2.7015621187164243
result code = 4