Good morning!
I am looking to create an analog of some code in R
Basically, I have a function that, among other things, takes a seed provided by the user (default is NULL), along with a specific distribution (default is rnorm), and outputs 9 random numbers, saved as a vector "e". This is what it looked like in R...
function (...other variables..., seed=NULL, dist=rnorm)
...other code...
e <- dist(9,...)
Now I'm converting the function to Python, but I can't quite seem to find an analog that would work, where a user can replace the base seed and distribution.
Here's what i have so far...
def (...other variables..., seed=None, dist=?):
...other code...
e = dist(9)
See numpy.random.normal function (doc here)
For instance:
import numpy as np
np.random.normal(0,1,9)
array([ 0.33593283, -0.18149502, 0.43148566, 1.46831794, -0.72244867,
-1.40048855, 0.52366471, 0.34099135, 0.71654992])
Related
I am trying to find the parametric equations for a certain set of plots, however when I implement the code I only get the first value back. The code is from a website, as I am not very proficient in python. I am using 3.6.5. Here is the code:
import numpy as np
import scipy as sp
from fractions import Fraction
def trigSeries(x):
f=sp.fft(x)
n=len(x)
A0=abs(f[0])/n
A0=Fraction(A0).limit_denominator(1000)
hn=np.ceil(n/2)
f=f[1:int(hn)]
A=2*abs(f)/n
P=sp.pi/2-sp.angle(f)
A=map(Fraction,A)
A=map(lambda a:a.limit_denominator(1000),A)
P=map(Fraction,P)
P=map(lambda a:a.limit_denominator(1000),P)
s=map(str,A)
s=map(lambda a: a+"*np.sin(", s)
s=map(lambda a,b,c :
a+str(b)+"-2*sp.pi*t*"+str(c)+")",
s,P,range(1,len(list(P))+1))
s="+".join(s)
s=str(A0)+"+"+s
return s
x=[5041,4333,3625,3018,2816,2967,3625,4535,5800,6811,7823,8834,8429,7418,6305,5193,4181,3018,3018,3777,4687,5496,6912,7974,9087]
y=[4494,5577,6930,8825,10990,13426,14509,15456,15456,15186,15321,17486,19246,21005,21276,21952,22223,23712,25877,27501,28178,28448,27636,26960,25742]
xf=trigSeries(x)
print(xf)
Any help would be appreciated.
I tried to make the code to work but i could not manage to do it.
The problem id that when you call map(...) You create an iterator, so in order to print it's content you have to do:
for data in iterator:
print(data)
The problem her is that when you apply the lambda function if you cycle over the variable it returns nothing.
You colud convert all the lambda in for cycles, but you have to think over the triple-argument lambda.
The problem is at the step
s=map(lambda a,b,c :
a+str(b)+"-2*sp.pi*t*"+str(c)+")",
s,P,range(1,len(list(P))+1))
it returns empty list. To resolve it, convert s and P to lists just before feeding to this map function. Add two lines above.
s = list(s)
P = list(P)
Output for your example
134723/25+308794/391*np.sin(-1016/709-2*sp.pi*t*1)+2537094/989*np.sin(641/835-2*sp.pi*t*2)+264721/598*np.sin(-68/241-2*sp.pi*t*3)+285344/787*np.sin(-84/997-2*sp.pi*t*4)+118145/543*np.sin(-190/737-2*sp.pi*t*5)+281400/761*np.sin(-469/956-2*sp.pi*t*6)+1451/8*np.sin(-563/489-2*sp.pi*t*7)+122323/624*np.sin(-311/343-2*sp.pi*t*8)+115874/719*np.sin(-137/183-2*sp.pi*t*9)+171452/861*np.sin(-67/52-2*sp.pi*t*10)+18152/105*np.sin(-777/716-2*sp.pi*t*11)+24049/125*np.sin(-107/76-2*sp.pi*t*12)
I'm looking for a simple solution to perform multi-factor ANOVA analysis in python. A 2-factor nested ANOVA is what I'm after, and the SPM1D python module is one way to do that, however I am having an issue.
http://www.spm1d.org/doc/Stats1D/anova.html#two-way-nested-anova
for any of the nested approach examples, there is never any F-statistic or p_values printed, nor can I find any way to print them or send them to a variable.
To go through the motions of running one of their examples, where B is nested inside A, with Y observations:
import numpy as np
from matplotlib import pyplot
import spm1d
dataset = spm1d.data.uv1d.anova2nested.SPM1D_ANOVA2NESTED_3x3()
Y,A,B = dataset.get_data()
#(1) Conduct ANOVA:
alpha = 0.05
FF = spm1d.stats.anova2nested(Y, A, B, equal_var=True)
FFi = FF.inference(0.05)
print( FFi )
#(2) Plot results:
pyplot.close('all')
FFi.plot(plot_threshold_label=True, plot_p_values=True)
pyplot.show()
The only indication of statistical significance provided is whether the h0 hypothesis is rejected or not.
> print( FFi )
SPM{F} inference list
design : ANOVA2nested
nEffects : 2
Effects:
A z=(1x101) array df=(2, 6) h0reject=True
B z=(1x101) array df=(6, 36) h0reject=False
In reality, that should be enough. However, in science, scientists like to think of something as more or less significant, which is actually kind of crap... significance is binary. But that's how they think about it, so I have to play along in order to get work published.
The example code produces a matplotlib plot, and this DOES have the f statistic and p_values on it!
#(2) Plot results:
pyplot.close('all')
FFi.plot(plot_threshold_label=True, plot_p_values=True)
pyplot.show()
But I can't seem to get any output which prints it.
FFi.get_p_values
and
FFi.get_f_values
produce the output:
<bound method SPMFiList.get_p_values <kabammi edit -- or get_f_values> of SPM{F} inference list
design : ANOVA2nested
nEffects : 2
Effects:
A z=(1x101) array df=(2, 6) h0reject=True
B z=(1x101) array df=(6, 36) h0reject=False
So I don't know what to do. Clearly the FFi.plot class can access the p_values (with plot_p_values) but FFi.get_p_values cant!!? Can anyone lend a hand?
cheers,
K
The easiest way to get the p values is to use the get_p_values method that you mention, you just need to call the method by adding () to the end.
p = FFi.get_p_values()
print(p)
This yields:
([0.016584151119287904], [])
To see more detailed information for each effect in 2+-way ANOVA, including p values, use print along with the individual F statistics like this:
print( FFi[0] )
print( FFi[1] )
The first print statement will produce output like this:
SPM{F} inference field
SPM.effect : Main A
SPM.z : (1x101) raw test stat field
SPM.df : (2, 6)
SPM.fwhm : 11.79254
SPM.resels : (1, 8.47993)
Inference:
SPM.alpha : 0.050
SPM.zstar : 24.30619
SPM.h0reject : True
SPM.p_set : 0.017
SPM.p_cluster : (0.017)
You can retrieve the clusters' p values like this:
p = [F.p for F in FFi]
which gives the same result as calling get_p_values.
Note that there are no p values in this case for FFi[1] because the test statistic fails to cross the alpha-defined threshold (see the "Main B" panel in the figure above). If you need to report p values in this case as well, one option is simply to use "p > alpha". More precise p value are available parametrically up until about p = 0.5, but larger p values than that are not very accurate using parametric methods, so if you need p values for all cases consider using the nonparametric version: spm1d.stats.nonparam.anova2nested.
In Python, I'm trying to write an algorithm alias_freq(f_signal,f_sample,n), which behaves as follows:
def alias_freq(f_signal,f_sample,n):
f_Nyquist=f_sample/2.0
if f_signal<=f_Nyquist:
return n'th frequency higher than f_signal that will alias to f_signal
else:
return frequency (lower than f_Nyquist) that f_signal will alias to
The following is code that I have been using to test the above function (f_signal, f_sample, and n below are chosen arbitrarily just to fill out the code)
import numpy as np
import matplotlib.pyplot as plt
t=np.linspace(0,2*np.pi,500)
f_signal=10.0
y1=np.sin(f_signal*t)
plt.plot(t,y1)
f_sample=13.0
t_sample=np.linspace(0,int(f_sample)*(2*np.pi/f_sample),f_sample)
y_sample=np.sin(f_signal*t_sample)
plt.scatter(t_sample,y_sample)
n=2
f_alias=alias_freq(f_signal,f_sample,n)
y_alias=np.sin(f_alias*t)
plt.plot(t,y_alias)
plt.xlim(xmin=-.1,xmax=2*np.pi+.1)
plt.show()
My thinking is that if the function works properly, the plots of both y1 and y_alias will hit every scattered point from y_sample. So far I have been completely unsuccessful in getting either the if statement or the else statement in the function to do what I think it should, which makes me believe that either I don't understand aliasing nearly as well as I want to, or my test code is no good.
My questions are: Prelimarily, is the test code I'm using sound for what I'm trying to do? And primarily, what is the alias_freq function that I am looking for?
Also please note: If some Python package has a function just like this already built in, I'd love to hear about it - however, part of the reason I'm doing this is to give myself a device to understand phenomena like aliasing better, so I'd still like to see what my function should look like.
As far as I understood the question correctly, the frequency of the aliased signal is abs(sampling_rate * n - f_signal), where n is the closest integer multiple to f_signal.
Thus:
n = round(f_signal / float(f_sample))
f_alias = abs(f_sample * n - f_signal)
This should work for frequencies under and over Nyquist.
I figured out the answer to my and just realized that I forgot to post it here, sorry. Turns out it was something silly - Antii's answer is basically right, but the way I wrote the code I need a f_sample-1 in the alias_freq function, where I just had an f_sample. There's still a phase shift thing that happens sometimes, but just plugging in either 0 or pi for the phase shift has worked for me every time, I think it's just due to even or odd folding. The working function and test code is below.
import numpy as np
import matplotlib.pyplot as plt
#Given a sample frequency and a signal frequency, return frequency that signal frequency will be aliased to.
def alias_freq(f_signal,f_sample,n):
f_alias = np.abs((f_sample-1)*n - f_signal)
return f_alias
t=np.linspace(0,2*np.pi,500)
f_signal=13
y1=np.sin(f_signal*t)
plt.plot(t,y1)
f_sample=7
t_sample=np.linspace(0,int(f_sample)*(2*np.pi/f_sample),f_sample)
y_sample=np.sin((f_signal)*t_sample)
plt.scatter(t_sample,y_sample)
f_alias=alias_freq(f_signal,f_sample,3)
y_alias=np.sin(f_alias*t+np.pi)#Sometimes with phase shift, usually np.pi for integer f_signal and f_sample, sometimes without.
plt.plot(t,y_alias)
plt.xlim(xmin=-.1,xmax=2*np.pi+.1)
plt.show()
Here is a Python aliased frequency calculator based on numpy
def get_aliased_freq(f, fs):
"""
return aliased frequency of f sampled at fs
"""
import numpy as np
fn = fs / 2
if np.int(f / fn) % 2 == 0:
return f % fn
else:
return fn - (f % fn)
I am new to SymPy and Python in general, and I am currently working with Python 2.7 and SymPy 0.7.5 with the objective to:
a) read a system of differential equations from a text file
b) solve the system
I already read this question and this other question, and they are almost what I am looking for, but I have an additional issue: I do not know in advance the form of the system of equations, so I cannot create the corresponding function using def inside the script, as in this example. The whole thing has to be managed at run-time.
So, here are some snippets of my code. Suppose I have a text file system.txt containing the following:
dx/dt = 0.0387*x - 0.0005*x*y
dy/dt = 0.0036*x*y - 0.1898*y
What I do is:
# imports
import sympy
import scipy
import re as regex
# define all symbols I am going to use
x = sympy.Symbol('x')
y = sympy.Symbol('y')
t = sympy.Symbol('t')
# read the file
systemOfEquations = []
with open("system.txt", "r") as fp :
for line in fp :
pattern = regex.compile(r'.+?\s+=\s+(.+?)$')
expressionString = regex.search(pattern, line) # first match ends in group(1)
systemOfEquations.append( sympy.sympify( expressionString.group(1) ) )
At this point, I am stuck with the two symbolic expressions inside the systemOfEquation list. Provided that I can read the initial conditions for the ODE system from another file, in order to use scipy.integrate.odeint, I would have to convert the system into a Python-readable function, something like:
def dX_dt(X, t=0):
return array([ 0.0387*X[0] - 0.0005*X[0]*X[1] ,
-0.1898*X[1] + 0.0036*X[0]*X[1] ])
Is there a nice way to create this at run-time? For example, write the function to another file and then import the newly created file as a function? (maybe I am being stupid here, but remember that I am relatively new to Python :-D)
I've seen that with sympy.utilities.lambdify.lambdify it's possible to convert a symbolic expression into a lambda function, but I wonder if this can help me...lambdify seems to work with one expression at the time, not with systems.
Thank you in advance for any advice :-)
EDIT:
With minimal modifications, Warren's answer worked flawlessly. I have a list of all symbols inside listOfSymbols; moreover, they appear in the same order as the columns of data X that will be used by odeint. So, the function I used is
def dX_dt(X, t):
vals = dict()
for index, s in enumerate(listOfSymbols) :
if s != time :
vals[s] = X[index]
vals[time] = t
return [eq.evalf(subs=vals) for eq in systemOfEquations]
I just make an exception for the variable 'time' in my specific problem. Thanks again! :-)
If you are going to solve the system in the same script that reads the file (so systemOfEquations is available as a global variable), and if the only variables used in systemOfEquations are x, y and possibly t, you could define dX_dt in the same file like this:
def dX_dt(X, t):
vals = dict(x=X[0], y=X[1], t=t)
return [eq.evalf(subs=vals) for eq in systemOfEquations]
dX_dt can be used in odeint. In the following ipython session, I have already run the script that creates systemOfEquations and defines dX_dt:
In [31]: odeint(dX_dt, [1,2], np.linspace(0, 1, 5))
Out[31]:
array([[ 1. , 2. ],
[ 1.00947534, 1.90904183],
[ 1.01905178, 1.82223595],
[ 1.02872997, 1.73939226],
[ 1.03851059, 1.66032942]]
I am attempting to extract Weibull distribution parameters (shape 'k' and scale 'lambda') that satisfy a certain mean and variance. In this example, the mean is 4 and the variance is 8. It is a 2-unknowns and 2-equations type of problem.
Since this algorithm works with Excel 2010's GRG Solver, I am certain it is about the way I am framing the problem, or potentially, the libraries I am using. I am not overly familiar with optimization libraries, so please let me know where the error is.
Below is the script:
from scipy.optimize import fmin_cg
import math
def weibull_mu(k, lmda): #Formula can be found on wikipedia
return lmda*math.gamma(1+1/k)
def weibull_var(k, lmda): #Formula can be found on wikipedia
return lmda**2*math.gamma(1+2/k)-weibull_mu(k, lmda)**2
def min_function(arggs):
actual_mean = 4 # specific to this example
actual_var = 8 # specific to this example
k = arggs[0]
lmda = arggs[1]
output = [weibull_mu(k, lmda)-(var_wei)]
output.append(weibull_var(k, lmda)-(actual_var)**2-(actual_mean)**2)
return output
print fmin(min_function, [1,1])
This script gives me the following error:
[...]
File "C:\Program Files\Python27\lib\site-packages\scipy\optimize\optimize.py", line 278, in fmin
fsim[0] = func(x0)
ValueError: setting an array element with a sequence.
As far as I can tell, min_function returns a multi-dimensional list, but fmin and fmin_cg does expect that the objective function returns a scalar, if I am not mistaken.
If you are searching the root of the two-equations problem, I suppose it is better that you apply the root function instead. As far as I have been able to find out, scipy does not provide any general optimizers for vector functions.
I managed to get it to work thanks to Anders Gustafsson's comment (thank you). This script now works if one returns only a scalar (in this case I used something along the lines of least-squares). Also, bounds were added by changing the optimization function to "fmin_l_bfgs_b" (again, thanks to Anders Gustafsson).
I only changed the min_function definition relative to the question.
from scipy.optimize import fmin_l_bfgs_b
import math
def weibull_mu(k, lmda):
return lmda*math.gamma(1+1/k)
def weibull_var(k, lmda):
return lmda**2*math.gamma(1+2/k)-weibull_mu(k, lmda)**2
def min_function(arggs):
actual_mean = 4. # specific to this example
actual_var = 8. # specific to this example
k = arggs[0]
lmda = arggs[1]
extracted_var = weibull_var(k, lmda)
extracted_mean = weibull_mu(k, lmda)
output = (extracted_var - actual_var)**2 + (extracted_mean - actual_mean)**2
return output
print fmin_l_bfgs_b(min_function, best_guess, approx_grad = True, bounds = [(.0000001,None),(.0000001,None)], disp = False)
Note: Please feel free to use this script for your own or professional use.