Distribution plot with wrong total value - python

To create
I have made a distribution plot with code below:
from numpy import *
import numpy as np
import matplotlib.pyplot as plt
sigma = 4.1
x = np.linspace(-6*sigma, 6*sigma, 200)
def distr(n):
def g(x):
return (1/(sigma*sqrt(2*pi)))*exp(-0.5*(x/sigma)**2)
FxSum = 0
a = list()
for i in range(n):
# divide into 200 parts and sum one by one
numb = g(-6*sigma + (12*sigma*i)/n)
FxSum += numb
a.append(FxSum)
return a
plt.plot(x, distr(len(x)))
plt.show()
This is, of course, a way of getting the result without using hist(), cdf() or any other options from Python libraries.
Why the total sum is not 1? It shouldn't depend from (for example) sigma.

Almost right, but in order to integrate you have to multiply the function value g(x) times your tiny interval dx (12*sigma/200). That's the area you sum up:
from numpy import *
import numpy as np
import matplotlib.pyplot as plt
sigma = 4.1
x = np.linspace(-6*sigma, 6*sigma, 200)
def distr(n):
def g(x):
return (1/(sigma*sqrt(2*pi)))*exp(-0.5*(x/sigma)**2)
FxSum = 0
a = list()
for i in range(n):
# divide into 200 parts and sum one by one
numb = g(-6*sigma + (12*sigma*i)/n) * (12*sigma/200)
FxSum += numb
a.append(FxSum)
return a
plt.plot(x, distr(len(x)))
plt.show()

Related

do same calculation over and over in loop with changing variable each time

I want to run this code with several x values and get all the outputs in a list. First run x should be 1, next loop x should be 2, then 3 etc... Is there an easy way to implement this in my code?
EDIT: The loop is now working after i added:
for x in range(1, max_value):
Is there an way I can make a list of the outputs for the degrees of freedom for each loop?
https://imgur.com/eQxHzHZ
import numpy as np
import math
from scipy.stats import skew, kurtosis, kurtosistest
import matplotlib.pyplot as plt
from scipy.stats import norm,t
import pandas as pd
data = pd.read_excel(r"filename.xlsx",sheet_name,skiprows=x+5,usecols="C")
ret = np.array(data.values)
from scipy.stats import skew, kurtosis
X = np.random.randn(10000000)
print(skew(X))
print(kurtosis(X, fisher=False))
# N(x; mu, sig) best fit (finding: mu, stdev)
mu_norm, sig_norm = norm.fit(ret)
dx = 0.0001 # resolution
x = np.arange(-0.1, 0.1, dx)
pdf = norm.pdf(x, mu_norm, sig_norm)
print("Integral norm.pdf(x; mu_norm, sig_norm) dx = %.2f" % (np.sum(pdf*dx)))
print("Sample mean = %.5f" % mu_norm)
print("Sample stdev = %.5f" % sig_norm)
print()
df = pd.DataFrame(ret)
# Student t best fit (finding: nu)
x = t.fit(ret)
nu, mu_t, sig_t = x
pdf2 = t.pdf(x, nu, mu_t, sig_t)
print("Integral t.pdf(x; mu, sig) dx = %.2f" % (np.sum(pdf2*dx)))
print("nu = %.2f" % nu)
print()
You can use a for loop :
for x in range(n):
f(x)
will call the function f on x with x=0, x=1, all the way to x=n-1.
Put the whole code in a for loop that increments x each time.
for x in range(1, max_value):
#do stuff
#add value to a list
print(your_list)
Side note: maybe add all your imports at the beginning, before any scripts start
EDIT x2: as x is overwritten, do
my_list = []
for my_var in range(1, max_value):
x = my_var
#do stuff with x
#add value to a list
my_list.append(x)
print(my_list)

The Birthday paradox - how to plot

from __future__ import division, print_function
from numpy.random import randint
import random
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
def bday(c):
trials = 5000
count = 0
for trial in range(trials):
year = [0]*365
l = False
for i in range(c):
bdayp = randint(1,365)
year[bdayp] = year[bdayp] + 1
if year[bdayp] > 1:
l = True
if l == True:
count = count + 1
prob = count / trials
return prob
for i in range(2,41):
a = bday(i)
print(i,a)
As you can see, I generate the number of people in the class along with the probability that they share a birthday. How can I plot this so that I have n (number of people) on the x-axis and probability on the y-axis using matplotlib.pyplot?
Thanks.
I've linked in the comments the proper documentation to your problem. For the sake of you finding your own solution, perhaps looking at the following might make more sense of how to go about your problem:
def func(x):
return x * 10
x = []
y = []
for i in range(10):
x.append(i)
y.append(func(i))
plt.plot(x, y)
The above can also be achieved by doing the following:
def func(x):
return x * 10
x = np.arange(10)
plt.plot(x, func(x))
Here is the documentation for np.arange; both will plot the following:

How to plot confidence intervals for stattools ccf function?

I am computing the cross-correlation function using ccf from statsmodels. It works fine except I can't see how to also plot the confidence intervals. I notice that acf seems to have much more functionality. Here is a toy example just to have something to see:
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.tsa.stattools as stattools
def create(n):
x = np.zeros(n)
for i in range(1, n):
if np.random.rand() < 0.9:
if np.random.rand() < 0.5:
x[i] = x[i-1] + 1
else:
x[i] = np.random.randint(0,100)
return x
x = create(4000)
y = create(4000)
plt.plot(stattools.ccf(x, y)[:100])
This gives:
Unfortunately, the confidence interval is not provided by the statsmodels cross-correlation function (ccf). In R the ccf() would also print the confidence interval.
Here, we need to calculate the confidence interval by ourself and plot it out afterwards. The confidence interval is here computed as 2 / np.sqrt(lags). For basic info on confidence intervals for cross-correlation refer to:
Stats StackExchange answer by Rob Hyndman: https://stats.stackexchange.com/a/3128/43304
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.tsa.stattools as stattools
def create(n):
x = np.zeros(n)
for i in range(1, n):
if np.random.rand() < 0.9:
if np.random.rand() < 0.5:
x[i] = x[i-1] + 1
else:
x[i] = np.random.randint(0,100)
return x
x = create(4000)
y = create(4000)
lags= 4000
sl = 2 / np.sqrt(lags)
plt.plot(x, list(np.ones(lags) * sl), color='r')
plt.plot(x, list(np.ones(lags) * -sl), color='r')
plt.plot(stattools.ccf(x, y)[:100])
This leads to the following plot with the additional red lines:

calling args in a loop-function

for MCMC I use emcee package this tutorial. Instead of the equation of thispart which is fractional and so easy I use this form, I mean I use matrix form(not its code) and wrote the following code.
for more explanation of my code:
def new_calculation(n) is the equation for each component of matrix and def log_likelihood(theta,hh): is the mentioned matrix.
the problem is, I need args to use in soln = minimize(nll, initial, args=(hh)) and def log_probability(theta,hh):
I use hh as args but the Python says the hh is not defined. the problem may be for definition of arguments and function. I do not know how to fix it.
import numpy as np
import emcee
import matplotlib.pyplot as plt
from math import *
import numpy as np
from scipy.integrate import quad
from scipy.integrate import odeint
xx=np.array([0.01,0.012,0.014,0.016])
yy=np.array([32.95388698,33.87900347,33.84214074,34.11856704])
Cov=[[137,168],[28155,-2217]]
rc=0.09
c=0.7
H01 = 70
O_m1 = 0.28
z0=0
M1=10
np.random.seed(123)
def ant(z,O_m,O_D):
return 1/sqrt(((1+z)**2)*(1+O_m*z))
def new_calculation(n):
O_D=1-O_m-(1/(2*rc*yyn))
q=quad(ant,0,xx[n],args=(O_m,O_D))[0]
h=log10((1+xx[n])*q)
fn=(yy[n]-M-h)
return fn
def log_likelihood(theta,hh):
H0, O_m,M= theta
f_list = []
for i in range(2): # the value '2' reflects matrix size
f_list.append(new_calculation(i))
rdag=[f_list]
rmat=[[f] for f in f_list]
hh=np.linalg.det(np.dot(rdag,Cov),rmat)*0.000001
return hh
from scipy.optimize import minimize
np.random.seed(42)
nll = lambda *args: -log_likelihood(*args)
initial = np.array([H01, O_m1,M1]) + 0.1*np.random.randn(3)
soln = minimize(nll, initial, args=(hh))
H0_ml, O_m0_ml = soln.x
def log_prior(theta):
H0, O_D = theta
if 65 < H0 < 75 and 0.22 < O_m < 0.32 and 0 < M < 12:
return 0.0
return -np.inf
def log_probability(theta, mm,zz,hh):
lp = log_prior(theta)
if not np.isfinite(lp):
return -np.inf
return lp + log_likelihood(theta, mm,zz,hh)
y0=H0
pos = soln.x + 1e-4*np.random.randn(200, 3)
nwalkers, ndim = pos.shape
sampler = emcee.EnsembleSampler(nwalkers, ndim, log_probability, args=(rdag, rmat))
sampler.run_mcmc(pos, 500);
fig = plt.figure(2,figsize=(10, 10))
fig.clf()
for j in range(ndim):
ax = fig.add_subplot(ndim,1,j+1)
ax.plot(np.array([sampler.chain[:,i,j] for i in range(nwalkers)]),"k", alpha = 0.3)
ax.set_ylabel(("H0", "O_m")[j], fontsize = 15)
plt.xlabel('Steps', fontsize = 15)
fig.show()
I appreciate your help and your attention.

Bifurcation diagram in matplotlib

I'm trying to acquire the bifurcation diagram for the equation below:
(x is a function of t)
as:
And here is my snippet:
import numpy as np
import matplotlib.pyplot as plt
def pitch(r, x):
return r * x + np.power(x,3)- np.power(x,5)
n = 10000
r = np.linspace(-200, 200, n)
iterations = 1000
last = 100
x = 0
for i in range(iterations):
x = pitch(r,x)
if i >= (iterations - last):
plt.plot(r,x, ',k', alpha=0.02)
plt.title("Bifurcation diagram")
plt.show()
But the generated plot is not what it is supposed to be:
Edit:
Here is my recent attempt:
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
def pitch(s,x,r):
x = s[0]
dxdt = r * x + np.power(x,3)- np.power(x,5)
return [dxdt]
t = np.linspace(0,100)
s0=[-50]
r = np.linspace(-200, 200)
for i in r:
s = odeint(pitch,s0,t, args=(i,))
plt.plot(s,i,',k', alpha=0.02)
plt.title("Bifurcation diagram")
plt.show()
With this error:
raise ValueError("x and y must have same first dimension") ValueError:
x and y must have same first dimension
Could you give me some advice to fix this problem?!
I found a link to this post and decided to post a few remarks that might be helpful to someone stumbling upon it in the future.
I did not analyze the equation in detail but it is clear from the first sight that something interesting would happen when r is close to 0.
So we could study the behavior of the system for r in [-10,10]
You are right to use odeint instead of solving the Cauchy problem using Euler method coded by yourself.
This equation has an attractor in that it soon "forgets" the initial condition and slides towards the attractor, yet the choice of the attractor depends on where in relation to 0 do we start. Large positive initial conditions would slide to the negative attractor and vice versa as - x^5 is the term that defines the behavior at large x.
What we need to do is for each r in the range put a mark at the attractor that the solution slides to for each initial condition.
We first create a canvas to put marks into:
diagram = np.zeros((200,200))
And then for each combination of (r,s0) we put a point on the canvas at (r,s[-1]).
Here is the complete code
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
def pitch(s,x,r):
x = s[0]
dxdt = r * x + np.power(x,3)- np.power(x,5)
return [dxdt]
t = np.arange(0,100,2)
s0=[-50]
N = 200 # Number of points along each side of the diagram
diagram = np.zeros((N,N))
rmin,rmax = -10,10
rrange = np.arange(rmin, rmax,(rmax-rmin)/N)
smin,smax = -5.0,5.0
srange = np.arange(smin,smax,2*(smax-smin)/N)
for i in rrange:
for s0 in srange:
s = odeint(pitch,[s0],t, args=(i,))
imgx = int((i-rmin)*N/(rmax-rmin))
imgy = int((s[-1]-smin)/(smax-smin)*N)
imgx = min(N-1,max(0,imgx)) # make sure we stay
imgy = min(N-1,max(0,imgy)) # within the diagram bounds
diagram[imgy,imgx] = 1
plt.title("Bifurcation diagram")
plt.imshow(np.flipud(diagram),cmap=cm.Greys,
extent=[rmin,rmax,smin,smax],aspect=(rmax-rmin)/(smax-smin))
plt.xlabel("r")
plt.ylabel("x")
plt.show()
And the resulting plot
When you zoom in into the region around 0 by setting (rmin,rmax) to (-0.5,0.5) you could see that the branches of the diagram do not start at 0
Instead as in the diagram drawn in the original post the branches start at roughly r=-0.25

Categories