While loops for lists? - python

I'm pretty new to programming and I have a quick question. I am trying to make a Gaussian function for a range of stars. However i want the size of undercurve be at 100 for all the stars. I was thinking of doing a while loop saying that while the total length of undercurve be 100. However, I get an error and I'm guessing it has something to do with it being a list. I'm showing you guys my code to see if you can help me out here. Thanks!
I get a syntax error: can't assign to call function
import numpy
import random
import math
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
import scipy
from scipy import stats
from math import sqrt
from numpy import zeros
from numpy import numarray
variance = input("Input variance of the star:")
mean = input("Input mean of the star:")
space=numpy.linspace(-4,1,1000)
sigma = sqrt(variance)
Max = max(mlab.normpdf(space,mean,sigma))
normalized = (mlab.normpdf(space,mean,sigma))/Max
def random_y_pt():
return random.uniform(0,1)
def random_x_pt():
return random.uniform(-4,1)
import random
def undercurve(size):
result = []
for i in range(0,size):
y = random_y_pt()
x = random_x_pt()
if y < scipy.stats.norm(scale=variance,loc=mean).pdf(x):
result.append((x))
return result
size = 1
while len(undercurve(size)) < 100:
undercurve(size) = undercurve(1)+undercurve(size)
print undercurve(size)
plt.hist(undercurve(size),bins=20)
plt.show()

If your error is something like SyntaxError: can't assign to function call then that's because of your line
undercurve(size) = undercurve(1)+undercurve(size)
Which is trying to set the output of the right-hand side as the value of undercurve(size), which you cannot do.
It sounds like you actually want to see just the first 100 items in the list returned by undercurve(size). For that, use
undercurve(size)[:100]

Related

How to plot a 2-D graph for a induced velocity function in terms of two variables as shown here?

I am reading a technical paper and trying to reproduce a graph corresponding to the following function:
K(x) and E(x) are Legendre's elliptical integrals of the first and second kind respectively. The plot of V_r looks like the following:
I tried to plot this function in Python and here is my code:
import numpy as np
import scipy
import matplotlib.pyplot as plt
from scipy import special
def x(r,z,R):
return np.sqrt((-4*r*R)/((R-r)**2+z**2))
def V_z(Gamma,R,r,z):
return (-Gamma/4*np.pi)*((2/np.sqrt((R-r)**2+z**2))*(special.ellipk(x(r,z,R))+\
(R**2-r**2-z**2)/((R+r)**2+z**2) * special.ellipe(x(r,z,R))))
def V_r(Gamma,R,r,z):
return (-Gamma/4*np.pi)*((4*R*z*np.sqrt((R+r)**2+z**2)/((R+r)**4+z**2*(2*(R+r)**2+z**2)) \
* np.sqrt((R+r)**2+z**2)/((R-r)**2+z**2)))*special.ellipe(x(r,z,R))
r = np.linspace(-5,5,100);
z = np.linspace(-5,5,100);
V_z_list = []
for i in V_z(0.1,1,r,z):
if np.isnan(i).any() == False:
print(i)
V_z_list.append(i)
V_r_list = []
for i in V_r(0.1,1,r,z):
if np.isnan(i).any() == False:
print(i)
V_r_list.append(i)
plt.plot(V_z_list)
However, I'm getting this graph as a result:
Could someone help me fix my code?

Plotting in python using matplotlib?

I have been trying to simulate the first order differential equation using the fourth order Runge-Kutta method, but I am having problems plotting it.
#simulation of ode using 4th order rk method dy/dx=-2y+1.3e^-x,y(0)=5,h=0.01 from sympy import*
import math
import numpy as np
import matplotlib.pyplot as plt
h=0.01;
ti=0;
x=0;
n=0;
y=5;
def f(x,y):
return 1.3*math.exp(-x)-2*y
while x < 10:
k1=f(x,5);
k2=f(x+h/2,y+(h/2)* k1);
k3=f(x+h/2,y+(h/2)* k2);
k4=f(x+h,y+h*k3);
y=y+h/6*(k1+2*(k2+k3)+k4);
x=x+h;
plt.plot(x,y);
I know that the problem is because of updating the x,y values every time the loop runs, but can somebody explain how to plot all the values of (x,y)?
As suggested in the comment, you can create two lists to store x and y values and plot it after the while loop:
import math
import numpy as np
import matplotlib.pyplot as plt
h=0.01;
ti=0;
x=0;
n=0;
y=5;
def f(x,y):
return 1.3*math.exp(-x)-2*y
xs = [x] # <<<
ys = [y] # <<<
while x < 10:
k1=f(x,5);
k2=f(x+h/2,y+(h/2)* k1);
k3=f(x+h/2,y+(h/2)* k2);
k4=f(x+h,y+h*k3);
y=y+h/6*(k1+2*(k2+k3)+k4);
x=x+h;
xs.append(x) # <<<
ys.append(y) # <<<
plt.plot(xs,ys);
Another source for wrong results is the first line in the RK4 loop. Instead of
k1=f(x,5);
use
k1=f(x,y);
since the value of y does not stay constant at the initial value.

customizing np.fft.fft function in python

I wish to perform a fourier transform of the function 'stress' from 0 to infinity and extract the real and imaginary parts. I have the following code that does it using a numerical integration technique:
import numpy as np
from scipy.integrate import trapz
import fileinput
import sys,string
window = 200000 # length of the array I wish to transform (number of data points)
time = np.linspace(1,window,window)
freq = np.logspace(-5,2,window)
output = [0]*len(freq)
for index,f in enumerate(freq):
visco = trapz(stress*np.exp(-1j*f*t),t)
soln = visco*(1j*f)
output[index] = soln
print 'f storage loss'
for i in range(len(freq)):
print freq[i],output[i].real,output[i].imag
This gives me a nice transformation of my input data.
Now I have an array of size 2x10^6, and using the above technique is not feasible(computation time scales as O(N^2)), so I have turned to the inbuilt fft function in numpy.
There aren't too many arguments that you can specify to change this function, and so I'm finding it difficult to customize it to my needs.
So far I have
import numpy as np
import fileinput
import sys, string
np.set_printoptions(threshold='nan')
N = len(stress)
fvi = np.fft.fft(stress,n=N)
gprime = fvi.real
gdoubleprime = fvi.imag
for i in range(len(stress)):
print gprime[i], gdoubleprime[i]
And it's not giving me accurate results.
The DFT in python is of the form A_k = summation(a_m * exp(-2*piimk/n)) where the summation is from m = 0 to m = n-1 (http://docs.scipy.org/doc/numpy-1.10.1/reference/routines.fft.html). How can I change it to the form that I have mentioned in my first code, i.e. exp(-1jfreq*t) (freq is the frequency and t is the time which have already been predefined)? Or is there a post processing of the data that I have to do?
Thanks in advance for all your help.

I wrote a simple random number generator, how can I graph the distribution of the function I wrote?

This is my first time writing a random number generator and I was just messing around to see what I can do with just random formulas.
I am curious, however, with how bias my function is and with the distribution of the function (between 1 through 9).
Here is my unnecessarily long code:
import time
class Random:
"""random generator"""
def __init__(self):
""" Random()-> create a random number generator with a random seed
a seed is needed in order to generate random numbers"""
self.seed = time.time()
def random(self):
""" Random.random() -> get a random number using a formula"""
self.seed =(((int(self.seed)*129381249123+2019383)**0.74123)/517247) % 288371
def get_ran_num(self):
""" Random.get_ran_num() -> return a random integer from 1 thru 10"""
self.random()
return int(list(str(int(self.seed)))[3])
ranNum = Random()
It would be great if there exist some tools that can take a random function and then run it some thousand times and then graph the distribution of it.
Thank you in advance
p/s: How can I improve my RNG and make it even random-er?
If you just want a visual representation, you can pretty easily use
import matplotlib.pyplot as plt
# Random class here
ranNum = Random()
randNums = [ranNum.get_ran_num() for _ in range(100)]
nums = list(range(len(randNums)))
plt.plot(nums, randNums, 'ro')
plt.show()
Here it is for 100 random numbers:
However, I'm getting an IndexError when I go to higher ranges. You should probably fix the actual algorithm for whatever is causing that problem, but the way I put a band aid on it was:
def get_ran_num(self):
""" Random.get_ran_num() -> return a random integer from 1 thru 10"""
retval = None
while True:
try:
self.random()
retval = int(list(str(int(self.seed)))[3])
break
except IndexError as e:
continue
return retval
Here's a plot for 100,000 random numbers, which is pretty good. Contiguous lines where none is noticeably more dense than the others is what you're after, but you're going to need to do much better entropy analysis to find out anything more useful than a quick visual representation. In your case, it looks like 6 is a little more favored. Also, it looks like it repeats fairly often.
I would try random.rand and matplotlib.
import numpy as np
import matplotlib.pyplot as plt
N = 50
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.rand(N)
area = np.pi * (15 * np.random.rand(N))**2 # 0 to 15 point radiuses
plt.scatter(x, y, s=area, c=colors, alpha=0.5)
plt.show()
Something like this?
Edit: Are you trying to generate a psuedo random number? You'd need a seed for the shift register anyway, so in that respect, I am not sure if it would be completely random.
I would prefer histogram plot to check for uniformity of distribution of your random number generator.
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
myarr = np.random.randint(1000, size=100000)
plt.hist(myarr, bins=40)
plt.show()

Tukey five number summary in Python

I have been unable to find this function in any of the standard packages, so I wrote the one below. Before throwing it toward the Cheeseshop, however, does anyone know of an already published version? Alternatively, please suggest any improvements. Thanks.
def fivenum(v):
"""Returns Tukey's five number summary (minimum, lower-hinge, median, upper-hinge, maximum) for the input vector, a list or array of numbers based on 1.5 times the interquartile distance"""
import numpy as np
from scipy.stats import scoreatpercentile
try:
np.sum(v)
except TypeError:
print('Error: you must provide a list or array of only numbers')
q1 = scoreatpercentile(v,25)
q3 = scoreatpercentile(v,75)
iqd = q3-q1
md = np.median(v)
whisker = 1.5*iqd
return np.min(v), md-whisker, md, md+whisker, np.max(v),
pandas Series and DataFrame have a describe method, which is similar to R's summary:
In [3]: import numpy as np
In [4]: import pandas as pd
In [5]: s = pd.Series(np.random.rand(100))
In [6]: s.describe()
Out[6]:
count 100.000000
mean 0.540376
std 0.296250
min 0.002514
25% 0.268722
50% 0.593436
75% 0.831067
max 0.991971
NAN's are handled correctly.
I would get rid of these two things:
import numpy as np
from scipy.stats import scoreatpercentile
You should be importing at the module level. This means that users will be aware of missing dependencies as soon as they import your module, rather than when they call the function.
try:
sum(v)
except TypeError:
print('Error: you must provide a list or array of only numbers')
Several problems with this:
Don't type check in Python. Document what the function takes.
How do you know callers will see this? They might not be running at a console, and even if they are, they might not want your error message interfering with their output.
Don't type check in Python.
If you do want to raise some sort of exception for invalid data (not type checking), either let an existing exception propagate, or wrap it in your own exception type.
In case anybody ever needs a version that works with NaN in the data, here is my modification. I didn't want to change the original poster answer to avoid confusion.
import numpy as np
from scipy.stats import scoreatpercentile
from scipy.stats import nanmedian
def fivenum(v):
"""Returns Tukey's five number summary (minimum, lower-hinge, median, upper-hinge, maximum) for the input vector, a list or array of numbers based on 1.5 times the interquartile distance"""
try:
np.sum(v)
except TypeError:
print('Error: you must provide a list or array of only numbers')
q1 = scoreatpercentile(v[~np.isnan(v)],25)
q3 = scoreatpercentile(v[~np.isnan(v)],75)
iqd = q3-q1
md = nanmedian(v)
whisker = 1.5*iqd
return np.nanmin(v), md-whisker, md, md+whisker, np.nanmax(v),
Try this:
import numpy as np
import numpy.random
from statstools import run
from scipy.stats import scoreatpercentile
data=np.random.randn(5)
return (min(data), md-whisker, md, md+whisker, max(data))
I am new to Python, but the return is calculated incorrectly: it should be max(min(v), q1-whisker) for the lower bound and min (max(v), q3+whisker) for the upper bound. It is how it's done in R (the summary() function), and that's what shows up on the boxplots in matplotlib.pyplot and in R.
Minimal, but it gets the job done. :)
import numpy as np
[round(np.percentile(results[:,4], i), 1) for i in [1, 2, 5, 10, 25, 50]]
import numpy as np
# np_array = np.array(np.random.random(100))
np.percentile(np_array, [0, 25, 50, 75, 100])
percentiles selection can be configured with the interpolation argument which is linear by default
import pandas as pd
def fivenum(x):
series=pd.Series(x)
mi = series.min()
q1 = series.quantile(q=0.25, interpolation='nearest')
me = series.median()
q3 = series.quantile(q=0.75, interpolation='nearest')
ma = series.max()
return pd.Series([mi, q1, me, q3, ma], index=['min', 'q1', 'median', 'q3', 'max'])

Categories