Random Walk of a Photon Through the Sun

Random Walk of a Photon Through the Sun - python

For a project, I am trying to determine the time it would take for a photon to leave the Sun. However, I am having trouble with my code (found below).
More specifically, I set up a for loop with an if statement, and if some randomly generated probability is less than the probability of collision, that means the photon collides and it changes direction.
What I am having trouble with is setting up a condition where the for loop stops if the photon escapes (when distance > Sun radius). The one I have set up already doesn't appear to work.
I use a very scaled down measurement of the Sun's radius because if I didn't it would take a long time for the photon to escape in my simulation.
from numpy.random import random as rng # we want them random numbers
import numpy as np # for the math functions
import matplotlib.pyplot as plt # to make pretty pretty class
mass_proton = 1.67e-27
mass_electron = 9.11e-31
Thompson_cross = 6.65e-29
Sun_density = 150000
Sun_radius = .005
Mean_Free = (mass_proton + mass_electron)/(Thompson_cross*Sun_density*np.sqrt(2))
time_step= 10**-13 # Used this specifically in order for the path length to be < Mean free Path
path_length = (3e8)*time_step
Probability = 1-np.exp(-path_length/Mean_Free) # Probability of the photon colliding
def Random_walk():
x = 0 # Start at origin (0,0)
y = 0
N = 1000
m=0 # This is a counter I have set up for the number of collisions
for i in range(1,N+1):
prand = rng(N+1) # Randomly generated probability
if prand[i] < Probability: # If my prand is less than the probability
# of collision, the photon collides and changes
# direction
x += Mean_Free*np.cos(2*np.pi*prand)
y += Mean_Free*np.sin(2*np.pi*prand)
m += 1 # Everytime a collision occurs 1 is added to my collision counter
distance = np.sqrt(x**2 + y**2) # Final distance the photon covers
if np.all(distance) > Sun_radius: # if the distance the photon travels
break # is greater than the Radius of the Sun,
# the for loop stops, meaning the
#photon escaped
print(m)
return x,y,distance
x,y,d = Random_walk()
plt.plot(x,y, '--')
plt.plot(x[-1], y[-1], 'ro')
Any criticisms of my code are welcome, this is for a grade and I do want to learn how to do this correctly, so please tell me if you notice any other errors.

I don't understand the motivation for the formulas you've implemented. I'll explain my own motivation here, but if your instructor told you to do something else, I guess you should listen to them instead.
If I were going to do this, I would generate a sequence of movements of a photon, stopping when distance of the photon to the center of the sun is greater than the solar radius. Each movement is a sample from a distribution which has two components: one for the distance, and one for the direction. I will assume that these are independent (this may be questioned in a more careful simulation).
It seems plausible that the distribution of distance is an exponential distribution with parameter 1/(mean free path). Then the density is p(d) = (1/MFP) exp(-d/MFP). Its cdf is 1 - exp(-d/MFP) and the inverse of the cdf is -MFP log(1 - p) where p = cdf(d). Now you can sample from the distribution of distances: let p = rand(0, 1) where rand = uniform random and plug it into the inverse cdf to get d. This is called the inverse cdf method of sampling; a web search will find more info about it.
As for the direction, you can let angle = rand(0, 2*pi) and then (x, y) = (cos(angle), sin(angle)).
Now you can construct the series of positions. From an initial location, let the new location = previous + d*(x, y). Stop when distance of location to center is greater than radius.
Looks like a great problem! Good luck and have fun. Let me know if you have any questions.

Here is a way of thinking about the problem that you may find helpful. At each moment, the photon has a position (x, y) and a direction (dx, dy). The (dx, dy) variables are coefficients of the unit vector, so sqrt(dx**2 + dy**2) = 1.0. The distance traveled by the photon during one step is path_length * direction.
At each step you do 4 things:
calculate the photon's new position
figure out if the photon has left the sun by computing its distance from the center point
determine, with a single random number, whether or not the photon collides. If it does you randomly generate a new direction.
Append the photon's current position to a list. You might want to do this as a function of distance from the center rather than x,y.
At the end, you plot the list you have built up.
You should also choose a random direction at the very start.
I don't know how you will terminate the loop, for the photon isn't ever guaranteed to leave the sun - just like in the real world. In principle the program might run forever (or until the sun burns out).
There is a slight inaccuracy in that the photon can collide at any instant, not just at the end of one step. But since the steps are small, so is the error introduced by this simplification.
I will point out that you do not need numpy for any of this except perhaps the final plot. The standard Python library has all the math functions you need. Numpy is of course great for manipulating arrays of data, but the only array you will have here is the one you build, a step at a time, of photon position versus time.
As I pointed out in one of my comments, you are modeling the sun as a 2-dimensional object. If you want to do this calculation in three dimensions, you don't need to change this basic approach.

Related

Python: How do I properly use integrate.solve_bvp for the 2-D projectile motion of an object with air resistance

I'm fairly new to programming and am trying to write a program using the inbuilt function 'integrate.solve_bvp' to determinine the trajectory of a projectile subject to boundary conditions.
I'm not by any means a programmer, so my knowledge and understanding is extremely limited. Please explain like I'm 5.
I need to be able to determine the launch and final velocity of a projectile, given its launch angle, and the time taken for it to return to the ground while considering drag.
I've written some code, but it doesn't work and I don't know why. I've tried reading the documentation, but it all goes over my head.
I started by considering the 1-D case (launch angle of 90 degrees), which seemed failly simple. I used SUVAT to gain a guess of the launch velocity (ie, the launch velocity if ignoring drag) and wrote the following code:
import numpy as np
from scipy import integrate
import matplotlib.pyplot as plt
drag_coef = 0.47 # average drag coefficient of a golf ball
air_density = 1.293
area = 1.45*10**-3 # cross sectionl area of a golf ball
mass = 45.9*10**-3 # mass of a golf ball
g = -9.80665
def function(time,height):
drag_factor = drag_coef * air_density * area / (2*mass)
return height[1], (g-drag_factor*height[1]*np.abs(height[1]))
def boundary_conditions(height_0,height_end):
return height_0[0], height_end[0]
time_scale = 10 # time at which you want projectile to hit the ground
velocity_0_guess = 49 # initial velocity guess (t=0)
time = np.linspace(0, time_scale, time_scale*1000+1)
height_0 = np.zeros(len(time))
velocity_0 = velocity_0_guess * np.ones(len(time))
height = np.array((height_0,velocity_0))
res = integrate.solve_bvp(function, boundary_conditions, time, height, max_nodes = time_scale*1000+1)
print(res.y[1] [0]) # calculated V_0
print(res.y[1] [time_scale*1000]) # Calculated V_end
print(res)
plt.plot(time, res.y[0], label="S_z") # caculated S_z
plt.xlabel("time [s]")
plt.ylabel("displacement [m]")
plt.show()
plt.plot(time, res.y[1], label="V_z") # calcuted V_z
plt.xlabel("time [s]")
plt.ylabel("velocity [m/s]")
plt.show()
However, even for this seemingly simple case, when I "print(res)"; I get a success result of 'False', along with the following statement:
message: 'The maximum number of mesh nodes is exceeded.'
And I don't know why as I think I've defined the number of nodes to equal the number of points in time that are being considered.
Clearly however this isn't the case as when I halve my 'time_scale' and 'velocity_0_guess' to 5 and 24.5 respecitvely, I get a successful result, even though both of these should equally valid:
message: 'The algorithm converged to the desired accuracy.'
I've tried to google the issue, but I haven't found anything that's been able to help me. I've looked through Reddit, and StackOverflow with no success. And I've even tried using ChatGPT to help fix my code, but that too was to no avail. So my step is posting a question.
I don't know if this is relevant, but I've been writing this program via the website: repl.it

The solver works iteratively. It alternates between computing an approximate solution of the DE on a fixed time grid and refining the time grid using an error estimate along the approximate solution.
The general idea is to start with an initial guess that roughly gives the shape of the desired solution. This usually should have no more points than necessary to define the shape. The solver then fills in the grid.
The resulting solution is found either in the function table res.x, res.y or with the "dense output" interpolation res.sol.
For your constant guess it is completely sufficient to have a minimal grid of two points
time = [0,time_scale]
height = [[0.0]*2, [velocity_0_guess]*2]
This finishes without complaint, giving res.y[1,[0,-1]] = [ 95.93681148, -30.32436139] and the number of grid points as len(res.x) = 27. Visibly, the time grid is no longer that of time, so you need to use res.x in the plots.
You can get a denser grid and more accurate solution by setting the error tolerance tol lower than its default value 1e-3
res = integrate.solve_bvp(function, boundary_conditions, time, height, tol=1e-6);
giving len(res.x) = 186 and res.y[1,[0,-1]] = [ 95.93702666, -30.32440457]

Monte Carlo Simulation to estimating pi using circle

I have a question on the algorithm below. What confused me is why x = random.random()*2 -1 and y = random.random()*2 -1 rather than just simply x = random.random() and y = random.random()? The complete code is as following:
import random
NUMBER_OF_TRIALS= 1000000
numberOfHits = 0
for i in range(NUMBER_OF_TRIALS):
x = random.random()*2 -1
y = random.random()*2 -1
if x * x + y * y <=1:
numberOfHits +=1
pi = 4* numberOfHits / NUMBER_OF_TRIALS
print("PI is", pi)

The circle in this simulation is centered at (0, 0) with a radius of 1, so
x = random.random() * 2 - 1
y = random.random() * 2 - 1
will make the range for each -1 to 1.

The interesting thing about this question is that the implementation works just as well, and gives you the same expected answer whether you use random.random() or random.random()*2-1... so the reason why the author chose to use random.random()*2-1 has nothing to do with what the program does.
The author of this code understands the algorithm as follows:
Imagine a circle inscribe in a square. Use the unit circle because it's simplest
Choose random points within the square, and see how many are also inside the circle
The circle has area pi and the square has area 4, so the proportion of points that fall in the circle will approach pi/4. Calculate the measured ratio and solve for pi.
Now, the square in which the unit circle is inscribed goes from (-1,-1) to (1,1). Since random() only gives you a number in [0,1), it needs to be multiplied by two and shifted to select a random number in [-1,1), which chooses random points within the square.
If the author had used random(), then he would be selecting point within the first quadrant only. All the quadrants look exactly the same, so the ratio of hits to misses would be the same and the program would still work just fine, but then the program would not be implementing the above-described procedure, and would be more difficult to understand.
One of the most important properties of good code is that it clearly communicates the author's intent.

random() gives you a random float between 0 and 1.
random()*2 -1 gives you a random float between -1 and +1.
The algorithm, as usually explained, is in terms of the proportion of points in the unit square that are in the unit circle being pi/4, which is obvious after a moment's thought, and the second one gives you that directly.
It doesn't take much additional thought to see that using only the upper-right quadrant of the unit square and the unit circle will still give you pi/4 (although it is possible to confuse yourself and get it wrong, as I embarrassingly did in the first version of this answer). But it's not as blindingly obvious. And that might be a good enough reason for a tutorial to not do things that way.
If you were interested in calculating pi as efficiently as possible, it would probably make more sense to just use random(), and add a comment about how you're diving both the unit square and the unit circle by the same value so the odds are still pi/4. But if you're interested in showing novice programmers how to design and implement randomized algorithms? Probably better to write it the way it's written.

Generate random points on a surface of the cylinder

I want to generate random points on the surface of cylinder such that distance between the points fall in a range of 230 and 250. I used the following code to generate random points on surface of cylinder:
import random,math
H=300
R=20
s=random.random()
#theta = random.random()*2*math.pi
for i in range(0,300):
theta = random.random()*2*math.pi
z = random.random()*H
r=math.sqrt(s)*R
x=r*math.cos(theta)
y=r*math.sin(theta)
z=z
print 'C' , x,y,z
How can I generate random points such that they fall with in the range(on the surfaceof cylinder)?

This is not a complete solution, but an insight that should help. If you "unroll" the surface of the cylinder into a rectangle of width w=2*pi*r and height h, the task of finding distance between points is simplified. You have not explained how to measure "distance along the surface" between points on the top of the cylinder and the side- this is a slightly tricky bit of geometry.
As for computing the distance along the surface when we created an artificial "seam", just use both (x1-x2) and (w -x1+x2) - whichever gives the shorter distance is the one you want.
I do think that #VincentNivoliers' suggestion to use Poisson disk sampling is very good, but with the constraints of h=300 and r=20 you will get terrible results no matter what.

The basic way of creating a set of random points with constraints in the positions between them, is to have a function that modulates the probability of points being placed at a certain location. this function starts out being a constant, and whenever a point is placed, forbidden areas surrounding the point are set to zero. That is difficult to do with continuous variables, but reasonably easy if you discretize your problem.
The other thing to be careful about is the being on a cylinder part. It may be easier to think of it as random points on a rectangular area that repeats periodically. This can be handled in two different ways:
the simplest is to take into consideration not only the rectangular tile where you are placing the points, but also its neighbouring ones. Whenever you place a point in your main tile, you also place one in the neighboring ones and compute their effect on the probability function inside your tile.
A more sophisticated approach considers the probability function then convolution of a kernel that encodes forbidden areas, with a sum of delta functions, corresponding to the points already placed. If this is computed using FFTs, the periodicity is anatural by product.
The first approach can be coded as follows:
from __future__ import division
import numpy as np
r, h = 20, 300
w = 2*np.pi*r
int_w = int(np.rint(w))
mult = 10
pdf = np.ones((h*mult, int_w*mult), np.bool)
points = []
min_d, max_d = 230, 250
available_locs = pdf.sum()
while available_locs:
new_idx = np.random.randint(available_locs)
new_idx = np.nonzero(pdf.ravel())[0][new_idx]
new_point = np.array(np.unravel_index(new_idx, pdf.shape))
points += [new_point]
min_mask = np.ones_like(pdf)
if max_d is not None:
max_mask = np.zeros_like(pdf)
else:
max_mask = True
for p in [new_point - [0, int_w*mult], new_point +[0, int_w*mult],
new_point]:
rows = ((np.arange(pdf.shape[0]) - p[0]) / mult)**2
cols = ((np.arange(pdf.shape[1]) - p[1]) * 2*np.pi*r/int_w/mult)**2
dist2 = rows[:, None] + cols[None, :]
min_mask &= dist2 > min_d*min_d
if max_d is not None:
max_mask |= dist2 < max_d*max_d
pdf &= min_mask & max_mask
available_locs = pdf.sum()
points = np.array(points) / [mult, mult*int_w/(2*np.pi*r)]
If you run it with your values, the output is usually just one or two points, as the large minimum distance forbids all others. but if you run it with more reasonable values, e.g.
min_d, max_d = 50, 200
Here's how the probability function looks after placing each of the first 5 points:
Note that the points are returned as pairs of coordinates, the first being the height, the second the distance along the cylinder's circumference.

calculate turning points / pivot points in trajectory (path)

I'm trying to come up with an algorithm that will determine turning points in a trajectory of x/y coordinates. The following figures illustrates what I mean: green indicates the starting point and red the final point of the trajectory (the entire trajectory consists of ~ 1500 points):
In the following figure, I added by hand the possible (global) turning points that an algorithm could return:
Obviously, the true turning point is always debatable and will depend on the angle that one specifies that has to lie between points. Furthermore a turning point can be defined on a global scale (what I tried to do with the black circles), but could also be defined on a high-resolution local scale. I'm interested in the global (overall) direction changes, but I'd love to see a discussion on the different approaches that one would use to tease apart global vs local solutions.
What I've tried so far:
calculate distance between subsequent points
calculate angle between subsequent points
look how distance / angle changes between subsequent points
Unfortunately this doesn't give me any robust results. I probably have too calculate the curvature along multiple points, but that's just an idea.
I'd really appreciate any algorithms / ideas that might help me here. The code can be in any programming language, matlab or python are preferred.
EDIT here's the raw data (in case somebody want's to play with it):
mat file
text file (x coordinate first, y coordinate in second line)

You could use the Ramer-Douglas-Peucker (RDP) algorithm to simplify the path. Then you could compute the change in directions along each segment of the simplified path. The points corresponding to the greatest change in direction could be called the turning points:
A Python implementation of the RDP algorithm can be found on github.
import matplotlib.pyplot as plt
import numpy as np
import os
import rdp
def angle(dir):
"""
Returns the angles between vectors.
Parameters:
dir is a 2D-array of shape (N,M) representing N vectors in M-dimensional space.
The return value is a 1D-array of values of shape (N-1,), with each value
between 0 and pi.
0 implies the vectors point in the same direction
pi/2 implies the vectors are orthogonal
pi implies the vectors point in opposite directions
"""
dir2 = dir[1:]
dir1 = dir[:-1]
return np.arccos((dir1*dir2).sum(axis=1)/(
np.sqrt((dir1**2).sum(axis=1)*(dir2**2).sum(axis=1))))
tolerance = 70
min_angle = np.pi*0.22
filename = os.path.expanduser('~/tmp/bla.data')
points = np.genfromtxt(filename).T
print(len(points))
x, y = points.T
# Use the Ramer-Douglas-Peucker algorithm to simplify the path
# http://en.wikipedia.org/wiki/Ramer-Douglas-Peucker_algorithm
# Python implementation: https://github.com/sebleier/RDP/
simplified = np.array(rdp.rdp(points.tolist(), tolerance))
print(len(simplified))
sx, sy = simplified.T
# compute the direction vectors on the simplified curve
directions = np.diff(simplified, axis=0)
theta = angle(directions)
# Select the index of the points with the greatest theta
# Large theta is associated with greatest change in direction.
idx = np.where(theta>min_angle)[0]+1
fig = plt.figure()
ax =fig.add_subplot(111)
ax.plot(x, y, 'b-', label='original path')
ax.plot(sx, sy, 'g--', label='simplified path')
ax.plot(sx[idx], sy[idx], 'ro', markersize = 10, label='turning points')
ax.invert_yaxis()
plt.legend(loc='best')
plt.show()
Two parameters were used above:
The RDP algorithm takes one parameter, the tolerance, which
represents the maximum distance the simplified path
can stray from the original path. The larger the tolerance, the cruder the simplified path.
The other parameter is the min_angle which defines what is considered a turning point. (I'm taking a turning point to be any point on the original path, whose angle between the entering and exiting vectors on the simplified path is greater than min_angle).

I will be giving numpy/scipy code below, as I have almost no Matlab experience.
If your curve is smooth enough, you could identify your turning points as those of highest curvature. Taking the point index number as the curve parameter, and a central differences scheme, you can compute the curvature with the following code
import numpy as np
import matplotlib.pyplot as plt
import scipy.ndimage
def first_derivative(x) :
return x[2:] - x[0:-2]
def second_derivative(x) :
return x[2:] - 2 * x[1:-1] + x[:-2]
def curvature(x, y) :
x_1 = first_derivative(x)
x_2 = second_derivative(x)
y_1 = first_derivative(y)
y_2 = second_derivative(y)
return np.abs(x_1 * y_2 - y_1 * x_2) / np.sqrt((x_1**2 + y_1**2)**3)
You will probably want to smooth your curve out first, then calculate the curvature, then identify the highest curvature points. The following function does just that:
def plot_turning_points(x, y, turning_points=10, smoothing_radius=3,
cluster_radius=10) :
if smoothing_radius :
weights = np.ones(2 * smoothing_radius + 1)
new_x = scipy.ndimage.convolve1d(x, weights, mode='constant', cval=0.0)
new_x = new_x[smoothing_radius:-smoothing_radius] / np.sum(weights)
new_y = scipy.ndimage.convolve1d(y, weights, mode='constant', cval=0.0)
new_y = new_y[smoothing_radius:-smoothing_radius] / np.sum(weights)
else :
new_x, new_y = x, y
k = curvature(new_x, new_y)
turn_point_idx = np.argsort(k)[::-1]
t_points = []
while len(t_points) < turning_points and len(turn_point_idx) > 0:
t_points += [turn_point_idx[0]]
idx = np.abs(turn_point_idx - turn_point_idx[0]) > cluster_radius
turn_point_idx = turn_point_idx[idx]
t_points = np.array(t_points)
t_points += smoothing_radius + 1
plt.plot(x,y, 'k-')
plt.plot(new_x, new_y, 'r-')
plt.plot(x[t_points], y[t_points], 'o')
plt.show()
Some explaining is in order:
turning_points is the number of points you want to identify
smoothing_radius is the radius of a smoothing convolution to be applied to your data before computing the curvature
cluster_radius is the distance from a point of high curvature selected as a turning point where no other point should be considered as a candidate.
You may have to play around with the parameters a little, but I got something like this:
>>> x, y = np.genfromtxt('bla.data')
>>> plot_turning_points(x, y, turning_points=20, smoothing_radius=15,
... cluster_radius=75)
Probably not good enough for a fully automated detection, but it's pretty close to what you wanted.

A very interesting question. Here is my solution, that allows for variable resolution. Although, fine-tuning it may not be simple, as it's mostly intended to narrow down
Every k points, calculate the convex hull and store it as a set. Go through the at most k points and remove any points that are not in the convex hull, in such a way that the points don't lose their original order.
The purpose here is that the convex hull will act as a filter, removing all of "unimportant points" leaving only the extreme points. Of course, if the k-value is too high, you'll end up with something too close to the actual convex hull, instead of what you actually want.
This should start with a small k, at least 4, then increase it until you get what you seek. You should also probably only include the middle point for every 3 points where the angle is below a certain amount, d. This would ensure that all of the turns are at least d degrees (not implemented in code below). However, this should probably be done incrementally to avoid loss of information, same as increasing the k-value. Another possible improvement would be to actually re-run with points that were removed, and and only remove points that were not in both convex hulls, though this requires a higher minimum k-value of at least 8.
The following code seems to work fairly well, but could still use improvements for efficiency and noise removal. It's also rather inelegant in determining when it should stop, thus the code really only works (as it stands) from around k=4 to k=14.
def convex_filter(points,k):
new_points = []
for pts in (points[i:i + k] for i in xrange(0, len(points), k)):
hull = set(convex_hull(pts))
for point in pts:
if point in hull:
new_points.append(point)
return new_points
# How the points are obtained is a minor point, but they need to be in the right order.
x_coords = [float(x) for x in x.split()]
y_coords = [float(y) for y in y.split()]
points = zip(x_coords,y_coords)
k = 10
prev_length = 0
new_points = points
# Filter using the convex hull until no more points are removed
while len(new_points) != prev_length:
prev_length = len(new_points)
new_points = convex_filter(new_points,k)
Here is a screen shot of the above code with k=14. The 61 red dots are the ones that remain after the filter.

The approach you took sounds promising but your data is heavily oversampled. You could filter the x and y coordinates first, for example with a wide Gaussian and then downsample.
In MATLAB, you could use x = conv(x, normpdf(-10 : 10, 0, 5)) and then x = x(1 : 5 : end). You will have to tweak those numbers depending on the intrinsic persistence of the objects you are tracking and the average distance between points.
Then, you will be able to detect changes in direction very reliably, using the same approach you tried before, based on the scalar product, I imagine.

Another idea is to examine the left and the right surroundings at every point. This may be done by creating a linear regression of N points before and after each point. If the intersecting angle between the points is below some threshold, then you have an corner.
This may be done efficiently by keeping a queue of the points currently in the linear regression and replacing old points with new points, similar to a running average.
You finally have to merge adjacent corners to a single corner. E.g. choosing the point with the strongest corner property.

These spectrum bands used to be judged by eye, how to do it programmatically?

Operators used to examine the spectrum, knowing the location and width of each peak and judge the piece the spectrum belongs to. In the new way, the image is captured by a camera to a screen. And the width of each band must be computed programatically.
Old system: spectroscope -> human eye
New system: spectroscope -> camera -> program
What is a good method to compute the width of each band, given their approximate X-axis positions; given that this task used to be performed perfectly by eye, and must now be performed by program?
Sorry if I am short of details, but they are scarce.
Program listing that generated the previous graph; I hope it is relevant:
import Image
from scipy import *
from scipy.optimize import leastsq
# Load the picture with PIL, process if needed
pic = asarray(Image.open("spectrum.jpg"))
# Average the pixel values along vertical axis
pic_avg = pic.mean(axis=2)
projection = pic_avg.sum(axis=0)
# Set the min value to zero for a nice fit
projection /= projection.mean()
projection -= projection.min()
#print projection
# Fit function, two gaussians, adjust as needed
def fitfunc(p,x):
return p[0]*exp(-(x-p[1])**2/(2.0*p[2]**2)) + \
p[3]*exp(-(x-p[4])**2/(2.0*p[5]**2))
errfunc = lambda p, x, y: fitfunc(p,x)-y
# Use scipy to fit, p0 is inital guess
p0 = array([0,20,1,0,75,10])
X = xrange(len(projection))
p1, success = leastsq(errfunc, p0, args=(X,projection))
Y = fitfunc(p1,X)
# Output the result
print "Mean values at: ", p1[1], p1[4]
# Plot the result
from pylab import *
#subplot(211)
#imshow(pic)
#subplot(223)
#plot(projection)
#subplot(224)
#plot(X,Y,'r',lw=5)
#show()
subplot(311)
imshow(pic)
subplot(312)
plot(projection)
subplot(313)
plot(X,Y,'r',lw=5)
show()

Given an approximate starting point, you could use a simple algorithm that finds a local maxima closest to this point. Your fitting code may be doing that already (I wasn't sure whether you were using it successfully or not).
Here's some code that demonstrates simple peak finding from a user-given starting point:
#!/usr/bin/env python
from __future__ import division
import numpy as np
from matplotlib import pyplot as plt
# Sample data with two peaks: small one at t=0.4, large one at t=0.8
ts = np.arange(0, 1, 0.01)
xs = np.exp(-((ts-0.4)/0.1)**2) + 2*np.exp(-((ts-0.8)/0.1)**2)
# Say we have an approximate starting point of 0.35
start_point = 0.35
# Nearest index in "ts" to this starting point is...
start_index = np.argmin(np.abs(ts - start_point))
# Find the local maxima in our data by looking for a sign change in
# the first difference
# From http://stackoverflow.com/a/9667121/188535
maxes = (np.diff(np.sign(np.diff(xs))) < 0).nonzero()[0] + 1
# Find which of these peaks is closest to our starting point
index_of_peak = maxes[np.argmin(np.abs(maxes - start_index))]
print "Peak centre at: %.3f" % ts[index_of_peak]
# Quick plot showing the results: blue line is data, green dot is
# starting point, red dot is peak location
plt.plot(ts, xs, '-b')
plt.plot(ts[start_index], xs[start_index], 'og')
plt.plot(ts[index_of_peak], xs[index_of_peak], 'or')
plt.show()
This method will only work if the ascent up the peak is perfectly smooth from your starting point. If this needs to be more resilient to noise, I have not used it, but PyDSTool seems like it might help. This SciPy post details how to use it for detecting 1D peaks in a noisy data set.
So assume at this point you've found the centre of the peak. Now for the width: there are several methods you could use, but the easiest is probably the "full width at half maximum" (FWHM). Again, this is simple and therefore fragile. It will break for close double-peaks, or for noisy data.
The FWHM is exactly what its name suggests: you find the width of the peak were it's halfway to the maximum. Here's some code that does that (it just continues on from above):
# FWHM...
half_max = xs[index_of_peak]/2
# This finds where in the data we cross over the halfway point to our peak. Note
# that this is global, so we need an extra step to refine these results to find
# the closest crossovers to our peak.
# Same sign-change-in-first-diff technique as above
hm_left_indices = (np.diff(np.sign(np.diff(np.abs(xs[:index_of_peak] - half_max)))) > 0).nonzero()[0] + 1
# Add "index_of_peak" to result because we cut off the left side of the data!
hm_right_indices = (np.diff(np.sign(np.diff(np.abs(xs[index_of_peak:] - half_max)))) > 0).nonzero()[0] + 1 + index_of_peak
# Find closest half-max index to peak
hm_left_index = hm_left_indices[np.argmin(np.abs(hm_left_indices - index_of_peak))]
hm_right_index = hm_right_indices[np.argmin(np.abs(hm_right_indices - index_of_peak))]
# And the width is...
fwhm = ts[hm_right_index] - ts[hm_left_index]
print "Width: %.3f" % fwhm
# Plot to illustrate FWHM: blue line is data, red circle is peak, red line
# shows FWHM
plt.plot(ts, xs, '-b')
plt.plot(ts[index_of_peak], xs[index_of_peak], 'or')
plt.plot(
[ts[hm_left_index], ts[hm_right_index]],
[xs[hm_left_index], xs[hm_right_index]], '-r')
plt.show()
It doesn't have to be the full width at half maximum — as one commenter points out, you can try to figure out where your operators' normal threshold for peak detection is, and turn that into an algorithm for this step of the process.
A more robust way might be to fit a Gaussian curve (or your own model) to a subset of the data centred around the peak — say, from a local minima on one side to a local minima on the other — and use one of the parameters of that curve (eg. sigma) to calculate the width.
I realise this is a lot of code, but I've deliberately avoided factoring out the index-finding functions to "show my working" a bit more, and of course the plotting functions are there just to demonstrate.
Hopefully this gives you at least a good starting point to come up with something more suitable to your particular set.

Late to the party, but for anyone coming across this question in the future...
Eye movement data looks very similar to this; I'd base an approach off that used by Nystrom + Holmqvist, 2010. Smooth the data using a Savitsky-Golay filter (scipy.signal.savgol_filter in scipy v0.14+) to get rid of some of the low-level noise while keeping the large peaks intact - the authors recommend using an order of 2 and a window size of about twice the width of the smallest peak you want to be able to detect. You can find where the bands are by arbitrarily removing all values above a certain y value (set them to numpy.nan). Then take the (nan)mean and (nan)standard deviation of the remainder, and remove all values greater than the mean + [parameter]*std (I think they use 6 in the paper). Iterate until you're not removing any data points - but depending on your data, certain values of [parameter] may not stabilise. Then use numpy.isnan() to find events vs non-events, and numpy.diff() to find the start and end of each event (values of -1 and 1 respectively). To get even more accurate start and end points, you can scan along the data backward from each start and forward from each end to find the nearest local minimum which has value smaller than mean + [another parameter]*std (I think they use 3 in the paper). Then you just need to count the data points between each start and end.
This won't work for that double peak; you'd have to do some extrapolation for that.

The best method might be to statistically compare a bunch of methods with human results.
You would take a large variety data and a large variety of measurement estimates (widths at various thresholds, area above various thresholds, different threshold selection methods, 2nd moments, polynomial curve fits of various degrees, pattern matching, and etc.) and compare these estimates to human measurements of the same data set. Pick the estimate method that correlates best with expert human results. Or maybe pick several methods, the best one for each of various heights, for various separations from other peaks, and etc.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Random Walk of a Photon Through the Sun - python

Related

Python: How do I properly use integrate.solve_bvp for the 2-D projectile motion of an object with air resistance

Monte Carlo Simulation to estimating pi using circle

Generate random points on a surface of the cylinder

calculate turning points / pivot points in trajectory (path)

These spectrum bands used to be judged by eye, how to do it programmatically?

Categories

Resources