Area under a parabola revisited

Area under a parabola revisited - python

I was unable to go from the previous question because I was not a member of the website so I was unable to comment on it when I returned.
Here is my question:
To find the area of a region bounded by the graph y=x^2 and the x-axis on the interval [a,b] we can approximate the region by drawing a number of "thin" rectangles and taking the sum of their areas. Let us divide [a,b] into n smaller intervals of the same widght h=b-1/n. On each interval there is a rectangle with the height y=r where r is the middle of that small interval on the x-axis. The area of that rectangle is hy. Write a Python function that takes a,b, and n as parameters and returns the approximate area of the region under the parabola y=x^2 using the above method. If you could please explain as to why your program works that would be helpful.
Thanks to helpful members, I found the following program(please edit the program for i am unable/ don't know how to
def parabola(x):
y = x*x
return y
def approx_area(fn, a, b, n):
"""
Approximate the area under fn in the interval [a,b]
by adding the area of n rectangular slices.
"""
a = float(a)
b = float(b)
area = 0.0
for slice in range(n):
left = a + (b-a)*slice/n
right = a + (b-a)*(slice+1)/n
mid = (left + right)*0.5
height = fn(mid)
width = right - left
area += height * width
return area
print "Area is", approx_area(parabola, -1.0, 1.0, 500)
However, I would need to place this under one entire function. Any ideas on how I can do this?

Okay, by changing the function to y = x and trying some known input values, I conclude that it works fine:
0 .. 1 => 0.5
0 .. 2 => 2.0
1 .. 2 => 1.5
0 .. 9 => 40.5
If you want it all in one function, just get rid of parabola(), remove the first parameter from the approx_area() function (and call), then change:
height = fn(mid)
to:
height = mid * mid
as in:
def approx_area(a, b, n):
"""
Approximate the area under fn in the interval [a,b]
by adding the area of n rectangular slices.
"""
a = float(a)
b = float(b)
area = 0.0
for slice in range(n):
left = a + (b-a)*slice/n
right = a + (b-a)*(slice+1)/n
mid = (left + right)*0.5
height = mid * mid
width = right - left
area += height * width
return area
print "Area is", approx_area(-1, 1, 500)
Note that I wouldn't normally give this much explicit help for homework but, since you've done the bulk of the work yourself, it's only a small nudge to push you across the line.
I would warn you against handing in this code as-is since a simple web search will easily find it here and your grades may suffer for that.
Examine it, understand how it works thoroughly, then try to re-code it yourself without looking at this source. That will assist you far more in your career than just blind copying, trust me.
And just so you understand the theory behind this method, consider the slice of the function y = x:
7 .
6 /|
5 / |
| |
| |
| |
| |
| |
0 +-+
567
The midpoint y co-ordinate (and also the height) of the top is (5 + 7) / 2, or 6, and the width is 2 so the area is 12.
Now this is in fact the actual area but that's only because of the formula we're using. For a non-linear formula, there will be inaccuracies because of the nature of the "line" at the top. Specifically, in your case, a parabola is curved.
But these inaccuracies get less and less and you use thinner and thinner slices since any line tends towards a straight line (linear) as you shorten it. For the case above, if you divided that into two slices, the areas would be 5.5 x 1 and 6.5 x 1 for a total of 12. If you line weren't straight, the two-slice answer would be closer to reality than the one-slice answer.
For your parabola (but from x = 0 .. 1 to make my life easier, just double everything for x = -1 .. 1 since it's symmetrical around the y-axis), the worst case in a one-slice solution. In that case, the midpoint is at x = 0.5, y = 0.25 and, when you multiply that y by the width of 1, you get an area of 0.25.
With two slices (width = 0.5), the midpoints are at:
x y y x width
---- ------ ---------
0.25 0.0625 0.03125
0.75 0.5625 0.28125
---------
0.31250
So the area estimate there is 0.3125.
With four slices (width = 0.25), the midpoints are at:
x y y x width
----- -------- ----------
0.125 0.015625 0.00390625
0.375 0.140625 0.03515625
0.625 0.390625 0.09765625
0.875 0.765625 0.19140625
----------
0.32812500
So the area estimate there is 0.328125.
With eight slices (width = 0.125), the midpoints are at:
x y y x width
------ ---------- -----------
0.0625 0.00390625 0.000488281
0.1875 0.03515625 0.004394531
0.3125 0.09765625 0.012207031
0.4375 0.19140625 0.023925781
0.5625 0.31640625 0.039550781
0.6875 0.47265625 0.059082031
0.8125 0.66015625 0.082519531
0.9375 0.87890625 0.109863281
-----------
0.332031248
So the area estimate there is 0.332031248.
As you can see, this is becoming closer and closer to the actual area of 1/3 (I know this since I know calculus, see below).
Hopefully, that will assist you in understanding the code you have.
If you really want to know how this works, you need to look into calculus, specifically integration and differentiation. These methods can take a formula and give you another formula for calculating the slope of a line and the area under the line.
But, unless you're going to be using it a lot and need real (mathematical) accuracy, you can probably just use the approximation methods you're learning about.

There is also a good visualization of this at http://en.wikipedia.org/wiki/Integral#Formal_definitions
We look at the section of the parabola between a and b, and we divide it into a set of vertical rectangular slices such that the top-center of each rectangle is exactly on the parabola.
This leaves one corner of each rectangle "hanging over" the parabola, and the other too low, leaving unfilled space; so the area under the parabola is equal to the area of the rectangle, plus a bit, minus a bit. But how do we compare the bits? Is the area of the rectangle a bit too much, or not quite enough?
If we draw a line tangent to the parabola at the top-center of the rectangle, we can "cut off" the overlapping bit, flip it over, and add it to the other side; note that this does not change the total area of the rectangle (now a trapezoid).
We can now see that there is a little bit of space left on either side under the parabola, so the area of the trapezoid is slightly less than the area under the parabola. We can now think of the trapezoid-tops as forming a bunch of straight-line segments (a "linear piecewise approximation") along the bottom of the parabola; and the area under the segments is almost the same as (but always slightly less than) the actual area we are seeking.
So how do we minimize the "slightly less than" amount, to make our calculated area more accurate? One way is to use curved approximation-pieces instead of straight lines; this leads into splines (Bezier curves, NURBS, etc). Another way is to use a larger number of shorter line-pieces, to "increase the resolution". Calculus takes this idea to the limit (pun intended), using an infinite number of infinitely short pieces.

Related

Monte Carlo Simulation to estimating pi using circle

I have a question on the algorithm below. What confused me is why x = random.random()*2 -1 and y = random.random()*2 -1 rather than just simply x = random.random() and y = random.random()? The complete code is as following:
import random
NUMBER_OF_TRIALS= 1000000
numberOfHits = 0
for i in range(NUMBER_OF_TRIALS):
x = random.random()*2 -1
y = random.random()*2 -1
if x * x + y * y <=1:
numberOfHits +=1
pi = 4* numberOfHits / NUMBER_OF_TRIALS
print("PI is", pi)

The circle in this simulation is centered at (0, 0) with a radius of 1, so
x = random.random() * 2 - 1
y = random.random() * 2 - 1
will make the range for each -1 to 1.

The interesting thing about this question is that the implementation works just as well, and gives you the same expected answer whether you use random.random() or random.random()*2-1... so the reason why the author chose to use random.random()*2-1 has nothing to do with what the program does.
The author of this code understands the algorithm as follows:
Imagine a circle inscribe in a square. Use the unit circle because it's simplest
Choose random points within the square, and see how many are also inside the circle
The circle has area pi and the square has area 4, so the proportion of points that fall in the circle will approach pi/4. Calculate the measured ratio and solve for pi.
Now, the square in which the unit circle is inscribed goes from (-1,-1) to (1,1). Since random() only gives you a number in [0,1), it needs to be multiplied by two and shifted to select a random number in [-1,1), which chooses random points within the square.
If the author had used random(), then he would be selecting point within the first quadrant only. All the quadrants look exactly the same, so the ratio of hits to misses would be the same and the program would still work just fine, but then the program would not be implementing the above-described procedure, and would be more difficult to understand.
One of the most important properties of good code is that it clearly communicates the author's intent.

random() gives you a random float between 0 and 1.
random()*2 -1 gives you a random float between -1 and +1.
The algorithm, as usually explained, is in terms of the proportion of points in the unit square that are in the unit circle being pi/4, which is obvious after a moment's thought, and the second one gives you that directly.
It doesn't take much additional thought to see that using only the upper-right quadrant of the unit square and the unit circle will still give you pi/4 (although it is possible to confuse yourself and get it wrong, as I embarrassingly did in the first version of this answer). But it's not as blindingly obvious. And that might be a good enough reason for a tutorial to not do things that way.
If you were interested in calculating pi as efficiently as possible, it would probably make more sense to just use random(), and add a comment about how you're diving both the unit square and the unit circle by the same value so the odds are still pi/4. But if you're interested in showing novice programmers how to design and implement randomized algorithms? Probably better to write it the way it's written.

Fall-off function for mountains

I am writing mincraft-like game with voxel terrain.
For mountains, I specify a location, a height and size. There is a function to return True if the block at the current (x, y, z) coordinate is part of a mountain. If a block is far away from the centre of a mountain, True is only returned if if the z coord is below a maximum height for the distance from the mountain, ie the further from a mountain a block is, the lower the maximum height. So at the centre of a mountain, the maximum height is high, and True will be returned even if the z is high (I am using a z-up system). However, further away from the mountain, the maximum height will be lower.
However, my current function (below) returns them linearly, and real mountains do not have straight sides:
def isMountain(self, x, y, z, mountainPos, mountainSize, mountainHeight):
if math.hypot(mountainPos[0] - x, mountainPos[1] - y) < mountainSize:
if z < (mountainHeight - math.hypot(mountainPos[0] - x, mountainPos[1] - y)):
return True
else:
return False
The line 3 checks if z is less than the maximum height for the position, if yes, returning True, otherwise, False.
These are the maximum heights for distances:
Distance: Max Height
0 - 10
1 - 9
2 - 8
...
9 - 1
10 - 0
How could I re-write this function to make it return more mountain-like values: not linear, rather cubic or smooth fall-off (like blender proportianal edit mode), so it would give values more like this:
0 - 10
1 - 9
2 - 9
3 - 8
4 - 7
5 - 5
6 - 3
7 - 1

You can either break your head to find out some mathematical formula for this, or you could simulate the natural erosion process.
This is usually done using a grid (matrix, cells, ...) and iterating.
Basically you would start with more or less random high terrain, then erode it until mountains form, well actually mountains are what remains.
That said, this is usually more costly than using a simple function, but on modern computers this would work well.
Also see: https://www.gamasutra.com/blogs/MattKlingensmith/20130811/198049/How_we_Generate_Terrain_in_DwarfCorp.php

If you were interested in going another route you could use a modified version of perlin noise to use amplitude and frequency then use smoothing transition to get what you want. You could set points to have a general height range and then let the noise algo do its thing to create variability between the points. I have done something similar for creating an inf gen world with different biomes that have different kinds of mountain heights and shapes.

Maybe you could use an inverse tan function like this
https://www.desmos.com/calculator/sn7tbepuxh
Where h is the max height, s is the steepness and x is the distance from the centre of the peak. The -1 at the end allows negative values to be ignored so that the base of the mountain won't extend forever.
I've used this for a mountain generator for a small game and it seems to work fine, just as long as you tweak your steepness and height values to the mountain isn't too spiky.

Python creating density map efficiently

I was hoping for a bit of help to make my code run faster.
Basically I have a square grid of lat,long points in a list insideoceanlist. Then there is a directory containing data files of lat, long coords which represent lightning strikes for a particular day. The idea is for each day, we want to know how many lightning strikes there were around each point on the square grid. At the moment it is just two for loops, so for every point on the square grid, you check how far away every lightning strike was for that day. If it was within 40km I add one to that point to make a density map.
The starting grid has the overall shape of a rectangle, made up of squares with width of 0.11 and length 0.11. The entire rectange is about 50x30. Lastly I have a shapefile which outlines the 'forecast zones' in Australia, and if any point in the grid is outside this zone then we omit it. So all the leftover points (insideoceanlist) are the ones in Australia.
There are around 100000 points on the square grid and even for a slow day there are around 1000 lightning strikes, so it takes a long time to process. Is there a way to do this more efficiently? I really appreciate any advice.
By the way I changed list2 into list3 because I heard that iterating over lists is faster than arrays in python.
for i in range(len(list1)): #list1 is a list of data files containing lat,long coords for lightning strikes for each day
dict_density = {}
for k in insideoceanlist: #insideoceanlist is a grid of ~100000 lat,long points
dict_density[k] = 0
list2 = np.loadtxt(list1[i],delimiter = ",") #this open one of the files containing lat,long coords and puts it into an array
list3 = map(list,list2) #converts the array into a list
# the following part is what I wanted to improve
for j in insideoceanlist:
for l in list3:
if great_circle(l,j).meters < 40000: #great_circle is a function which measures distance between points the two lat,long points
dict_density[j] += 1
#
filename = 'example' +str(i) + '.txt'
with open(filename, 'w') as f:
for m in range(len(insideoceanlist)):
f.write('%s\n' % (dict_density[insideoceanlist[m]])) #writes each point in the same order as the insideoceanlist
f.close()

To elaborate a bit on #DanGetz's answer, here is some code that uses the strike data as the driver, rather than iterating the entire grid for each strike point. I'm assuming you're centered on Australia's median point, with 0.11 degree grid squares, even though the size of a degree varies by latitude!
Some back-of-the-envelope computation with a quick reference to Wikipedia tells me that your 40km distance is a ±4 grid-square range from north to south, and a ±5 grid-square range from east to west. (It drops to 4 squares in the lower latitudes, but ... meh!)
The tricks here, as mentioned, are to convert from strike position (lat/lon) to grid square in a direct, formulaic manner. Figure out the position of one corner of the grid, subtract that position from the strike, then divide by the size of the grid - 0.11 degrees, truncate, and you have your row/col indexes. Now visit all the surrounding squares until the distance grows too great, which is at most 1 + (2 * 2 * 4 * 5) = 81 squares checking for distance. Increment the squares within range.
The result is that I'm doing at most 81 visits times 1000 strikes (or however many you have) as opposed to visiting 100,000 grid squares times 1000 strikes. This is a significant performance gain.
Note that you don't describe your incoming data format, so I just randomly generated numbers. You'll want to fix that. ;-)
#!python3
"""
Per WikiPedia (https://en.wikipedia.org/wiki/Centre_points_of_Australia)
Median point
============
The median point was calculated as the midpoint between the extremes of
latitude and longitude of the continent.
24 degrees 15 minutes south latitude, 133 degrees 25 minutes east
longitude (24°15′S 133°25′E); position on SG53-01 Henbury 1:250 000
and 5549 James 1:100 000 scale maps.
"""
MEDIAN_LAT = -(24.00 + 15.00/60.00)
MEDIAN_LON = (133 + 25.00/60.00)
"""
From the OP:
The starting grid has the overall shape of a rectangle, made up of
squares with width of 0.11 and length 0.11. The entire rectange is about
50x30. Lastly I have a shapefile which outlines the 'forecast zones' in
Australia, and if any point in the grid is outside this zone then we
omit it. So all the leftover points (insideoceanlist) are the ones in
Australia.
"""
DELTA_LAT = 0.11
DELTA_LON = 0.11
GRID_WIDTH = 50.0 # degrees
GRID_HEIGHT = 30.0 # degrees
GRID_ROWS = int(GRID_HEIGHT / DELTA_LAT) + 1
GRID_COLS = int(GRID_WIDTH / DELTA_LON) + 1
LAT_SIGN = 1.0 if MEDIAN_LAT >= 0 else -1.0
LON_SIGN = 1.0 if MEDIAN_LON >= 0 else -1.0
GRID_LOW_LAT = MEDIAN_LAT - (LAT_SIGN * GRID_HEIGHT / 2.0)
GRID_HIGH_LAT = MEDIAN_LAT + (LAT_SIGN * GRID_HEIGHT / 2.0)
GRID_MIN_LAT = min(GRID_LOW_LAT, GRID_HIGH_LAT)
GRID_MAX_LAT = max(GRID_LOW_LAT, GRID_HIGH_LAT)
GRID_LOW_LON = MEDIAN_LON - (LON_SIGN * GRID_WIDTH / 2.0)
GRID_HIGH_LON = MEDIAN_LON + (LON_SIGN * GRID_WIDTH / 2.0)
GRID_MIN_LON = min(GRID_LOW_LON, GRID_HIGH_LON)
GRID_MAX_LON = max(GRID_LOW_LON, GRID_HIGH_LON)
GRID_PROXIMITY_KM = 40.0
"""https://en.wikipedia.org/wiki/Longitude#Length_of_a_degree_of_longitude"""
_Degree_sizes_km = (
(0, 110.574, 111.320),
(15, 110.649, 107.551),
(30, 110.852, 96.486),
(45, 111.132, 78.847),
(60, 111.412, 55.800),
(75, 111.618, 28.902),
(90, 111.694, 0.000),
)
# For the Australia situation, +/- 15 degrees means that our worst
# case scenario is about 40 degrees south. At that point, a single
# degree of longitude is smallest, with a size about 80 km. That
# in turn means a 40 km distance window will span half a degree or so.
# Since grid squares a 0.11 degree across, we have to check +/- 5
# cols.
GRID_SEARCH_COLS = 5
# Latitude degrees are nice and constant-like at about 110km. That means
# a .11 degree grid square is 12km or so, making our search range +/- 4
# rows.
GRID_SEARCH_ROWS = 4
def make_grid(rows, cols):
return [[0 for col in range(cols)] for row in range(rows)]
Grid = make_grid(GRID_ROWS, GRID_COLS)
def _col_to_lon(col):
return GRID_LOW_LON + (LON_SIGN * DELTA_LON * col)
Col_to_lon = [_col_to_lon(c) for c in range(GRID_COLS)]
def _row_to_lat(row):
return GRID_LOW_LAT + (LAT_SIGN * DELTA_LAT * row)
Row_to_lat = [_row_to_lat(r) for r in range(GRID_ROWS)]
def pos_to_grid(pos):
lat, lon = pos
if lat < GRID_MIN_LAT or lat >= GRID_MAX_LAT:
print("Lat limits:", GRID_MIN_LAT, GRID_MAX_LAT)
print("Position {} is outside grid.".format(pos))
return None
if lon < GRID_MIN_LON or lon >= GRID_MAX_LON:
print("Lon limits:", GRID_MIN_LON, GRID_MAX_LON)
print("Position {} is outside grid.".format(pos))
return None
row = int((lat - GRID_LOW_LAT) / DELTA_LAT)
col = int((lon - GRID_LOW_LON) / DELTA_LON)
return (row, col)
def visit_nearby_grid_points(pos, dist_km):
row, col = pos_to_grid(pos)
# +0, +0 is not symmetric - don't increment twice
Grid[row][col] += 1
for dr in range(1, GRID_SEARCH_ROWS):
for dc in range(1, GRID_SEARCH_COLS):
misses = 0
gridpos = Row_to_lat[row+dr], Col_to_lon[col+dc]
if great_circle(pos, gridpos).meters <= dist_km:
Grid[row+dr][col+dc] += 1
else:
misses += 1
gridpos = Row_to_lat[row+dr], Col_to_lon[col-dc]
if great_circle(pos, gridpos).meters <= dist_km:
Grid[row+dr][col-dc] += 1
else:
misses += 1
gridpos = Row_to_lat[row-dr], Col_to_lon[col+dc]
if great_circle(pos, gridpos).meters <= dist_km:
Grid[row-dr][col+dc] += 1
else:
misses += 1
gridpos = Row_to_lat[row-dr], Col_to_lon[col-dc]
if great_circle(pos, gridpos).meters <= dist_km:
Grid[row-dr][col-dc] += 1
else:
misses += 1
if misses == 4:
break
def get_pos_from_line(line):
"""
FIXME: Don't know the format of your data, just random numbers
"""
import random
return (random.uniform(GRID_LOW_LAT, GRID_HIGH_LAT),
random.uniform(GRID_LOW_LON, GRID_HIGH_LON))
with open("strikes.data", "r") as strikes:
for line in strikes:
pos = get_pos_from_line(line)
visit_nearby_grid_points(pos, GRID_PROXIMITY_KM)

If you know the formula that generates the points on your grid, you can probably find the closest grid point to a given point quickly by reversing that formula.
Below is a motivating example, that isn't quite right for your purposes because the Earth is a sphere, not flat or cylindrical. If you can't easily reverse the grid point formula to find the closest grid point, then maybe you can do the following:
create a second grid (let's call it G2) that is a simple formula like below, with big enough boxes such that you can be confident that the closest grid point to any point in one box will either be in the same box, or in one of the 8 neighboring boxes.
create a dict which stores which original grid (G1) points are in which box of the G2 grid
take the point p you're trying to classify, and find the G2 box it would go into
compare p to all the G1 points in this G2 box, and all the immediate neighbors of that box
choose the G1 point of these that's closest to p
Motivating example with a perfect flat grid
If you had a perfect square grid on a flat surface, that isn't rotated, with sides of length d, then their points can be defined by a simple mathematical formula. Their latitude values will all be of the form
lat0 + d * i
for some integer value i, where lat0 is the lowest-numbered latitude, and their longitude values will be of the same form:
long0 + d * j
for some integer j. To find what the closest grid point is for a given (lat, long) pair, you can separately find its latitude and longitude. The closest latitude number on your grid will be where
i = round((lat - lat0) / d)
and likewise j = round((long - long0) / d) for the longitude.
So one way you can go forward is to plug that in to the formulas above, and get
grid_point = (lat0 + d * round((lat - lat0) / d),
long0 + d * round((long - long0) / d)
and just increment the count in your dict at that grid point. This should make your code much, much faster than before, because instead of checking thousands of grid points for distance, you directly found the grid point with a couple calculations.
You can probably make this a little faster by using the i and j numbers as indexes into a multidimensional array, instead of using grid_point in a dict.

Have you tried using Numpy for the indexing? You can use multi-dimensional arrays, and the indexing should be faster because Numpy arrays are essentially a Python wrapper around C arrays.
If you need further speed increases, take a look at Cython, a Python to optimized C converter. It is especially good for multi-dimensional indexing, and should be able to speed this type of code by about an order of magnitude. It'll add a single additional dependency to your code, but it's a quick install, and not too difficult to implement.
(Benchmarks), (Tutorial using Numpy with Cython)
Also as a quick aside, use
for listI in list1:
...
list2 = np.loadtxt(listI, delimiter=',')
# or if that doesn't work, at least use xrange() rather than range()
essentially you should only ever use range() when you explicity need the list generated by the range() function. In your case, it shouldn't do much because it is the outer-most loop.

Can't accurately calculate pi on Python

I am new member here and I'm gonna drive straight into this as I've spent my whole Sunday trying to get my head around it.
I'm new to Python, having previously learned coding on C++ to a basic-intermediate level (it was a 10-week university module).
I'm trying a couple of iterative techniques to calculate Pi but both are coming up slightly inaccurate and I'm not sure why.
The first method I was taught at university - I'm sure some of you have seen it done before.
x=0.0
y=0.0
incircle = 0.0
outcircle = 0.0
pi = 0.0
i = 0
while (i<100000):
x = random.uniform(-1,1)
y = random.uniform(-1,1)
if (x*x+y*y<=1):
incircle=incircle+1
else:
outcircle=outcircle+1
i=i+1
pi = (incircle/outcircle)
print pi
It's essentially a generator for random (x,y) co-ordinates on a plane from -1 to +1 on both axes. Then if x^2+y^2 <= 1, we know the point rests inside a circle of radius 1 within the box formed by the co-ordinate axes.
Depending on the position of the point, a counter increases for incircle or outcircle.
The value for pi is then the ratio of values inside and outside the circle. The co-ordinates are randomly generated so it should be an even spread.
However, even at very high iteration values, my result for Pi is always around the 3.65 mark.
The second method is another iteration which calculates the circumference of a polygon with increasing number of sides until the polygon is almost a circle, then, Pi=Circumference/diameter. (I sort of cheated because the coding has a math.cos(Pi) term so it looks like I'm using Pi to find Pi, but this is only because you can't easily use degrees to represent angles on Python). But even for high iterations the final result seems to end around 3.20, which again is wrong. The code is here:
S = 0.0
C = 0.0
L = 1.0
n = 2.0
k = 3.0
while (n<2000):
S = 2.0**k
L = L/(2.0*math.cos((math.pi)/(4.0*n)))
C = S*L
n=n+2.0
k=k+1.0
pi = C/math.sqrt(2.0)
print pi
I remember, when doing my C++ course, being told that the problem is a common one and it isn't due to the maths but because of something within the coding, however I can't remember exactly. It may be to do with the random number generation, or the limitations of using floating point numbers, or... anything really. It could even just be my mathematics...
Can anyone think what the issue is?
TL;DR: Trying to calculate Pi, I can get close to it but never very accurately, no matter how many iterations I do.
(Oh and another point - in the second code there's a line saying S=2.0**k. If I set 'n' to anything higher than 2000, the value for S becomes too big to handle and the code crashes. How can I fix this?)
Thanks!

The algorithm for your first version should look more like this:
from __future__ import division, print_function
import sys
if sys.version_info.major < 3:
range = xrange
import random
incircle = 0
n = 100000
for n in range(n):
x = random.random()
y = random.random()
if (x*x + y*y <= 1):
incircle += 1
pi = (incircle / n) * 4
print(pi)
Prints:
3.14699146991
This is closer. Increase n to get even closer to pi.
The algorithm takes into account only one quarter of the unit circle, i.e. with a radius of 1.
The formula for the area of a quarter circle is:
area_c = (pi * r **2) / 4
That for the area of the square containing this circle:
area_s = r **2
where r is the radius of the circle.
Now the ratio is:
area_c / area_s
substitute the equations above, re-arange and you get:
pi = 4 * (area_c / area_s)
Going Monte Carlo, just replace both areas by a very high number that represents them. Typically, the analogy of darts thrown randomly is used here.

For the first one, your calculation should be
pi = incircle/1000000*4 # 3.145376..
This is the number of points that landed inside of the circle over the number of total points (approximately 0.785671 on my run).
With a radius of 1 (random.uniform(-1,1)), the total area is 4, so if you multiple 4 by the ratio of points that landed inside of the circle, you get the correct answer.

Python - area of irregular polygon results in negative value?

Good morning all!
I have to calculate the area of a polygon using python.
The formula to do that is given by (sorry, can't post pictures yet..)
(x0*y1 - y0*x1) + (x1*y2 - y1*x2) + ... + (xn-1*y0 - yn-1*x0)
2
This is the code i came up with. However, it results in a (correct) negative value, and i have no idea why.
Would it be valid to simply multiply the area time -0.5 or is there something wrong with my code?
Any help is greatly appreciated!!
polygon = [[0,0],[-1,5],[2,3],[1,5],[3,6],[4,5],[5,3],[8,-2],[4,-4],[2,-5]]
area = 0.0
n = len(polygon)
for i in range(n):
i1 = (i+1)%n
area += polygon[i][0]*polygon[i1][1] - polygon[i1][0]*polygon[i][1]
area *= 0.5
print 'area = ', area

The formula works by computing a sum of the cross products of each pair of vectors between the origin and each end of the line segment composing the polygon. In essence, the area is computed as the difference between the area of the green and red triangles in the picture below. (Note that the red triangles are partially underneath the green ones.)
The sign of the cross product depends on the orientation of the vectors, i.e. if we can make the second vector to align with the first by turning it left or right. Therefore, you will get either a negative or positive area depending on whether the points in the polygon are ordered clockwise or counter-clockwise. The solution is correct, but you need to apply the abs() function to the result, since you don't need the sign.

The sign of the final answer is based on the orientation of the polygon. You can check it by taking the reverse list of polygon in the given example.
polygon = [[0,0],[-1,5],[2,3],[1,5],[3,6],[4,5],[5,3],[8,-2],[4,-4],[2,-5]]
polygon.reverse()
...
In this case you'll find area to be positive, thought it is essentially the same polygon.
You can read more about why orientation makes area negative here.
You simply need to take the absolute value of the final result.
print 'area = ', abs(area)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.