Probability when throwing dice - python

n = 5
k = 3
x = 1
for i in range(1, n + 1):
x = x * i
y = 1
for i in range(1, k + 1):
y = y * i
z = 1
for i in range(1, n - k + 1):
z = z * i
c = x / (y * z)
p = 1
for i in range(k):
p = p * (1 / 6)
q = 1
for i in range(n - k):
q = q * (5 / 6)
result = c * p * q
So the following code calculates the probability of seeing exactly 3 sixes when throwing 5 dice. However, I'm unsure about the loops in this code.
I'm aware that:
n = number of trials
k = number of successes
And p/q success/failure?
But what are the loops doing and I'm unsure about the variables x,y,z and c. Traditionally I would just use powers to get the answer for these types of questions but I'm unsure about this method.
Thank you

Seems like c is the binomial coefficient, used in this probability calculation.
and the loops are used to calculate the required factorials (x is n!, y is k!, and z is (n-k)!).
The loops for p and q are indeed used to calculate the powers of success/fail probabilities.
This code would be much nicer using pow and math.factorial

Related

How to cut down on run-time for finding dimensions of a sphere Python?

I'm supposed to write code that prints the total number of integer solutions to the inequality x^2 + y^2 + z^2 <= n, where n is a user-inputted integer that is between 1 and 2000 (inclusive), and adds all the previous numbers of solutions (E.g. n=1 returns 7 and n=2 is 19, but it returns 26 and so forth). Here is my code:
import math
import itertools
n = int(input("Please enter an integer between 1 and 2000: "))
def sphereCombos(radius):
for i in range(1, radius+1):
count = 0
CombosList = []
rad = int(math.sqrt(i))
range_for_x= range(-rad, rad + 1)
range_for_y= range(-rad, rad + 1)
range_for_z= range(-rad, rad + 1)
total_perms = list(itertools.product(range_for_x, range_for_y, range_for_z))
for x, y, z in total_perms:
if x*x+ y*y + z*z <= i:
count = count + 1
return count
possible_combos = 0
for i in range(1, n + 1):
possible_combos = possible_combos + sphereCombos(i)
print(possible_combos)
The code works exactly as it's supposed to, but the problem is when n is set to be 2000, the program takes way too long, and I need to get it to run in 2 minutes or less. I thought using .product() would make it much faster than using three nested for loops, but that didn't end up being super true. Is there any way for me to cut down on run time?
The code is working because i is in global scope. Notice 'sphereCombos(i)'
passes i to radius. But the function is actually using global i in
rad = int(math.sqrt(i)) and in if x * x + y * y + z * z <= i:
import math
import itertools
n = int(input("Please enter an integer between 1 and 2000: "))
def sphereCombos(radius):
count = 0
rad = int(math.sqrt(radius)) # Changed from i to radius
range_for_x = range(-rad, rad + 1)
range_for_y = range(-rad, rad + 1)
range_for_z = range(-rad, rad + 1)
total_perms = list(
itertools.product(range_for_x, range_for_y, range_for_z))
for x, y, z in total_perms:
if x * x + y * y + z * z <= radius: # Changed from i to radius
count = count + 1
return count
possible_combos = 0
for i in range(1, n + 1):
possible_combos = possible_combos + sphereCombos(i)
print(possible_combos)

How do you control the number of times a functions calls, in a recursive function. I.e the number of times the recursive function calls on itself

I have 2 functions right now, one of them is to code a simple recursive function that will calculate exponents in n number of steps
the second function which is my primary problem, is n/2 steps. I am confused on how to negotiate or control the number of recursive calls which is represented by n.
this is an assignment question, i am not allowed to use loops of any kind, "while" and "for" are not allowed, only if thens, so please go easy on me because i know it looks simple.
def simple_recursive_power(x, n):
print("n="+str(n))
if n ==1:
return x
else:
return x* simple_recursive_power(x,n-1)
print("the simple recurse method="+ str(simple_recursive_power(3,3)))
""the above works, the one below is working the wrong way""
def advanced_recursive_power(x, n):
print("n="+str(n))
if n <= 1:
return x
else:
return x * advanced_recursive_power(x, n-1/2)
print("advanced recursion="+ str(advanced_recursive_power(3,3)))
The better exponential function that takes half of the cycles does not just need an adjustment of N, it needs a better algorithm.
The simple exponent works like this: take N steps, at each step multiply what you have with X.
If you want to halve the number of steps, the crucial detail to notice is that, if you multiply with X*X, you are taking two steps at a time.
def advanced_recursive_power(x, n):
print("n="+str(n))
if n == 0:
return 1
elif n == 1:
return x
else:
return x * x * advanced_recursive_power(x, n - 2)
Now, this cuts down number of function invocations, but not number of multiplications: for example with N = 7, we went from X * X * X * X * X * X * X to X * (X * X) * (X * X) * (X * X). If we could just pre-calculate X * X, we could actually cut down on multiplications as well... This will calculate (X2 = X * X); X * X2 * X2 * X2, with four multiplications, not seven:
def super_advanced_recursive_power(x, n):
print("n="+str(n))
if n % 2 == 0:
start = 1
else:
start = x
return start * simple_recursive_power(x * x, n // 2)
You can drastically cut down the number of steps if you pass on powers:
def arp(x, n):
"""Advanced recursive power equivalent to x ** n"""
print('#', x, n)
if n == 0:
return 1
elif n == 1:
return x
elif n % 2: # odd exponential
return x * arp(x * x, (n - 1) // 2)
else: # even exponential
return arp(x * x, n // 2)
This takes O(log n) steps only.
>>> arp(3, 15)
# 3 15
# 9 7
# 81 3
# 6561 1
14348907
This is the equivalent of expressing addition as a series of decrements and increments:
def recursive_add(x, y):
if y == 0:
return x
return (x + 1, y - 1)
This uses that x + y == (x + 1) + (y - 1). Similarly, for powers the relation x ** n == (x * x) ** (n / 2) holds true. While it is pretty slow for addition (linear) it is fast for powers (exponential).
This exploits that even powers repeat terms. For example, 2**8 can be written as ((2 * 2) * (2 * 2)) * ((2 * 2) * (2 * 2)) - notice how the term 2 * 2 and (2 * 2) * (2 * 2) repeats. We can rewrite 2**8 as ((2 ** 2) ** 2) ** 2. This is exactly what the last term for even exponentials does recursively.
For odd exponentials, we have the problem that we would go from, say, 2 ** 3 to 4 ** 1.5. Thus, we use x ** n == x * (x ** (n - 1)) to go from an odd to an even exponent. Since we have excluded the case of n == 1, we know that n >= 3 and thus it is safe to proceed with x * x, (n-1) // 2 directly.
The net effect is that n is halved on each step, not just once.

Overlap Integrals in Python - Storing Results in Array

I have a set of basis functions defined as:
def HO_wavefunction(x, n, x0, omega, m=1):
N = 1.0 / math.sqrt(2**n * math.factorial(n)) * ((m * omega)/(math.pi))**(0.25) # Normaliziation constant
y = (np.sqrt(m * omega)) * (x - x0)
return N * np.exp(-y * y / 2.0) * sp.hermite(n)(y)
#Define the basis
def enol_basis(x, n):
return HO_wavefunction(x, n, x0=Enolminx, omega=wenol)
I now want to compute the overlap integrals Sii = integral((SiSi)dx), Sjj = integral((SjSj)dx) and Sij = integral((Si*Sj)dx) of my basis functions and store them in some type of array. I tried the following:
G = 10
S = np.empty([G,G])
for n in range (G-1):
for m in range (G-1):
S[n][m]= np.trapz(enol_basis(x,n)*enol_basis(x,m),x)
print (S[n][m])
This only returns a single value instead of all the results stored in an array. If anyone could help me compute the overlap integrals as I defined them above and store the results in an array I would really appreciate it!
Solution:
G = 50
S = np.zeros([G,G])
for n in range (G):
for m in range (G):
S[n,m]= np.trapz(enol_basis(x,n)*enol_basis(x,m),x)
print (S)

Way to solve constraint satisfaction faster than brute force?

I have a CSV that provides a y value for three different x values for each row. When read into a pandas DataFrame, it looks like this:
5 10 20
0 -13.6 -10.7 -10.3
1 -14.1 -11.2 -10.8
2 -12.3 -9.4 -9.0
That is, for row 0, at 5 the value is -13.6, at 10 the value is -10.7, and at 20 the value is -10.3. These values are the result of an algorithm in the form:
def calc(x, r, b, c, d):
if x < 10:
y = (x * r + b) / x
elif x >= 10 and x < 20:
y = ((x * r) + (b - c)) / x
else:
y = ((x * r) + (b - d)) / x
return y
I want to find the value of r, b, c, and d for each row. I know certain things about each of the values. For example, for each row: r is in np.arange(-.05, -.11, -.01), b is in np.arange(0, -20.05, -.05), and c and d are in np.arange(0, 85, 5). I also know that d is <= c.
Currently, I am solving this with brute force. For each row, I iterate through every combination of r, b, c, and d and test if the value at the three x values is equal to the known value from the DataFrame. This works, giving me a few combinations for each row that are basically the same except for rounding differences.
The problem is that this approach takes a long time when I need to run it against 2,000+ rows. My question is: is there a faster way than iterating and testing every combination? My understanding is that this is a constraint satisfaction problem but, after that, I have no idea what to narrow in on; there are so many types of constraint satisfaction problems (it seems) that I'm still lost (I'm not even certain that this is such a problem!). Any help in pointing me in the right direction would be greatly appreciated.
I hope i understood the task correctly.
If you know the resolution/discretization of the parameters, it looks like a discrete-optimization problem (in general: hard), which could be solved by CP-approaches.
But if you allow these values to be continuous (and reformulate the formulas), it is:
(1) a Linear Program: if checking for feasible values (there needs to be a valid solution)
(2) a Linear Program: if optimizing parameters for minimization of sum of absolute differences (=errors)
(3) a Quadratic Program: if optimizing parameters for minimization of sum of squared differences (=errors) / equivalent to minimizing euclidean-norm
All three versions can be solved efficiently!
Here is a non-general (could be easily generalized) implementation of (3) using cvxpy to formulate the problem and ecos to solve the QP. Both tools are open-source.
Code
import numpy as np
import time
from cvxpy import *
from random import uniform
""" GENERATE TEST DATA """
def sample_params():
while True:
r = uniform(-0.11, -0.05)
b = uniform(-20.05, 0)
c = uniform(0, 85)
d = uniform(0, 85)
if d <= c:
return r, b, c, d
def calc(x, r, b, c, d):
if x < 10:
y = (x * r + b) / x
elif x >= 10 and x < 20:
y = ((x * r) + (b - c)) / x
else:
y = ((x * r) + (b - d)) / x
return y
N = 2000
sampled_params = [sample_params() for i in range(N)]
data_5 = np.array([calc(5, *sampled_params[i]) for i in range(N)])
data_10 = np.array([calc(10, *sampled_params[i]) for i in range(N)])
data_20 = np.array([calc(20, *sampled_params[i]) for i in range(N)])
data = np.empty((N, 3))
for i in range(N):
data[i, :] = [data_5[i], data_10[i], data_20[i]]
""" SOLVER """
def solve(row):
""" vars """
R = Variable(1)
B = Variable(1)
C = Variable(1)
D = Variable(1)
E = Variable(3)
""" constraints """
constraints = []
# bounds
constraints.append(R >= -.11)
constraints.append(R <= -.05)
constraints.append(B >= -20.05)
constraints.append(B <= 0.0)
constraints.append(C >= 0.0)
constraints.append(C <= 85.0)
constraints.append(D >= 0.0)
constraints.append(D <= 85.0)
constraints.append(D <= C)
# formula of model
constraints.append((1.0 / 5.0) * B + R == row[0] + E[0]) # alternate function form: b/x+r
constraints.append((1.0 / 10.0) * B - (1.0 / 10.0) * C == row[1] + E[1]) # alternate function form: b/x-c/x+r
constraints.append((1.0 / 20.0) * B - (1.0 / 20.0) * D == row[2] + E[2]) # alternate function form: b/x-d/x+r
""" Objective """
objective = Minimize(norm(E, 2))
""" Solve """
problem = Problem(objective, constraints)
problem.solve(solver=ECOS, verbose=False)
return R.value, B.value, C.value, D.value, E.value
start = time.time()
for i in range(N):
r, b, c, d, e = solve(data[i])
end = time.time()
print('seconds taken: ', end-start)
print('seconds per row: ', (end-start) / N)
Output
('seconds taken: ', 20.620506048202515)
('seconds per row: ', 0.010310253024101258)

Computing integral with the Trapezoidal Rule (the approximate value of the integral, and the number of iterations)

The program needs to compute define integral with a predetermined
accuracy (eps) with the Trapezoidal Rule and my function needs to return:
1.the approximate value of the integral.
2.the number of iterations.
My code:
from math import *
def f1(x):
return (x ** 2 - 1)**(-0.5)
def f2(x):
return (cos(x)/(x + 1))
def integral(f,a,b,eps):
n = 2
x = a
h = (b - a) / n
sum = 0.5 * (f(a) + f(b))
for i in range(n):
sum = sum + f(a + i * h)
sum_2 = h * sum
k = 0
flag = 1
while flag == 1:
n = n * 2
sum = 0
k = k + 1
x = a
h = (b - a) / n
sum = 0.5 * (f(a) + f(b))
for i in range(n):
sum = sum + f(a + i * h)
sum_new = h * sum
if eps > abs(sum_new - sum_2):
t1 = sum_new
t2 = k
return t1, t2
else:
sum_2 = sum_new
x1 = float(input("First-begin: "))
x2 = float(input("First-end: "))
y1 = float(input("Second-begin: "))
y2 = float(input("Second-end: "))
int_1 = integral(f1,x1,y1,1e-6)
int_2 = integral(f2,x2,y2,1e-6)
print(int_1)
print(int_2)
It doesn't work correct. Help, please!
You implemented the math wrong. The error is in the lines
for i in range(n):
sum = sum + f(a + i * h)
range(n) always starts at 0, so in your first iteration you just add the f(a) term again.
If you replace it with
for i in range(1, n):
sum = sum + f(a + i * h)
it works.
Also, you have a ton of redundant code; you basically coded the core of the integration algorithm twice. Try to follow the DRY-principle.
The trapezoidal rule of integration simply says that an approximation to the integral $\int_a^b f(x) dx$ is (b-a) (f(a)+f(b))/2. The error is proportional to (b-a)^2, so that it is possible to have a better estimate using the composite rule, i.e., subdividing the initial interval in a number of shorter intervals.
Is it possible to use shorter intervals and still reuse the function values previously computed, so minimizing the total number of function evaluation?
Yes, it is possible if we divide each interval in two equal parts, so that at stage 0 we use 1 intervals, at stage 1 2 equal intervals and in general, at stage n, we use 2n equal intervals.
Let's start with a simple problem and see if it possible to generalize the procedure…
a, b = 0, 32
L = b-a = 32
by the trapezoidal rule the initial approximation say I0, is given by
I0 = L * (f0+f1)/2
= L * S0
with S0 = (f0+f1)/2; a pictorial representation of the real axis, the coordinates of the interval extremes and the evaluated functions follows
x0 x1
01234567890123456789012345679012
f0 f1
Next, we divide the original interval in two,
L = L/2
x0 x2 x1
01234567890123456789012345679012
f0 f2 f1
and the new approximation, stage n=1, is obtained using two times the trapezoidal rule and applying a bit of algebra
I1 = L * (f0+f2)/2 + L * (f2+f1)/2
= L * [(f0+f1)/2 + f2]
= L * [S0 + S1]
with S1 = f2
Another subdivision, stage n=2, L = L/2 and
x0 x3 x2 x4 x1
012345678901234567890123456789012
f0 f3 f2 f4 f1
I2 = L * [(f0+f3) + (f3+f2) + (f2+f4) + (f4+f1)] / 2
= L * [(f0+f1)/2 + f2 + (f3+f4)]
= L * [S0+S1+S2]
with S2 = f3 + f4.
It is not difficult, given this picture,
x0 x5 x3 x6 x2 x7 x4 x8 x1
012345678901234567890123456789012
f0 f5 f3 f6 f2 f7 f4 f8 f1
to understand that our next approximation can be computed as follows
L = L/2
S3 = f5+f6+f7+f8
I3 = L*[S0+S1+S2+S3]
Now, we have to understand how to compute a generalization of Sn,
n = 1, … — for us, the pseudocode is
L_n = (b-a)/2**n
list_x_n = list(a + L_n + 2*Ln*j for j=0, …, 2**n-1)
Sn = Sum(f(xj) for each xj in list_x_n)
For n = 3, L = (b-a)/8 = 4, we have from the formula above list_x_n = [4, 12, 20, 28], please check with the picture...
Now we are ready to code our algorithm in Python
def trapaezia(f, a, b, tol):
"returns integ(f, (a,b)), estimated error and number of evaluations"
from math import fsum # controls accumulation of rounding errors in sums
L = b - a
S = (f(a)+f(b))/2
I = L*S
n = 1
while True:
L = L/2
new_points = (a+L+j*L for j in range(0, n+n, 2))
delta_S = fsum(f(x) for x in new_points)
new_S = S + delta_S
new_I = L*new_S
# error is estimated using Richardson extrapolation (REP)
err = (new_I - I) * 4/3
if abs(err) > tol:
n = n+n
S, I = new_S, new_I
else:
# we return a better estimate using again REP
return (4*new_I-I)/3, err, n+n+1
If you are curious about Richardson extrapolation, I recommend this document that deals exactly with the application of REP to the trapezoidal rule quadrature algorithm.
If you are curious about math.fsum, the docs don't say too much but the link to the original implementation that also includes an extended explanation of all the issues involved.

Categories