I made this short code to calculate the chances of a success rolling dice, and it worked very well... but not in big numbers. Se the code, I'll explain better below.
def calc_dados(f_sucessos = 1, faces = 6, n_dados = 1):
p_max = ((f_sucessos/faces)**n_dados) #chance de todos
fator = 1
p_meio = 0
for i in range(n_dados-1):
p_meio += (((f_sucessos/faces)**(n_dados-fator) * ((faces-f_sucessos)/faces)**(n_dados-(n_dados-fator))) * n_dados)
fator += 1
p = p_max + p_meio
return p*100
So, ok, it works, why not see how my chances are better in function of adding dice? More the dice, better the chance. So I made this tiny table with pandas:
f_sucessos = 1 # how many faces are success
faces = 2 # faces of the dice
n_dados = 10 # n de dados lançados
suc_list = []
for i in range(0,n_dados): suc_list.append(f_sucessos)
fac_list = []
for i in range(0,n_dados): fac_list.append(faces)
cha_list = []
for i in range(0,n_dados): cha_list.append(calc_dados(f_sucessos, faces, i+1))
df = pd.DataFrame(
{
"n_dados" : range(1,n_dados+1),
"faces" : fac_list,
"sucessos" : suc_list,
"chance" : cha_list
}
)
df
The results were very strange... So I wrote an coin probability table and tested as the coin was an 2 faced dice. The right table is this:
table of right brute force tested results
But if you use my code to create this table the result will be this:
table of the results of my code
Please, anybody can help me to understood why in a certain moment the probabilities just fall when they should be higher? For example:The chance of at least 1 'head' in 4 coins should be 93,75%, but my code says it is 81,25%...
To be honest, I don't get how exactly 'calc_dados' calculate the probability of a success rolling dice.
So instead, I implemented maybe a more naive approach:
First, we calculate the total of possible outcomes: outcomes_total = faces ** n_dados
Second, we calculate the successful outcomes: outcomes_success
At last: p = outcomes_success / outcomes_total
I'm going to add a mathematical proof behind my version of the function a bit later:)
from math import comb
def calc_dados(f_sucessos=1, faces=6, n_dados=1):
assert f_sucessos <= faces
outcomes_total = faces ** n_dados
outcomes_success = 0
f_fail = faces - f_sucessos
for i in range(1, n_dados + 1):
one_permutation = (f_sucessos ** i) * (f_fail ** (n_dados - i))
n_permutations = comb(n_dados, i)
outcomes_success += one_permutation * n_permutations
p = outcomes_success / outcomes_total
return p * 100
These are some testing results
Now my code, based on the images I posted is the sum of all exact chances to find the chance of at least 1 result.
Below the code I will comment the changes.
from decimal import Decimal
def dado(fs=1,ft=6,d=1,ns=1,exato=False):
'''
fs = faces success
ft = faces totals
d = n of dice rolled
ns - n of expected success
exato = True: chance of exact ns events, False: chance of at least ns events
'''
s = Decimal(str(fs/ft))
f = Decimal(str((ft-fs)/ft))
d_int = d
d = Decimal(str(d))
ns = Decimal(str(ns))
p_max = Decimal(str(s))**Decimal(str(d))
fator = 1
po_soma = 0
for i in range(d_int-1):
po = (Decimal(str(s))**(Decimal(str(d))-fator) * Decimal(str(f))**(Decimal(str(d))-(Decimal(str(d))-fator)))*Decimal(str(d))
po_soma += po
if exato == True:
p_max = 0
break
fator += 1
return f'{(p_max + po_soma)*100:.2f}%'
dado(1,2,5,1)
First - not a change, it still dont work well.
Second - I'm using now 'fs' variable to number of faces that means success and 'ns' variable to elaborate how many successes we gonna find, so fs = 1 and ns 2 in 3d6 means 'the chance of find at least 2 of 1 specific face rolling 3 dice'.
Third - I'm using Decimal because I realize that the multiplication of fractions could generate very small numbers and the precision could be affected by this (but it dont solve the initial problem, them Decimal may be quicked out soon).
Fourth - Exato (exact) is now a variable that breaks the loop and send to us just the 'exact value' or the 'at least ns value'. So 'exato=True' means in the last example 'the chance of find exact 2 of 1 specific face rolling 3 dice', a very smaller number.
This is it, my thanks for #Raibek that is trying solve this problem in combinations way, I'll study this way too but if you have an idea about please let me know.
Hello people, it's finally solved!
First I would like to thank Raibek, who solved it using combinations, I didn't realize it was solved when he did it and below I'll tell you how and why.
If you are not following the history of this code, you just need to know that it is used to calculate the probability of getting at least ns successes when rolling d amount of dice. Solution codes are at the end of this answer.
I found out how to solve the problem by talking to a friend, Eber, who pointed me to an alternative to check the data, anydice.com. I quickly realized that my visual check, assembling tables in Excel/Calc was wrong, but why?
Well, here comes my friend who, reading the table of large numbers with 7d6, where the error was already very evident, shows me that although at the beginning the account worked, my table did not have all the possible combinations. And the more possibilities there were, the more my accounts failed, with the odds getting smaller as more dice were added to the roll.
This is the combinations I was considering, in this example on 7d6 case.
In the first code the account was:
successes**factor *failures**factor *d
The mistake is in assuming that the number of possible combinations was equal to d (which is a coincidence up to 3 dice for the tests I did before thanks to factorials of 1 = 1 and factorial of 2 = 2).
Now notice that, in 7d6 example, in the exact 3 block there are some missing possible combinations in yellow:
The correct account for this term of the equation is:
factorial(d) / factorial (failures) * factorial (successes)
With this account we can find out what the chance of exactly n faces rolling is, and then if we want, for example, to know the chance of at least once getting the number 1 in 3d6, we just need to add the chances of getting exactly 1 time, 2 times and 3 times. What the code already did well.
Finally, let's get to the code:
Daniel-Eber solution:
def dado(fs=1,ft=6,d=1,ns=1,exato=False):
'''
fs = faces sucesso
ft = faces totais
d = n de dados
ns - n de sucessos esperados modificados por exato
exato = True: chance de exatamente ns ocorrerem, False: chance de pelo menos ns ocorrerem
'''
from math import factorial
s = fs/ft
f = (ft-fs)/ft
d = d
ns = ns
p_max = s**d
falhas = 1
po_soma = 0
if exato == False:
for i in range(d-1):
po = ( (s**(d-falhas)) * (f**(falhas))) * (factorial(d)/(factorial(falhas)*factorial((d-falhas))))
po_soma += po
falhas += 1
else:
p_max = 0
falhas = d-ns
po_soma = ( (s**(d-falhas)) * (f**(falhas))) * (factorial(d)/(factorial(falhas)*factorial((d-falhas))))
return f'{(p_max + po_soma)*100:.2f}%'
print(dado(1,6,6,1))
Raibek solution:
from scipy.special import comb
def calc_dados(f_sucessos=1, faces=6, n_dados=1):
assert f_sucessos <= faces
outcomes_total = faces ** n_dados
outcomes_success = 0
f_fail = faces - f_sucessos
for i in range(1, n_dados + 1):
one_permutation = (f_sucessos ** i) * (f_fail ** (n_dados - i))
n_permutations = comb(n_dados, i)
outcomes_success += one_permutation * n_permutations
p = outcomes_success / outcomes_total
return f'{(p)*100:.2f}%'
Related
I'm having difficulty with the below problems and not sure what I'm doing wrong. My goal is to figure out how many periods I need to compound interest on a deposit using loops to reach a target deposit amount on a function that takes three arguments I have to create. I've included what I have below but can't seem to get my number of periods.
Example:
period(1000, .05, 2000) - answer 15
where d is initial deposit, r is interest rate and t is target amount.
new_deposit = 0
def periods (d,r,t):
while d*(1+r)<=t:
new_deposit = d*(1+r) - d
print(new_deposit)
return periods
I'm very new to this so not sure where I'm going wrong.
You were close, but your return statement would throw an error as you never set periods.
def periods(d,r,t):
count_periods = 1
current_ammount = d
while current_ammount*(1+r)<=t:
current_ammount = current_ammount*(1+r)
count_periods+=1
print(current_ammount)
return count_periods
print(periods(100, 0.01, 105))
I renamed the return variable, as to not overlap with the function name itself.
EDIT: sorry your logic was flawed all the way through the code, rewrote it.
def periods(d, r, t):
p = 0
while d < t:
d *= (1 + r)
p += 1
return p
periods(100, .01, 105) # 5
I'm working on a dynamic programming problem and actually, I'm not quite sure whether it is dynamic programming since moving average M is based on previous M. No need to consider the efficiency. The problem requires selling a product over T time periods and maximizing the total actual sale amount. The total number of products is N and I plan to sell some products over different periods n0,n1,⋯,nT−1 and ∑ni=N.
In conclusion, this question wants to find the most optimal schedule for n0,n1,⋯,nT−1 such that ∑ni=N, which maximizes the ∑Si.
And the actual sale amount Si are based on current moving average M and current ni.
Assume that α=0.001 and π=0.5
Initialize M=0. Then for i=0,1,…,T−1
Compute new Mi=⌈0.5∗(Mi+ni)⌉
At time i we sell Si = ⌈(1−α*M^πi)*ni⌉ products
Continue this process until the last time period. For example, assume we already know ni for all periods, the trading will be below
M = 0
T = 4
N = 10000
alpha = 1e-3
pi = 0.5
S = np.zeros(T,dtype='i')
n = np.array([5000,1000,2000,2000])
print(n)
total = 0
for i in range(T):
M = math.ceil(0.5*(M + n[i]))
S[i] = math.ceil((1 - alpha*M**pi)*n[i])
total += S[i]
print('at time %d, M = %d and we sell %d products' %(i,M,S[i]))
print('total sold =', total)
My idea is to keep track of the state based on t time period, n products left, and m moving average as index and store the actual sale in a high dimension matrix. I think the upper bound for moving average is just [0,n] I'm still confusing how to program it. Could someone provide ideas about how to fix some problems in my programming? Thank you very much.
The below is some of my crude codes but the output is a little strange.
def DPtry(N,T,alpha,pi,S):
schedule = np.zeros(T)
M = 0
for n in range(0,N+1):
for m in range(0,n+1):
S[T-1,n,m] = math.ceil((1 - alpha*m**pi)*n)
for k in range(1,T):
t = T - k - 1
print("t = ",t)
for n in range(0,N+1):
for m in range(0,n+1):
best = -1
for plan in range(0,n+1):
salenow = math.ceil((1 - alpha*m**pi)*plan)
M = math.ceil(0.5*(m + plan))
salelater = S[t+1,n-plan,M]
candidate = salenow + salelater
if candidate > best:
best = candidate
S[t,n,m] = best
print(S[0,N,0])
N = 100
T = 5
pi = .5
alpha = 1e-3
S = np.zeros((T,N+1,N+1))
DPtry(N,T,alpha,pi,S)
I was trying to calculate the expected value for the longest consecutive heads streak in 200 coin flips, using python. I came up with a code which I think does the job right but it's just not efficient because of the amount of calculations and data storage it requires, and I was wondering if someone could help me out with this, making it faster and more efficient (I took only one course of python programming in last semester without any previous knowledge of the subject).
My code was
import numpy as np
from itertools import permutations
counter = 0
sett = 0
rle = []
matrix = np.zeros(200)
for i in range (0,200):
matrix[i] = 1
for j in permutations(matrix):
for k in j:
if k == 1:
counter += 1
else:
if counter > sett:
sett == counter
counter == 0
rle.append(sett)
After finding rle, I'd iterate over it to get how many streaks of which length there are, and their sum divided by 2^200 would give me the expected value I'm looking for.
Thanks in advance for help, much appreciated!
You don't have to try all the permutations (in fact you cannot), but you can do a simple Monte Carlo style simulation. Repeat the 200 coin flips many times. Average the lengths of longest streaks you get and this will be a good approximation of the expected value.
def oneTrial (noOfCoinFlips):
s = numpy.random.binomial(1, 0.5, noOfCoinFlips)
maxCount = 0
count = 0
for x in s:
if x == 1:
count += 1
if x == 0:
count = 0
maxCount = max(maxCount, count)
return maxCount
numpy.mean([oneTrial(200) for x in range(10000)])
Output: 6.9843
Also see this thread for exact computation without using Python simulation.
This is an answer to a slightly different question. But, as I had invested an hour and half of my time into it, I didn't wanna scrape it off.
Let E(k) denote a k head streak, i.e., you get k consecutive heads from the first toss onwards.
E(0): T { another 199 tosses that we do not care about }
E(1): H T { another 198 tosses... }
.
.
E(198): { 198 heads } T H
E(199): { 199 heads } T
E(200): { 200 heads }
Note that P(0) = 0.5, which is P(tails in first toss)
whereas P(1) = 0.25 , i.e., P(heads in first toss and tails in the second)
P(0) = 2**-1
P(1) = 2**-2
.
.
.
P(198) = 2**-199
P(199) = 2**-200
P(200) = 2**-200 #same as P(199)
Which means if you toss a coin 2**200 times, you'd get
E(0) 2**199 times
E(1) 2**198 times
.
.
E(198) 2**1 times
E(199) 2**0 times and
E(200) 2**0 times.
Thus, the expected value reduces to
(0*(2**199) + 1*(2**198) + 2*(2**197) + ... + 198*(2**1) + 199*(2**0) + 200*(2**0))/2**200
This number is virtually equal to 1.
Expected_value = 1 - 2**-200
How I got the difference.
>>> diff = 2**200 - sum([ k*(2**(199-k)) for k in range(200)], 200*(2**0))
>>> diff
1
This can be generalized to n tosses as
f(n) = 1 - 2**(-n)
I am trying to find stdev for a sequence of numbers that were extracted from combinations of dice (30) that sum up to 120. I am very new to Python, so this code makes the console freeze because the numbers are endless and I am not sure how to fit them all into a smaller, more efficient function. What I did is:
found all possible combinations of 30 dice;
filtered combinations that sum up to 120;
multiplied all items in the list within result list;
tried extracting standard deviation.
Here is the code:
import itertools
import numpy
dice = [1,2,3,4,5,6]
subset = itertools.product(dice, repeat = 30)
result = []
for x in subset:
if sum(x) == 120:
result.append(x)
my_result = numpy.product(result, axis = 1).tolist()
std = numpy.std(my_result)
print(std)
Note that D(X^2) = E(X^2) - E(X)^2, you can solve this problem analytically by following equations.
f[i][N] = sum(k*f[i-1][N-k]) (1<=k<=6)
g[i][N] = sum(k^2*g[i-1][N-k])
h[i][N] = sum(h[i-1][N-k])
f[1][k] = k ( 1<=k<=6)
g[1][k] = k^2 ( 1<=k<=6)
h[1][k] = 1 ( 1<=k<=6)
Sample implementation:
import numpy as np
Nmax = 120
nmax = 30
min_value = 1
max_value = 6
f = np.zeros((nmax+1, Nmax+1), dtype ='object')
g = np.zeros((nmax+1, Nmax+1), dtype ='object') # the intermediate results will be really huge, to keep them accurate we have to utilize python big-int
h = np.zeros((nmax+1, Nmax+1), dtype ='object')
for i in range(min_value, max_value+1):
f[1][i] = i
g[1][i] = i**2
h[1][i] = 1
for i in range(2, nmax+1):
for N in range(1, Nmax+1):
f[i][N] = 0
g[i][N] = 0
h[i][N] = 0
for k in range(min_value, max_value+1):
f[i][N] += k*f[i-1][N-k]
g[i][N] += (k**2)*g[i-1][N-k]
h[i][N] += h[i-1][N-k]
result = np.sqrt(float(g[nmax][Nmax]) / h[nmax][Nmax] - (float(f[nmax][Nmax]) / h[nmax][Nmax]) ** 2)
# result = 32128174994365296.0
You ask for a result of an unfiltered lengths of 630 = 2*1023, impossible to handle as such.
There are two possibilities that can be combined:
Include more thinking to pre-treat the problem, e.g. on how to sample only
those with sum 120.
Do a Monte Carlo simulation instead, i.e. don't sample all
combinations, but only a random couple of 1000 to obtain a representative
sample to determine std sufficiently accurate.
Now, I only apply (2), giving the brute force code:
N = 30 # number of dices
M = 100000 # number of samples
S = 120 # required sum
result = [[random.randint(1,6) for _ in xrange(N)] for _ in xrange(M)]
result = [s for s in result if sum(s) == S]
Now, that result should be comparable to your result before using numpy.product ... that part I couldn't follow, though...
Ok, if you are out after the standard deviation of the product of the 30 dices, that is what your code does. Then I need 1 000 000 samples to get roughly reproducible values for std (1 digit) - takes my PC about 20 seconds, still considerably less than 1 million years :-D.
Is a number like 3.22*1016 what you are looking for?
Edit after comments:
Well, sampling the frequency of numbers instead gives only 6 independent variables - even 4 actually, by substituting in the constraints (sum = 120, total number = 30). My current code looks like this:
def p2(b, s):
return 2**b * 3**s[0] * 4**s[1] * 5**s[2] * 6**s[3]
hits = range(31)
subset = itertools.product(hits, repeat=4) # only 3,4,5,6 frequencies
product = []
permutations = []
for s in subset:
b = 90 - (2*s[0] + 3*s[1] + 4*s[2] + 5*s[3]) # 2 frequency
a = 30 - (b + sum(s)) # 1 frequency
if 0 <= b <= 30 and 0 <= a <= 30:
product.append(p2(b, s))
permutations.append(1) # TODO: Replace 1 with possible permutations
print numpy.std(product) # TODO: calculate std manually, considering permutations
This computes in about 1 second, but the confusing part is that I get as a result 1.28737023733e+17. Either my previous approaches or this one has a bug - or both.
Sorry - not that easy: The sampling is not of the same probability - that is the problem here. Each sample has a different number of possible combinations, giving its weight, which has to be considered before taking the std-deviation. I have drafted that in the code above.
I'm doing an exercise that asks for a function that approximates the value of pi using Leibniz' formula. These are the explanations on Wikipedia:
Logical thinking comes to me easily, but I wasn't given much of a formal education in maths, so I'm a bit lost as to what the leftmost symbols in the second one represent. I tried to make the code pi = ( (-1)**n / (2*n + 1) ) * 4, but that returned 1.9999990000005e-06 instead of 3.14159..., so I used an accumulator pattern instead (since the chapter of the guide that this was in mentions them as well) and it worked fine. However, I can't help thinking that it's somewhat contrived and there's probably a better way to do it, given Python's focus on simplicity and making programmes as short as possible. This is the full code:
def myPi(n):
denominator = 1
addto = 1
for i in range(n):
denominator = denominator + 2
addto = addto - (1/denominator)
denominator = denominator + 2
addto = addto + (1/denominator)
pi = addto * 4
return(pi)
print(myPi(1000000))
Does anyone know a better function?
The Leibniz formula translates directly into Python with no muss or fuss:
>>> steps = 1000000
>>> sum((-1.0)**n / (2.0*n+1.0) for n in reversed(range(steps))) * 4
3.1415916535897934
The capital sigma here is sigma notation. It is notation used to represent a summation in concise form.
So your sum is actually an infinite sum. The first term, for n=0, is:
(-1)**0/(2*0+1)
This is added to
(-1)**1/(2*1+1)
and then to
(-1)**2/(2*2+1)
and so on for ever. The summation is what is known mathematically as a convergent sum.
In Python you would write it like this:
def estimate_pi(terms):
result = 0.0
for n in range(terms):
result += (-1.0)**n/(2.0*n+1.0)
return 4*result
If you wanted to optimise a little, you can avoid the exponentiation.
def estimate_pi(terms):
result = 0.0
sign = 1.0
for n in range(terms):
result += sign/(2.0*n+1.0)
sign = -sign
return 4*result
....
>>> estimate_pi(100)
3.1315929035585537
>>> estimate_pi(1000)
3.140592653839794
Using pure Python you can do something like:
def term(n):
return ( (-1.)**n / (2.*n + 1.) )*4.
def pi(nterms):
return sum(map(term,range(nterms)))
and then calculate pi with the number of terms you need to reach a given precision:
pi(100)
# 3.13159290356
pi(1000)
# 3.14059265384
The following version uses Ramanujan's formula as outlined in this SO post - it uses a relation between pi and the "monster group", as discussed in this article.
import math
def Pi(x):
Pi = 0
Add = 0
for i in range(x):
Add =(math.factorial(4*i) * (1103 + 26390*i))/(((math.factorial(i))**4)*(396**(4*i)))
Pi = Pi + (((math.sqrt(8))/(9801))*Add)
Pi = 1/Pi
print(Pi)
Pi(100)
This was my approach:
def estPi(terms):
outPut = 0.0
for i in range (1, (2 * terms), 4):
outPut = (outPut + (1/i) - (1/(i+2)))
return 4 * outPut
I take in the number of terms the user wants, then in the for loop I double it to account for only using odds.
at 100 terms I get 3.1315929035585537
at 1000 terms I get 3.140592653839794
at 10000 terms I get 3.1414926535900345
at 100000 terms I get 3.1415826535897198
at 1000000 terms I get 3.1415916535897743
at 10000000 terms I get 3.1415925535897915
at 100000000 terms I get 3.141592643589326
at 1000000000 terms I get 3.1415926525880504
Actual Pi is 3.1415926535897932
Got to love a convergent series.
def myPi(iters):
pi = 0
sign = 1
denominator = 1
for i in range(iters):
pi = pi + (sign/denominator)
# alternating between negative and positive
sign = sign * -1
denominator = denominator + 2
pi = pi * 4.0
return pi
pi_approx = myPi(10000)
print(pi_approx)
old thread, but i wanted to stuff around with this and coincidentally i came up with pretty much the same as user3220980
# gregory-leibnitz
# pi acurate to 8 dp in around 80 sec
# pi to 5 dp in .06 seconds
import time
start_time = time.time()
pi = 4 # start at 4
times = 100000000
for i in range(3,times,4):
pi -= (4/i) + (4/(i + 2))
print(pi)
print("{} seconds".format(time.time() - start_time))