Parallel loop in python - python

Hi there I'm trying to run a big for loop, 239500 iterations, I have made some tests and I've found that 200 takes me 1 hour, resulting in 2 months of cpu time.
This is the loop:
for i in range(0, MonteCarlo):
print('Performing Monte Carlo ' + str(i) + '/' + str(MonteCarlo))
MCR = scramble(YearPos)
NewPos = reduce(operator.add, YearPos)
C = np.cov(VAR[NewPos, :], rowvar=0)
s, eof = eigs(C, k=neof, which='LR')
sc = (s.real / np.sum(s) * 100)**2
tcs = np.sum(sc)
MCH = sc/tcs
Hits[(MCH >= pcvar)] += 1
if (Hits >= CL).all():
print("Number of Hits is greater than 5 !!!")
break
Where np stands for numpy ans scramble stands for random.shuffle the calculations performed within the for loop are not dependent on each other.
Is there any way to do the loop in parallel, I have 12 cores and only 1 is running.... In Matlab I would make a parfor, is there any thing similar in python?
Thanks in advance

Related

Probabilities and rpg dice goes wrong for high values

I made this short code to calculate the chances of a success rolling dice, and it worked very well... but not in big numbers. Se the code, I'll explain better below.
def calc_dados(f_sucessos = 1, faces = 6, n_dados = 1):
p_max = ((f_sucessos/faces)**n_dados) #chance de todos
fator = 1
p_meio = 0
for i in range(n_dados-1):
p_meio += (((f_sucessos/faces)**(n_dados-fator) * ((faces-f_sucessos)/faces)**(n_dados-(n_dados-fator))) * n_dados)
fator += 1
p = p_max + p_meio
return p*100
So, ok, it works, why not see how my chances are better in function of adding dice? More the dice, better the chance. So I made this tiny table with pandas:
f_sucessos = 1 # how many faces are success
faces = 2 # faces of the dice
n_dados = 10 # n de dados lançados
suc_list = []
for i in range(0,n_dados): suc_list.append(f_sucessos)
fac_list = []
for i in range(0,n_dados): fac_list.append(faces)
cha_list = []
for i in range(0,n_dados): cha_list.append(calc_dados(f_sucessos, faces, i+1))
df = pd.DataFrame(
{
"n_dados" : range(1,n_dados+1),
"faces" : fac_list,
"sucessos" : suc_list,
"chance" : cha_list
}
)
df
The results were very strange... So I wrote an coin probability table and tested as the coin was an 2 faced dice. The right table is this:
table of right brute force tested results
But if you use my code to create this table the result will be this:
table of the results of my code
Please, anybody can help me to understood why in a certain moment the probabilities just fall when they should be higher? For example:The chance of at least 1 'head' in 4 coins should be 93,75%, but my code says it is 81,25%...
To be honest, I don't get how exactly 'calc_dados' calculate the probability of a success rolling dice.
So instead, I implemented maybe a more naive approach:
First, we calculate the total of possible outcomes: outcomes_total = faces ** n_dados
Second, we calculate the successful outcomes: outcomes_success
At last: p = outcomes_success / outcomes_total
I'm going to add a mathematical proof behind my version of the function a bit later:)
from math import comb
def calc_dados(f_sucessos=1, faces=6, n_dados=1):
assert f_sucessos <= faces
outcomes_total = faces ** n_dados
outcomes_success = 0
f_fail = faces - f_sucessos
for i in range(1, n_dados + 1):
one_permutation = (f_sucessos ** i) * (f_fail ** (n_dados - i))
n_permutations = comb(n_dados, i)
outcomes_success += one_permutation * n_permutations
p = outcomes_success / outcomes_total
return p * 100
These are some testing results
Now my code, based on the images I posted is the sum of all exact chances to find the chance of at least 1 result.
Below the code I will comment the changes.
from decimal import Decimal
def dado(fs=1,ft=6,d=1,ns=1,exato=False):
'''
fs = faces success
ft = faces totals
d = n of dice rolled
ns - n of expected success
exato = True: chance of exact ns events, False: chance of at least ns events
'''
s = Decimal(str(fs/ft))
f = Decimal(str((ft-fs)/ft))
d_int = d
d = Decimal(str(d))
ns = Decimal(str(ns))
p_max = Decimal(str(s))**Decimal(str(d))
fator = 1
po_soma = 0
for i in range(d_int-1):
po = (Decimal(str(s))**(Decimal(str(d))-fator) * Decimal(str(f))**(Decimal(str(d))-(Decimal(str(d))-fator)))*Decimal(str(d))
po_soma += po
if exato == True:
p_max = 0
break
fator += 1
return f'{(p_max + po_soma)*100:.2f}%'
dado(1,2,5,1)
First - not a change, it still dont work well.
Second - I'm using now 'fs' variable to number of faces that means success and 'ns' variable to elaborate how many successes we gonna find, so fs = 1 and ns 2 in 3d6 means 'the chance of find at least 2 of 1 specific face rolling 3 dice'.
Third - I'm using Decimal because I realize that the multiplication of fractions could generate very small numbers and the precision could be affected by this (but it dont solve the initial problem, them Decimal may be quicked out soon).
Fourth - Exato (exact) is now a variable that breaks the loop and send to us just the 'exact value' or the 'at least ns value'. So 'exato=True' means in the last example 'the chance of find exact 2 of 1 specific face rolling 3 dice', a very smaller number.
This is it, my thanks for #Raibek that is trying solve this problem in combinations way, I'll study this way too but if you have an idea about please let me know.
Hello people, it's finally solved!
First I would like to thank Raibek, who solved it using combinations, I didn't realize it was solved when he did it and below I'll tell you how and why.
If you are not following the history of this code, you just need to know that it is used to calculate the probability of getting at least ns successes when rolling d amount of dice. Solution codes are at the end of this answer.
I found out how to solve the problem by talking to a friend, Eber, who pointed me to an alternative to check the data, anydice.com. I quickly realized that my visual check, assembling tables in Excel/Calc was wrong, but why?
Well, here comes my friend who, reading the table of large numbers with 7d6, where the error was already very evident, shows me that although at the beginning the account worked, my table did not have all the possible combinations. And the more possibilities there were, the more my accounts failed, with the odds getting smaller as more dice were added to the roll.
This is the combinations I was considering, in this example on 7d6 case.
In the first code the account was:
successes**factor *failures**factor *d
The mistake is in assuming that the number of possible combinations was equal to d (which is a coincidence up to 3 dice for the tests I did before thanks to factorials of 1 = 1 and factorial of 2 = 2).
Now notice that, in 7d6 example, in the exact 3 block there are some missing possible combinations in yellow:
The correct account for this term of the equation is:
factorial(d) / factorial (failures) * factorial (successes)
With this account we can find out what the chance of exactly n faces rolling is, and then if we want, for example, to know the chance of at least once getting the number 1 in 3d6, we just need to add the chances of getting exactly 1 time, 2 times and 3 times. What the code already did well.
Finally, let's get to the code:
Daniel-Eber solution:
def dado(fs=1,ft=6,d=1,ns=1,exato=False):
'''
fs = faces sucesso
ft = faces totais
d = n de dados
ns - n de sucessos esperados modificados por exato
exato = True: chance de exatamente ns ocorrerem, False: chance de pelo menos ns ocorrerem
'''
from math import factorial
s = fs/ft
f = (ft-fs)/ft
d = d
ns = ns
p_max = s**d
falhas = 1
po_soma = 0
if exato == False:
for i in range(d-1):
po = ( (s**(d-falhas)) * (f**(falhas))) * (factorial(d)/(factorial(falhas)*factorial((d-falhas))))
po_soma += po
falhas += 1
else:
p_max = 0
falhas = d-ns
po_soma = ( (s**(d-falhas)) * (f**(falhas))) * (factorial(d)/(factorial(falhas)*factorial((d-falhas))))
return f'{(p_max + po_soma)*100:.2f}%'
print(dado(1,6,6,1))
Raibek solution:
from scipy.special import comb
def calc_dados(f_sucessos=1, faces=6, n_dados=1):
assert f_sucessos <= faces
outcomes_total = faces ** n_dados
outcomes_success = 0
f_fail = faces - f_sucessos
for i in range(1, n_dados + 1):
one_permutation = (f_sucessos ** i) * (f_fail ** (n_dados - i))
n_permutations = comb(n_dados, i)
outcomes_success += one_permutation * n_permutations
p = outcomes_success / outcomes_total
return f'{(p)*100:.2f}%'

Longest expected head streak in 200 coinflips

I was trying to calculate the expected value for the longest consecutive heads streak in 200 coin flips, using python. I came up with a code which I think does the job right but it's just not efficient because of the amount of calculations and data storage it requires, and I was wondering if someone could help me out with this, making it faster and more efficient (I took only one course of python programming in last semester without any previous knowledge of the subject).
My code was
import numpy as np
from itertools import permutations
counter = 0
sett = 0
rle = []
matrix = np.zeros(200)
for i in range (0,200):
matrix[i] = 1
for j in permutations(matrix):
for k in j:
if k == 1:
counter += 1
else:
if counter > sett:
sett == counter
counter == 0
rle.append(sett)
After finding rle, I'd iterate over it to get how many streaks of which length there are, and their sum divided by 2^200 would give me the expected value I'm looking for.
Thanks in advance for help, much appreciated!
You don't have to try all the permutations (in fact you cannot), but you can do a simple Monte Carlo style simulation. Repeat the 200 coin flips many times. Average the lengths of longest streaks you get and this will be a good approximation of the expected value.
def oneTrial (noOfCoinFlips):
s = numpy.random.binomial(1, 0.5, noOfCoinFlips)
maxCount = 0
count = 0
for x in s:
if x == 1:
count += 1
if x == 0:
count = 0
maxCount = max(maxCount, count)
return maxCount
numpy.mean([oneTrial(200) for x in range(10000)])
Output: 6.9843
Also see this thread for exact computation without using Python simulation.
This is an answer to a slightly different question. But, as I had invested an hour and half of my time into it, I didn't wanna scrape it off.
Let E(k) denote a k head streak, i.e., you get k consecutive heads from the first toss onwards.
E(0): T { another 199 tosses that we do not care about }
E(1): H T { another 198 tosses... }
.
.
E(198): { 198 heads } T H
E(199): { 199 heads } T
E(200): { 200 heads }
Note that P(0) = 0.5, which is P(tails in first toss)
whereas P(1) = 0.25 , i.e., P(heads in first toss and tails in the second)
P(0) = 2**-1
P(1) = 2**-2
.
.
.
P(198) = 2**-199
P(199) = 2**-200
P(200) = 2**-200 #same as P(199)
Which means if you toss a coin 2**200 times, you'd get
E(0) 2**199 times
E(1) 2**198 times
.
.
E(198) 2**1 times
E(199) 2**0 times and
E(200) 2**0 times.
Thus, the expected value reduces to
(0*(2**199) + 1*(2**198) + 2*(2**197) + ... + 198*(2**1) + 199*(2**0) + 200*(2**0))/2**200
This number is virtually equal to 1.
Expected_value = 1 - 2**-200
How I got the difference.
>>> diff = 2**200 - sum([ k*(2**(199-k)) for k in range(200)], 200*(2**0))
>>> diff
1
This can be generalized to n tosses as
f(n) = 1 - 2**(-n)

Why count() method is faster than a for loop python

Here are 2 functions that do exactly the same thing, but does anyone know why the one using the count() method is much faster than the other? (I mean how does it work? How is it built?)
If possible, I'd like a more understandable answer than what's found here : Algorithm used to implement the Python str.count function
or what's in the source code : https://hg.python.org/cpython/file/tip/Objects/stringlib/fastsearch.h
def scoring1(seq):
score = 0
for i in range(len(seq)):
if seq[i] == '0':
score += 1
return score
def scoring2(seq):
score = 0
score = seq.count('0')
return score
seq = 'AATTGGCCGGGGAG0CTTC0CTCC000TTTCCCCGGAAA'
# takes 1min15 when applied to 100 sequences larger than 100 000 characters
score1 = scoring1(seq)
# takes 10 sec when applied to 100 sequences larger than 100 000 characters
score2 = scoring2(seq)
Thanks a lot for your reply
#CodeMonkey has already given the answer, but it is potentially interesting to note that your first function can be improved so that it runs about 20% faster:
import time, random
def scoring1(seq):
score=0
for i in range(len(seq)):
if seq[i]=='0':
score+=1
return score
def scoring2(seq):
score=0
for x in seq:
score += (x =='0')
return score
def scoring3(seq):
score = 0
score = seq.count('0')
return score
def test(n):
seq = ''.join(random.choice(['0','1']) for i in range(n))
functions = [scoring1,scoring2,scoring3]
for i,f in enumerate(functions):
start = time.clock()
s = f(seq)
elapsed = time.clock() - start
print('scoring' + str(i+1) + ': ' + str(s) + ' computed in ' + str(elapsed) + ' seconds')
test(10**7)
Typical output:
scoring1: 5000742 computed in 0.9651326495293333 seconds
scoring2: 5000742 computed in 0.7998054195159483 seconds
scoring3: 5000742 computed in 0.03732172598339578 seconds
Both of the first two approaches are blown away by the built-in count().
Moral of the story: when you are not using an already optimized built-in method, you need to optimize your own code.
Because count is executed in the underlying native implementation. The for-loop is executed in slower interpreted code.

Track and display percentage of code already executed

I have a very large code that takes some time to run. In order to make sure the process hasn't stalled somewhere I print to screen the percentage of the code that has already been executed, which depends on a for loop and an integer.
To display the percentage of the for loop already processed I use flags to indicate how much of the loop already passed.
The MWE might make it a bit more clear:
import time
N = 100
flag_15, flag_30, flag_45, flag_60, flag_75, flag_90 = False, False,\
False, False, False, False
for i in range(N):
# Large block of code.
time.sleep(0.1)
if i + 1 >= 0.15 * N and flag_15 is False:
print '15%'
flag_15 = True
elif i + 1 >= 0.3 * N and flag_30 is False:
print '30%'
flag_30 = True
elif i + 1 >= 0.45 * N and flag_45 is False:
print '45%'
flag_45 = True
elif i + 1 >= 0.6 * N and flag_60 is False:
print '60%'
flag_60 = True
elif i + 1 >= 0.75 * N and flag_75 is False:
print '75%'
flag_75 = True
elif i + 1 >= 0.9 * N and flag_90 is False:
print '90%'
flag_90 = True
elif i + 1 == N:
print '100%'
This works but is quite verbose and truly ugly. I was wondering if there might be a better/prettier way of doing this.
I like to use modulus to periodically print status messages.
import time
N = 100
for i in range(N):
#do work here
if i % 15 == 0:
print "{}% complete".format(int(100 * i / N))
print "100% complete"
Result:
0% complete
15% complete
30% complete
45% complete
60% complete
75% complete
90% complete
100% complete
for values of N other than 100, if you want to print every 15%, you'll have to dynamically calculate the stride instead of just using the literal 15 value.
import time
import math
N = 300
percentage_step = 15
stride = N * percentage_step / 100
for i in range(N):
#do work
if i % stride == 0:
print "{}% complete".format(int(100 * i / N))
(Posting a second answer because this solution uses a completely different technique)
You could create a list of milestone values, and print a message when the percentage complete reaches the lowest value.
milestones = [15, 30, 45, 60, 75, 90, 100]
for i in range(N):
#do work here
percentage_complete = (100.0 * (i+1) / N)
while len(milestones) > 0 and percentage_complete >= milestones[0]:
print "{}% complete".format(milestones[0])
#remove that milestone from the list
milestones = milestones[1:]
Result:
15% complete
30% complete
45% complete
60% complete
75% complete
90% complete
100% complete
Unlike the "stride" method I posted earlier, here you have precise control over which percentages are printed. They don't need to be evenly spaced, they don't need to be divisible by N, they don't even need to be integers! You could do milestones = [math.pi, 4.8, 15.16, 23.42, 99] if you wanted.
You can use combination of write() and flush() for nice ProgressBar:
import sys
import time
for i in range(100):
row = "="*i + ">"
sys.stdout.write("%s\r%d%%" %(row, i + 1))
sys.stdout.flush()
time.sleep(0.1)
sys.stdout.write("\n")
Progress will be displaying like this:
69%====================================================================>
You don't need any flags. You can just print the completion based on the current value of i.
for i in range(N):
# lots of code
print '{0}% completed.'.format((i+1)*100.0/N)
Just add a "\r" in Misha's answer:
import sys
import time
for i in range(100):
row = "="*i + ">"
sys.stdout.write("%s\r %d%%\r" %(row, i + 1))
sys.stdout.flush()
time.sleep(0.1)
sys.stdout.write("\n")
Output:
65%======================================================>
In colab.research.google.com works like this:
import sys
import time
for i in range(100):
row = "="*i + ">"
sys.stdout.write("\r %d%% %s " %( i + 1,row))
sys.stdout.flush()
time.sleep(0.1)
sys.stdout.write("\n")

comparing large vectors in python

I have two large vectors (~133000 values) of different length. They are each sortet from small to large values. I want to find values that are similar within a given tolerance. This is my solution but it is very slow. Is there a way to speed this up?
import numpy as np
for lv in range(np.size(vector1)):
for lv_2 in range(np.size(vector2)):
if np.abs(vector1[lv_2]-vector2[lv])<.02:
print(vector1[lv_2],vector2[lv],lv,lv_2)
break
Your algorithm is far from optimal. You compare way too much values. Assume you are at a certain position in vector1 and the current value in vector2 is already more than 0.02 bigger. Why would you compare the rest of vector2?
Start with something like
pos1 = 0
pos2 = 0
Now compare the values at those postions in your vectors. If the difference is too big, move the position of the smaller one fowared and check again. Continue until you reach the end of one vector.
haven't tested it, but the following should work. The idea is to exploit the fact that the vectors are sorted
lv_1, lv_2 = 0,0
while lv_1 < len(vector1) and lv_2 < len(vector2):
if np.abs(vector1[lv_2]-vector2[lv_1])<.02:
print(vector1[lv_2],vector2[lv_1],lv_1,lv_2)
lv_1 += 1
lv_2 += 1
elif vector1[lv_1] < vector2[lv_2]: lv_1 += 1
else: lv_2 += 1
The following code gives a nice increase in performance that depends upon how dense the numbers are. Using a set of 1000 random numbers, sampled uniformly between 0 and 100, it runs about 30 times faster than your implementation.
pos_1_start = 0
for i in range(np.size(vector1)):
for j in range(pos1_start, np.size(vector2)):
if np.abs(vector1[i] - vector2[j]) < .02:
results1 += [(vector1[i], vector2[j], i, j)]
else:
if vector2[j] < vector1[i]:
pos1_start += 1
else:
break
The timing:
time new method: 0.112464904785
time old method: 3.59720897675
Which is produced by the following script:
import random
import numpy as np
import time
# initialize the vectors to be compared
vector1 = [random.uniform(0, 40) for i in range(1000)]
vector2 = [random.uniform(0, 40) for i in range(1000)]
vector1.sort()
vector2.sort()
# the arrays that will contain the results for the first method
results1 = []
# the arrays that will contain the results for the second method
results2 = []
pos1_start = 0
t_start = time.time()
for i in range(np.size(vector1)):
for j in range(pos1_start, np.size(vector2)):
if np.abs(vector1[i] - vector2[j]) < .02:
results1 += [(vector1[i], vector2[j], i, j)]
else:
if vector2[j] < vector1[i]:
pos1_start += 1
else:
break
t1 = time.time() - t_start
print "time new method:", t1
t = time.time()
for lv1 in range(np.size(vector1)):
for lv2 in range(np.size(vector2)):
if np.abs(vector1[lv1]-vector2[lv2])<.02:
results2 += [(vector1[lv1], vector2[lv2], lv1, lv2)]
t2 = time.time() - t_start
print "time old method:", t2
# sort the results
results1.sort()
results2.sort()
print np.allclose(results1, results2)

Categories