Working with python 2.7.
The following code allows me to input the winning percentage of two teams (WP_1 and WP_2) a number of wins (k) and determine given the two team's winning percentages, the probability that team one will have more wins at the end of the season (Playoff_Probability):
def PlayoffProb(WP_1, k, WP_2):
TProb_2 = 0
p = float(WP_1)/1000
q = float(WP_2)/1000
n = 162.0
G = math.factorial(n)/(math.factorial(k)*math.factorial(n-k))
Prob = G*(p**k)*((1-p)**(n-k))
for c in range(0, k):
G_2 = math.factorial(n)/(math.factorial(c)*math.factorial(n-c))
Prob_2 = G_2*(q**c)*(1-q)**(n-c)
TProb_2 += Prob_2
Playoff_Probability = Prob*TProb_2
print Playoff_Probability
print TProb_2
But what would be a lot easier is if the function could be written recursively so that it would perform the same operation over every possible value of k and return the total probability of ending the season with more wins (which I believe should be given by the Playoff_Probability for each value run through the function of k, which I've tried to set equal to Total_Playoff_Probability).
I've tried the following code, but I get a TypeError telling me that 'float' object is not callable at the return Total_Playoff_Probability step. I'm also not at all sure that I've set up the recursion appropriately.
def PlayoffProb2(WP_1, k, WP_2):
TProb_2 = 0
Total_Playoff_Probability = 0
p = float(WP_1)/1000
q = float(WP_2)/1000
n = 162.0
G = math.factorial(n)/(math.factorial(k)*math.factorial(n-k))
Prob = G*(p**k)*((1-p)**(n-k))
for c in range(0, k):
G_2 = math.factorial(n)/(math.factorial(c)*math.factorial(n-c))
Prob_2 = G_2*(q**c)*(1-q)**(n-c)
TProb_2 += Prob_2
Playoff_Probability = Prob*TProb_2
Total_Playoff_Probability += Playoff_Probability
if k == 162:
return Total_Playoff_Probability
else:
return PlayoffProb2(WP_1, k+1, WP_2)
Any help would be greatly appreciated!
return Total_Playoff_Probability(WP_1, k+1, WP_2)
I think you meant
return PlayoffProb2(WP_1, k+1, WP_2)
You've got that error because you are trying to treat a floating point number as a function. Obviously, that doesn't compute.
EDIT
Actually, it should be:
return Total_Playoff_Probability + PlayoffProb2(WP_1, k+1, WP_2)
As it is, you aren't doing anything with Total_Playoff_Probability after you calculate it. If k != 167, you just return the value for k+1.
You've called your function PlayoffProb2. You must use that name when you recurse.
Related
I am trying to count the number of unique numbers in a sorted array using binary search. I need to get the edge of the change from one number to the next to count. I was thinking of doing this without using recursion. Is there an iterative approach?
def unique(x):
start = 0
end = len(x)-1
count =0
# This is the current number we are looking for
item = x[start]
while start <= end:
middle = (start + end)//2
if item == x[middle]:
start = middle+1
elif item < x[middle]:
end = middle -1
#when item item greater, change to next number
count+=1
# if the number
return count
unique([1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5])
Thank you.
Edit: Even if the runtime benefit is negligent from o(n), what is my binary search missing? It's confusing when not looking for an actual item. How can I fix this?
Working code exploiting binary search (returns 3 for given example).
As discussed in comments, complexity is about O(k*log(n)) where k is number of unique items, so this approach works well when k is small compared with n, and might become worse than linear scan in case of k ~ n
def countuniquebs(A):
n = len(A)
t = A[0]
l = 1
count = 0
while l < n - 1:
r = n - 1
while l < r:
m = (r + l) // 2
if A[m] > t:
r = m
else:
l = m + 1
count += 1
if l < n:
t = A[l]
return count
print(countuniquebs([1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5]))
I wouldn't quite call it "using a binary search", but this binary divide-and-conquer algorithm works in O(k*log(n)/log(k)) time, which is better than a repeated binary search, and never worse than a linear scan:
def countUniques(A, start, end):
len = end-start
if len < 1:
return 0
if A[start] == A[end-1]:
return 1
if len < 3:
return 2
mid = start + len//2
return countUniques(A, start, mid+1) + countUniques(A, mid, end) - 1
A = [1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,3,4,5,5,5,5,5,5,5,5,5,5]
print(countUniques(A,0,len(A)))
I am working on a python algorithm to find the most frequent element in the list.
def GetFrequency(a, element):
return sum([1 for x in a if x == element])
def GetMajorityElement(a):
n = len(a)
if n == 1:
return a[0]
k = n // 2
elemlsub = GetMajorityElement(a[:k])
elemrsub = GetMajorityElement(a[k:])
if elemlsub == elemrsub:
return elemlsub
lcount = GetFrequency(a, elemlsub)
rcount = GetFrequency(a, elemrsub)
if lcount > k:
return elemlsub
elif rcount > k:
return elemrsub
else:
return None
I tried some test cases. Some of them are passed, but some of them fails.
For example, [1,2,1,3,4] this should return 1, buit I get None.
The implementation follows the pseudocode here:
http://users.eecs.northwestern.edu/~dda902/336/hw4-sol.pdf
The pseudocode finds the majority item and needs to be at least half. I only want to find the majority item.
Can I get some help?
Thanks!
I wrote an iterative version instead of the recursive one you're using in case you wanted something similar.
def GetFrequency(array):
majority = int(len(array)/2)
result_dict = {}
while array:
array_item = array.pop()
if result_dict.get(array_item):
result_dict[array_item] += 1
else:
result_dict[array_item] = 1
if result_dict[array_item] > majority:
return array_item
return max(result_dict, key=result_dict.get)
This will iterate through the array and return the value as soon as one hits more than 50% of the total (being a majority). Otherwise it goes through the entire array and returns the value with the greatest frequency.
def majority_element(a):
return max([(a.count(elem), elem) for elem in set(a)])[1]
EDIT
If there is a tie, the biggest value is returned. E.g: a = [1,1,2,2] returns 2. Might not be what you want but that could be changed.
EDIT 2
The pseudocode you gave divided into arrays 1 to k included, k + 1 to n. Your code does 1 to k - 1, k to end, not sure if it changes much though ? If you want to respect the algorithm you gave, you should do:
elemlsub = GetMajorityElement(a[:k+1]) # this slice is indices 0 to k
elemrsub = GetMajorityElement(a[k+1:]) # this one is k + 1 to n.
Also still according to your provided pseudocode, lcount and rcount should be compared to k + 1, not k:
if lcount > k + 1:
return elemlsub
elif rcount > k + 1:
return elemrsub
else:
return None
EDIT 3
Some people in the comments highligted that provided pseudocode solves not for the most frequent, but for the item which is present more that 50% of occurences. So indeed your output for your example is correct. There is a good chance that your code already works as is.
EDIT 4
If you want to return None when there is a tie, I suggest this:
def majority_element(a):
n = len(a)
if n == 1:
return a[0]
if n == 0:
return None
sorted_counts = sorted([(a.count(elem), elem) for elem in set(a)], key=lambda x: x[0])
if len(sorted_counts) > 1 and sorted_counts[-1][0] == sorted_counts[-2][0]:
return None
return sorted_counts[-1][1]
All, I would like to count how close an array is being sorted by using Merge Sort Algorithm. I am able to use Merge Sort to arrange the array but I have trouble to keep counting how many inversion I need during the process.
For example, when input [9,4,8,3], I want to get the output [3,4,8,9] and 4 inversions. The definition of inversion is: if b in B , c in C and we have b>c then inversion is needed (the order of B,C matter). First, I will get two parts ([4,9],1) and ([3,8],1) which indicate one inversion individually. Then, when they merge again, there are another two inversions: choosing 3 instead of 4, choosing 8 instead of 9.
My main question might not relate to the algorithm itself. It is about how to keep one of my variable evolve within function loop of a function. (I have using Merge_Sort function within Merge_Sort function)
def Merge_Sort(a):
n = len(a)
if n==1:
if not 'total_rev' in vars():
total_rev = 0
else:
total_rev += rev_ind
return a , total_rev
else:
m = math.floor(n/2)
b , rev_ind_b = Merge_Sort(a[:m])
if not 'total_rev' in vars():
total_rev = 0
else:
total_rev += rev_ind_b
c , rev_ind_c = Merge_Sort(a[m:])
if not 'total_rev' in vars():
total_rev = 0
else:
total_rev += rev_ind_c
a_sort , rev_ind = Merge(b,c)
if not 'total_rev' in vars():
total_rev = 0
total_rev += rev_ind
else :
total_rev += rev_ind
return a_sort , total_rev
def Merge(b,c):
p = len(b)
q = len(c)
d = []
reverse_ind = 0
while len(b)!=0 or len(c)!=0 :
if (len(b)*len(c) != 0) :
b0 = b[0]
c0 = c[0]
if b0 <= c0 :
d.append(b0)
b.remove(b[0])
else :
reverse_ind += 1
d.append(c0)
c.remove(c[0])
else :
d.extend(b)
b=[]
d.extend(c)
c=[]
return d,reverse_ind
The Merge function can work well. The only question is I cannot keep the variable "total_inv" update as I wish. I try to define "total_inv" whenever it is not defined. Not sure if it is a good way because it made my code messy. I also try to use global variable but it cannot work well. Thank you!
It is simpler than that:
when at the deepest recursion level (n==1) just return 0 for the number of swaps. The logic is that you should return the number of swaps for the list as it is at that recursion level, without any consideration of what the larger list may be. So when n==1 your list has one value, which obviously does not need swapping.
In other cases, just add up the counts you get from the recursive calls. That way they will increase when bubbling back up the recursion tree.
Here is the adapted code for Merge_Sort:
def Merge_Sort(a):
n = len(a)
if n==1:
return a, 0 # at deepest recursion always return 0 for the number of swaps
else:
m = n//2 # use integer division; you don't need `math.floor`
b , rev_ind_b = Merge_Sort(a[:m])
c , rev_ind_c = Merge_Sort(a[m:])
a_sort , rev_ind = Merge(b,c)
return a_sort , rev_ind_b + rev_ind_c + rev_ind # add the numbers
I need to write a function that returns the number of ways of reaching a certain number by adding numbers of a list. For example:
print(p([3,5,8,9,11,12,20], 20))
should return:5
The code I wrote is:
def pow(lis):
power = [[]]
for lst in lis:
for po in power:
power = power + [list(po)+[lst]]
return power
def p(lst, n):
counter1 = 0
counter2 = 0
power_list = pow(lst)
print(power_list)
for p in power_list:
for j in p:
counter1 += j
if counter1 == n:
counter2 += 1
counter1 == 0
else:
counter1 == 0
return counter2
pow() is a function that returns all of the subsets of the list and p should return the number of ways to reach the number n. I keep getting an output of zero and I don't understand why. I would love to hear your input for this.
Thanks in advance.
There are two typos in your code: counter1 == 0 is a boolean, it does not reset anything.
This version should work:
def p(lst, n):
counter2 = 0
power_list = pow(lst)
for p in power_list:
counter1 = 0 #reset the counter for every new subset
for j in p:
counter1 += j
if counter1 == n:
counter2 += 1
return counter2
As tobias_k and Faibbus mentioned, you have a typo: counter1 == 0 instead of counter1 = 0, in two places. The counter1 == 0 produces a boolean object of True or False, but since you don't assign the result of that expression the result gets thrown away. It doesn't raise a SyntaxError, since an expression that isn't assigned is legal Python.
As John Coleman and B. M. mention it's not efficient to create the full powerset and then test each subset to see if it has the correct sum. This approach is ok if the input sequence is small, but it's very slow for even moderately sized sequences, and if you actually create a list containing the subsets rather than using a generator and testing the subsets as they're yielded you'll soon run out of RAM.
B. M.'s first solution is quite efficient since it doesn't produce subsets that are larger than the target sum. (I'm not sure what B. M. is doing with that dict-based solution...).
But we can enhance that approach by sorting the list of sums. That way we can break out of the inner for loop as soon as we detect a sum that's too high. True, we need to sort the sums list on each iteration of the outer for loop, but fortunately Python's TimSort is very efficient, and it's optimized to handle sorting a list that contains sorted sub-sequences, so it's ideal for this application.
def subset_sums(seq, goal):
sums = [0]
for x in seq:
subgoal = goal - x
temp = []
for y in sums:
if y > subgoal:
break
temp.append(y + x)
sums.extend(temp)
sums.sort()
return sum(1 for y in sums if y == goal)
# test
lst = [3, 5, 8, 9, 11, 12, 20]
total = 20
print(subset_sums(lst, total))
lst = range(1, 41)
total = 70
print(subset_sums(lst, total))
output
5
28188
With lst = range(1, 41) and total = 70, this code is around 3 times faster than the B.M. lists version.
A one pass solution with one counter, which minimize additions.
def one_pass_sum(L,target):
sums = [0]
cnt = 0
for x in L:
for y in sums[:]:
z = x+y
if z <= target :
sums.append(z)
if z == target : cnt += 1
return cnt
This way if n=len(L), you make less than 2^n additions against n/2 * 2^n by calculating all the sums.
EDIT :
A more efficient solution, that just counts ways. The idea is to see that if there is k ways to make z-x, there is k more way to do z when x arise.
def enhanced_sum_with_lists(L,target):
cnt=[1]+[0]*target # 1 way to make 0
for x in L:
for z in range(target,x-1,-1): # [target, ..., x+1, x]
cnt[z] += cnt[z-x]
return cnt[target]
But order is important : z must be considered descendant here, to have the good counts (Thanks to PM 2Ring).
This can be very fast (n*target additions) for big lists.
For example :
>>> enhanced_sum_with_lists(range(1,100),2500)
875274644371694133420180815
is obtained in 61 ms. It will take the age of the universe to compute it by the first method.
from itertools import chain, combinations
def powerset_generator(i):
for subset in chain.from_iterable(combinations(i, r) for r in range(len(i)+1)):
yield set(subset)
def count_sum(s, cnt):
return sum(1 for i in powerset_generator(s) if sum(k for k in i) == cnt)
print(count_sum(set([3,5,8,9,11,12,20]), 20))
This is a simple question that has been bothering me for a while now.
I am attempting to rewrite my code to be parallel, and in the process I need to split up a sum to be done on multiple nodes and then add those small sums together. The piece that I am working with is this:
def pia(n, i):
k = 0
lsum = 0
while k < n:
p = (n-k)
ld = (8.0*k+i)
ln = pow(16.0, p, ld)
lsum += (ln/ld)
k += 1
return lsum
where n is the limit and i is an integer. Does anyone have some hints on how to split this up and get the same result in the end?
Edit: For those asking, I'm not using pow() but a custom version to do it efficiently with floating point:
def ssp(b, n, m):
ssp = 1
while n>0:
if n % 2 == 1:
ssp = (b*ssp) % m
b = (b**2) % m
n = n // 2
return ssp
Since the only variable that's used from one pass to the next is k, and k just increments by one each time, it's easy to split the calculation.
If you also pass k into pia, then you'll have both a definable starting and ending points, and you can split this up into as many pieces as you want, and at the end, add all the results together. So something like:
# instead of pia(20000, i), use pia(n, i, k) and run
result = pia(20000, i, 10000) + pia(10000, i, 0)
Also, since n is used to both set the limits and in the calculation directly, these two uses need to be split.
from math import pow
def pia(nlimit, ncalc, i, k):
lsum = 0
while k < nlimit:
p = ncalc-k
ld = 8.0*k+i
ln = ssp(16., p, ld)
lsum += ln/ld
k += 1
return lsum
if __name__=="__main__":
i, ncalc = 5, 10
print pia(10, ncalc, i, 0)
print pia(5, ncalc, i, 0) + pia(10, ncalc, i, 5)
Looks like I found a way. What I did was in the sum I had each node calculate a portion (ex. node one calculates k=1, node 2 k=2, node 3 k=3, node 4 k=4, node 1 k=5...) and then gathered them up and added them.