Python Lottery Number Generation - python

I am working on a lottery number generation program. I have a fixed list of allowed numbers (1-80) from which users can choose 6 numbers. Each number can only be picked once. I want to generate all possible combinations efficiently. Current implementation takes more than 30 seconds if allowed_numbers is [1,...,60]. Above that, it freezes my system.
from itertools import combinations
import numpy as np
LOT_SIZE = 6
allowed_numbers = np.arange(1, 61)
all_combinations = np.array(list(combinations(allowed_numbers, LOT_SIZE)))
print(len(all_combinations))
I think I would need a numpy array (not sure if 2D). Something like,
[[1,2,3,4,5,6],
[1,2,3,4,5,,7],...]
because I want to (quickly) perform several operations on these combinations. These operations may include,
Removing combinations that have only even numbers
Removing combinations who's sum is greater than 150 etc.
Checking if there is only one pair of consecutive numbers (Acceptable: [1,2,4,6,8,10] {Pair: (1,2)}| Not-acceptable: [1,2,4,5,7,9] {Pairs: (1,2) and (4,5)} )
Any help will be appreciated.
Thanks

Some options:
1) apply filters on the iterable instead of on the data, using filter:
def filt(x):
return sum(x) < 7
list(filter(filt, itertools.combinations(allowed, n)))
will save ~15% time vs. constructing the list and applying the filters then, i.e.:
[i for i in itertools.combinations(allowed, n) if filt(i) if filt(i)]
2) Use np.fromiter
arr = np.fromiter(itertools.chain.from_iterable(itertools.combinations(allowed, n)), int).reshape(-1, n)
return arr[arr.sum(1) < 7]
3) work on the generator object itself. In the example above, you can stop the itertools.combinations when the first number is above 7 (as an example):
def my_generator():
for i in itertools.combinations(allowed, n):
if i[0] >= 7:
return
elif sum(i) < 7:
yield i
list(my_generator()) # will build 3x times faster than option 1
Note that np.fromiter becomes less efficient on compound expressions, so the mask is applied afterwards

You can use itertools.combinations(allowed_numbers, 6) to get all combinations of length 6 from your list (this is the fastest way to get this operation done).

Related

Python: Memory-efficient random sampling of list of permutations

I am seeking to sample n random permutations of a list in Python.
This is my code:
obj = [ 5 8 9 ... 45718 45719 45720]
#type(obj) = numpy.ndarray
pairs = random.sample(list(permutations(obj,2)),k= 150)
Although the code does what I want it to, it causes memory issues. I sometimes receive the error Memory error when running on CPU, and when running on GPU, my virtual machine crashes.
How can I make the code work in a more memory-efficient manner?
This avoids using permutations at all:
count = len(obj)
def index2perm(i,obj):
i1, i2 = divmod(i,len(obj)-1)
if i1 <= i2:
i2 += 1
return (obj[i1],obj[i2])
pairs = [index2perm(i,obj) for i in random.sample(range(count*(count-1)),k=3)]
Building on Pablo Ruiz's excellent answer, I suggest wrapping his sampling solution into a generator function that yields unique permutations by keeping track of what it has already yielded:
import numpy as np
def unique_permutations(sequence, r, n):
"""Yield n unique permutations of r elements from sequence"""
seen = set()
while len(seen) < n:
# This line of code adapted from Pablo Ruiz's answer:
candidate_permutation = tuple(np.random.choice(sequence, r, replace=False))
if candidate_permutation not in seen:
seen.add(candidate_permutation)
yield candidate_permutation
obj = list(range(10))
for permutation in unique_permutations(obj, 2, 15):
# do something with the permutation
# Or, to save the result as a list:
pairs = list(unique_permutations(obj, 2, 15))
My assumption is that you are sampling a small subset of the very large number of possible permutations, in which case collisions will be rare enough that keeping a seen set will not be expensive.
Warnings: this function is an infinite loop if you ask for more permutations than are possible given the inputs. It will also get increasingly slow an n gets close to the number of possible permutations, since collisions will get increasingly frequent.
If I were to put this function in my code base, I would put a shield at the top that calculated the number of possible permutations and raised a ValueError exception if n exceeded that number, and maybe output a warning if n exceeded one tenth that number, or something like that.
You can avoid listing the permutation iterator that could be massive in memory. You can generate random permutations by sampling the list with replace=False.
import numpy as np
obj = np.array([5,8,123,13541,42])
k = 15
permutations = [tuple(np.random.choice(obj, 2, replace=False)) for _ in range(k)]
print(permutations)
This problem becomes much harder, if you for example impose no repetition in your random permutations.
Edit, no repetitions code
I think this is the best possible approach for the non repetition case.
We index all possible permutations from 1 to n**2-n in a permutation matrix where the diagonal should be avoided. We sample the indexes without repetitions and without listing them, then we map the samples to the coordinates of the permutations and then we get the permutations from the indexes of matrix.
import random
import numpy as np
obj = np.array([1,2,3,10,43,19,323,142,334,33,312,31,12])
k = 150
obj_len = len(obj)
indexes = random.sample(range(obj_len**2-obj_len), k)
def mapm(m):
return m + m //(obj_len) +1
permutations = [(obj[mapm(i)//obj_len], obj[mapm(i)%obj_len]) for i in indexes]
This approach is not based on any assumption, does not load the permutations and also the performance is not based on a while loop failing to insert duplicates, as no duplicates are ever generated.

How can I obtain all combinations of my list?

I am attempting to create an array with all the possible combinations of two numbers.
My array is [0, 17.1]
I wish to obtain all the possible combinations of these two values in a list of 48 elements long, both of which can be repeated.
from itertools import combinations_with_replacement
array = [0, 17.1]
combo_wr = combinations_with_replacement(array, 48)
print(len(list(combo_wr)))
I have attempted to make use of itertools.combinations_with_replacement to create something which looks like the following -> combo_wr = combinations_with_replacement(array, 48).
When I print the length of this I would expect a much larger number but I am only getting 49 combinations of these numbers. Where am I going wrong or what other functions would work better to get all the possible combinations, order does not matter in the instance.
Below is what I have tried so far for reproducibility
>>> from itertools import combinations_with_replacement
>>> array = [0, 17.1]
>>> combo_wr = combinations_with_replacement(array, 48)
>>> print(len(list(combo_wr)))
49
a sequence of 48 numbers each chosen from 2 different options gives a search space of 2^48 which is 281.4 trillion.
An added constrant that the sum of the numbers should be larger than 250, then with [0,17.1] means at least 15 of the elements must be 17.1 so you reduce your search space by 48 choose 15 which is 1 trillion, I.E. not enough to make much of a difference.
If you set the first (or last) 15 elements to 17.1 then it would reduce the search space to choosing the rest of the elements so 2^(58-15) = 2^33 which is 8.6 billion but I'm not sure that is the constraint you actually want or if that is still small enough to be useful.
So code that produces the results you asked for is not likely to help you.
But if you still wanted help generating those trillions of combinations
to clarify what the different options available to you:
itertools.product gives every possible sequence of heads and tails
itertools.combinations gives the unordered subsets of a given length
itertools.permutations gives all ways of reordering the given sequence, or ordering of all subsets of a given length
itertools.combinations_with_replacement gives all subsets where the number of repetitions of different options is unique, for 2 element input this would be like "after n coin flips what are the sequences where the number of heads is unique"
permutations and combinations don't make sense with len(array)==2 and r=48 since they are about subsets and product will do a lot more redundancy than you want.
order does not matter in the instance.
If this is the case then it is possible you are just expecting more combos then there are.
I wish to get all of them but is it possible to narrow down those of which would satisfy say the summated value of >= 250
ok so then you can get every unique value for the sum of elements with combinations_with_replacements then do permutations on that
array = [0, 17.1]
reps = 48
lower_bound = 250
upper_bound = float("inf") # you might have an upper bound, if not you can remove this from the condition below or leave it as inf
for combo in combinations_with_replacement(array, reps):
if lower_bound <= sum(combo) <= upper_bound:
# this combo of 'number of elements that are 17.1` meets criteria for further investigation
for perm in permutations(combo):
do_thing(perm)
although this still ends up visiting a ton of duplicate entries since permutations of a sequence with a lot of duplicate entries will swap elements that are equal and give the same entries so we can do better.
First the combinations_with_replacement is really only communicating how many of each element we are dealing with so we can just do for k in range(reps) to get that info, and then want every permutation that has exactly k repeats of the second element in array - which happens to be equivalent to choosing k indices to set to that.
So we can use combinations(range(reps), k) to get a set of indices to set to the second element and this I believe is the smallest set of possible sequences you would have to check to meet the "sum is greater than 250 requirement.
reps = 48
def isSummationValidCombo(summation):
return summation >= 250
for k in range(reps):
summation = array[1] *k + array[0] * (reps-k)
if not isSummationValidCombo(summation):
continue
for indices_of_sequence_to_set_to_second_element in combinations(range(reps), k):
# each combination of k inices to set to the higher value
seq = [array[0]]*reps
for idx in indices_of_sequence_to_set_to_second_element:
seq[idx] = array[1]
do_thing(seq)
this would leave your number of combinations as 280 trillion compared to the 281 trillion that would be hit by product so you will probably need to figure out other techniques to reduce search space

Algorithm - Grouping List in unique pairs

I'm having difficulties with an assignment I've received, and I am pretty sure the problem's text is flawed. I've translated it to this:
Consider a list x[1..2n] with elements from {1,2,..,m}, m < n. Propose and implement in Python an algorithm with a complexity of O(n) that groups the elements into pairs (pairs of (x[i],x[j]) with i < j) such as every element is present in a single pair. For each set of pairs, calculate the maximum sum of the pairs, then compare it with the rest of the sets. Return the set that has the minimum of those.
For example, x = [1,5,9,3] can be paired in three ways:
(1,5),(9,3) => Sums: 6, 12 => Maximum 12
(1,9),(5,3) => Sums: 10, 8 => Maximum 10
(1,3),(5,9) => Sums: 4, 14 => Maximum 14
----------
Minimum 10
Solution to be returned: (1,9),(5,3)
The things that strike me oddly are as follows:
Table contents definition It says that there are elements of 1..2n, from {1..m}, m < n. But if m < n, then there aren't enough elements to populate the list without duplicating some, which is not allowed. So then I would assume m >= 2n. Also, the example has n = 2 but uses elements that are greater than 1, so I assume that's what they meant.
O(n) complexity? So is there a way to combine them in a single loop? I can't think of anything.
My Calculations:
For n = 4:
Number of ways to combine: 6
Valid ways: 3
For n = 6
Number of ways to combine: 910
Valid ways: 15
For n = 8
Number of ways to combine: >30 000
Valid ways: ?
So obviously, I cannot use brute force and then figure out if it is valid after then. The formula I used to calculate the total possible ways is
C(C(n,2),n/2)
Question:
Is this problem wrongly written and impossible to solve? If so, what conditions should be added or removed to make it feasible? If you are going to suggest some code in python, remember I cannot use any prebuilt functions of any kind. Thank you
Assuming a sorted list:
def answer(L):
return list(zip(L[:len(L)//2], L[len(L)//2:][::-1]))
Or if you want to do it more manually:
def answer(L):
answer = []
for i in range(len(L)//2):
answer.append((L[i], L[len(L)-i-1)]))
return answer
Output:
In [3]: answer([1,3,5,9])
Out[3]: [(1, 9), (3, 5)]

iterating from a tuple of tuples optimizations

Basic problem: take a list of digits, find all permutations, filter, filter again, and sum.
this is my first python script so after some research i decided to use itertools.permutations. i then iterate through the tuple, and create a new tuple of tuples with only the tuples i wanted. i then concatenate the tuples because I want the permutations as numbers, not as broken strings.
then i do one more filter and sum them together.
for 8 digits, this is taking me about 2.5 seconds, far too slow if i want to scale to 15 digits (my goal).
(I decided to use tuples since a list of the permutations will be too large for memory)
EDIT: I realized that I don't care about the sum of the permutations, but rather just the count. If going the generator path, how could I include a counter instead of taking the sum?
Updated my original code with [very] slight improvements shortcuts also, as to not just copy pasta suggested answers before I truly understand them.
import itertools
digits= [0,1,2,3,4,5,6,7]
digital=(itertools.permutations(digits))
mytuple=()
for i in digital:
q=''
j=list(i)
if j[0] != 0:
for k in range(len(j)):
q=q+str(j[k])
mytuple=mytuple+(q,)
#print mytuple
z = [i for i in mytuple if i%7==0]
print len(z)
this being my first python script, any non-optimization pointers would also be appreciated.
thanks!
"Generator comprehensions" are your friend. Not least because a "generator" only works on one element at a time, helping you save memory. Also, some time can be saved by pre-computing the relevant powers of 10 and performing integer arithmetic instead of converting to and from strings:
import itertools
digits = [0,1,2,3,4,5,6,7,8,9]
oom = [ 10 ** i for i, digit in enumerate( digits ) ][ ::-1 ] # orders of magnitude
allperm = itertools.permutations( digits )
firstpass = ( sum( a * b for a, b in zip( perm, oom ) ) for perm in allperm if perm[ 0 ] )
print sum( i for i in firstpass if i % 7 == 0 )
This is faster than the original by a large factor, but the factorial nature of permutations means that 15 digits is still a long way away. I get 0.05s for len(digits)==8, 0.5s for len(digits)==9, but 9.3s for len(digits)==10...
Since you're working in base 10, digits sequences of length >10 will contain repeats, leading to repeats in the set of permutations. Your strategy will need to change if the repeats are not supposed to be counted separately (e.g. if the question is phrased as "how many 15-digit multiples of 7 are repermutations of the following digits...").
Using itertools is a good choice. Well investigated.
I tried to improve the nice solution of #jez. I rearanged the range, replaced zip through izip and cached the lookup in a local variable.
N = 10
gr = xrange(N-1,-1,-1)
ap = itertools.permutations(gr)
o = [10 ** i for i in gr]
zip = itertools.izip
print sum(i for i in (sum(a*b for a, b in zip(p, o)) for p in ap if p[0]) if i % 7 == 0)
For me it's about 17% faster for N=9 and 7% for N=10. The speed improvement may is negligible for larger N's, but not tested.
There are many short-cuts in python you're missing. Try this:
import itertools
digits= [0,1,2,3,4,5,6,7]
digital=(itertools.permutations(digits))
mytuple=set()
for i in digital:
if i[0] != 0:
mytuple.add(int(''.join(str(d) for d in i)))
z = [i for i in mytuple if i%7==0]
print sum(z)
Might be hard to get to 15 digits though. 15! is 1.3 trillion...if you could process 10 million permutations per second, it would still take 36 hours.

Generating a set random list of integers based on a distribution

I hope I can explain this well, if I don't I'll try again.
I want to generate an array of 5 random numbers that all add up to 10 but whose allocation are chosen on an interval of [0,2n/m].
I'm using numpy.
The code I have so far looks like this:
import numpy as np
n=10
m=5
#interval that numbers are generated on
randNumbers= np.random.uniform(0,np.divide(np.multiply(2.0,n),fronts),fronts)
#Here I normalize the random numbers
normNumbers = np.divide(randNumbers,np.sum(randNumbers))
#Next I multiply the normalized numbers by n
newList = np.multiply(normNumbers,n)
#Round the numbers two whole numbers
finalList = np.around(newList)
This works for the most part, however the rounding is off, it will add up to 9 or 11 as opposed to 10. Is there a way to do what I'm trying to do without worrying about rounding errors, or maybe a way to work around them? If you would like for me to be more clear I can, because I have trouble explaining what I'm trying to do with this when talking :).
This generates all the possible combinations that sum to 10 and selects a random one
from itertools import product
from random import choice
n=10
m=5
finalList = choice([x for x in product(*[range(2*n/m+1)]*m) if sum(x) == 10])
There may be a more efficient way, but this will select fairly between the outcomes
Lets see how this works when n=10 and m=5
2*n/m+1 = 5, so the expression becomes
finalList = choice([x for x in product(*[range(5)]*5) if sum(x) == 10])
`*[range(5)]*5 is using argument unpacking. This is equivalent to
finalList = choice([x for x in product(range(5),range(5),range(5),range(5),range(5)) if sum(x) == 10])
product() gives the cartesian product of the parameters, which in this case has 5**5 elements, but we then filter out the ones that don't add to 10, which leaves a list of 381 values
choice() is used to select a random value from the resultant list
Just generate four of the numbers using the technique above, then subtract the sum of the four from 10 to pick the last number.

Categories