This question already has answers here:
counting combinations and permutations efficiently
(13 answers)
Closed 7 years ago.
How do I get a number of the possible combinations knowing the number of characters used in generating the combinations and a range of lengths.
To get all permutations i would use:
chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
minLen = 1
maxLen = 3
total = 0
for i in range(minLen,maxLen+1):
total += len(chars)**i
How would I do this for combinations? When repetition is not allowed.
I'm sure there's a mathematical formula to do this but I couldn't find it anywhere.
Thanks!
EDIT:
I realised that code might not be readable so here's an explenation:
It's pretty obvious what the variables are: minimum combination length, maximum, used characters...
The for loop goes from 1 to 3 and each time it adds this to the total: length of characters (len(chars)) to the power of i (the current iteration's length).
This is a basic way of calculating permutations.
So, you actually need a way to count the Binomial coefficient, right?
import math
def binomial_cooefficient(n: int, k: int) -> int:
n_fac = math.factorial(n)
k_fac = math.factorial(k)
n_minus_k_fac = math.factorial(n - k)
return n_fac/(k_fac*n_minus_k_fac)
This might not be the most optimal implementation, but it works :)
I believe that you are looking for scipy.misc.comb (which gives you the number of unique combinations of a binomial (of the form of from N take X):
>>> from scipy.misc import comb
>>> comb(10, 1) # Num of unique combinations of *from 10 take 1*
10.0
>>> comb(10, 2) # Num of unique combinations of *from 10 take 2*
45.0
And so on.. You can then get a nice reduce:
>>> total_len = 10
>>> min_size = 1
>>> max_size = 2
>>> reduce(lambda acc, x: acc + comb(total_len, x), range(min_size, max_size+1), 0)
55.0
The above reduce is the 1-liner functional equivalent of your for:
>>> total = 0
for x in range(min_size, max_size+1):
total += comb(total_len, x)
As a side note, if you use comb(total_len, x, exact=True) you will get the result as an integer rather than a float.
Related
I tried a lot of things and still don't know why it doesn't work fast. How to I fix it?
It is a CodeWars 6 kyu task:
Given a set of elements (integers or string characters, characters only in RISC-V), where any element may occur more than once, return the number of subsets that do not contain a repeated element.
import itertools
def est_subsets(a):
counter = 0
a = list(set(a))
p = itertools.chain.from_iterable(itertools.combinations(a, r)for r in range(1, len(a) + 1))
for b in p:
counter += 1
return counter
itertools.combinations needs to generate all the values. But you could just compute the number of values that would be generated directly, instead of generating them at all. Just use math.comb (added in 3.8), selecting the length of your input and you'll get the same results in a tiny fraction of the time.
Please take a look at the manual:
https://docs.python.org/3/library/itertools.html#itertools.combinations
The number of items returned is n! / r! / (n-r)! when 0 <= r <= n or zero when r > n.
Which means that you can calculate the number or items it should return.
I am trying to solve this math problem in python, and I'm not sure what it is called:
The answer X is always 100
Given a list of 5 integers, their sum would equal X
Each integer has to be between 1 and 25
The integers can appear one or more times in the list
I want to find all the possible unique lists of 5 integers that match.
These would match:
20,20,20,20,20
25,25,25,20,5
10,25,19,21,25
along with many more.
I looked at itertools.permutations, but I don't think that handles duplicate integers in the list. I'm thinking there must be a standard math algorithm for this, but my search queries must be poor.
Only other thing to mention is if it matters that the list size could change from 10 integers to some other length (6, 24, etc).
This is a constraint satisfaction problem. These can often be solved by a method called linear programming: You fix one part of the solution and then solve the remaining subproblem. In Python, we can implement this approach with a recursive function:
def csp_solutions(target_sum, n, i_min=1, i_max=25):
domain = range(i_min, i_max + 1)
if n == 1:
if target_sum in domain:
return [[target_sum]]
else:
return []
solutions = []
for i in domain:
# Check if a solution is still possible when i is picked:
if (n - 1) * i_min <= target_sum - i <= (n - 1) * i_max:
# Construct solutions recursively:
solutions.extend([[i] + sol
for sol in csp_solutions(target_sum - i, n - 1)])
return solutions
all_solutions = csp_solutions(100, 5)
This yields 23746 solutions, in agreement with the answer by Alex Reynolds.
Another approach with Numpy:
#!/usr/bin/env python
import numpy as np
start = 1
end = 25
entries = 5
total = 100
a = np.arange(start, end + 1)
c = np.array(np.meshgrid(a, a, a, a, a)).T.reshape(-1, entries)
assert(len(c) == pow(end, entries))
s = c.sum(axis=1)
#
# filter all combinations for those that meet sum criterion
#
valid_combinations = c[np.where(s == total)]
print(len(valid_combinations)) # 23746
#
# filter those combinations for unique permutations
#
unique_permutations = set(tuple(sorted(x)) for x in valid_combinations)
print(len(unique_permutations)) # 376
You want combinations_with_replacement from itertools library. Here is what the code would look like:
from itertools import combinations_with_replacement
values = [i for i in range(1, 26)]
candidates = []
for tuple5 in combinations_with_replacement(values, 5):
if sum(tuple5) == 100:
candidates.append(tuple5)
For me on this problem I get 376 candidates. As mentioned in the comments above if these are counted once for each arrangement of the 5-pair, then you'd want to look at all, permutations of the 5 candidates-which may not be all distinct. For example (20,20,20,20,20) is the same regardless of how you arrange the indices. However, (21,20,20,20,19) is not-this one has some distinct arrangements.
I think that this could be what you are searching for: given a target number SUM, a left treshold L, a right treshold R and a size K, find all the possible lists of K elements between L and R which sum gives SUM. There isn't a specific name for this problem though, as much as I was able to find.
this is the question:
Write a python code to find all the integers less than 50,000 that equal to
the sum of factorials of their digits. As an example: the number 7666 6=
7! + 6! + 6! + 6! but 145=1!+4!+5!
note: im not allowed to use any specific factorial function.
my solution:
import math
from numpy import *
for i in range(5):
for j in range(10):
for k in range(10):
for l in range(10):
for m in range(10):
x=1*m+10*l+100*k+1000*j+10000*i
def fact(m):
fact=1
for i in range(1,m+1):
fact=fact*i
return fact
y=fact(i)+fact(j)+fact(k)+fact(l)+fact(m)
if x==y :
print(x)
Hint 1
The reason this does not give you a correct answer is because there will be times where your code considers 0 to be a digit.
For example
fact(0)+fact(0)+fact(1)+fact(4)+fact(5) gives 147
because fact(0) is 1.
Hint 2
While your manner of iterating is interesting and somewhat correct, it is the source of your bug.
Try iterating normally from 1 to 50000 then figure out the sum of the digits a different way.
for i in range(50000):
# ...
Solution
Since this is StackOverflow, I offer a solution straightaway.
Use a function like this to find the sum of digits of a number:
def fact(m):
fact=1
for i in range(1,m+1):
fact=fact*i
return fact
def sumOfFactOfDigits(x):
# This function only works on integers
assert type(x) == int
total = 0
# Repeat until x is 0
while x:
# Add last digit to total
total += fact(x%10)
# Remove last digit (always ends at 0, ie 123 -> 12 -> 1 -> 0)
x //= 10
return total
for i in range(50000):
if i == sumOfFactOfDigits(i):
print(i)
Note
You should move your definition of fact outside of the loop.
This question already has answers here:
Python: Generate random number between x and y which is a multiple of 5 [duplicate]
(4 answers)
Closed 3 years ago.
I want to generate a random number from range [a,b] that is dividsible by N (4 in my case).
I have the solution, but is there a better (more elegant) way to do it?
result = random.randint(a, b)
result = math.ceil(result / 4) * 4
Solutions from here:
Python: Generate random number between x and y which is a multiple of 5
doesn't answer my question since I'll have to implement something like:
random.randint(a, b) * 4;
I'll have to divide original range by 4 and it's less readable then my original solution
A generic solution and an example
import random
def divisible_random(a,b,n):
if b-a < n:
raise Exception('{} is too big'.format(n))
result = random.randint(a, b)
while result % n != 0:
result = random.randint(a, b)
return result
# get a random int in the range 2 - 12, the number is divisible by 3
print(divisible_random(2,12,3))
The first thing coming to my mind is creating a list of all the possible choices using range in the given interval, followed by randomly choosing one value using choice.
So, in this case, for a given a and b,
random.choice(range(a + 4 - (a%4), b, 4))
If a is a perfect multiple of 4, then
random.choice(range(a, b, 4))
Would give you the required random number.
So, in a single generic function, (as suggested in comments)
def get_num(a, b, x):
if not a % x:
return random.choice(range(a, b, x))
else:
return random.choice(range(a + x - (a%x), b, x))
where x is the number whose multiples are required.
As the others have pointed out, your solution might produce out of range results, e.g. math.ceil(15 / 4) * 4 == 16. Also, be aware that the produced distribution might be very far from uniform. For example, if a == 0 and b == 4, the generated number will be 4 in 80% of the cases.
Aside from that, it seems good to me, but in Python, you can also just use the integer division operator (actually floor division, so it's not equivalent to your examlpe):
result = random.randint(a, b)
result = result // 4 * 4
But a more general albeit less efficient method of generating uniform random numbers with specific constraints (while also keeping the uniform distribution) is generating them in a loop until you find a good one:
result = 1
while result % 4 != 0:
result = random.randint(a, b)
Use random.randrange with a step size of n, using a+n-(a%n) as start if a is non-divisible by n, else use a as start
import random
def rand_n(a, b,n):
#If n is bigger than range, return -1
if n > b-a:
return -1
#If a is divisible by n, use a as a start, using n as step size
if a%n == 0:
return random.randrange(a,b,n)
# If a is not divisible by n, use a+n-(a%n) as a start, using n as step size
else:
return random.randrange(a+n-(a%n),b, n)
The task is to search every power of two below 2^10000, returning the index of the first power in which a string is contained. For example if the given string to search for is "7" the program will output 15, as 2^15 is the first power to contain 7 in it.
I have approached this with a brute force attempt which times out on ~70% of test cases.
for i in range(1,9999):
if search in str(2**i):
print i
break
How would one approach this with a time limit of 5 seconds?
Try not to compute 2^i at each step.
pow = 1
for i in xrange(1,9999):
if search in str(pow):
print i
break
pow *= 2
You can compute it as you go along. This should save a lot of computation time.
Using xrange will prevent a list from being built, but that will probably not make much of a difference here.
in is probably implemented as a quadratic string search algorithm. It may (or may not, you'd have to test) be more efficient to use something like KMP for string searching.
A faster approach could be computing the numbers directly in decimal
def double(x):
carry = 0
for i, v in enumerate(x):
d = v*2 + carry
if d > 99999999:
x[i] = d - 100000000
carry = 1
else:
x[i] = d
carry = 0
if carry:
x.append(carry)
Then the search function can become
def p2find(s):
x = [1]
for y in xrange(10000):
if s in str(x[-1])+"".join(("00000000"+str(y))[-8:]
for y in x[::-1][1:]):
return y
double(x)
return None
Note also that the digits of all powers of two up to 2^10000 are just 15 millions, and searching the static data is much faster. If the program must not be restarted each time then
def p2find(s, digits = []):
if len(digits) == 0:
# This precomputation happens only ONCE
p = 1
for k in xrange(10000):
digits.append(str(p))
p *= 2
for i, v in enumerate(digits):
if s in v: return i
return None
With this approach the first check will take some time, next ones will be very very fast.
Compute every power of two and build a suffix tree using each string. This is linear time in the size of all the strings. Now, the lookups are basically linear time in the length of each lookup string.
I don't think you can beat this for computational complexity.
There are only 10000 numbers. You don't need any complex algorithms. Simply calculated them in advance and do search. This should take merely 1 or 2 seconds.
powers_of_2 = [str(1<<i) for i in range(10000)]
def search(s):
for i in range(len(powers_of_2)):
if s in powers_of_2[i]:
return i
Try this
twos = []
twoslen = []
two = 1
for i in xrange(10000):
twos.append(two)
twoslen.append(len(str(two)))
two *= 2
tens = []
ten = 1
for i in xrange(len(str(two))):
tens.append(ten)
ten *= 10
s = raw_input()
l = len(s)
n = int(s)
for i in xrange(len(twos)):
for j in xrange(twoslen[i]):
k = twos[i] / tens[j]
if k < n: continue
if (k - n) % tens[l] == 0:
print i
exit()
The idea is to precompute every power of 2, 10 and and also to precompute the number of digits for every power of 2. In this way the problem is reduces to finding the minimum i for which there exist a j such that after removing the last j digits from 2 ** i you obtain a number which ends with n or expressed as a formula (2 ** i / 10 ** j - n) % 10 ** len(str(n)) == 0.
A big problem here is that converting a binary integer to decimal notation takes time quadratic in the number of bits (at least in the straightforward way Python does it). It's actually faster to fake your own decimal arithmetic, as #6502 did in his answer.
But it's very much faster to let Python's decimal module do it - at least under Python 3.3.2 (I don't know how much C acceleration is built in to Python decimal versions before that). Here's code:
class S:
def __init__(self):
import decimal
decimal.getcontext().prec = 4000 # way more than enough for 2**10000
p2 = decimal.Decimal(1)
full = []
for i in range(10000):
s = "%s<%s>" % (p2, i)
##assert s == "%s<%s>" % (str(2**i), i)
full.append(s)
p2 *= 2
self.full = "".join(full)
def find(self, s):
import re
pat = s + "[^<>]*<(\d+)>"
m = re.search(pat, self.full)
if m:
return int(m.group(1))
else:
print(s, "not found!")
and sample usage:
>>> s = S()
>>> s.find("1")
0
>>> s.find("2")
1
>>> s.find("3")
5
>>> s.find("65")
16
>>> s.find("7")
15
>>> s.find("00000")
1491
>>> s.find("666")
157
>>> s.find("666666")
2269
>>> s.find("66666666")
66666666 not found!
s.full is a string with a bit over 15 million characters. It looks like this:
>>> print(s.full[:20], "...", s.full[-20:])
1<0>2<1>4<2>8<3>16<4 ... 52396298354688<9999>
So the string contains each power of 2, with the exponent following a power enclosed in angle brackets. The find() method constructs a regular expression to search for the desired substring, then look ahead to find the power.
Playing around with this, I'm convinced that just about any way of searching is "fast enough". It's getting the decimal representations of the large powers that sucks up the vast bulk of the time. And the decimal module solves that one.