How to verify that a shuffling algorithm is uniform? - python

I have a simple Python implementation of Knuth's shuffling algorithm:
def knuth_shuffle(ar):
num = len(ar)
for i in range(num):
index = random.randint(0, i)
ar[i], ar[index] = ar[index], ar[i]
return ar
How is it possible to test (using scipy or whatever other package) that the shuffling is indeed uniform? I have found a couple of related posts (1, 2) but they don't answer my question. It would be great to understand how to perform such checks in general.

EDIT:
As Paul Hankin in the comments, my original test only checked that the probability of each element falling into each position, but not that all permutations are equally likely, which is a stronger requirement. The snippet below counts the frequency of each permutation, which is what we should be looking at:
import math
import random
def knuth_shuffle(ar):
num = len(ar)
for i in range(num):
index = random.randint(0, i)
ar[i], ar[index] = ar[index], ar[i]
return ar
# This function computes a unique index for a given permutation
# Adapted from https://www.jaapsch.net/puzzles/compindx.htm#perm
def permutation_index(permutation):
n = len(permutation)
t = 0
for i in range(n):
t = t * (n - i)
for j in range(i + 1, n):
if permutation[i] > permutation[j]:
t += 1
return t
N = 6 # Test list size
T = 1000 # Trials / number of permutations
random.seed(100)
n_perm = math.factorial(N)
trials = T * n_perm
ar = list(range(N))
freq = [0] * n_perm
for _ in range(trials):
ar_shuffle = ar.copy()
knuth_shuffle(ar_shuffle)
freq[permutation_index(ar_shuffle)] += 1
If the shuffle is uniform, the values of the resulting freq vector should be distributed according to a binomial distribution with T * N! trials and probability of success 1 / (N!). Here is a plot of a distribution estimation for the previous example (done with Seaborn), where frequency values should be around 1000:
Which I think looks good but again, for a quantitative result you would need a deeper statistical analysis, such as Pearson's chi-squared test, as suggested by David Eisenstat.
ORIGINAL ANSWER:
I'm going to put here some basic ideas, but I don't have the strongest background on statistics so someone may want to complement or correct anything that is wrong.
You can make a matrix of frequencies of each value falling into each position for a number of trials:
def knuth_shuffle(ar):
num = len(ar)
for i in range(num):
index = random.randint(0, i)
ar[i], ar[index] = ar[index], ar[i]
return ar
N = 100 # Test list size
T = 10000 # Number of trials
ar = list(range(N))
freq = [[0] * N for _ in range(N)]
for _ in range(T):
ar_shuffle = ar.copy()
kunth_shuffle(ar_shuffle)
for i, j in enumerate(ar_shuffle):
freq[i][j] += 1
Once you can do that, there are several approaches you can take. A simple idea is, if the shuffle is uniform, freq / T should tend to 1 / N as T tends to infinity. So you can just use a "very big" value of T and see that those values are "close enough". Or check that the standard deviation of freq / T - 1 / N is "small enough".
These "close enough" and "small enough" though are not very solid concepts. The grounded analysis requires more statistical tools. I think you would need to test the hipothesis that each frequency value is sampled from a binomial distribution with T trials a 1 / N success probability. As I said do not have the background for a full explanation of that and this is probably not the place for it but if you really need a thorough analysis you can read up on the topic.

You can check this exactly, by injecting all possible sequences of random numbers into knuth_shuffle, and then verifying you get each permutation exactly once.
This code does that:
import collections
import itertools
import random
def knuth_shuffle(ar, R=random.randint):
num = len(ar)
for i in range(num):
index = R(0, i)
ar[i], ar[index] = ar[index], ar[i]
return ar
def fact(i):
r = 1
while i > 1:
r *= i
i -= 1
return r
def all_random_seqs(N):
for r in range(fact(N)):
seq = []
for i in range(N):
seq.append(r % (i+1))
r //= (i+1)
it = iter(seq)
yield lambda x, y: next(it)
for N in range(1, 6):
print N
results = collections.Counter()
for R in all_random_seqs(N):
a = list('ABCDEFG'[:N])
knuth_shuffle(a, R)
results[''.join(a)] += 1
print 'checking...'
if len(results) != fact(N):
print 'N=%d. Not enough results. %s' % (N, results)
if any(c > 1 for c in results.itervalues()):
print 'N=%d. Not all permutations unique. %s' % (N, results)
if any(sorted(c) != list('ABCDEFG'[:N]) for c in results.iterkeys()):
print 'N=%d. Some permutations are illegal. %s' % (N, results)
This code checks exact correctness for input lists of size 1, 2, 3, 4, 5. You can probably go a little further before N! gets too large.
You will also want to perform sanity checks on the version of the code using random.randint (for example generating 500 shuffles of 'ABCD', and making sure you get each permutation at least once).

If you randomly shuffle the same items from a given fixed order, then the count of each item in one fixed position in the shuffled items should tend to the same value.
Below I shuffle the list 0..9 a few times and print the output:
from random import shuffle # Uses Fischer-Yates
tries = 1_000_000
intcount = 10
first_position_counts = {n:0 for n in ints}
ints = range(intcount)
for _ in range(tries):
lst = list(ints) # [0, 1, ...9] In that order
shuffle(lst)
first_position_counts[lst[0]] += 1
print(f'{tries} shuffles of the ints 0..{intcount-1} should have each int \n',
'appear in the first position {tries/intcount} times.')
for item in first_position_counts.items():
print(' %i: %5i' % item)
Run once you might get something like:
0: 99947
1: 100522
2: 99828
3: 100123
4: 99582
5: 99635
6: 99991
7: 100108
8: 100172
9: 100092
And again:
0: 100049
1: 99918
2: 100053
3: 100285
4: 100293
5: 100034
6: 99861
7: 99584
8: 100055
9: 99868
Now if you have thousands of items to shuffle, then they should end up in one of n! permutations, but n! grows large, quckly; and if it is "comparable", certainly grater than the possible range of your random number generator then it breaks down.

Related

finding the smallest prime factors of a number using python [duplicate]

Two part question:
Trying to determine the largest prime factor of 600851475143, I found this program online that seems to work. The problem is, I'm having a hard time figuring out how it works exactly, though I understand the basics of what the program is doing. Also, I'd like if you could shed some light on any method you may know of finding prime factors, perhaps without testing every number, and how your method works.
Here's the code that I found online for prime factorization [NOTE: This code is incorrect. See Stefan's answer below for better code.]:
n = 600851475143
i = 2
while i * i < n:
while n % i == 0:
n = n / i
i = i + 1
print(n)
#takes about ~0.01secs
Why is that code so much faster than this code, which is just to test the speed and has no real purpose other than that?
i = 1
while i < 100:
i += 1
#takes about ~3secs
This question was the first link that popped up when I googled "python prime factorization".
As pointed out by #quangpn88, this algorithm is wrong (!) for perfect squares such as n = 4, 9, 16, ... However, #quangpn88's fix does not work either, since it will yield incorrect results if the largest prime factor occurs 3 or more times, e.g., n = 2*2*2 = 8 or n = 2*3*3*3 = 54.
I believe a correct, brute-force algorithm in Python is:
def largest_prime_factor(n):
i = 2
while i * i <= n:
if n % i:
i += 1
else:
n //= i
return n
Don't use this in performance code, but it's OK for quick tests with moderately large numbers:
In [1]: %timeit largest_prime_factor(600851475143)
1000 loops, best of 3: 388 µs per loop
If the complete prime factorization is sought, this is the brute-force algorithm:
def prime_factors(n):
i = 2
factors = []
while i * i <= n:
if n % i:
i += 1
else:
n //= i
factors.append(i)
if n > 1:
factors.append(n)
return factors
Ok. So you said you understand the basics, but you're not sure EXACTLY how it works. First of all, this is a great answer to the Project Euler question it stems from. I've done a lot of research into this problem and this is by far the simplest response.
For the purpose of explanation, I'll let n = 20. To run the real Project Euler problem, let n = 600851475143.
n = 20
i = 2
while i * i < n:
while n%i == 0:
n = n / i
i = i + 1
print (n)
This explanation uses two while loops. The biggest thing to remember about while loops is that they run until they are no longer true.
The outer loop states that while i * i isn't greater than n (because the largest prime factor will never be larger than the square root of n), add 1 to i after the inner loop runs.
The inner loop states that while i divides evenly into n, replace n with n divided by i. This loop runs continuously until it is no longer true. For n=20 and i=2, n is replaced by 10, then again by 5. Because 2 doesn't evenly divide into 5, the loop stops with n=5 and the outer loop finishes, producing i+1=3.
Finally, because 3 squared is greater than 5, the outer loop is no longer true and prints the result of n.
Thanks for posting this. I looked at the code forever before realizing how exactly it worked. Hopefully, this is what you're looking for in a response. If not, let me know and I can explain further.
It looks like people are doing the Project Euler thing where you code the solution yourself. For everyone else who wants to get work done, there's the primefac module which does very large numbers very quickly:
#!python
import primefac
import sys
n = int( sys.argv[1] )
factors = list( primefac.primefac(n) )
print '\n'.join(map(str, factors))
For prime number generation I always use the Sieve of Eratosthenes:
def primes(n):
if n<=2:
return []
sieve=[True]*(n+1)
for x in range(3,int(n**0.5)+1,2):
for y in range(3,(n//x)+1,2):
sieve[(x*y)]=False
return [2]+[i for i in range(3,n,2) if sieve[i]]
In [42]: %timeit primes(10**5)
10 loops, best of 3: 60.4 ms per loop
In [43]: %timeit primes(10**6)
1 loops, best of 3: 1.01 s per loop
You can use Miller-Rabin primality test to check whether a number is prime or not. You can find its Python implementations here.
Always use timeit module to time your code, the 2nd one takes just 15us:
def func():
n = 600851475143
i = 2
while i * i < n:
while n % i == 0:
n = n / i
i = i + 1
In [19]: %timeit func()
1000 loops, best of 3: 1.35 ms per loop
def func():
i=1
while i<100:i+=1
....:
In [21]: %timeit func()
10000 loops, best of 3: 15.3 us per loop
If you are looking for pre-written code that is well maintained, use the function sympy.ntheory.primefactors from SymPy.
It returns a sorted list of prime factors of n.
>>> from sympy.ntheory import primefactors
>>> primefactors(6008)
[2, 751]
Pass the list to max() to get the biggest prime factor: max(primefactors(6008))
In case you want the prime factors of n and also the multiplicities of each of them, use sympy.ntheory.factorint.
Given a positive integer n, factorint(n) returns a dict containing the
prime factors of n as keys and their respective multiplicities as
values.
>>> from sympy.ntheory import factorint
>>> factorint(6008) # 6008 = (2**3) * (751**1)
{2: 3, 751: 1}
The code is tested against Python 3.6.9 and SymPy 1.1.1.
"""
The prime factors of 13195 are 5, 7, 13 and 29.
What is the largest prime factor of the number 600851475143 ?
"""
from sympy import primefactors
print(primefactors(600851475143)[-1])
def find_prime_facs(n):
list_of_factors=[]
i=2
while n>1:
if n%i==0:
list_of_factors.append(i)
n=n/i
i=i-1
i+=1
return list_of_factors
Isn't largest prime factor of 27 is 3 ??
The above code might be fastest,but it fails on 27 right ?
27 = 3*3*3
The above code returns 1
As far as I know.....1 is neither prime nor composite
I think, this is the better code
def prime_factors(n):
factors=[]
d=2
while(d*d<=n):
while(n>1):
while n%d==0:
factors.append(d)
n=n/d
d+=1
return factors[-1]
Another way of doing this:
import sys
n = int(sys.argv[1])
result = []
for i in xrange(2,n):
while n % i == 0:
#print i,"|",n
n = n/i
result.append(i)
if n == 1:
break
if n > 1: result.append(n)
print result
sample output :
python test.py 68
[2, 2, 17]
The code is wrong with 100. It should check case i * i = n:
I think it should be:
while i * i <= n:
if i * i = n:
n = i
break
while n%i == 0:
n = n / i
i = i + 1
print (n)
My code:
# METHOD: PRIME FACTORS
def prime_factors(n):
'''PRIME FACTORS: generates a list of prime factors for the number given
RETURNS: number(being factored), list(prime factors), count(how many loops to find factors, for optimization)
'''
num = n #number at the end
count = 0 #optimization (to count iterations)
index = 0 #index (to test)
t = [2, 3, 5, 7] #list (to test)
f = [] #prime factors list
while t[index] ** 2 <= n:
count += 1 #increment (how many loops to find factors)
if len(t) == (index + 1):
t.append(t[-2] + 6) #extend test list (as much as needed) [2, 3, 5, 7, 11, 13...]
if n % t[index]: #if 0 does else (otherwise increments, or try next t[index])
index += 1 #increment index
else:
n = n // t[index] #drop max number we are testing... (this should drastically shorten the loops)
f.append(t[index]) #append factor to list
if n > 1:
f.append(n) #add last factor...
return num, f, f'count optimization: {count}'
Which I compared to the code with the most votes, which was very fast
def prime_factors2(n):
i = 2
factors = []
count = 0 #added to test optimization
while i * i <= n:
count += 1 #added to test optimization
if n % i:
i += 1
else:
n //= i
factors.append(i)
if n > 1:
factors.append(n)
return factors, f'count: {count}' #print with (count added)
TESTING, (note, I added a COUNT in each loop to test the optimization)
# >>> prime_factors2(600851475143)
# ([71, 839, 1471, 6857], 'count: 1472')
# >>> prime_factors(600851475143)
# (600851475143, [71, 839, 1471, 6857], 'count optimization: 494')
I figure this code could be modified easily to get the (largest factor) or whatever else is needed. I'm open to any questions, my goal is to improve this much more as well for larger primes and factors.
In case you want to use numpy here's a way to create an array of all primes not greater than n:
[ i for i in np.arange(2,n+1) if 0 not in np.array([i] * (i-2) ) % np.arange(2,i)]
Check this out, it might help you a bit in your understanding.
#program to find the prime factors of a given number
import sympy as smp
try:
number = int(input('Enter a number : '))
except(ValueError) :
print('Please enter an integer !')
num = number
prime_factors = []
if smp.isprime(number) :
prime_factors.append(number)
else :
for i in range(2, int(number/2) + 1) :
"""while figuring out prime factors of a given number, n
keep in mind that a number can itself be prime or if not,
then all its prime factors will be less than or equal to its int(n/2 + 1)"""
if smp.isprime(i) and number % i == 0 :
while(number % i == 0) :
prime_factors.append(i)
number = number / i
print('prime factors of ' + str(num) + ' - ')
for i in prime_factors :
print(i, end = ' ')
This is my python code:
it has a fast check for primes and checks from highest to lowest the prime factors.
You have to stop if no new numbers came out. (Any ideas on this?)
import math
def is_prime_v3(n):
""" Return 'true' if n is a prime number, 'False' otherwise """
if n == 1:
return False
if n > 2 and n % 2 == 0:
return False
max_divisor = math.floor(math.sqrt(n))
for d in range(3, 1 + max_divisor, 2):
if n % d == 0:
return False
return True
number = <Number>
for i in range(1,math.floor(number/2)):
if is_prime_v3(i):
if number % i == 0:
print("Found: {} with factor {}".format(number / i, i))
The answer for the initial question arrives in a fraction of a second.
Below are two ways to generate prime factors of given number efficiently:
from math import sqrt
def prime_factors(num):
'''
This function collectes all prime factors of given number and prints them.
'''
prime_factors_list = []
while num % 2 == 0:
prime_factors_list.append(2)
num /= 2
for i in range(3, int(sqrt(num))+1, 2):
if num % i == 0:
prime_factors_list.append(i)
num /= i
if num > 2:
prime_factors_list.append(int(num))
print(sorted(prime_factors_list))
val = int(input('Enter number:'))
prime_factors(val)
def prime_factors_generator(num):
'''
This function creates a generator for prime factors of given number and generates the factors until user asks for them.
It handles StopIteration if generator exhausted.
'''
while num % 2 == 0:
yield 2
num /= 2
for i in range(3, int(sqrt(num))+1, 2):
if num % i == 0:
yield i
num /= i
if num > 2:
yield int(num)
val = int(input('Enter number:'))
prime_gen = prime_factors_generator(val)
while True:
try:
print(next(prime_gen))
except StopIteration:
print('Generator exhausted...')
break
else:
flag = input('Do you want next prime factor ? "y" or "n":')
if flag == 'y':
continue
elif flag == 'n':
break
else:
print('Please try again and enter a correct choice i.e. either y or n')
Since nobody has been trying to hack this with old nice reduce method, I'm going to take this occupation. This method isn't flexible for problems like this because it performs loop of repeated actions over array of arguments and there's no way how to interrupt this loop by default. The door open after we have implemented our own interupted reduce for interrupted loops like this:
from functools import reduce
def inner_func(func, cond, x, y):
res = func(x, y)
if not cond(res):
raise StopIteration(x, y)
return res
def ireducewhile(func, cond, iterable):
# generates intermediary results of args while reducing
iterable = iter(iterable)
x = next(iterable)
yield x
for y in iterable:
try:
x = inner_func(func, cond, x, y)
except StopIteration:
break
yield x
After that we are able to use some func that is the same as an input of standard Python reduce method. Let this func be defined in a following way:
def division(c):
num, start = c
for i in range(start, int(num**0.5)+1):
if num % i == 0:
return (num//i, i)
return None
Assuming we want to factor a number 600851475143, an expected output of this function after repeated use of this function should be this:
(600851475143, 2) -> (8462696833 -> 71), (10086647 -> 839), (6857, 1471) -> None
The first item of tuple is a number that division method takes and tries to divide by the smallest divisor starting from second item and finishing with square root of this number. If no divisor exists, None is returned.
Now we need to start with iterator defined like this:
def gener(prime):
# returns and infinite generator (600851475143, 2), 0, 0, 0...
yield (prime, 2)
while True:
yield 0
Finally, the result of looping is:
result = list(ireducewhile(lambda x,y: div(x), lambda x: x is not None, iterable=gen(600851475143)))
#result: [(600851475143, 2), (8462696833, 71), (10086647, 839), (6857, 1471)]
And outputting prime divisors can be captured by:
if len(result) == 1: output = result[0][0]
else: output = list(map(lambda x: x[1], result[1:]))+[result[-1][0]]
#output: [2, 71, 839, 1471]
Note:
In order to make it more efficient, you might like to use pregenerated primes that lies in specific range instead of all the values of this range.
You shouldn't loop till the square root of the number! It may be right some times, but not always!
Largest prime factor of 10 is 5, which is bigger than the sqrt(10) (3.16, aprox).
Largest prime factor of 33 is 11, which is bigger than the sqrt(33) (5.5,74, aprox).
You're confusing this with the propriety which states that, if a number has a prime factor bigger than its sqrt, it has to have at least another one other prime factor smaller than its sqrt. So, with you want to test if a number is prime, you only need to test till its sqrt.
def prime(n):
for i in range(2,n):
if n%i==0:
return False
return True
def primefactors():
m=int(input('enter the number:'))
for i in range(2,m):
if (prime(i)):
if m%i==0:
print(i)
return print('end of it')
primefactors()
Another way that skips even numbers after 2 is handled:
def prime_factors(n):
factors = []
d = 2
step = 1
while d*d <= n:
while n>1:
while n%d == 0:
factors.append(d)
n = n/d
d += step
step = 2
return factors

how can i check if number entered is prime and if not find its prime factors in python? [duplicate]

Two part question:
Trying to determine the largest prime factor of 600851475143, I found this program online that seems to work. The problem is, I'm having a hard time figuring out how it works exactly, though I understand the basics of what the program is doing. Also, I'd like if you could shed some light on any method you may know of finding prime factors, perhaps without testing every number, and how your method works.
Here's the code that I found online for prime factorization [NOTE: This code is incorrect. See Stefan's answer below for better code.]:
n = 600851475143
i = 2
while i * i < n:
while n % i == 0:
n = n / i
i = i + 1
print(n)
#takes about ~0.01secs
Why is that code so much faster than this code, which is just to test the speed and has no real purpose other than that?
i = 1
while i < 100:
i += 1
#takes about ~3secs
This question was the first link that popped up when I googled "python prime factorization".
As pointed out by #quangpn88, this algorithm is wrong (!) for perfect squares such as n = 4, 9, 16, ... However, #quangpn88's fix does not work either, since it will yield incorrect results if the largest prime factor occurs 3 or more times, e.g., n = 2*2*2 = 8 or n = 2*3*3*3 = 54.
I believe a correct, brute-force algorithm in Python is:
def largest_prime_factor(n):
i = 2
while i * i <= n:
if n % i:
i += 1
else:
n //= i
return n
Don't use this in performance code, but it's OK for quick tests with moderately large numbers:
In [1]: %timeit largest_prime_factor(600851475143)
1000 loops, best of 3: 388 µs per loop
If the complete prime factorization is sought, this is the brute-force algorithm:
def prime_factors(n):
i = 2
factors = []
while i * i <= n:
if n % i:
i += 1
else:
n //= i
factors.append(i)
if n > 1:
factors.append(n)
return factors
Ok. So you said you understand the basics, but you're not sure EXACTLY how it works. First of all, this is a great answer to the Project Euler question it stems from. I've done a lot of research into this problem and this is by far the simplest response.
For the purpose of explanation, I'll let n = 20. To run the real Project Euler problem, let n = 600851475143.
n = 20
i = 2
while i * i < n:
while n%i == 0:
n = n / i
i = i + 1
print (n)
This explanation uses two while loops. The biggest thing to remember about while loops is that they run until they are no longer true.
The outer loop states that while i * i isn't greater than n (because the largest prime factor will never be larger than the square root of n), add 1 to i after the inner loop runs.
The inner loop states that while i divides evenly into n, replace n with n divided by i. This loop runs continuously until it is no longer true. For n=20 and i=2, n is replaced by 10, then again by 5. Because 2 doesn't evenly divide into 5, the loop stops with n=5 and the outer loop finishes, producing i+1=3.
Finally, because 3 squared is greater than 5, the outer loop is no longer true and prints the result of n.
Thanks for posting this. I looked at the code forever before realizing how exactly it worked. Hopefully, this is what you're looking for in a response. If not, let me know and I can explain further.
It looks like people are doing the Project Euler thing where you code the solution yourself. For everyone else who wants to get work done, there's the primefac module which does very large numbers very quickly:
#!python
import primefac
import sys
n = int( sys.argv[1] )
factors = list( primefac.primefac(n) )
print '\n'.join(map(str, factors))
For prime number generation I always use the Sieve of Eratosthenes:
def primes(n):
if n<=2:
return []
sieve=[True]*(n+1)
for x in range(3,int(n**0.5)+1,2):
for y in range(3,(n//x)+1,2):
sieve[(x*y)]=False
return [2]+[i for i in range(3,n,2) if sieve[i]]
In [42]: %timeit primes(10**5)
10 loops, best of 3: 60.4 ms per loop
In [43]: %timeit primes(10**6)
1 loops, best of 3: 1.01 s per loop
You can use Miller-Rabin primality test to check whether a number is prime or not. You can find its Python implementations here.
Always use timeit module to time your code, the 2nd one takes just 15us:
def func():
n = 600851475143
i = 2
while i * i < n:
while n % i == 0:
n = n / i
i = i + 1
In [19]: %timeit func()
1000 loops, best of 3: 1.35 ms per loop
def func():
i=1
while i<100:i+=1
....:
In [21]: %timeit func()
10000 loops, best of 3: 15.3 us per loop
If you are looking for pre-written code that is well maintained, use the function sympy.ntheory.primefactors from SymPy.
It returns a sorted list of prime factors of n.
>>> from sympy.ntheory import primefactors
>>> primefactors(6008)
[2, 751]
Pass the list to max() to get the biggest prime factor: max(primefactors(6008))
In case you want the prime factors of n and also the multiplicities of each of them, use sympy.ntheory.factorint.
Given a positive integer n, factorint(n) returns a dict containing the
prime factors of n as keys and their respective multiplicities as
values.
>>> from sympy.ntheory import factorint
>>> factorint(6008) # 6008 = (2**3) * (751**1)
{2: 3, 751: 1}
The code is tested against Python 3.6.9 and SymPy 1.1.1.
"""
The prime factors of 13195 are 5, 7, 13 and 29.
What is the largest prime factor of the number 600851475143 ?
"""
from sympy import primefactors
print(primefactors(600851475143)[-1])
def find_prime_facs(n):
list_of_factors=[]
i=2
while n>1:
if n%i==0:
list_of_factors.append(i)
n=n/i
i=i-1
i+=1
return list_of_factors
Isn't largest prime factor of 27 is 3 ??
The above code might be fastest,but it fails on 27 right ?
27 = 3*3*3
The above code returns 1
As far as I know.....1 is neither prime nor composite
I think, this is the better code
def prime_factors(n):
factors=[]
d=2
while(d*d<=n):
while(n>1):
while n%d==0:
factors.append(d)
n=n/d
d+=1
return factors[-1]
Another way of doing this:
import sys
n = int(sys.argv[1])
result = []
for i in xrange(2,n):
while n % i == 0:
#print i,"|",n
n = n/i
result.append(i)
if n == 1:
break
if n > 1: result.append(n)
print result
sample output :
python test.py 68
[2, 2, 17]
The code is wrong with 100. It should check case i * i = n:
I think it should be:
while i * i <= n:
if i * i = n:
n = i
break
while n%i == 0:
n = n / i
i = i + 1
print (n)
My code:
# METHOD: PRIME FACTORS
def prime_factors(n):
'''PRIME FACTORS: generates a list of prime factors for the number given
RETURNS: number(being factored), list(prime factors), count(how many loops to find factors, for optimization)
'''
num = n #number at the end
count = 0 #optimization (to count iterations)
index = 0 #index (to test)
t = [2, 3, 5, 7] #list (to test)
f = [] #prime factors list
while t[index] ** 2 <= n:
count += 1 #increment (how many loops to find factors)
if len(t) == (index + 1):
t.append(t[-2] + 6) #extend test list (as much as needed) [2, 3, 5, 7, 11, 13...]
if n % t[index]: #if 0 does else (otherwise increments, or try next t[index])
index += 1 #increment index
else:
n = n // t[index] #drop max number we are testing... (this should drastically shorten the loops)
f.append(t[index]) #append factor to list
if n > 1:
f.append(n) #add last factor...
return num, f, f'count optimization: {count}'
Which I compared to the code with the most votes, which was very fast
def prime_factors2(n):
i = 2
factors = []
count = 0 #added to test optimization
while i * i <= n:
count += 1 #added to test optimization
if n % i:
i += 1
else:
n //= i
factors.append(i)
if n > 1:
factors.append(n)
return factors, f'count: {count}' #print with (count added)
TESTING, (note, I added a COUNT in each loop to test the optimization)
# >>> prime_factors2(600851475143)
# ([71, 839, 1471, 6857], 'count: 1472')
# >>> prime_factors(600851475143)
# (600851475143, [71, 839, 1471, 6857], 'count optimization: 494')
I figure this code could be modified easily to get the (largest factor) or whatever else is needed. I'm open to any questions, my goal is to improve this much more as well for larger primes and factors.
In case you want to use numpy here's a way to create an array of all primes not greater than n:
[ i for i in np.arange(2,n+1) if 0 not in np.array([i] * (i-2) ) % np.arange(2,i)]
Check this out, it might help you a bit in your understanding.
#program to find the prime factors of a given number
import sympy as smp
try:
number = int(input('Enter a number : '))
except(ValueError) :
print('Please enter an integer !')
num = number
prime_factors = []
if smp.isprime(number) :
prime_factors.append(number)
else :
for i in range(2, int(number/2) + 1) :
"""while figuring out prime factors of a given number, n
keep in mind that a number can itself be prime or if not,
then all its prime factors will be less than or equal to its int(n/2 + 1)"""
if smp.isprime(i) and number % i == 0 :
while(number % i == 0) :
prime_factors.append(i)
number = number / i
print('prime factors of ' + str(num) + ' - ')
for i in prime_factors :
print(i, end = ' ')
This is my python code:
it has a fast check for primes and checks from highest to lowest the prime factors.
You have to stop if no new numbers came out. (Any ideas on this?)
import math
def is_prime_v3(n):
""" Return 'true' if n is a prime number, 'False' otherwise """
if n == 1:
return False
if n > 2 and n % 2 == 0:
return False
max_divisor = math.floor(math.sqrt(n))
for d in range(3, 1 + max_divisor, 2):
if n % d == 0:
return False
return True
number = <Number>
for i in range(1,math.floor(number/2)):
if is_prime_v3(i):
if number % i == 0:
print("Found: {} with factor {}".format(number / i, i))
The answer for the initial question arrives in a fraction of a second.
Below are two ways to generate prime factors of given number efficiently:
from math import sqrt
def prime_factors(num):
'''
This function collectes all prime factors of given number and prints them.
'''
prime_factors_list = []
while num % 2 == 0:
prime_factors_list.append(2)
num /= 2
for i in range(3, int(sqrt(num))+1, 2):
if num % i == 0:
prime_factors_list.append(i)
num /= i
if num > 2:
prime_factors_list.append(int(num))
print(sorted(prime_factors_list))
val = int(input('Enter number:'))
prime_factors(val)
def prime_factors_generator(num):
'''
This function creates a generator for prime factors of given number and generates the factors until user asks for them.
It handles StopIteration if generator exhausted.
'''
while num % 2 == 0:
yield 2
num /= 2
for i in range(3, int(sqrt(num))+1, 2):
if num % i == 0:
yield i
num /= i
if num > 2:
yield int(num)
val = int(input('Enter number:'))
prime_gen = prime_factors_generator(val)
while True:
try:
print(next(prime_gen))
except StopIteration:
print('Generator exhausted...')
break
else:
flag = input('Do you want next prime factor ? "y" or "n":')
if flag == 'y':
continue
elif flag == 'n':
break
else:
print('Please try again and enter a correct choice i.e. either y or n')
Since nobody has been trying to hack this with old nice reduce method, I'm going to take this occupation. This method isn't flexible for problems like this because it performs loop of repeated actions over array of arguments and there's no way how to interrupt this loop by default. The door open after we have implemented our own interupted reduce for interrupted loops like this:
from functools import reduce
def inner_func(func, cond, x, y):
res = func(x, y)
if not cond(res):
raise StopIteration(x, y)
return res
def ireducewhile(func, cond, iterable):
# generates intermediary results of args while reducing
iterable = iter(iterable)
x = next(iterable)
yield x
for y in iterable:
try:
x = inner_func(func, cond, x, y)
except StopIteration:
break
yield x
After that we are able to use some func that is the same as an input of standard Python reduce method. Let this func be defined in a following way:
def division(c):
num, start = c
for i in range(start, int(num**0.5)+1):
if num % i == 0:
return (num//i, i)
return None
Assuming we want to factor a number 600851475143, an expected output of this function after repeated use of this function should be this:
(600851475143, 2) -> (8462696833 -> 71), (10086647 -> 839), (6857, 1471) -> None
The first item of tuple is a number that division method takes and tries to divide by the smallest divisor starting from second item and finishing with square root of this number. If no divisor exists, None is returned.
Now we need to start with iterator defined like this:
def gener(prime):
# returns and infinite generator (600851475143, 2), 0, 0, 0...
yield (prime, 2)
while True:
yield 0
Finally, the result of looping is:
result = list(ireducewhile(lambda x,y: div(x), lambda x: x is not None, iterable=gen(600851475143)))
#result: [(600851475143, 2), (8462696833, 71), (10086647, 839), (6857, 1471)]
And outputting prime divisors can be captured by:
if len(result) == 1: output = result[0][0]
else: output = list(map(lambda x: x[1], result[1:]))+[result[-1][0]]
#output: [2, 71, 839, 1471]
Note:
In order to make it more efficient, you might like to use pregenerated primes that lies in specific range instead of all the values of this range.
You shouldn't loop till the square root of the number! It may be right some times, but not always!
Largest prime factor of 10 is 5, which is bigger than the sqrt(10) (3.16, aprox).
Largest prime factor of 33 is 11, which is bigger than the sqrt(33) (5.5,74, aprox).
You're confusing this with the propriety which states that, if a number has a prime factor bigger than its sqrt, it has to have at least another one other prime factor smaller than its sqrt. So, with you want to test if a number is prime, you only need to test till its sqrt.
def prime(n):
for i in range(2,n):
if n%i==0:
return False
return True
def primefactors():
m=int(input('enter the number:'))
for i in range(2,m):
if (prime(i)):
if m%i==0:
print(i)
return print('end of it')
primefactors()
Another way that skips even numbers after 2 is handled:
def prime_factors(n):
factors = []
d = 2
step = 1
while d*d <= n:
while n>1:
while n%d == 0:
factors.append(d)
n = n/d
d += step
step = 2
return factors

Python - Pull random numbers from a list. Populate a new list with a specified length and sum

I am trying to create a function where:
The output list is generated from random numbers from the input list
The output list is a specified length and adds to a specified sum
ex. I specify that I want a list that is 4 in length and adds up to 10. random numbers are pulled from the input list until the criteria is satisfied.
I feel like I am approaching this problem all wrong trying to use recursion. Any help will be greatly appreciated!!!
EDIT: for more context on this problem.... Its going to be a random enemy generator.
The end goal input list will be coming from a column in a CSV called XP. (I plan to use pandas module). But this CSV will have a list of enemy names in the one column, XP in another, Health in another, etc. So the end goal is to be able to specify the total number of enemies and what the sum XP should be between those enemies and have the list generate with the appropriate information. For ex. 5 enemies with a total of 200 XP between them. The result is maybe -> Apprentice Wizard(50 xp), Apprentice Wizard(50 xp), Grung(50), Xvart(25 xp), Xvart(25 xp). The output list will actually need to include all of the row information for the selected items. And it is totally fine to have duplicated in the output as seen in this example. That will actually make more sense in the narrative of the game that this is for.
The csv --> https://docs.google.com/spreadsheets/d/1PjnN00bikJfY7mO3xt4nV5Ua1yOIsh8DycGqed6hWD8/edit?usp=sharing
import random
from random import *
lis = [1,2,3,4,5,6,7,8,9,10]
output = []
def query (total, numReturns, myList, counter):
random_index = randrange(len(myList)-1)
i = myList[random_index]
h = myList[i]
# if the problem hasn't been solved yet...
if len(output) != numReturns and sum(output) != total:
print(output)
# if the length of the list is 0 (if we just started), then go ahead and add h to the output
if len(output) == 0 and sum(output) + h != total:
output.append(h)
query (total, numReturns, myList, counter)
#if the length of the output is greater than 0
if len(output) > 0:
# if the length plus 1 is less than or equal to the number numReturns
if len(output) +1 <= numReturns:
print(output)
#if the sum of list plus h is greater than the total..then h is too big. We need to try another number
if sum(output) + h > total:
# start counter
for i in myList:# try all numbers in myList...
print(output)
print ("counter is ", counter, " and i is", i)
counter += 1
print(counter)
if sum(output) + i == total:
output.append(i)
counter = 0
break
if sum(output) + i != total:
pass
if counter == len(myList):
del(output[-1]) #delete last item in list
print(output)
counter = 0 # reset the counter
else:
pass
#if the sum of list plus h is less than the total
if sum(output) + h < total:
output.append(h) # add h to the list
print(output)
query (total, numReturns, myList, counter)
if len(output) == numReturns and sum(output) == total:
print(output, 'It worked')
else:
print ("it did not work")
query(10, 4, lis, 0)
I guess that it would be better to get first all n-size combinations of given array which adds to specified number, and then randomly select one of them. Random selecting and checking if sum is equal to specified value, in pessimistic scenario, can last indefinitely.
from itertools import combinations as comb
from random import randint
x = [1,1,2,4,3,1,5,2,6]
def query(arr, total, size):
combs = [c for c in list(comb(arr, size)) if sum(c)==total]
return combs[randint(0, len(combs))]
#example 4-item array with items from x, which adds to 10
print(query(x, 10, 4))
If the numbers in your input list are consecutive numbers, then this is equivalent to the problem of choosing a uniform random output list of N integers in the range [min, max], where the output list is ordered randomly and min and max are the smallest and largest number in the input list. The Python code below shows how this can be solved. It has the following advantages:
It does not use rejection sampling.
It chooses uniformly at random from among all combinations that meet the requirements.
It's based on an algorithm by John McClane, which he posted as an answer to another question. I describe the algorithm in another answer.
import random # Or secrets
def _getSolTable(n, mn, mx, sum):
t = [[0 for i in range(sum + 1)] for j in range(n + 1)]
t[0][0] = 1
for i in range(1, n + 1):
for j in range(0, sum + 1):
jm = max(j - (mx - mn), 0)
v = 0
for k in range(jm, j + 1):
v += t[i - 1][k]
t[i][j] = v
return t
def intsInRangeWithSum(numSamples, numPerSample, mn, mx, sum):
""" Generates one or more combinations of
'numPerSample' numbers each, where each
combination's numbers sum to 'sum' and are listed
in any order, and each
number is in the interval '[mn, mx]'.
The combinations are chosen uniformly at random.
'mn', 'mx', and
'sum' may not be negative. Returns an empty
list if 'numSamples' is zero.
The algorithm is thanks to a _Stack Overflow_
answer (`questions/61393463`) by John McClane.
Raises an error if there is no solution for the given
parameters. """
adjsum = sum - numPerSample * mn
# Min, max, sum negative
if mn < 0 or mx < 0 or sum < 0:
raise ValueError
# No solution
if numPerSample * mx < sum:
raise ValueError
if numPerSample * mn > sum:
raise ValueError
if numSamples == 0:
return []
# One solution
if numPerSample * mx == sum:
return [[mx for i in range(numPerSample)] for i in range(numSamples)]
if numPerSample * mn == sum:
return [[mn for i in range(numPerSample)] for i in range(numSamples)]
samples = [None for i in range(numSamples)]
table = _getSolTable(numPerSample, mn, mx, adjsum)
for sample in range(numSamples):
s = adjsum
ret = [0 for i in range(numPerSample)]
for ib in range(numPerSample):
i = numPerSample - 1 - ib
# Or secrets.randbelow(table[i + 1][s])
v = random.randint(0, table[i + 1][s] - 1)
r = mn
v -= table[i][s]
while v >= 0:
s -= 1
r += 1
v -= table[i][s]
ret[i] = r
samples[sample] = ret
return samples
Example:
weights=intsInRangeWithSum(
# One sample
1,
# Count of numbers per sample
4,
# Range of the random numbers
1, 5,
# Sum of the numbers
10)
# Divide by 100 to get weights that sum to 1
weights=[x/20.0 for x in weights[0]]

Code is taking too much time

I wrote code to arrange numbers after taking user input. The ordering requires that the sum of adjacent numbers is prime. Up until 10 as an input code is working fine. If I go beyond that the system hangs. Please let me know the steps to optimize it
ex input 8
Answer should be: (1, 2, 3, 4, 7, 6, 5, 8)
Code as follows....
import itertools
x = raw_input("please enter a number")
range_x = range(int(x)+1)
del range_x[0]
result = list(itertools.permutations(range_x))
def prime(x):
for i in xrange(1,x,2):
if i == 1:
i = i+1
if x%i==0 and i < x :
return False
else:
return True
def is_prime(a):
for i in xrange(len(a)):
print a
if i < len(a)-1:
if prime(a[i]+a[i+1]):
pass
else:
return False
else:
return True
for i in xrange(len(result)):
if i < len(result)-1:
if is_prime(result[i]):
print 'result is:'
print result[i]
break
else:
print 'result is'
print result[i-1]
For posterity ;-), here's one more based on finding a Hamiltonian path. It's Python3 code. As written, it stops upon finding the first path, but can easily be changed to generate all paths. On my box, it finds a solution for all n in 1 through 900 inclusive in about one minute total. For n somewhat larger than 900, it exceeds the maximum recursion depth.
The prime generator (psieve()) is vast overkill for this particular problem, but I had it handy and didn't feel like writing another ;-)
The path finder (ham()) is a recursive backtracking search, using what's often (but not always) a very effective ordering heuristic: of all the vertices adjacent to the last vertex in the path so far, look first at those with the fewest remaining exits. For example, this is "the usual" heuristic applied to solving Knights Tour problems. In that context, it often finds a tour with no backtracking needed at all. Your problem appears to be a little tougher than that.
def psieve():
import itertools
yield from (2, 3, 5, 7)
D = {}
ps = psieve()
next(ps)
p = next(ps)
assert p == 3
psq = p*p
for i in itertools.count(9, 2):
if i in D: # composite
step = D.pop(i)
elif i < psq: # prime
yield i
continue
else: # composite, = p*p
assert i == psq
step = 2*p
p = next(ps)
psq = p*p
i += step
while i in D:
i += step
D[i] = step
def build_graph(n):
primes = set()
for p in psieve():
if p > 2*n:
break
else:
primes.add(p)
np1 = n+1
adj = [set() for i in range(np1)]
for i in range(1, np1):
for j in range(i+1, np1):
if i+j in primes:
adj[i].add(j)
adj[j].add(i)
return set(range(1, np1)), adj
def ham(nodes, adj):
class EarlyExit(Exception):
pass
def inner(index):
if index == n:
raise EarlyExit
avail = adj[result[index-1]] if index else nodes
for i in sorted(avail, key=lambda j: len(adj[j])):
# Remove vertex i from the graph. If this isolates
# more than 1 vertex, no path is possible.
result[index] = i
nodes.remove(i)
nisolated = 0
for j in adj[i]:
adj[j].remove(i)
if not adj[j]:
nisolated += 1
if nisolated > 1:
break
if nisolated < 2:
inner(index + 1)
nodes.add(i)
for j in adj[i]:
adj[j].add(i)
n = len(nodes)
result = [None] * n
try:
inner(0)
except EarlyExit:
return result
def solve(n):
nodes, adj = build_graph(n)
return ham(nodes, adj)
This answer is based on #Tim Peters' suggestion about Hamiltonian paths.
There are many possible solutions. To avoid excessive memory consumption for intermediate solutions, a random path can be generated. It also allows to utilize multiple CPUs easily (each cpu generates its own paths in parallel).
import multiprocessing as mp
import sys
def main():
number = int(sys.argv[1])
# directed graph, vertices: 1..number (including ends)
# there is an edge between i and j if (i+j) is prime
vertices = range(1, number+1)
G = {} # vertex -> adjacent vertices
is_prime = sieve_of_eratosthenes(2*number+1)
for i in vertices:
G[i] = []
for j in vertices:
if is_prime[i + j]:
G[i].append(j) # there is an edge from i to j in the graph
# utilize multiple cpus
q = mp.Queue()
for _ in range(mp.cpu_count()):
p = mp.Process(target=hamiltonian_random, args=[G, q])
p.daemon = True # do not survive the main process
p.start()
print(q.get())
if __name__=="__main__":
main()
where Sieve of Eratosthenes is:
def sieve_of_eratosthenes(limit):
is_prime = [True]*limit
is_prime[0] = is_prime[1] = False # zero and one are not primes
for n in range(int(limit**.5 + .5)):
if is_prime[n]:
for composite in range(n*n, limit, n):
is_prime[composite] = False
return is_prime
and:
import random
def hamiltonian_random(graph, result_queue):
"""Build random paths until Hamiltonian path is found."""
vertices = list(graph.keys())
while True:
# build random path
path = [random.choice(vertices)] # start with a random vertice
while True: # until path can be extended with a random adjacent vertex
neighbours = graph[path[-1]]
random.shuffle(neighbours)
for adjacent_vertex in neighbours:
if adjacent_vertex not in path:
path.append(adjacent_vertex)
break
else: # can't extend path
break
# check whether it is hamiltonian
if len(path) == len(vertices):
assert set(path) == set(vertices)
result_queue.put(path) # found hamiltonian path
return
Example
$ python order-adjacent-prime-sum.py 20
Output
[19, 18, 13, 10, 1, 4, 9, 14, 5, 6, 17, 2, 15, 16, 7, 12, 11, 8, 3, 20]
The output is a random sequence that satisfies the conditions:
it is a permutation of the range from 1 to 20 (including)
the sum of adjacent numbers is prime
Time performance
It takes around 10 seconds on average to get result for n = 900 and extrapolating the time as exponential function, it should take around 20 seconds for n = 1000:
The image is generated using this code:
import numpy as np
figname = 'hamiltonian_random_noset-noseq-900-900'
Ns, Ts = np.loadtxt(figname+'.xy', unpack=True)
# use polyfit to fit the data
# y = c*a**n
# log y = log (c * a ** n)
# log Ts = log c + Ns * log a
coeffs = np.polyfit(Ns, np.log2(Ts), deg=1)
poly = np.poly1d(coeffs, variable='Ns')
# use curve_fit to fit the data
from scipy.optimize import curve_fit
def func(x, a, c):
return c*a**x
popt, pcov = curve_fit(func, Ns, Ts)
aa, cc = popt
a, c = 2**coeffs
# plot it
import matplotlib.pyplot as plt
plt.figure()
plt.plot(Ns, np.log2(Ts), 'ko', label='time measurements')
plt.plot(Ns, np.polyval(poly, Ns), 'r-',
label=r'$time = %.2g\times %.4g^N$' % (c, a))
plt.plot(Ns, np.log2(func(Ns, *popt)), 'b-',
label=r'$time = %.2g\times %.4g^N$' % (cc, aa))
plt.xlabel('N')
plt.ylabel('log2(time in seconds)')
plt.legend(loc='upper left')
plt.show()
Fitted values:
>>> c*a**np.array([900, 1000])
array([ 11.37200806, 21.56029156])
>>> func([900, 1000], *popt)
array([ 14.1521409 , 22.62916398])
Dynamic programming, to the rescue:
def is_prime(n):
return all(n % i != 0 for i in range(2, n))
def order(numbers, current=[]):
if not numbers:
return current
for i, n in enumerate(numbers):
if current and not is_prime(n + current[-1]):
continue
result = order(numbers[:i] + numbers[i + 1:], current + [n])
if result:
return result
return False
result = order(range(500))
for i in range(len(result) - 1):
assert is_prime(result[i] + result[i + 1])
You can force it to work for even larger lists by increasing the maximum recursion depth.
Here's my take on a solution. As Tim Peters pointed out, this is a Hamiltonian path problem.
So the first step is to generate the graph in some form.
Well the zeroth step in this case to generate prime numbers. I'm going to use a sieve, but whatever prime test is fine. We need primes upto 2 * n since that is the largest any two numbers can sum to.
m = 8
n = m + 1 # Just so I don't have to worry about zero indexes and random +/- 1's
primelen = 2 * m
prime = [True] * primelen
prime[0] = prime[1] = False
for i in range(4, primelen, 2):
prime[i] = False
for i in range(3, primelen, 2):
if not prime[i]:
continue
for j in range(i * i, primelen, i):
prime[j] = False
Ok, now we can test for primality with prime[i]. Now its easy to make the graph edges. If I have a number i, what numbers can come next. I'll also make use of the fact that i and j have opposite parity.
pairs = [set(j for j in range(i%2+1, n, 2) if prime[i+j])
for i in range(n)]
So here pairs[i] is set object whose elements are integers j such that i+j is prime.
Now we need to walk the graph. This is really where the time consuming part is and all further optimizations will be done here.
chains = [
([], set(range(1, n))
]
chains is going to keep track of the valid paths as we walk them. The first element in the tuple will be your result. The second element is all the unused numbers, or unvisited nodes. The idea is to take one chain out of the queue, take a step down the path and put it back.
while chains:
chain, unused = chains.pop()
if not chain:
# we haven't even started, all unused are valid
valid_next = unused
else:
# We need numbers that are both unused and paired with the last node
# Using sets makes this easy
valid_next = unused & pairs[chains[-1]]
for num in valid_next:
# Take a step to the new node and add the new path back to chains
# Reminder, its important not to mutate anything here, always make new objs
newchain = chain + [num]
newunused = unused - set([num])
chains.append( (newchain, newunused) )
# are we done?
if not newunused:
print newchain
chains = False
Notice that if there is no valid next step, the path is removed without a replacement.
This is really memory inefficient, but runs in a reasonable time. The biggest performance bottleneck is walking the graph, so the next optimization would be popping and inserting paths in intelligent places to prioritize the most likely paths. It might be helpful to use a collections.deque or different container for your chains in that case.
EDIT
Here is an example of how you can implement your path priority. We will assign each path a score and keep the chains list sorted by this score. For a simple example I will suggest that paths containing "harder to use" nodes are worth more. That is for each step on a path the score will increase by n - len(valid_next) The modified code will look something like this.
import bisect
chains = ...
chains_score = [0]
while chains:
chain, unused = chains.pop()
score = chains_score.pop()
...
for num in valid_next:
newchain = chain + [num]
newunused = unused - set([num])
newscore = score + n - len(valid_next)
index = bisect.bisect(chains_score, newscore)
chains.insert(index, (newchain, newunused))
chains_score.insert(index, newscore)
Remember that insertion is O(n) so the overhead of adding this can be rather large. Its worth doing some analysis on your score algorithm to keep the queue length len(chains) managable.

Calculating the first triangle number to have over 500 divisors in python

I'm trying to solve the 12th problem on Project Euler. I can calculate the number that has over 500 divisors in almost 4 minutes. How can i make it faster? Here's the attempt;
import time
def main():
memo={0:0,1:1}
i=2
n=200
while(1):
if len(getD(getT(i)))>n:
break
i+=1
print(getT(i))
#returns the nth triangle number
def getT(n):
if not n in memo:
memo[n]=n+getT(n-1)
return memo[n]
#returns the list of the divisors
def getD(n):
divisors=[n]
for i in xrange(1,int((n/2)+1)):
if (n/float(i))%1==0:
divisors.append(i)
return divisors
startTime=time.time()
main()
print(time.time()-startTime)
You don't need an array to store the triangle numbers. You can use a single int because you are checking only one value. Also, it might help to use the triangle number formula:n*(n+1)/2 where you find the nth triangle number.
getD also only needs to return a single number, as you are just looking for 500 divisors, not the values of the divisors.
However, your real problem lies in the n/2 in the for loop. By checking factor pairs, you can use sqrt(n). So only check values up to sqrt(n). If you check up to n/2, you get a very large number of wasted tests (in the millions).
So you want to do the following (n is the integer to find number of divisors of, d is possible divisor):
make sure n/d has no remainder.
determine whether to add 1 or 2 to your number of divisors.
Using a decorator (courtesy of activestate recipes) to save previously calculated values, and using a list comprehension to generate the devisors:
def memodict(f):
""" Memoization decorator for a function taking a single argument """
class memodict(dict):
def __missing__(self, key):
ret = self[key] = f(key)
return ret
return memodict().__getitem__
#memodict
def trinumdiv(n):
'''Return the number of divisors of the n-th triangle number'''
numbers = range(1,n+1)
total = sum(numbers)
return len([j for j in range(1,total+1) if total % j == 0])
def main():
nums = range(100000)
for n in nums:
if trinumdiv(n) > 200:
print n
break
Results:
In [1]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:def main():
: nums = range(10000)
: for n in nums:
: if trinumdiv(n) > 100:
: print 'Found:', n
: break
:
:startTime=time.time()
:main()
:print(time.time()-startTime)
:--
Found: 384
1.34229898453
and
In [2]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:def main():
: nums = range(10000)
: for n in nums:
: if trinumdiv(n) > 200:
: print 'Found:', n
: break
:
:startTime=time.time()
:main()
:print(time.time()-startTime)
:--
Found: 2015
220.681169033
A few comments.
As Quincunx writes, you only need to check the integer range from 1..sqrt(n) which would translate into something like this for i in xrange(1, sqrt(n) + 1): .... This optimization alone vastly speeds up things.
You can use the triangle number formula (which I didn't know until just now, thank you Quincunx), or you can use another approach for finding the triangle numbers than recursion and dictionary lookups. You only need the next number in the sequence, so there is no point in saving it. Function calls involves significant overhead in Python, so recursion is usually not recommended for number crunching. Also, why the cast to float, I didn't quite get that ?
I see that you are already using xrange instead of range to build the int stream. I assume you know that xrange is faster because it is implemented as a generator function. You can do that too. This makes things a lot smoother as well.
I've tried to do just that, use generators, and the code below finds the 500th triangle number in ~16sec on my machine (YMMV). But I've also used a neat trick to find the divisors, which is the quadratic sieve.
Here is my code:
def triangle_num_generator():
""" return the next triangle number on each call
Nth triangle number is defined as SUM([1...N]) """
n = 1
s = 0
while 1:
s += n
n += 1
yield s
def triangle_num_naive(n):
""" return the nth triangle number using the triangle generator """
tgen = triangle_num_generator()
ret = 0
for i in range(n):
ret = tgen.next()
return ret
def divisor_gen(n):
""" finds divisors by using a quadrativ sieve """
divisors = []
# search from 1..sqrt(n)
for i in xrange(1, int(n**0.5) + 1):
if n % i is 0:
yield i
if i is not n / i:
divisors.insert(0, n / i)
for div in divisors:
yield div
def divisors(n):
return [d for d in divisor_gen(n)]
num_divs = 0
i = 1
while num_divs < 500:
i += 1
tnum = triangle_num_naive(i)
divs = divisors(tnum)
num_divs = len(divs)
print tnum # 76576500
Running it produces the following output on my humble machine:
morten#laptop:~/documents/project_euler$ time python pr012.py
76576500
real 0m16.584s
user 0m16.521s
sys 0m0.016s
Using the triangle formula instead of the naive approach:
real 0m3.437s
user 0m3.424s
sys 0m0.000s
I made a code for the same task. It is fairly fast. I used a very fast factor-finding algorithm to find the factors of the number. I also used (n^2 + n)/2 to find the triangle numbers. Here is the code:
from functools import reduce
import time
start = time.time()
n = 1
list_divs = []
while len(list_divs) < 500:
tri_n = (n*n+n)/2 # Generates the triangle number T(n)
list_divs = list(set(reduce(list.__add__,([i, int(tri_n//i)] for i in range(1, int(pow(tri_n, 0.5) + 1)) if tri_n % i == 0)))) # this is the factor generator for any number n
n+=1
print(tri_n, time.time() - start)
It completes the job in 15 seconds on an OK computer.
Here is my answer which solves in about 3 seconds. I think it could be made faster by keeping track of the divisors or generating a prime list to use as divisors... but 3 seconds was quick enough for me.
import time
def numdivisors(triangle):
factors = 0
for i in range(1, int((triangle ** 0.5)) + 1):
if triangle % i == 0:
factors += 1
return factors * 2
def maxtriangledivisors(max):
i = 1
triangle = 0
while i > 0:
triangle += i
if numdivisors(triangle) >= max:
print 'it was found number', triangle,'triangle', i, 'with total of ', numdivisors(triangle), 'factors'
return triangle
i += 1
startTime=time.time()
maxtriangledivisors(500)
print(time.time()-startTime)
Here is another solution to the problem.In this i use Sieve of Eratosthenes to find the primes then doing prime factorisation.
Applying the below formula to calculate number of factors of a number:
total number of factors=(n+1)*(m+1).....
where number=2^n*3^n.......
My best time is 1.9 seconds.
from time import time
t=time()
a=[0]*100
c=0
for i in range(2,100):
if a[i]==0:
for j in range(i*i,100,i):
continue
a[c]=i
c=c+1
print(a)
n=1
ctr=0
while(ctr<=1000):
ctr=1
triang=n*(n+1)/2
x=triang
i=0
n=n+1
while(a[i]<=x):
b=1
while(x%a[i]==0):
b=b+1
x=x//a[i];
i=i+1
ctr=ctr*b
print(triang)
print("took time",time()-t)

Categories