Calculations on sliding windows and memoization

Calculations on sliding windows and memoization - python

I am working on Project Euler Problem 50, which states:
The prime 41, can be written as the sum of six consecutive primes:
41 = 2 + 3 + 5 + 7 + 11 + 13
This is the longest sum of consecutive primes that adds to a prime below one-hundred.
The longest sum of consecutive primes below one-thousand that adds to a prime, contains 21 terms, and is equal to 953.
Which prime, below one-million, can be written as the sum of the most consecutive primes?
For determining the terms in prime P (if it at all can be written as a sum of primes) I use a sliding window of all the primes (in increasing order) up to (but not including) P, and calculate the sum of all these windows, if the sum is equal to the prime considered, I count the length of the window...
This works fine for all primes up to 1000, but for primes up to 10**6 it is very slow, so I was hoping memozation would help; when calculating the sum of sliding windows, a lot of double work is done...(right?)
So I found the standard memoizaton implemention on the net and just pasted it in my code, is this correct? (I have no idea how it is supposed to work here...)
primes = tuple(n for n in range(1, 10**6) if is_prime(n)==True)
count_best = 0
##http://docs.python.org/release/2.3.5/lib/itertools-example.html:
## Slightly modified (first for loop)
from itertools import islice
def window(seq):
for n in range(2, len(seq) + 1):
it = iter(seq)
result = tuple(islice(it, n))
if len(result) == n:
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
def memoize(function):
cache = {}
def decorated_function(*args):
if args in cache:
return cache[args]
else:
val = function(*args)
cache[args] = val
return val
return decorated_function
#memoize
def find_lin_comb(prime):
global count_best
for windows in window(primes[0 : primes.index(prime)]):
if sum(windows) == prime and len(windows) > count_best:
count_best = len(windows)
print('Prime: ', prime, 'Terms: ', count_best)
##Find them:
for x in primes[::-1]: find_lin_comb(x)
(btw, the tuple of prime numbers is generated "decently" fast)
All input is appreciated, I am just a hobby programmer, so please don´t get to advanced on me.
Thank you!
Edit: here is a working code paste that doesn´t have ruined indentations:
http://pastebin.com/R1NpMqgb

This works fine for all primes up to 1000, but for primes up to 10**6 it is very slow, so I was hoping memozation would help; when calculating the sum of sliding windows, a lot of double work is done...(right?)
Yes, right. And of course it's slow for the primes up to 106.
Say you have n primes up to N, numbered in increasing order, p_1 = 2, p_2 = 3, .... When considering whether prime no. k is the sum of consecutive primes, you consider all windows [p_i, ..., p_j], for pairs (i,j) with i < j < k. There are (k-1)*(k-2)/2 of them. Going through all k to n, you examine about n³/6 windows in total (counting multiplicity, you're examining w(i.j) in total n-j times). Even ignoring the cost of creating the window and summing it, you can see how it scales badly:
For N = 1000, there are n = 168 primes and about 790000 windows to examine (counting multiplicity).
For N = 10**6, there are n = 78498 primes and about 8.3*10**13 windows to examine.
Now factor in the work for creating and summing the windows, estimate it low at j-i+1 for summing the j-i+1 primes in w(i,j), the work for p_k is about k³/6, and the total work becomes roughly k**4/24. Something like 33 million steps for N = 1000, peanuts, but nearly 1.6*10**18 for N = 1000000.
A year contains about 3.1*10**7 seconds, with a ~3GHz CPU, that's roughly 1017 clock cycles. So we're talking of an operation needing something like 100 CPU-years (may be a factor of 10 off or so).
You aren't willing to wait that long, I suppose;)
Now, with memoisation, you still look at each window multiple times, but you do the computation of each window only once. That means you need about n³/6 work for the computation of the windows, and look about n³/6 times at any window.
Problem 1: You still need to look at windows about 8.3*10**13 times, that's several hours even if looking cost only one cycle.
Problem 2: There are about 8.3*10**13 windows to memoise. You don't have that much memory, unless you can use a really large HD.
You can circumvent the memory problem by throwing away data you don't need anymore and only calculating data for the windows when it is needed, but once you know which data you may throw away when, you should be able to see a much better approach.
The longest sum of consecutive primes below one-thousand that adds to a prime, contains 21 terms, and is equal to 953.
What does this tell you about the window generating that sum? Where can it start, where can it stop? How can you use that information to create an efficient algorithm to solve the problem?

The memoize decorator adds a wrapper to a function to cache the return value for each value of the argument (each combination of values in case of multiple arguments). It is useful when the function is called multiple times with the same arguments. You can only use it with a pure function, i.e.
The function has no side effects. Changing a global variable and doing output are examples of side effects.
The return value depends only on the values of the arguments, not on some global variables that may change values between calls.
Your find_lin_comb function does not satisfy the above criteria. For one thing, it is called with a different argument every time, for another, the function does not return a value.

Related

Why does this 'optimized' prime checker run at the same speed as the regular version?

Given this plain is_prime1 function which checks all the divisors from 1 to sqrt(p) with some bit-playing in order to avoid even numbers which are of-course not primes.
import time
def is_prime1(p):
if p & 1 == 0:
return False
# if the LSD is 5 then it is divisible by 5 (i.e. not a prime)
elif p % 10 == 5:
return False
for k in range(2, int(p ** 0.5) + 1):
if p % k == 0:
return False
return True
Versus this "optimized" version. The idea is to save all the primes we have found until a certain number p, then we iterate on the primes (using this basic arithmetic rule that every number is a product of primes) so we don't iterate through the numbers until sqrt(p) but over the primes we found which supposed to be a tiny bit compared to sqrt(p). We also iterate only on half the elements, because then the largest prime would most certainly won't "fit" in the number p.
import time
global mem
global lenMem
mem = [2]
lenMem = 1
def is_prime2(p):
global mem
global lenMem
# if p is even then the LSD is off
if p & 1 == 0:
return False
# if the LSD is 5 then it is divisible by 5 (i.e. not a prime)
elif p % 10 == 5:
return False
for div in mem[0: int(p ** 0.5) + 1]:
if p % div == 0:
return False
mem.append(p)
lenMem += 1
return True
The only idea I have in mind is that "global variables are expensive and time consuming" but I don't know if there is another way, and if there is, will it really help?
On average, when running this same program:
start = time.perf_counter()
for p in range(2, 100000):
print(f'{p} is a prime? {is_prime2(p)}') # change to is_prime1 or is_prime2
end = time.perf_counter()
I get that for is_prime1 the average time for checking the numbers 1-100K is ~0.99 seconds and so is_prime2 (maybe a difference of +0.01s on average, maybe as I said the usage of global variables ruin some performance?)

The difference is a combination of three things:
You're just not doing that much less work. Your test case includes testing a ton of small numbers, where the distinction between testing "all numbers from 2 to square root" and testing "all primes from 2 to square root" just isn't that much of a difference. Your "average case" is roughly the midpoint of the range, 50,000, square root of 223.6, which means testing 48 primes, or testing 222 numbers if the number is prime, but most numbers aren't prime, and most numbers have at least one small factor (proof left as exercise), so you short-circuit and don't actually test most of the numbers in either set (if there's a factor below 8, which applies to ~77% of all numbers, you've saved maybe two tests by limiting yourself to primes)
You're slicing mem every time, which is performed eagerly, and completely, even if you don't use all the values (and as noted, you almost never do for the non-primes). This isn't a huge cost, but then, you weren't getting huge savings from skipping non-primes, so it likely eats what little savings you got from the other optimization.
(You found this one, good show) Your slice of primes took a number of primes to test equal to the square root of number to test, not all primes less than the square root of the number to test. So you actually performed the same number of tests, just with different numbers (many of them primes larger than the square root that definitely don't need to be tested).
A side-note:
Your up-front tests aren't actually saving you much work; you redo both tests in the loop, so they're wasted effort when the number is prime (you test them both twice). And your test for divisibility by five is pointless; % 10 is no faster than % 5 (computers don't operate in base-10 anyway), and if not p % 5: is a slightly faster, more direct, and more complete (your test doesn't recognize multiples of 10, just multiples of 5 that aren't multiples of 10) way to test for divisibility.
The tests are also wrong, because they don't exclude the base case (they say 2 and 5 are not prime, because they're divisible by 2 and 5 respectively).

First of all, you should remove the print call, it is very time consuming.
You should just time your function, not the print function, so you could do it like this:
start = time.perf_counter()
for p in range(2, 100000):
## print(f'{p} is a prime? {is_prime2(p)}') # change to is_prime1 or is_prime2
is_prime1(p)
end = time.perf_counter()
print ("prime1", end-start)
start = time.perf_counter()
for p in range(2, 100000):
## print(f'{p} is a prime? {is_prime2(p)}') # change to is_prime1 or is_prime2
is_prime2(p)
end = time.perf_counter()
print ("prime2", end-start)
is_prime1 is still faster for me.

If you want to hold primes in global memory to accelerate multiple calls, you need to ensure that the primes list is properly populated even when the function is called with numbers in random order. The way is_prime2() stores and uses the primes assumes that, for example, it is called with 7 before being called with 343. If not, 343 will be treated as a prime because 7 is not yet in the primes list.
So the function must compute and store all primes up to √49 before it can respond to the is_prime(343) call.
In order to quickly build a primes list, the Sieve of Eratosthenes is one of the fastest method. But, since you don't know in advance how many primes you need, you can't allocate the sieve's bit flags in advance. What you can do is use a rolling window of the sieve to move forward by chunks (of let"s say 1000000 bits at a time). When a number beyond your maximum prime is requested, you just generate more primes chunk by chunk until you have enough to respond.
Also, since you're going to build a list of primes, you might as well make it a set and check if the requested number is in it to respond to the function call. This will require generating more primes than needed for divisions but, in the spirit of accelerating subsequent calls, that should not be an issue.
Here's an example of an isPrime() function that uses that approach:
primes = {3}
sieveMax = 3
sieveChunk = 1000000 # must be an even number
def isPrime(n):
if not n&1: return n==2
global primes,sieveMax, sieveChunk
while n>sieveMax:
base,sieveMax = sieveMax, sieveMax + sieveChunk
sieve = [True]* sieveChunk
for p in primes:
i = (p - base%p)%p
sieve[i::p]=[False]*len(sieve[i::p])
for i in range(0, sieveChunk,2):
if not sieve[i]: continue
p = i + base
primes.add(p)
sieve[i::p] = [False]*len(sieve[i::p])
return n in primes
On the first call to an unknown prime, it will perform slower than the divisions approach but as the prime list builds up, it will provide much better response time.

Issues in implementing Sieve of Eratosthenes algorithm

This is the 10th problem in project euler in which we are supposed to find the sum of all the prime numbers below 2 million. I am using Sieve of Eratosthenes algorithm to find the prime numbers. Now I am facing and performance issue with the Sieve of Eratosthenes algorithm.
The performance goes down by a substantial amount if the print(i,"",sum_of_prime) is kept inside the loop. Is there anyway to see it working and keep the performance? If this is done with the conventional method it take some 13 minutes to get the result.
#Euler 10
#Problem:
#The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.
#Find the sum of all the primes below two million.
#Author: Kshithij Iyer
#Date of creation: 15/1/2017
import time
#Recording start time of the program
start=time.time()
def sum_of_prime_numbers(limit):
"A function to get the sum prime numbers till the limit"
#Variable to store the sum of prime numbers
sum_of_prime=0
#Sieve of Eratosthenes algorithm
sieve=[True]*limit
for i in range(2,limit):
if sieve[i]:
sum_of_prime=sum_of_prime+i
for j in range(i*i,limit,i):
sieve[j]=False
print(i,"",sum_of_prime)
print("The sum of all the prime numbers below",limit,"is",sum_of_prime)
return
#sum_of_prime_numbers(10)
sum_of_prime_numbers(2000001)
print("Execution time of program is",time.time()-start)
#Note:
#I did give the conventioanl method a try but it didn't work well and was taking
#some 13 minutes to get the results.
#Algorithm for reference
#Input: an integer n > 1
#Let A be an array of Boolean values, indexed by integers 2 to n,
#initially all set to true.
#for i = 2, 3, 4, ..., not exceeding √n:
#if A[i] is true:
#for j = i2, i2+i, i2+2i, i2+3i, ..., not exceeding n :
#A[j] := false
#Output: all i such that A[i] is true.

So there are various improvements that could be made here:
Firstly, due to the nature of Eratosthenes' sieve, you can replace for i in range(2,limit): with for i in range(2,int(limit**0.5)+1): and the array will be calculated normally, but much faster; however, as a result, you would then have to sum the numbers later.
Also, you are not going to be able to read every individual prime and nor would you want to; instead, you only need the program to tell you milestones, such as every time the program reaches a certain number to check everything against.
Your program does not appear to take into account the fact that arrays start at 0, which should certainly cause some problems; however, this should be fairly fixable.
Finally, it occurs to me that your program appears to count 1 as prime; this should be another easy fix though.

Efficient method for generating lists of large prime numbers

What I'm trying to figure out is when I run this code for smaller numbers it returns the list just fine, but for larger numbers (I would call this small in the context of what I'm working on.) like 29996299, it will run for a long time, I've waited for 45 minutes with no results and had to end up killing the program. What I was wondering was whether there was a more efficient way to handle numbers whose scale was larger than 4 or 5 digits. I've tested a few permutations of the range function to see if there was a better way to handle the limits of the list I want to produce but nothing seems to have any effect on the amount of time it takes to do the computation. I'm new to python and am not that experienced as a programmer. Thank you for your time.
ran the program again before submitting this post and it took an hour and a half or so.
function of the program is to take the User selected number, use it to generate a lower bound, find all primes between the bound and input and append to list, then generate a secound upper bound and find all primes and then append to list, to create a list that extends forwards and backwards from the initial number.
the program works like I expect it to but not as quickly as I need it to since the numbers I'm going to be dealing with are going to get large quickly, almost doubling at each phase.
initial_num = input("Please enter a number. ")
lower_1 = int(initial_num) - 1000
upper_1 = int(initial_num)
list_1 = []
for num in range(lower_1,upper_1):
if num > 1:
for i in range(2,num):
if (num % i) == 0:
break
else:
list_1.append(num)
lower_2 = list_1[-1]
upper_2 = list_1[-1] + 2000
list_2 = []
for num in range(lower_2,upper_2 +1):
if num > 1:
for i in range(2,num):
if (num % i) == 0:
break
else:
list_2.append(num)
list_3 = list_1 + list_2[1:]
print list_3

You can use a more efficient algorithm to generate the entire list of prime numbers up to N. This is the Sieve of Erathostenes. Please have a look at the linked article, it even includes an example pseudocode. The basic idea of the algorithm is:
maintain L, a list of potentially prime numbers (initially all numbers from 2 to N)
pick the next prime number (p) as the first element of L (intially 2)
remove all numbers that are a multiple of p, up to N, since they cannot be prime
repeat from step 2
At the end you are left with a list of prime numbers.
An implementation in Pyhton from here
def eratosthenes2(n):
multiples = set()
for i in range(2, n+1):
if i not in multiples:
yield i
multiples.update(range(i*i, n+1, i))
print(list(eratosthenes2(100)))
To reduce memory consumpution you could consider usgin a bitset, storing one bit for each number. That should reduce memory usage by between 32 - 64 times. A bitset implementation is available for python here.

Project Euler 2 python3

I've got, what I think is a valid solution to problem 2 of Project Euler (finding all even numbers in the Fibonacci sequence up to 4,000,000). This works for lower numbers, but crashes when I run it with 4,000,000. I understand that this is computationally difficult, but shouldn't it just take a long time to compute rather than crash? Or is there an issue in my code?
import functools
def fib(limit):
sequence = []
for i in range(limit):
if(i < 3):
sequence.append(i)
else:
sequence.append(sequence[i-1] + sequence[i-2])
return sequence
def add_even(x, y):
if(y % 2 == 0):
return x + y
return x + 0
print(functools.reduce(add_even,fib(4000000)))

The problem is about getting the Fibonacci numbers that are smaller than 4000000. Your code tries to find the first 4000000 Fibonacci values instead. Since Fibonacci numbers grow exponentially, this will reach numbers too large to fit in memory.
You need to change your function to stop when the last calculated value is more than 4000000.
Another possible improvement is to add the numbers as you are calculating them instead of storing them in a list, but this won't be necessary if you stop at the appropriate time.

Python Time Complexity (run-time)

def f2(L):
sum = 0
i = 1
while i < len(L):
sum = sum + L[i]
i = i * 2
return sum
Let n be the size of the list L passed to this function. Which of the following most accurately describes how the runtime of this function grow as n grows?
(a) It grows linearly, like n does.
(b) It grows quadratically, like n^2 does.
(c) It grows less than linearly.
(d) It grows more than quadratically.
I don't understand how you figure out the relationship between the runtime of the function and the growth of n. Can someone please explain this to me?

ok, since this is homework:
this is the code:
def f2(L):
sum = 0
i = 1
while i < len(L):
sum = sum + L[i]
i = i * 2
return sum
it is obviously dependant on len(L).
So lets see for each line, what it costs:
sum = 0
i = 1
# [...]
return sum
those are obviously constant time, independant of L.
In the loop we have:
sum = sum + L[i] # time to lookup L[i] (`timelookup(L)`) plus time to add to the sum (obviously constant time)
i = i * 2 # obviously constant time
and how many times is the loop executed?
it's obvously dependant on the size of L.
Lets call that loops(L)
so we got an overall complexity of
loops(L) * (timelookup(L) + const)
Being the nice guy I am, I'll tell you that list lookup is constant in python, so it boils down to
O(loops(L)) (constant factors ignored, as big-O convention implies)
And how often do you loop, based on the len() of L?
(a) as often as there are items in the list (b) quadratically as often as there are items in the list?
(c) less often as there are items in the list (d) more often than (b) ?

I am not a computer science major and I don't claim to have a strong grasp of this kind of theory, but I thought it might be relevant for someone from my perspective to try and contribute an answer.
Your function will always take time to execute, and if it is operating on a list argument of varying length, then the time it takes to run that function will be relative to how many elements are in that list.
Lets assume it takes 1 unit of time to process a list of length == 1. What the question is asking, is the relationship between the size of the list getting bigger vs the increase in time for this function to execute.
This link breaks down some basics of Big O notation: http://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/
If it were O(1) complexity (which is not actually one of your A-D options) then it would mean the complexity never grows regardless of the size of L. Obviously in your example it is doing a while loop dependent on growing a counter i in relation to the length of L. I would focus on the fact that i is being multiplied, to indicate the relationship between how long it will take to get through that while loop vs the length of L. Basically, try to compare how many loops the while loop will need to perform at various values of len(L), and then that will determine your complexity. 1 unit of time can be 1 iteration through the while loop.
Hopefully I have made some form of contribution here, with my own lack of expertise on the subject.
Update
To clarify based on the comment from ch3ka, if you were doing more than what you currently have inside your with loop, then you would also have to consider the added complexity for each loop. But because your list lookup L[i] is constant complexity, as is the math that follows it, we can ignore those in terms of the complexity.

Here's a quick-and-dirty way to find out:
import matplotlib.pyplot as plt
def f2(L):
sum = 0
i = 1
times = 0
while i < len(L):
sum = sum + L[i]
i = i * 2
times += 1 # track how many times the loop gets called
return times
def main():
i = range(1200)
f_i = [f2([1]*n) for n in i]
plt.plot(i, f_i)
if __name__=="__main__":
main()
... which results in
Horizontal axis is size of L, vertical axis is how many times the function loops; big-O should be pretty obvious from this.

Consider what happens with an input of length n=10. Now consider what happens if the input size is doubled to 20. Will the runtime double as well? Then it's linear. If the runtime grows by factor 4, then it's quadratic. Etc.

When you look at the function, you have to determine how the size of the list will affect the number of loops that will occur.
In your specific situation, lets increment n and see how many times the while loop will run.
n = 0, loop = 0 times
n = 1, loop = 1 time
n = 2, loop = 1 time
n = 3, loop = 2 times
n = 4, loop = 2 times
See the pattern? Now answer your question, does it:
(a) It grows linearly, like n does. (b) It grows quadratically, like n^2 does.
(c) It grows less than linearly. (d) It grows more than quadratically.
Checkout Hugh's answer for an empirical result :)

it's O(log(len(L))), as list lookup is a constant time operation, independant of the size of the list.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.