I'm trying to make a python function as fast as I can. Let's suppose I have a prime list and I'm calling primes[i] n times for the same i.
I've the intuition that from a certain value of n, it becomes faser to keep the value of primes[i] in a variable.
I made some tries by comparing the two following implementations, and I can't figure out which one is the fastest. It looks like time access to primes[i] depends on a lot of factor.
1st implementation
while n != 1:
p = primes[i]
if n % p == 0:
n = n // p
factorization.append(p)
else:
i += 1
2nd implementation
while n != 1:
if n % primes[i] == 0:
n = n // primes[i]
factorization.append(primes[i])
else:
i += 1
Is there any rule to know from how many calls it becomes interesting to keep the value of an element of a list in a variable ?
Accessing primes[i] is done in constant time, O(1). What that means is that the time needed to read primes[i] does not increase as the primes becomes bigger and that it does not increase when i becomes bigger.
In layman's terms: it's damn fast!
Then again, accessing a local variable p is still faster than accessing primes[i], because the latter has to look up and call the __getitem__ implementation of the primes object. Therefore caching a value in a local variable instead of looking up a list twice is marginally faster.
On the other hand, caring about marginal speed improvements is meaningless compared to reducing algorithm complexity. For the problem of finding prime numbers, you should focus on finding a smart algorithm rather than on improving built-in-list access times.
Try using a benchmark
import time
start = time.time()
while n != 1:
p = primes[i]
if n % p == 0:
n = n // p
factorization.append(p)
else:
i += 1
end = time.time()
print(end - start)
do the same for implementation 2 and compare.
And also, try doing it in google colab or any other external machine for better results.
Related
Given this plain is_prime1 function which checks all the divisors from 1 to sqrt(p) with some bit-playing in order to avoid even numbers which are of-course not primes.
import time
def is_prime1(p):
if p & 1 == 0:
return False
# if the LSD is 5 then it is divisible by 5 (i.e. not a prime)
elif p % 10 == 5:
return False
for k in range(2, int(p ** 0.5) + 1):
if p % k == 0:
return False
return True
Versus this "optimized" version. The idea is to save all the primes we have found until a certain number p, then we iterate on the primes (using this basic arithmetic rule that every number is a product of primes) so we don't iterate through the numbers until sqrt(p) but over the primes we found which supposed to be a tiny bit compared to sqrt(p). We also iterate only on half the elements, because then the largest prime would most certainly won't "fit" in the number p.
import time
global mem
global lenMem
mem = [2]
lenMem = 1
def is_prime2(p):
global mem
global lenMem
# if p is even then the LSD is off
if p & 1 == 0:
return False
# if the LSD is 5 then it is divisible by 5 (i.e. not a prime)
elif p % 10 == 5:
return False
for div in mem[0: int(p ** 0.5) + 1]:
if p % div == 0:
return False
mem.append(p)
lenMem += 1
return True
The only idea I have in mind is that "global variables are expensive and time consuming" but I don't know if there is another way, and if there is, will it really help?
On average, when running this same program:
start = time.perf_counter()
for p in range(2, 100000):
print(f'{p} is a prime? {is_prime2(p)}') # change to is_prime1 or is_prime2
end = time.perf_counter()
I get that for is_prime1 the average time for checking the numbers 1-100K is ~0.99 seconds and so is_prime2 (maybe a difference of +0.01s on average, maybe as I said the usage of global variables ruin some performance?)
The difference is a combination of three things:
You're just not doing that much less work. Your test case includes testing a ton of small numbers, where the distinction between testing "all numbers from 2 to square root" and testing "all primes from 2 to square root" just isn't that much of a difference. Your "average case" is roughly the midpoint of the range, 50,000, square root of 223.6, which means testing 48 primes, or testing 222 numbers if the number is prime, but most numbers aren't prime, and most numbers have at least one small factor (proof left as exercise), so you short-circuit and don't actually test most of the numbers in either set (if there's a factor below 8, which applies to ~77% of all numbers, you've saved maybe two tests by limiting yourself to primes)
You're slicing mem every time, which is performed eagerly, and completely, even if you don't use all the values (and as noted, you almost never do for the non-primes). This isn't a huge cost, but then, you weren't getting huge savings from skipping non-primes, so it likely eats what little savings you got from the other optimization.
(You found this one, good show) Your slice of primes took a number of primes to test equal to the square root of number to test, not all primes less than the square root of the number to test. So you actually performed the same number of tests, just with different numbers (many of them primes larger than the square root that definitely don't need to be tested).
A side-note:
Your up-front tests aren't actually saving you much work; you redo both tests in the loop, so they're wasted effort when the number is prime (you test them both twice). And your test for divisibility by five is pointless; % 10 is no faster than % 5 (computers don't operate in base-10 anyway), and if not p % 5: is a slightly faster, more direct, and more complete (your test doesn't recognize multiples of 10, just multiples of 5 that aren't multiples of 10) way to test for divisibility.
The tests are also wrong, because they don't exclude the base case (they say 2 and 5 are not prime, because they're divisible by 2 and 5 respectively).
First of all, you should remove the print call, it is very time consuming.
You should just time your function, not the print function, so you could do it like this:
start = time.perf_counter()
for p in range(2, 100000):
## print(f'{p} is a prime? {is_prime2(p)}') # change to is_prime1 or is_prime2
is_prime1(p)
end = time.perf_counter()
print ("prime1", end-start)
start = time.perf_counter()
for p in range(2, 100000):
## print(f'{p} is a prime? {is_prime2(p)}') # change to is_prime1 or is_prime2
is_prime2(p)
end = time.perf_counter()
print ("prime2", end-start)
is_prime1 is still faster for me.
If you want to hold primes in global memory to accelerate multiple calls, you need to ensure that the primes list is properly populated even when the function is called with numbers in random order. The way is_prime2() stores and uses the primes assumes that, for example, it is called with 7 before being called with 343. If not, 343 will be treated as a prime because 7 is not yet in the primes list.
So the function must compute and store all primes up to √49 before it can respond to the is_prime(343) call.
In order to quickly build a primes list, the Sieve of Eratosthenes is one of the fastest method. But, since you don't know in advance how many primes you need, you can't allocate the sieve's bit flags in advance. What you can do is use a rolling window of the sieve to move forward by chunks (of let"s say 1000000 bits at a time). When a number beyond your maximum prime is requested, you just generate more primes chunk by chunk until you have enough to respond.
Also, since you're going to build a list of primes, you might as well make it a set and check if the requested number is in it to respond to the function call. This will require generating more primes than needed for divisions but, in the spirit of accelerating subsequent calls, that should not be an issue.
Here's an example of an isPrime() function that uses that approach:
primes = {3}
sieveMax = 3
sieveChunk = 1000000 # must be an even number
def isPrime(n):
if not n&1: return n==2
global primes,sieveMax, sieveChunk
while n>sieveMax:
base,sieveMax = sieveMax, sieveMax + sieveChunk
sieve = [True]* sieveChunk
for p in primes:
i = (p - base%p)%p
sieve[i::p]=[False]*len(sieve[i::p])
for i in range(0, sieveChunk,2):
if not sieve[i]: continue
p = i + base
primes.add(p)
sieve[i::p] = [False]*len(sieve[i::p])
return n in primes
On the first call to an unknown prime, it will perform slower than the divisions approach but as the prime list builds up, it will provide much better response time.
I am implementing the coin change problem in python in CS50's pset6. When I first tackled the problem, this was the algorithm I used:
import time
while True:
try:
totalChange = input('How much change do I owe you? ')
totalChange = float(totalChange) # check it it's a valid numeric value
if totalChange < 0:
print('Error: Please enter a positive numeric value')
continue
break
except:
print('Error: Please enter a positive numeric value')
start_time1 = time.time()
change1 = int(totalChange * 100) # convert money into cents
n = 0
while change1 >= 25:
change1 -= 25
n += 1
while change1 >= 10:
change1 -= 10
n += 1
while change1 >= 5:
change1 -= 5
n += 1
while change1 >= 1:
change1 -= 1
n += 1
print(f'Method1: {n}')
print("--- %s seconds ---" % (time.time() - start_time1))
Having watched the lecture on dynamic programming, I wanted to implement it into this problem. This was my attempt:
while True:
try:
totalChange = input('How much change do I owe you? ')
totalChange = float(totalChange) # check it it's a valid numeric value
if totalChange < 0:
print('Error: Please enter a positive numeric value')
continue
break
except:
print('Error: Please enter a positive numeric value')
start_time2 = time.time()
change2 = int(totalChange*100)
rowsCoins = [1,5,10,25]
colsCoins = list(range(change2 + 1))
n = len(rowsCoins)
m = len(colsCoins)
matrix = [[i for i in range(m)] for j in range(n)]
for i in range(1,n):
for j in range(1,m):
if rowsCoins[i] == j:
matrix[i][j] = 1
elif rowsCoins[i] > j:
matrix[i][j] = matrix[i-1][j]
else:
matrix[i][j] = min(matrix[i-1][j], 1 + matrix[i][j-rowsCoins[i]])
print(f'Method2: {matrix[-1][-1]}')
print("--- %s seconds ---" % (time.time() - start_time2))
When I run the program, it gives the correct answers, but it takes a much longer time.
How could I adjust the second code so that it is correctly implementing dynamic programming. Is my problem that I am starting the loops from the top left corner of the matrix instead of the bottom right?
What are the time complexities of the algorithms for each code that I wrote (as well as for a correct implementation of dynamic programming). I suspect that for the first code, it follows O(n^4), and for the second code O(n*m), and a correct implementation of dynamic programming should be O(n). Am I correct to think this?
Any help for a better understanding of these algorithms is much appreciated.
I think both algorithms are basically O(n).
n in this case is the size of the number entered.
In the first algorithm, it's not O(n^4) as that would suggest you have 4 nested loops looping n times. Instead, you have 4 loops that run sequentially. If they didn't modify change1 at all, that would potentially be O(4n), which is the same as O(n).
In the second algorithm, your choice of variable names confuses things a little. n is a constant, and m is based on the size of the input, so is what would typically be called n. So, if we rename n to c and m to n, we get O(c*n) which, again, is the same as O(n).
The key point here is that for any particular n, and O(n) algorithm isn't necessarily faster than, say, an O(n^2) algorithm. Big O notation just describes how the amount of work done varies with the size of the input. What it does say, is that as n gets bigger, the time taken by an O(n) algorithm will increase slower than the time taken by an O(n^2) algorithm, so for some large enough n, the algorithm with the lower complexity will be quicker.
How could I adjust the second code so that it is correctly implementing dynamic programming. Is my problem that I am starting the loops from the top left corner of the matrix instead of the bottom right?
IMHO, this problem is not suitable for dynamic programming, so it is hard to implement the correct dp. Check a greedy solution https://github.com/endiliey/cs50/blob/master/pset6/greedy.py which should be the best solution.
What are the time complexities of the algorithms for each code that I wrote (as well as for a correct implementation of dynamic programming).
Basically both of your codes should be O(n), but it does not mean that they have the same time complexity, as you have said, the dp solution is much slower. That is because they have different factor(ratio). For example, 4n and 0.25n both are O(n) but they have different time complexity.
The greedy solution should have a time complexity of O(1).
I currently have ↓ set as my randprime(p,q) function. Is there any way to condense this, via something like a genexp or listcomp? Here's my function:
n = randint(p, q)
while not isPrime(n):
n = randint(p, q)
It's better to just generate the list of primes, and then choose from that line.
As is, with your code there is the slim chance that it will hit an infinite loop, either if there are no primes in the interval or if randint always picks a non-prime then the while loop will never end.
So this is probably shorter and less troublesome:
import random
primes = [i for i in range(p,q) if isPrime(i)]
n = random.choice(primes)
The other advantage of this is there is no chance of deadlock if there are no primes in the interval. As stated this can be slow depending on the range, so it would be quicker if you cached the primes ahead of time:
# initialising primes
minPrime = 0
maxPrime = 1000
cached_primes = [i for i in range(minPrime,maxPrime) if isPrime(i)]
#elsewhere in the code
import random
n = random.choice([i for i in cached_primes if p<i<q])
Again, further optimisations are possible, but are very much dependant on your actual code... and you know what they say about premature optimisations.
Here is a script written in python to generate n random prime integers between tow given integers:
import numpy as np
def getRandomPrimeInteger(bounds):
for i in range(bounds.__len__()-1):
if bounds[i + 1] > bounds[i]:
x = bounds[i] + np.random.randint(bounds[i+1]-bounds[i])
if isPrime(x):
return x
else:
if isPrime(bounds[i]):
return bounds[i]
if isPrime(bounds[i + 1]):
return bounds[i + 1]
newBounds = [0 for i in range(2*bounds.__len__() - 1)]
newBounds[0] = bounds[0]
for i in range(1, bounds.__len__()):
newBounds[2*i-1] = int((bounds[i-1] + bounds[i])/2)
newBounds[2*i] = bounds[i]
return getRandomPrimeInteger(newBounds)
def isPrime(x):
count = 0
for i in range(int(x/2)):
if x % (i+1) == 0:
count = count+1
return count == 1
#ex: get 50 random prime integers between 100 and 10000:
bounds = [100, 10000]
for i in range(50):
x = getRandomPrimeInteger(bounds)
print(x)
So it would be great if you could use an iterator to give the integers from p to q in random order (without replacement). I haven't been able to find a way to do that. The following will give random integers in that range and will skip anything that it's tested already.
import random
fail = False
tested = set([])
n = random.randint(p,q)
while not isPrime(n):
tested.add(n)
if len(tested) == p-q+1:
fail = True
break
while n in s:
n = random.randint(p,q)
if fail:
print 'I failed'
else:
print n, ' is prime'
The big advantage of this is that if say the range you're testing is just (14,15), your code would run forever. This code is guaranteed to produce an answer if such a prime exists, and tell you there isn't one if such a prime does not exist. You can obviously make this more compact, but I'm trying to show the logic.
next(i for i in itertools.imap(lambda x: random.randint(p,q)|1,itertools.count()) if isPrime(i))
This starts with itertools.count() - this gives an infinite set.
Each number is mapped to a new random number in the range, by itertools.imap(). imap is like map, but returns an iterator, rather than a list - we don't want to generate a list of inifinite random numbers!
Then, the first matching number is found, and returned.
Works efficiently, even if p and q are very far apart - e.g. 1 and 10**30, which generating a full list won't do!
By the way, this is not more efficient than your code above, and is a lot more difficult to understand at a glance - please have some consideration for the next programmer to have to read your code, and just do it as you did above. That programmer might be you in six months, when you've forgotten what this code was supposed to do!
P.S - in practice, you might want to replace count() with xrange (NOT range!) e.g. xrange((p-q)**1.5+20) to do no more than that number of attempts (balanced between limited tests for small ranges and large ranges, and has no more than 1/2% chance of failing if it could succeed), otherwise, as was suggested in another post, you might loop forever.
PPS - improvement: replaced random.randint(p,q) with random.randint(p,q)|1 - this makes the code twice as efficient, but eliminates the possibility that the result will be 2.
I am creating a fast method of generating a list of primes in the range(0, limit+1). In the function I end up removing all integers in the list named removable from the list named primes. I am looking for a fast and pythonic way of removing the integers, knowing that both lists are always sorted.
I might be wrong, but I believe list.remove(n) iterates over the list comparing each element with n. meaning that the following code runs in O(n^2) time.
# removable and primes are both sorted lists of integers
for composite in removable:
primes.remove(composite)
Based off my assumption (which could be wrong and please confirm whether or not this is correct) and the fact that both lists are always sorted, I would think that the following code runs faster, since it only loops over the list once for a O(n) time. However, it is not at all pythonic or clean.
i = 0
j = 0
while i < len(primes) and j < len(removable):
if primes[i] == removable[j]:
primes = primes[:i] + primes[i+1:]
j += 1
else:
i += 1
Is there perhaps a built in function or simpler way of doing this? And what is the fastest way?
Side notes: I have not actually timed the functions or code above. Also, it doesn't matter if the list removable is changed/destroyed in the process.
For anyone interested the full functions is below:
import math
# returns a list of primes in range(0, limit+1)
def fastPrimeList(limit):
if limit < 2:
return list()
sqrtLimit = int(math.ceil(math.sqrt(limit)))
primes = [2] + range(3, limit+1, 2)
index = 1
while primes[index] <= sqrtLimit:
removable = list()
index2 = index
while primes[index] * primes[index2] <= limit:
composite = primes[index] * primes[index2]
removable.append(composite)
index2 += 1
for composite in removable:
primes.remove(composite)
index += 1
return primes
This is quite fast and clean, it does O(n) set membership checks, and in amortized time it runs in O(n) (first line is O(n) amortized, second line is O(n * 1) amortized, because a membership check is O(1) amortized):
removable_set = set(removable)
primes = [p for p in primes if p not in removable_set]
Here is the modification of your 2nd solution. It does O(n) basic operations (worst case):
tmp = []
i = j = 0
while i < len(primes) and j < len(removable):
if primes[i] < removable[j]:
tmp.append(primes[i])
i += 1
elif primes[i] == removable[j]:
i += 1
else:
j += 1
primes[:i] = tmp
del tmp
Please note that constants also matter. The Python interpreter is quite slow (i.e. with a large constant) to execute Python code. The 2nd solution has lots of Python code, and it can indeed be slower for small practical values of n than the solution with sets, because the set operations are implemented in C, thus they are fast (i.e. with a small constant).
If you have multiple working solutions, run them on typical input sizes, and measure the time. You may get surprised about their relative speed, often it is not what you would predict.
The most important thing here is to remove the quadratic behavior. You have this for two reasons.
First, calling remove searches the entire list for values to remove. Doing this takes linear time, and you're doing it once for each element in removable, so your total time is O(NM) (where N is the length of primes and M is the length of removable).
Second, removing elements from the middle of a list forces you to shift the whole rest of the list up one slot. So, each one takes linear time, and again you're doing it M times, so again it's O(NM).
How can you avoid these?
For the first, you either need to take advantage of the sorting, or just use something that allows you to do constant-time lookups instead of linear-time, like a set.
For the second, you either need to create a list of indices to delete and then do a second pass to move each element up the appropriate number of indices all at once, or just build a new list instead of trying to mutate the original in-place.
So, there are a variety of options here. Which one is best? It almost certainly doesn't matter; changing your O(NM) time to just O(N+M) will probably be more than enough of an optimization that you're happy with the results. But if you need to squeeze out more performance, then you'll have to implement all of them and test them on realistic data.
The only one of these that I think isn't obvious is how to "use the sorting". The idea is to use the same kind of staggered-zip iteration that you'd use in a merge sort, like this:
def sorted_subtract(seq1, seq2):
i1, i2 = 0, 0
while i1 < len(seq1):
if seq1[i1] != seq2[i2]:
i2 += 1
if i2 == len(seq2):
yield from seq1[i1:]
return
else:
yield seq1[i1]
i1 += 1
I'm relatively new to the python world, and the coding world in general, so I'm not really sure how to go about optimizing my python script. The script that I have is as follows:
import math
z = 1
x = 0
while z != 0:
x = x+1
if x == 500:
z = 0
calculated = open('Prime_Numbers.txt', 'r')
readlines = calculated.readlines()
calculated.close()
a = len(readlines)
b = readlines[(a-1)]
b = int(b) + 1
for num in range(b, (b+1000)):
prime = True
calculated = open('Prime_Numbers.txt', 'r')
for i in calculated:
i = int(i)
q = math.ceil(num/2)
if (q%i==0):
prime = False
if prime:
calculated.close()
writeto = open('Prime_Numbers.txt', 'a')
num = str(num)
writeto.write("\n" + num)
writeto.close()
print(num)
As some of you can probably guess I'm calculating prime numbers. The external file that it calls on contains all the prime numbers between 2 and 20.
The reason that I've got the while loop in there is that I wanted to be able to control how long it ran for.
If you have any suggestions for cutting out any clutter in there could you please respond and let me know, thanks.
Reading and writing to files is very, very slow compared to operations with integers. Your algorithm can be sped up 100-fold by just ripping out all the file I/O:
import itertools
primes = {2} # A set containing only 2
for n in itertools.count(3): # Start counting from 3, by 1
for prime in primes: # For every prime less than n
if n % prime == 0: # If it divides n
break # Then n is composite
else:
primes.add(n) # Otherwise, it is prime
print(n)
A much faster prime-generating algorithm would be a sieve. Here's the Sieve of Eratosthenes, in Python 3:
end = int(input('Generate primes up to: '))
numbers = {n: True for n in range(2, end)} # Assume every number is prime, and then
for n, is_prime in numbers.items(): # (Python 3 only)
if not is_prime:
continue # For every prime number
for i in range(n ** 2, end, n): # Cross off its multiples
numbers[i] = False
print(n)
It is very inefficient to keep storing and loading all primes from a file. In general file access is very slow. Instead save the primes to a list or deque. For this initialize calculated = deque() and then simply add new primes with calculated.append(num). At the same time output your primes with print(num) and pipe the result to a file.
When you found out that num is not a prime, you do not have to keep checking all the other divisors. So break from the inner loop:
if q%i == 0:
prime = False
break
You do not need to go through all previous primes to check for a new prime. Since each non-prime needs to factorize into two integers, at least one of the factors has to be smaller or equal sqrt(num). So limit your search to these divisors.
Also the first part of your code irritates me.
z = 1
x = 0
while z != 0:
x = x+1
if x == 500:
z = 0
This part seems to do the same as:
for x in range(500):
Also you limit with x to 500 primes, why don't you simply use a counter instead, that you increase if a prime is found and check for at the same time, breaking if the limit is reached? This would be more readable in my opinion.
In general you do not need to introduce a limit. You can simply abort the program at any point in time by hitting Ctrl+C.
However, as others already pointed out, your chosen algorithm will perform very poor for medium or large primes. There are more efficient algorithms to find prime numbers: https://en.wikipedia.org/wiki/Generating_primes, especially https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes.
You're writing a blank line to your file, which is making int() traceback. Also, I'm guessing you need to rstrip() off your newlines.
I'd suggest using two different files - one for initial values, and one for all values - initial and recently computed.
If you can keep your values in memory a while, that'd be a lot faster than going through a file repeatedly. But of course, this will limit the size of the primes you can compute, so for larger values you might return to the iterate-through-the-file method if you want.
For computing primes of modest size, a sieve is actually quite good, and worth a google.
When you get into larger primes, trial division by the first n primes is good, followed by m rounds of Miller-Rabin. If Miller-Rabin probabilistically indicates the number is probably a prime, then you do complete trial division or AKS or similar. Miller Rabin can say "This is probably a prime" or "this is definitely composite". AKS gives a definitive answer, but it's slower.
FWIW, I've got a bunch of prime-related code collected together at http://stromberg.dnsalias.org/~dstromberg/primes/