Maybe it is a stupid question, but i was wondering if you could provide the shortest source to find prime numbers with Python.
I was also wondering how to find prime numbers by using map() or filter() functions.
Thank you (:
EDIT: When i say fastest/shortest I mean the way with the less characters/words. Do not consider a competition, anyway: i was wondering if it was possible a one line source, without removing indentation always used with for cycles.
EDIT 2:The problem was not thought for huge numbers. I think we can stay under a million( range(2,1000000)
EDIT 3: Shortest, but still elegant. As i said in the first EDIT, you don't need to reduce variables' names to single letters. I just need a one line, elegant source.
Thank you!
The Sieve of Eratosthenes in two lines.
primes = set(range(2,1000000))
for n in [2]+range(3,1000000/2,2): primes -= set(range(2*n,1000000,n))
Edit: I've realized that the above is not a true Sieve of Eratosthenes because it filters on the set of odd numbers rather than the set of primes, making it unnecessarily slow. I've fixed that in the following version, and also included a number of common optimizations as pointed out in the comments.
primes = set([2] + range(3, 1000000, 2))
for n in range(3, int(1000000**0.5)+1, 2): primes -= set(range(n*n,1000000,2*n) if n in primes else [])
The first version is still shorter and does generate the proper result, even if it takes longer.
Since one can just cut and paste the first million primes from the net:
map(int,open('primes.txt'))
This is a somewhat similar to the question I asked yesterday where wim provided a fairly short answer:
is this primes generator pythonic
Similar to the above, but not as cheeky as Robert King's answer:
from itertools import ifilter, imap
def primes(max=3000):
r = set(); [r.add(n) for n in ifilter(lambda c: all(imap(c.__mod__, r)), xrange(2, max+1))]; return sorted(r)
This uses more characters, but it's readable:
def primes_to(n):
cands = set(xrange(2, n))
for i in xrange(2, int(n ** 0.5) + 1):
for ix in xrange(i ** 2, n, i):
cands.discard(ix)
return list(cands)
EDIT
A new way, similar to the above, but with less missed attempts at discard:
def primes_to(n):
cands = set(xrange(3, n, 2))
for i in xrange(3, int(n ** 0.5) + 1, 2):
for ix in xrange(i ** 2, n, i * 2):
cands.discard(ix)
return [2] + list(cands)
Related
The runtime of the below code is really long, is there a more efficient way of calculating the sum of all prime numbers under 2million?
primeNumberList = []
previousNumberList = []
for i in range(2,2000000):
for x in range(2,i):
previousNumberList.append(x)
if all(i % n > 0 for n in previousNumberList):
primeNumberList.append(i)
previousNumberList = []
print(sum(primeNumberList))
You can optimize it in a bunch of interesting ways.
First, look at algorithmic optimizations.
Use algorithms that find prime numbers faster. (See here).
Use something like memoization to prevent unnecessary computation.
If memory is not an issue, figure out how to exchange memory for runtime.
Next, look at systems level optimizations.
Divide it over multiple processes (multiple threads won't add much easily due to Python's Global Interpreter Lock). You can do this using GRPC on one host, or PySpark etc. if you are using multiple hosts.
Finally, look at stuff like loop unrolling etc.
Good luck!
Start with a faster algorithm for calculating prime numbers. A really good survey is here: Fastest way to list all primes below N
This one (taken from one of the answers of that post) calculates in under a second on my year-old iMac:
def primes(n):
""" Returns a list of primes < n """
sieve = [True] * n
for i in range(3,int(n**0.5)+1,2):
if sieve[i]:
sieve[i*i::2*i]=[False]*((n-i*i-1)//(2*i)+1)
return [2] + [i for i in range(3,n,2) if sieve[i]]
print(sum(primes(20000000)))
As long as you have the memory space for it, Eratosthene's sieve is hard to beat when it comes to finding prime numbers:
def sumPrimes(N):
prime = [True]*(N+1)
for n in range(3,int(N**(1/2)+1),2):
if prime[n] : prime[n*n:N+1:n*2] = [False]*len(range(n*n,N+1,n*2))
return sum(n for n in range(3,N+1,2) if prime[n]) + 2*(N > 1)
While I have seen posts about finding prime factors and divisors, I haven't found an answer to my question about factorisations in Python. I have a list of prime factors, i.e. for 24 it is [2,2,2,3]. I want from this list all possible factorisations, i.e. for 24 the output should be [[2,12], [3,8], [4,6], [2,2,6], [2,3,4], [2,2,2,3]].
I tried itertool approaches, but this created lots of duplicate answers and forgot others (like finding [2,3,4] but ignoring [4,6]).
I am specifically interested in an approach using the generated prime factor list. I have found a workaround with a recursive function.
def factors(n, n_list):
for i in range(2, 1 + int(n ** .5)):
if n % i == 0:
n_list.append([i, n // i])
if n // i not in primes: #primes is a list containing prime numbers
for items in factors(n // i, []):
n_list.append(sorted([i] + items))
fac_list = [[n]] #[n] has to be added manually
for facs in n_list: #removes double entries
if facs not in fac_list:
fac_list.append(facs)
return fac_list
But this is time consuming for large n, since it has to look through all numbers, not just prime numbers. A combinatorics approach for a prime factor list should be much faster.
Edit: After looking through several resources, the best explanation of a good strategy is the highest rated answer here on SO. Concise and easy to implement in any language, one prefers. Case closed.
Your task is to determine the multiplicative partition of a number. Google should point you where you need to go. Stack Overflow already has an answer.
The standard way of creating a list of random numbers is:
def generateList(n):
randlist = []
for i in range(n):
randlist.append(random.randint(1,9))
return randlist
However, I recently stumbled upon random.sample, which is much more compact:
def generateList(n):
return random.sample(range(1, 10), n)
But this results in an error if n is greater than 10. So, my question is, does there exist a built-in function that does exactly what I intend it to do without running into error? If not, is there at least a more efficient alternative to the first excerpt (considering that n can be quite large)?
No, there is not a function specifically dedicated to this task. But you can use a list comprehension if you want to condense the code:
def generateList(n):
return [randint(1, 9) for _ in range(n)]
The activity of sampling is to select a maximum of N elements from the sample space of size N. That is why you are getting an error.
Having said that, there are many efficient ways to solve your problem.
We can simply wrap your code in a list comprehension, like this
def generateList(n):
return [randint(1, 9) for i in range(n)]
Use randrange instead of randint, like this
def generateList(n):
return [randrange(1, 10) for i in range(n)]
If the number of possible elements is small, you can even use choice like this
def generateList(n, sample_space = range(1, 10)):
return [choice(sample_space) for i in range(n)]
Note: Since n is going to large, if you are using Python 2.7, use xrange instead of range. You can read more about the differences in this answer.
I think you're going to just have to sample it n times, each with size 1. The reason you are running into the error is that sample doesn't want to repeat numbers: when you ask for 12 unique numbers from a 10 element list, it chokes. Try:
def generateList(n, theList):
return [random.sample(theList, 1) for _ in range(n)]
The prime factors of 13195 are 5, 7, 13 and 29.
What is the largest prime factor of the number 600851475143 ? # http://projecteuler.net/problem=3
I have a deal going with myself that if I can't solve a project Euler problem I will understand the best solution I can find. I did write an algorithm which worked for smaller numbers but was too inefficient to work for bigger ones. So I googled Zach Denton's answer and started studying it.
Here is his code:
#!/usr/bin/env python
import math
def factorize(n):
res = []
# iterate over all even numbers first.
while n % 2 == 0:
res.append(2)
n //= 2
# try odd numbers up to sqrt(n)
limit = math.sqrt(n+1)
i = 3
while i <= limit:
if n % i == 0:
res.append(i)
n //= i
limit = math.sqrt(n+i)
else:
i += 2
if n != 1:
res.append(n)
return res
print max(factorize(600851475143))
Here are the bits I can't figure out for myself:
In the second while loop, why does he use a sqrt(n + 1) instead of just sqrt(n)?
Why wouldn't you also use sqrt(n + 1) when iterating over the even numbers in the first while loop?
How does the algorithm manage to find only prime factors? In the algorithm I first wrote I had a separate test for checking whether a factor was prime, but he doesn't bother.
I suspect the +1 has to do with the imprecision of float (I am not sure whether it's actually required, or is simply a defensive move on the author's part).
The first while loop factors all twos out of n. I don't see how sqrt(n + 1) would fit in there.
If you work from small factor to large factors, you automatically eliminate all composite candidates. Think about it: once you've factored out 5, you've automatically factored out 10, 15, 20 etc. No need to check whether they're prime or not: by that point n will not be divisible by them.
I suspect that checking for primality is what's killing your original algorithm's performance.
So I was reading the Wikipedia article on the Sieve of Eratosthenes and it included a Python implementation:
http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes#Algorithm_complexity_and_implementation
def eratosthenes_sieve(n):
# Create a candidate list within which non-primes will be
# marked as None; only candidates below sqrt(n) need be checked.
candidates = range(n+1)
fin = int(n**0.5)
# Loop over the candidates, marking out each multiple.
for i in xrange(2, fin+1):
if not candidates[i]:
continue
candidates[2*i::i] = [None] * (n//i - 1)
# Filter out non-primes and return the list.
return [i for i in candidates[2:] if i]
It looks like a very simple and elegant implementation. I've seen other implementations, even in Python, and I understand how the Sieve works. But the particular way this implementation works, I"m getting a little confused. Seems whoever was writing that page was pretty clever.
I get that its iterating through the list, finding primes, and then marking multiples of primes as non-prime.
But what does this line do exactly:
candidates[2*i::i] = [None] * (n//i - 1)
I've figured out that its slicing candidates from 2*i to the end, iterating by i, so that means all multiples of i, start at 2*i, then go to 3*i, then go to 4*i till you finish the list.
But what does [None] * (n//i - 1) mean? Why not just set it to False?
Thanks. Kind of a specific question with a single answer, but I think this is the place to ask it. I would sure appreciate a clear explanation.
candidates[2*i::i] = [None] * (n//i - 1)
is just a terse way of writing
for j in range(2 * i, n, i):
candidates[j] = None
which works by assigning an list of Nones to a slice of candidates.
L * N creates and concatenates N (shallow) copies of L, so [None] * (n//i - 1) gives a list of ceil(n / i) times None. Slice assignment (L[start:end:step] = new_L) overwrites the items of the list the slice touches with the items of new_L.
You are right, one could set the items to False as well - I think this would be preferrable, the author of the code obviously thought None would be a better indicator of "crossed out". But None works as well, as bool(None) is False and .. if i is essentially if bool(i).