I currently have ↓ set as my randprime(p,q) function. Is there any way to condense this, via something like a genexp or listcomp? Here's my function:
n = randint(p, q)
while not isPrime(n):
n = randint(p, q)
It's better to just generate the list of primes, and then choose from that line.
As is, with your code there is the slim chance that it will hit an infinite loop, either if there are no primes in the interval or if randint always picks a non-prime then the while loop will never end.
So this is probably shorter and less troublesome:
import random
primes = [i for i in range(p,q) if isPrime(i)]
n = random.choice(primes)
The other advantage of this is there is no chance of deadlock if there are no primes in the interval. As stated this can be slow depending on the range, so it would be quicker if you cached the primes ahead of time:
# initialising primes
minPrime = 0
maxPrime = 1000
cached_primes = [i for i in range(minPrime,maxPrime) if isPrime(i)]
#elsewhere in the code
import random
n = random.choice([i for i in cached_primes if p<i<q])
Again, further optimisations are possible, but are very much dependant on your actual code... and you know what they say about premature optimisations.
Here is a script written in python to generate n random prime integers between tow given integers:
import numpy as np
def getRandomPrimeInteger(bounds):
for i in range(bounds.__len__()-1):
if bounds[i + 1] > bounds[i]:
x = bounds[i] + np.random.randint(bounds[i+1]-bounds[i])
if isPrime(x):
return x
else:
if isPrime(bounds[i]):
return bounds[i]
if isPrime(bounds[i + 1]):
return bounds[i + 1]
newBounds = [0 for i in range(2*bounds.__len__() - 1)]
newBounds[0] = bounds[0]
for i in range(1, bounds.__len__()):
newBounds[2*i-1] = int((bounds[i-1] + bounds[i])/2)
newBounds[2*i] = bounds[i]
return getRandomPrimeInteger(newBounds)
def isPrime(x):
count = 0
for i in range(int(x/2)):
if x % (i+1) == 0:
count = count+1
return count == 1
#ex: get 50 random prime integers between 100 and 10000:
bounds = [100, 10000]
for i in range(50):
x = getRandomPrimeInteger(bounds)
print(x)
So it would be great if you could use an iterator to give the integers from p to q in random order (without replacement). I haven't been able to find a way to do that. The following will give random integers in that range and will skip anything that it's tested already.
import random
fail = False
tested = set([])
n = random.randint(p,q)
while not isPrime(n):
tested.add(n)
if len(tested) == p-q+1:
fail = True
break
while n in s:
n = random.randint(p,q)
if fail:
print 'I failed'
else:
print n, ' is prime'
The big advantage of this is that if say the range you're testing is just (14,15), your code would run forever. This code is guaranteed to produce an answer if such a prime exists, and tell you there isn't one if such a prime does not exist. You can obviously make this more compact, but I'm trying to show the logic.
next(i for i in itertools.imap(lambda x: random.randint(p,q)|1,itertools.count()) if isPrime(i))
This starts with itertools.count() - this gives an infinite set.
Each number is mapped to a new random number in the range, by itertools.imap(). imap is like map, but returns an iterator, rather than a list - we don't want to generate a list of inifinite random numbers!
Then, the first matching number is found, and returned.
Works efficiently, even if p and q are very far apart - e.g. 1 and 10**30, which generating a full list won't do!
By the way, this is not more efficient than your code above, and is a lot more difficult to understand at a glance - please have some consideration for the next programmer to have to read your code, and just do it as you did above. That programmer might be you in six months, when you've forgotten what this code was supposed to do!
P.S - in practice, you might want to replace count() with xrange (NOT range!) e.g. xrange((p-q)**1.5+20) to do no more than that number of attempts (balanced between limited tests for small ranges and large ranges, and has no more than 1/2% chance of failing if it could succeed), otherwise, as was suggested in another post, you might loop forever.
PPS - improvement: replaced random.randint(p,q) with random.randint(p,q)|1 - this makes the code twice as efficient, but eliminates the possibility that the result will be 2.
Related
I'm new to both Python and StackOverflow so I apologise if this question has been repeated too much or if it's not a good question. I'm doing a beginner's Python course and one of the tasks I have to do is to make a function that finds the next prime number after a given input. This is what I have so far:
def nextPrime(n):
num = n + 1
for i in range(1, 500):
for j in range(2, num):
if num%j == 0:
num = num + 1
return num
When I run it on the site's IDE, it's fine and everything works well but then when I submit the task, it says the runtime was too long and that I should optimise my code. But I'm not really sure how to do this, so would it be possible to get some feedback or any suggestions on how to make it run faster?
When your function finds the answer, it will continue checking the same number hundreds of times. This is why it is taking so long. Also, when you increase num, you should break out of the nested loop to that the new number is checked against the small factors first (which is more likely to eliminate it and would accelerate progress).
To make this simpler and more efficient, you should break down your problem in areas of concern. Checking if a number is prime or not should be implemented in its own separate function. This will make the code of your nextPrime() function much simpler:
def nextPrime(n):
n += 1
while not isPrime(n): n += 1
return n
Now you only need to implement an efficient isPrime() function:
def isPrime(x):
p,inc = 2,1
while p*p <= x:
if x % p == 0: return False
p,inc = p+inc,2
return x > 1
Looping from 1 to 500, especially because another loop runs through it, is not only inefficient, but also confines the range of the possible "next prime number" that you're trying to find. Therefore, you should make use of while loop and break which can be used to break out of the loop whenever you have found the prime number (of course, if it's stated that the number is less than 501 in the prompt, your approach totally makes sense).
Furthermore, you can make use of the fact that you only need check the integers less than or equal to the square root of the designated integer (which in python, is represented as num**0.5) to determine if that integer is prime, as the divisors of the integers always come in pair and the largest of the smaller divisor is always a square root, if it exists.
I have an application that is kind of like a URL shortener and need to generate unique URL whenever a user requests.
For this I need a function to map an index/number to a unique string of length n with two requirements:
Two different numbers can not generate same string.
In other words as long as i,j<K: f(i) != f(j). K is the number of possible strings = 26^n. (26 is number of characters in English)
Two strings generated by number i and i+1 don't look similar most of the times. For example they are not abcdef1 and abcdef2. (So that users can not predict the pattern and the next IDs)
This is my current code in Python:
chars = "abcdefghijklmnopqrstuvwxyz"
for item in itertools.product(chars, repeat=n):
print("".join(item))
# For n = 7 generates:
# aaaaaaa
# aaaaaab
# aaaaaac
# ...
The problem with this code is there is no index that I can use to generate unique strings on demand by tracking that index. For example generate 1 million unique strings today and 2 million tomorrow without looping through or collision with the first 1 million.
The other problem with this code is that the strings that are created after each other look very similar and I need them to look random.
One option is to populate a table/dictionary with millions of strings, shuffle them and keep track of index to that table but it takes a lot of memory.
An option is also to check the database of existing IDs after generating a random string to make sure it doesn't exist but the problem is as I get closer to the K (26^n) the chance of collision increases and it wouldn't be efficient to make a lot of check_if_exist queries against the database.
Also if n was long enough I could use UUID with small chance of collision but in my case n is 7.
I'm going to outline a solution for you that is going to resist casual inspection even by a knowledgeable person, though it probably IS NOT cryptographically secure.
First, your strings and numbers are in a one-to-one map. Here is some simple code for that.
alphabet = 'abcdefghijklmnopqrstuvwxyz'
len_of_codes = 7
char_to_pos = {}
for i in range(len(alphabet)):
char_to_pos[alphabet[i]] = i
def number_to_string(n):
chars = []
for _ in range(len_of_codes):
chars.append(alphabet[n % len(alphabet)])
n = n // len(alphabet)
return "".join(reversed(chars))
def string_to_number(s):
n = 0
for c in s:
n = len(alphabet) * n + char_to_pos[c]
return n
So now your problem is how to take an ascending stream of numbers and get an apparently random stream of numbers out of it instead. (Because you know how to turn those into strings.) Well, there are lots of tricks for primes, so let's find a decent sized prime that fits in the range that you want.
def is_prime (n):
for i in range(2, n):
if 0 == n%i:
return False
elif n < i*i:
return True
if n == 2:
return True
else:
return False
def last_prime_before (n):
for m in range(n-1, 1, -1):
if is_prime(m):
return m
print(last_prime_before(len(alphabet)**len_of_codes)
With this we find that we can use the prime 8031810103. That's how many numbers we'll be able to handle.
Now there is an easy way to scramble them. Which is to use the fact that multiplication modulo a prime scrambles the numbers in the range 1..(p-1).
def scramble1 (p, k, n):
return (n*k) % p
Picking a random number to scramble by, int(random.random() * 26**7) happened to give me 3661807866, we get a sequence we can calculate with:
for i in range(1, 5):
print(number_to_string(scramble1(8031810103, 3661807866, i)))
Which gives us
lwfdjoc
xskgtce
jopkctb
vkunmhd
This looks random to casual inspection. But will be reversible for any knowledgeable someone who puts modest effort in. They just have to guess the prime and algorithm that we used, look at 2 consecutive values to get the hidden parameter, then look at a couple of more to verify it.
Before addressing that, let's figure out how to take a string and get the number back. Thanks to Fermat's little theorem we know for p prime and 1 <= k < p that (k * k^(p-2)) % p == 1.
def n_pow_m_mod_k (n, m, k):
answer = 1
while 0 < m:
if 1 == m % 2:
answer = (answer * n) % k
m = m // 2
n = (n * n) % k
return answer
print(n_pow_m_mod_k(3661807866, 8031810103-2, 8031810103))
This gives us 3319920713. Armed with that we can calculate scramble1(8031810103, 3319920713, string_to_number("vkunmhd")) to find out that vkunmhd came from 4.
Now let's make it harder. Let's generate several keys to be scrambling with:
import random
p = 26**7
for i in range(5):
p = last_prime_before(p)
print((p, int(random.random() * p)))
When I ran this I happened to get:
(8031810103, 3661807866)
(8031810097, 3163265427)
(8031810091, 7069619503)
(8031809963, 6528177934)
(8031809917, 991731572)
Now let's scramble through several layers, working from smallest prime to largest (this requires reversing the sequence):
def encode (n):
for p, k in [
(8031809917, 991731572)
, (8031809963, 6528177934)
, (8031810091, 7069619503)
, (8031810097, 3163265427)
, (8031810103, 3661807866)
]:
n = scramble1(p, k, n)
return number_to_string(n)
This will give a sequence:
ehidzxf
shsifyl
gicmmcm
ofaroeg
And to reverse it just use the same trick that reversed the first scramble (reversing the primes so I am unscrambling in the order that I started with):
def decode (s):
n = string_to_number(s)
for p, k in [
(8031810103, 3319920713)
, (8031810097, 4707272543)
, (8031810091, 5077139687)
, (8031809963, 192273749)
, (8031809917, 5986071506)
]:
n = scramble1(p, k, n)
return n
TO BE CLEAR I do NOT promise that this is cryptographically secure. I'm not a cryptographer, and I'm aware enough of my limitations that I know not to trust it.
But I do promise that you'll have a sequence of over 8 billion strings that you are able to encode/decode with no obvious patterns.
Now take this code, scramble the alphabet, regenerate the magic numbers that I used, and choose a different number of layers to go through. I promise you that I personally have absolutely no idea how someone would even approach the problem of figuring out the algorithm. (But then again I'm not a cryptographer. Maybe they have some techniques to try. I sure don't.)
How about :
from random import Random
n = 7
def f(i):
myrandom = Random()
myrandom.seed(i)
alphabet = "123456789"
return "".join([myrandom.choice(alphabet) for _ in range(7)])
# same entry, same output
assert f(0) == "7715987"
assert f(0) == "7715987"
assert f(0) == "7715987"
# different entry, different output
assert f(1) == "3252888"
(change the alphabet to match your need)
This "emulate" a UUID, since you said you could accept a small chance of collision. If you want to avoid collision, what you really need is a perfect hash function (https://en.wikipedia.org/wiki/Perfect_hash_function).
you can try something based on the sha1 hash
#!/usr/bin/python3
import hashlib
def generate_link(i):
n = 7
a = "abcdefghijklmnopqrstuvwxyz01234567890"
return "".join(a[x%36] for x in hashlib.sha1(str(i).encode('ascii')).digest()[-n:])
This is a really simple example of what I outlined in this comment. It just offsets the number based on i. If you want "different" strings, don't use this, because if num is 0, then you will get abcdefg (with n = 7).
alphabet = "abcdefghijklmnopqrstuvwxyz"
# num is the num to convert, i is the "offset"
def num_to_char(num, i):
return alphabet[(num + i) % 26]
# generate the link
def generate_link(num, n):
return "".join([num_to_char(num, i) for i in range(n)])
generate_link(0, 7) # "abcdefg"
generate_link(0, 7) # still "abcdefg"
generate_link(0, 7) # again, "abcdefg"!
generate_link(1, 7) # now it's "bcdefgh"!
You would just need to change the num + i to some complicated and obscure math equation.
I am trying to write a function which calculates multiple iteration hashes of a specific value (and output each iteration in the meantime).
However, I can't get my head over how to perform, for instance, md5 hash function on itself multiple times. For instance:
a = hashlib.md5('fun').hexdigest()
b = hashlib.md5(a).hexdigest()
c = hashlib.md5(b).hexdigest()
d = hashlib.md5(c).hexdigest()
.......
I think the recursion is the solution, but I just can't seem to implement it properly. This is the general factorial recursion example, but how do I adapt it to hashes:
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n - 1)
This is a classic application of generators. Python allows a maximum of 500 recursions due to its unusually heavy stack. For anything which might be executed anywhere near that many times, iteration will often be faster. Using a generator allows you to break after any desired number of executions and allows flat usage of the desired logic in your code. The following example prints the output of 10 such executions.
from itertools import islice
def hashes(n):
while True:
n = hashlib.md5(n).hexdigest()
yield n
for h in islice(hashes('fun'), 10):
print(h)
In general, you are looking for a loop like
while True:
x = f(x)
where you repeatedly replace the input with the result of the most recent application.
For your specific example,
def iterated_hash(x):
while True:
x = hashlib.md5(x).hexdigest()
return x
However, since you don't really want to do this an infinite number of times, you need to supply a count:
def iterated_hash(x, n):
while True:
if n == 0:
return x
x = hashlib.md5(x).hexdigest()
or with a for loop,
def iterated_hash(x, n):
for _ in range(n):
x = hashlib.md5(x).hexdigest()
return x
(Practically speaking, you want to use the for loop, but it's nice to see how the for loop is just a finite special case of the more general infinite loop.)
Just iterate as many times as needed:
def make_hash(text, iterations):
a = hashlib.md5(text).hexdigest()
for _ in range(iterations):
a = hashlib.md5(a).hexdigest()
return a
a = make_hash('fun', 5) # 5 iterations
I'm relatively new to the python world, and the coding world in general, so I'm not really sure how to go about optimizing my python script. The script that I have is as follows:
import math
z = 1
x = 0
while z != 0:
x = x+1
if x == 500:
z = 0
calculated = open('Prime_Numbers.txt', 'r')
readlines = calculated.readlines()
calculated.close()
a = len(readlines)
b = readlines[(a-1)]
b = int(b) + 1
for num in range(b, (b+1000)):
prime = True
calculated = open('Prime_Numbers.txt', 'r')
for i in calculated:
i = int(i)
q = math.ceil(num/2)
if (q%i==0):
prime = False
if prime:
calculated.close()
writeto = open('Prime_Numbers.txt', 'a')
num = str(num)
writeto.write("\n" + num)
writeto.close()
print(num)
As some of you can probably guess I'm calculating prime numbers. The external file that it calls on contains all the prime numbers between 2 and 20.
The reason that I've got the while loop in there is that I wanted to be able to control how long it ran for.
If you have any suggestions for cutting out any clutter in there could you please respond and let me know, thanks.
Reading and writing to files is very, very slow compared to operations with integers. Your algorithm can be sped up 100-fold by just ripping out all the file I/O:
import itertools
primes = {2} # A set containing only 2
for n in itertools.count(3): # Start counting from 3, by 1
for prime in primes: # For every prime less than n
if n % prime == 0: # If it divides n
break # Then n is composite
else:
primes.add(n) # Otherwise, it is prime
print(n)
A much faster prime-generating algorithm would be a sieve. Here's the Sieve of Eratosthenes, in Python 3:
end = int(input('Generate primes up to: '))
numbers = {n: True for n in range(2, end)} # Assume every number is prime, and then
for n, is_prime in numbers.items(): # (Python 3 only)
if not is_prime:
continue # For every prime number
for i in range(n ** 2, end, n): # Cross off its multiples
numbers[i] = False
print(n)
It is very inefficient to keep storing and loading all primes from a file. In general file access is very slow. Instead save the primes to a list or deque. For this initialize calculated = deque() and then simply add new primes with calculated.append(num). At the same time output your primes with print(num) and pipe the result to a file.
When you found out that num is not a prime, you do not have to keep checking all the other divisors. So break from the inner loop:
if q%i == 0:
prime = False
break
You do not need to go through all previous primes to check for a new prime. Since each non-prime needs to factorize into two integers, at least one of the factors has to be smaller or equal sqrt(num). So limit your search to these divisors.
Also the first part of your code irritates me.
z = 1
x = 0
while z != 0:
x = x+1
if x == 500:
z = 0
This part seems to do the same as:
for x in range(500):
Also you limit with x to 500 primes, why don't you simply use a counter instead, that you increase if a prime is found and check for at the same time, breaking if the limit is reached? This would be more readable in my opinion.
In general you do not need to introduce a limit. You can simply abort the program at any point in time by hitting Ctrl+C.
However, as others already pointed out, your chosen algorithm will perform very poor for medium or large primes. There are more efficient algorithms to find prime numbers: https://en.wikipedia.org/wiki/Generating_primes, especially https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes.
You're writing a blank line to your file, which is making int() traceback. Also, I'm guessing you need to rstrip() off your newlines.
I'd suggest using two different files - one for initial values, and one for all values - initial and recently computed.
If you can keep your values in memory a while, that'd be a lot faster than going through a file repeatedly. But of course, this will limit the size of the primes you can compute, so for larger values you might return to the iterate-through-the-file method if you want.
For computing primes of modest size, a sieve is actually quite good, and worth a google.
When you get into larger primes, trial division by the first n primes is good, followed by m rounds of Miller-Rabin. If Miller-Rabin probabilistically indicates the number is probably a prime, then you do complete trial division or AKS or similar. Miller Rabin can say "This is probably a prime" or "this is definitely composite". AKS gives a definitive answer, but it's slower.
FWIW, I've got a bunch of prime-related code collected together at http://stromberg.dnsalias.org/~dstromberg/primes/
Question: What's the best way to iterate over an integer and find other integers inside it, then throw that integer away if it contains them?
Long Version:
I have been working on my Python skills by trying to make efficient solutions to the problems at Project Euler. After going through around 20 problems, I find that while I can solve the problems my solutions are often inelegant and clunky (i.e., ugly and slow). The way the problems are structured, I think I need to learn some better solutions because more complex stuff is going to compound these inefficiencies.
Anyway, today I'm working on problem 35, which requests all circular primes below 1,000,000. I have produced a list of all primes below 1,000,000 and I built a little framework to spit out permutations of these primes below, for each of which I was planning to test for prime-ness:
def number_switcher(number):
number = [num for num in str(number)]
numlist = [''.join(num) for num in list(itertools.permutations(number))]
return [int(num) for num in numlist]
Running this on all the primes and then testing each possible permutation for prime-ness, as you can imagine, is no way to solve the problem.
Then it occurred to me that I could throw away all numbers that have an even number in them (assuming they're longer than one digit) or any numbers with fives in them before I even start running the permutations.
Here's where I really got lost. Even though it seems simple, I can't figure out after two days of trying, how to throw away any multi-digit number with an even number or a 5 inside of it.
Here's what I tried (assuming a list of all primes below 1,000,000 called here "primes"):
[num for num in primes if any(x for x in '024685' in str(num))] # failed: 'bool' object is not iterable
Then I experimented with the following:
for prime in primes:
if '0' in str(prime):
primes.remove(prime)
>>>>len(primes)
4264
This cut my primes list about in half. Okay, so maybe I'm on the right track and I just need an ugly "if '0' in str(prime) or if '2' in str(prime)," etc.
But here's the weird thing: when I examine my 'primes' list, it still has primes with '0's in it. Running the same thing again on the new list of primes, I get the following results:
for prime in primes:
if '0' in str(prime):
primes.remove(prime)
>>>>len(primes)
4026
...and again the result drops to:
>>>>len(primes)
3892
....
3861 # again
....
3843 #and again
Maybe I'm missing something obvious here, but it seemed like that first if-test should find any prime with '0' in it and remove all of them?
Lastly, I also tried the following, which seems terrible because it jumps pointlessly back and forth across the str-integer train tracks, but it seemed like it just had to work:
for num in primes:
for d in str(num):
if (int(d) % 2 == 0 or int(d) == 5):
primes.remove(num) # doesn't work: ValueError: list.remove(x): x not in list
else:
pass
I feel like I shouldn't be tearing my hair out over this question, but it's making me a little crazy and probably because I've gotten to a point where I'm just trying to hack out a solution, my attempts are getting less lucid.
Here's my question:
What's the best way to iterate over an integer and find other integers inside it, then throw that stupid integer away if it contains them?
Thanks for your help/reading.
Footnote:
This is the first question I have asked here but I have benefitted from this site's guidance for a few months now. Apologies if this question/solution is extant, but I looked for it and could not find a way to cobble together a solution. Most search results come up as "how to tell if an integer is even".
#root is correct that your assumption about how to optimise this is false, but I'd thought it'd be useful to point out why what you're doing isn't working on a Python level.
Your bool problem:
[num for num in primes if any(x for x in '024685' in str(num))] # failed: 'bool' object is not iterable
'024685' in str(num) is being evaluated first, so it equates to for x in True - which isn't possible. The above would be written as:
[num for num in primes if not any(ch in '024685' for ch in str(num)]
Which takes each character from str(num) and checks to see if it's one of '024685'.
Your list problem:
There's a rule of thumb here - don't modify something you're iterating over. If you were to try this with a dict you'd get an exception thrown - but for a list it's likely to get silently wrong, but occasionally will break.
When removing more than one value from a list, it's better to build a new list keeping only the required values, eg:
no_zero = [num for num in primes if '0' not in str(num)]
And if you wanted to modify the original primes:
primes[:] = no_zero
Your last example also fails because of .remove(), and putting the above together can be written as:
[num for num in primes if not any(num % i == 0 for i in (2, 5)]
Misc:
You may wish to consider storing the primes in a set - as you're not concerned about the order of the primes, and will be faster for membership testing.
Thanks, Jon Clements. I was able to solve the problem today with the following script (note that I had to make sure the '024685' stripper did not strip out '2' and '5' because those were part of the answer, which threw me off for awhile...):
import math
import itertools
def prime_test(number):
if all(number % x != 0 for x in range(2, int(math.sqrt(number)) + 1)):
return True
else:
return False
def find_primes(limit):
primes = [2]
for x in range(3, limit + 1, 2):
if prime_test(x):
primes.append(x)
else:
pass
return primes
def circulate(number):
circ_list = [number]
x = 1
while x < len(str(number)):
number = str(number)[1:] + str(number)[0]
circ_list.append(number)
x += 1
return circ_list
def main():
primes = find_primes(1000000)
no_evens = [x for x in primes if not any(ch in '024685' for ch in str(x) if len(str(x)) > 1)]
circular_primes = []
for prime in no_evens:
if all(prime_test(x) for x in circulate(prime)):
circular_primes.append(prime)
else:
pass
return circular_primes, len(circular_primes)
if __name__ == '__main__':
print(main())
Another thing I didn't realize is that I was merely supposed to rotate the number, not provide all possible permutations of it. These gotches probably throw people off when they're trying to solve the problem.