Is gmpy2 suitable for implementing RSA in python? - python

More specifically, is the gmpy2.next_prime function good enough to find the large primes needed? Or should I be using one of the other many gmpy2.*_prp functions?
For example, is the following code good enough for finding suitable primes for encryption?
import os
import gmpy2
def random(bytez):
seed = reduce(lambda a, b: (a << 8)|ord(b), os.urandom(bytez), 0)
return gmpy2.mpz_urandomb(gmpy2.random_state(seed), bytez*8)
def find_prime(bytez=128):
p = random(bytez)|1
while not gmpy2.is_bpsw_prp(p):
p = random(bytez)|1
return p
def good_pair(p, q):
n = p*q
k = gmpy2.ceil(gmpy2.log2(n))
if abs(p - q) > 2**(k/2 - 100):
return n
return 0
def make_rsa_keypair():
p, q = find_prime(), find_prime()
n = good_pair(p, q)
while not n:
p, q = find_prime(), find_prime()
n = good_pair(p, q)
tot = n - (p + q - 1)
e = (1 << 16) + 1
d = gmpy2.invert(e, tot)
return {
'public':{
'n':n,
'e':e,
},
'private':{
'n':n,
'd':d,
}
}
UPDATE: updated the code with the suggestion.

Disclaimer: I maintain gmpy2.
I would recommend using gmpy2.is_bpsw_prp instead of gmpy2.next_prime. The BPSW test will be faster and there are no known counter-examples. The is_prime and next_prime checks used to use, and may still use, a fixed set of bases and it is possible to composites that pass a series of known tests. IIRC, someone found a composite that passed the first 17 checks. By default, 25 checks are done but it is a weakness.
I am planning to include an APR-CL provable primality test in the next release of gmpy2.
There are specific guidelines for selecting RSA primes that should be followed to prevent accidentally choosing primes that create an n that can be easily factored.

Related

Calculating a^b mod p for a large prime p

I'm trying to write a python code that calculates a^b mod p, where p = 10^9+7 for a list of pairs (a,b). The challenge is that the code has to finish the calculation and output the result in < 1 second. I've implemented successive squaring to calculate a^b mod p quickly. Please see my code below:
from sys import stdin, stdout
rl = stdin.readline
wo = stdout.write
m = 10**9+7
def fn(a,n):
t = 1
while n > 0:
if n%2 != 0: #exponent is odd
t = t*a %m
a = a*a %m
n = int(n/2)
return t%m
t = int(rl()) # number of pairs
I = stdin.read().split() # reading all pairs
I = list(map(int,I)) # making pairs a list of integers
# calculating a^b mod p. I used map because I read its faster than a for loop
s = list(map(fn,I[0:2*t:2],I[1:2*t:2]))
stdout.write('\n'.join(map(str,s))) # printing output
for 200000 pairs (a,b) with a,b<10^9, my code takes > 1 second. I'm new to python and was hoping someone could help me identify the time bottle neck in my code. Is it reading input and printing output or the calculation itself? Thanks for the help!
I don't see something wrong with your code from an efficiency standpoint, it's just unnecessarily complicated.
Here's what I'd call the straight-forward solution:
n = int(input())
for _ in range(n):
a, b = map(int, input().split())
print(pow(a, b, 10**9 + 7))
That did get accepted with PyPy3 but not with CPython3. And with PyPy3 it still took 0.93 seconds.
I'd say their time limit is inappropriate for Python. But try yours with PyPy3 if you haven't yet.
In case someone's wondering whether the map wastes time, the following got accepted in 0.92 seconds:
n = int(input())
for _ in range(n):
a, b = input().split()
print(pow(int(a), int(b), 10**9 + 7))

How to find the prime factors of a number with python

I'm writing a program that will calulate the private key for a weak RSA public key. I am wondering how I would go about determining the values for p and q from the value n. Here is the Python code so far:
from Crypto.PublicKey import RSA #PyCryptoDome
import .math as cm # My own module
with open(public_keyfile, 'rb') as key: # Public Keyfile Is in PEM format
public_key = RSA.import_key(key)
n = public_key.n # N value of the public_key
e = public_key.e # E value of the public_key
p, q = get_factors_of(n) # This I don't know how to do, though there is a question that might help [see bottom]
t = cm.lcm(p-1, q-1) # Get the lowest common multiple of q and q
d = cm.mod_inverse(e, t) # Get d, the modular inverse of e % t
private_key = RSA.construct((n, e, d, p, q) # Construct the RSA private_key
The .math module referenced above:
from math import gcd
def mod_inverse(a, b):
a = a % b
for x in range(1, b):
if (a * x) % b == 1:
return x
return 1
def lcm(x, y):
return x * y // gcd(x, y)
What I need to do appears to be referenced
here but this code is in Java.
If anyone knows how to get p and q from n with python, help would be appreciated.
Many thanks, Legorooj.
Mandatory warning: if you are after performance, you will need to investigate the details of the algorithms yourself. Even "weak" public keys will take forever to crack with a simplistic algorithm (e.g. Erathostene's sieve).
That being said, sympy.ntheory.factorint() might be what you need:
from sympy.ntheory import factorint
print(factorint(54)) # {2: 1, 3: 3} i.e. 54 == 2**1 * 3**3
After lots of googling, and pdf reading, I found an algorithm that works. Here is a python implementation:
import math
def get_factors_of(num):
poss_p = math.floor(math.sqrt(num))
if poss_p % 2 == 0: # Only checks odd numbers, it reduces time by orders of magnitude
poss_p += 1
while poss_p < num:
if num % poss_p == 0:
return poss_p
poss_p += 2
This algorithm effectively finds the P/Q factors of a small RSA key. (I have tested it against a 64-bit PEM public key)

Python: Faster Universal Hashing function with built in libs

I am trying to implement the universal hashing function with only base libs:
I am having issues because I am unable to run this in an effective time. I know % is slow so I have tried the following:
((a * x + b) % P) % n
divmod(divmod(a * x + b, P)[1], n)[1]
subeq = pow(a * x + b, 1, P)
hash = pow(subeq, 1, self.n)
All of these function are too slow for what I am trying to do. Is there a faster way to do mod division only using the base libs that I am unaware of?
Edit To elaborate, I will be running this function about 200000 times (or more) and I need for all 200000 runs to complete in under 4 seconds. None of these methods are even in that ball park (taking minutes)
You're not going to do better than ((a * x + b) % P) % m in pure Python code; the overhead of the Python interpreter is going to bottleneck you more than anything else; yes, if you ensure the m is a power of two, you can precompute mm1 = m - 1 and change the computation to ((a * x + b) % P) & mm1, replacing a more expensive remaindering operation with a cheaper bitmasking operation, but unless P is huge (hundreds of bits minimum), the interpreter overhead will likely outweigh the differences between remainder and bitmasking.
If you really need the performance, and the types you're working with will fit in C level primitive type, you may benefit from writing a Python C extension that converts all the values to size_t, Py_hash_t, uint64_t, or whatever suits your problem and performs the math as a set of bulk conversions to C types, C level math, then a single conversion back to Python int, saving a bunch of byte code and intermediate values (that are promptly tossed).
If the values are too large to fit in C primitives, GMP types are an option (look at mpz_import and mpz_export for efficient conversions from PyLong to mpz_t and back), but the odds of seeing big savings go down; GMP does math faster in general, and can mutate numbers in place rather than creating and destroying lots of temporaries, but even with mpz_import and mpz_export, the cost of converting between Python and GMP types would likely eat most of the savings.
from math import ceil, log2
from primesieve import nth_prime #will get nth prime number [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]
from random import randint
class UniversalHashing:
""" N = #bins
p = prime number st: p >= N
nth_prime(1, 1 << max(32, ceil(log2(N))))
nth_prime(1,1<<max(32),ceil(log2(2)))))
nth_prime(1,2**32)
nth_prime(1,4294967296)
=4294967311
assert:- Returns Error if condition not satisfied
<< operatior:- multiply with 2the power like 2<<2 =2*2'2=8 or 7*2'3=56 and ceil will give the exact value or next vlue ceil(1)=1 , ceil(1.1)=2
randint:- Return a random integer N such that a <= N <= b. Alias for randrange(a, b+1). """
def __init__(self, N, p = None):
self.N = N
if p is None:
p = nth_prime(1, 1 << max(32, ceil(log2(N))))
assert p >= N, 'Prime number p should be at least N!'
self.p = p
def draw(self):
a = randint(1, self.p - 1)
b = randint(0, self.p - 1)
return lambda x: ((a * x + b) % self.p) % self.N
if __name__ == '__main__':
N = 50 #bins
n = 100000 #elements
H = UniversalHashing(N)
h = H.draw()
T = [0] * N
for _ in range(n):
x = randint(0, n * 10)
T[h(x)] += 1
for i in range(len(T)):
print(T[i] / n) # This should be approximately equal

Testing RSA in Python

I created a small code for testing RSA, but when I try to decrypt a message
with keys that are 6-7 digit long, it takes a while and gives me the wrong
result.
from math import sqrt
def isPrime(n):
x = int(sqrt(n)) + 1
if n < 2:
return False`
for i in range(2, x):
if (n / i).is_integer():
return (i, False
return True
def factor(num):
hold = list()
inum = int(sqrt(num) + 1)
hold.append((1, num))
if num % 2 == 0: hold.append((2, int(num / 2)))
for i in range(3, inum, 2):
x = num / i
if x.is_integer():
hold.append((i, int(x)))
return hold
def egcd(a, b):
#Extended Euclidean Algorithm
x,y, u,v = 0,1, 1,0
while a != 0:
q, r = b//a, b%a
m, n = x-u*q, y-v*q
b,a, x,y, u,v = a,r, u,v, m,n
gcd = b
return y
def fastMod(n, e):
if e == 0:
return 1
if e % 2 == 1:
return n * fastMod(n, e - 1)
p = fastMod(n, e / 2)
return p * p
def decrypt(p, q, em):
#Uses CRT for decrypting
mp = em % p; mq = em % q;
dp = d % (p-1); dq = d % (q-1);
xp = fastMod(mp, dp) % p; xq = fastMod(mq, dq) % q
log = egcd(p, q)
cp = (p-log) if log > 0 else (p+log)
cq = cp
m = (((q*cp)*xp) + ((p*cq)*xq)) % n
return m
def encrypt(pm):
return fastMod(pm, e) % n
Is there any way to improve speed or fix any errors?
I try to decrypt a few messages I made with a key 9-10 digits long, but it takes
too long.
A lot of things need improvement, but most notably:
For RSA encryption/decryption: fastMod( ) should take the modulus as an input parameter, and reduce by the modulus each iteration. I found this code which illustrates the right way to do it.
For parameter generation: In practice, one could never use a function like your isPrime( ) to determine primality because it runs in exponential time. Instead, you should be doing Miller-Rabin / Strong pseudo prime tests, which can use fastMod( ) as a sub-routine.
By the way, you are implementing textbook RSA here, which is hugely insecure. You would need to use padding such as OAEP to have security, but you need to be very careful on how you implement that to prevent various forms of attacks (such as side channel attacks).
As for why you are getting the wrong result, it is hard to tell without seeing all of your code. Maybe you want to include a main function that generates params and tries to use them for encryption and decryption.
EDIT: I did notice this which looks suspicious: log = egcd(p, q). Not sure what you are doing here. I suggest you first compute d as the inverse of e mod (p-1)*(q-1) and verify that you are getting that correct (ie multiply d*e mod (p-1)*(q-1) and make sure the result is 1). If so, then do a fastMod( ) with d to see if it decrypts (it should). Once you get that working, then move on to making CRT work.

Simple RSA code

Hi I am trying to create a working RSA program, but on a very small level, I am having problems encrypting and decrypting with this code, can someone help me figure out what is wrong? I have tried doing this many different ways, but this way seems to be the right math, so I believe it might just be my lack of coding skills? Thanks
import random, math
def RandomPrime():
prime = False
while prime == False:
n = 2
while n % 2 == 0:
n = random.randint(10000, 100000)
s = math.trunc(n**0.5)
s = int(s)
x = 3
# While n doesn't exactly divide to equal 0, and x is less then the sqrt of n
while ( n % x != 0 ) and (x <= s):
x = x + 2
# if n is greater than s, it means it has run out of numbers to test, so is prime
if x > s:
prime = True
return n
def Modulus(p, q):
M = p * q
return M
def Totient(p, q):
T = ((p-1) * (q-1))
return T
def Pubkey(T):
prime = False
while prime == False:
n = 2
while n % 2 == 0:
n = random.randint(3, T)
s = math.trunc(n**0.5)
s = int(s)
x = 3
# While
while ( n % x != 0 ) and (x <= s):
x = x + 2
if x > s:
prime = True
return n
def privkey( T, n):
y = math.fmod(1, T)
d = float((y / n))
return d
# z is my encyption in this scenario
z = 8
# I generate p and q, using my random prime generator, i used low primes in
# this example just to see if it would work but it is still not showing reults
p = RandomPrime()
q = RandomPrime()
print(p, q)
#This creates the modulus
M = Modulus(p, q)
print(M)
# Eulier's totient
T = Totient(p, q)
print(T)
#Pub key creation
n = Pubkey(T)
print(n)
#Priv key creation
d = privkey(n, T)
print(d)
enc = (pow(z, n)) % M
print('enc: ', enc)
dec = (pow(enc, d)) % M
print('dec: ', dec)
Your privkey function appears wrong - I'm guessing you saw the definition of RSA's private key value as something like:
the value "e" such that e * d = 1 mod Phi(N)
However in this case, 1 mod Phi(N) does not mean The remainder when 1 is divided by Phi(N) (which appears to be the way you have translated it into code, based on your use of math.fmod(1, T), but in fact should be read more like:
the value "e" such that (e * d) mod Phi(N) = 1
This value is generally calculated using the Extended Euclidean Algorithm. An example Python implementation is here.
It's also worth noting that you seem to be defining privkey(T, n) but calling it as privkey(n, T).
Check my blog which in detail contains the implementation of the following using python:
MD5 Secure hash Algorithm RFC 1321, RSA public Key cryptography RFC 3447, OpenPGP RFC 4880
def keyGen():
''' Generate Keypair '''
i_p=randint(0,20)
i_q=randint(0,20)
# Instead of Asking the user for the prime Number which in case is not feasible,
# generate two numbers which is much highly secure as it chooses higher primes
while i_p==i_q:
continue
primes=PrimeGen(100)
p=primes[i_p]
q=primes[i_q]
#computing n=p*q as a part of the RSA Algorithm
n=p*q
#Computing lamda(n), the Carmichael's totient Function.
# In this case, the totient function is the LCM(lamda(p),lamda(q))=lamda(p-1,q-1)
# On the Contrary We can also apply the Euler's totient's Function phi(n)
# which sometimes may result larger than expected
lamda_n=int(lcm(p-1,q-1))
e=randint(1,lamda_n)
#checking the Following : whether e and lamda(n) are co-prime
while math.gcd(e,lamda_n)!=1:
e=randint(1,lamda_n)
#Determine the modular Multiplicative Inverse
d=modinv(e,lamda_n)
#return the Key Pairs
# Public Key pair : (e,n), private key pair:(d,n)
return ((e,n),(d,n))
Blog Link :Python Cryptography
Github Link : Python Cryptography

Categories