Testing RSA in Python - python

I created a small code for testing RSA, but when I try to decrypt a message
with keys that are 6-7 digit long, it takes a while and gives me the wrong
result.
from math import sqrt
def isPrime(n):
x = int(sqrt(n)) + 1
if n < 2:
return False`
for i in range(2, x):
if (n / i).is_integer():
return (i, False
return True
def factor(num):
hold = list()
inum = int(sqrt(num) + 1)
hold.append((1, num))
if num % 2 == 0: hold.append((2, int(num / 2)))
for i in range(3, inum, 2):
x = num / i
if x.is_integer():
hold.append((i, int(x)))
return hold
def egcd(a, b):
#Extended Euclidean Algorithm
x,y, u,v = 0,1, 1,0
while a != 0:
q, r = b//a, b%a
m, n = x-u*q, y-v*q
b,a, x,y, u,v = a,r, u,v, m,n
gcd = b
return y
def fastMod(n, e):
if e == 0:
return 1
if e % 2 == 1:
return n * fastMod(n, e - 1)
p = fastMod(n, e / 2)
return p * p
def decrypt(p, q, em):
#Uses CRT for decrypting
mp = em % p; mq = em % q;
dp = d % (p-1); dq = d % (q-1);
xp = fastMod(mp, dp) % p; xq = fastMod(mq, dq) % q
log = egcd(p, q)
cp = (p-log) if log > 0 else (p+log)
cq = cp
m = (((q*cp)*xp) + ((p*cq)*xq)) % n
return m
def encrypt(pm):
return fastMod(pm, e) % n
Is there any way to improve speed or fix any errors?
I try to decrypt a few messages I made with a key 9-10 digits long, but it takes
too long.

A lot of things need improvement, but most notably:
For RSA encryption/decryption: fastMod( ) should take the modulus as an input parameter, and reduce by the modulus each iteration. I found this code which illustrates the right way to do it.
For parameter generation: In practice, one could never use a function like your isPrime( ) to determine primality because it runs in exponential time. Instead, you should be doing Miller-Rabin / Strong pseudo prime tests, which can use fastMod( ) as a sub-routine.
By the way, you are implementing textbook RSA here, which is hugely insecure. You would need to use padding such as OAEP to have security, but you need to be very careful on how you implement that to prevent various forms of attacks (such as side channel attacks).
As for why you are getting the wrong result, it is hard to tell without seeing all of your code. Maybe you want to include a main function that generates params and tries to use them for encryption and decryption.
EDIT: I did notice this which looks suspicious: log = egcd(p, q). Not sure what you are doing here. I suggest you first compute d as the inverse of e mod (p-1)*(q-1) and verify that you are getting that correct (ie multiply d*e mod (p-1)*(q-1) and make sure the result is 1). If so, then do a fastMod( ) with d to see if it decrypts (it should). Once you get that working, then move on to making CRT work.

Related

Let n be a square number. Using Python, how we can efficiently calculate natural numbers y up to a limit l such that n+y^2 is again a square number?

Using Python, I would like to implement a function that takes a natural number n as input and outputs a list of natural numbers [y1, y2, y3, ...] such that n + y1*y1 and n + y2*y2 and n + y3*y3 and so forth is again a square.
What I tried so far is to obtain one y-value using the following function:
def find_square(n:int) -> tuple[int, int]:
if n%2 == 1:
y = (n-1)//2
x = n+y*y
return (y,x)
return None
It works fine, eg. find_square(13689) gives me a correct solution y=6844. It would be great to have an algorithm that yields all possible y-values such as y=44 or y=156.
Simplest slow approach is of course for given N just to iterate all possible Y and check if N + Y^2 is square.
But there is a much faster approach using integer Factorization technique:
Lets notice that to solve equation N + Y^2 = X^2, that is to find all integer pairs (X, Y) for given fixed integer N, we can rewrite this equation to N = X^2 - Y^2 = (X + Y) * (X - Y) which follows from famous school formula of difference of squares.
Now lets rename two factors as A, B i.e. N = (X + Y) * (X - Y) = A * B, which means that X = (A + B) / 2 and Y = (A - B) / 2.
Notice that A and B should be of same odditiy, either both odd or both even, otherwise in last formulas above we can't have whole division by 2.
We will factorize N into all possible pairs of two factors (A, B) of same oddity. For fast factorization in code below I used simple to implement but yet quite fast algorithm Pollard Rho, also two extra algorithms were needed as a helper to Pollard Rho, one is Fermat Primality Test (which allows fast checking if number is probably prime) and second is Trial Division Factorization (which helps Pollard Rho to factor out small factors, which could cause Pollard Rho to fail).
Pollard Rho for composite number has time complexity O(N^(1/4)) which is very fast even for 64-bit numbers. Any faster factorization algorithm can be chosen if needed a bigger space to be searched. My fast algorithm time is dominated by speed of factorization, remaining part of algorithm is blazingly fast, just few iterations of loop with simple formulas.
If your N is a square itself (hence we know its root easily), then Pollard Rho can factor N even much faster, within O(N^(1/8)) time. Even for 128-bit numbers it means very small time, 2^16 operations, and I hope you're solving your task for less than 128 bit numbers.
If you want to process a range of possible N values then fastest way to factorize them is to use techniques similar to Sieve of Erathosthenes, using set of prime numbers, it allows to compute all factors for all N numbers within some range. Using Sieve of Erathosthenes for the case of range of Ns is much faster than factorizing each N with Pollard Rho.
After factoring N into pairs (A, B) we compute (X, Y) based on (A, B) by formulas above. And output resulting Y as a solution of fast algorithm.
Following code as an example is implemented in pure Python. Of course one can use Numba to speed it up, Numba usually gives 30-200 times speedup, for Python it achieves same speed as optimized C++. But I thought that main thing here is to implement fast algorithm, Numba optimizations can be done easily afterwards.
I added time measurement into following code. Although it is pure Python still my fast algorithm achieves 8500x times speedup compared to regular brute force approach for limit of 1 000 000.
You can change limit variable to tweak amount of searched space, or num_tests variable to tweak amount of different tests.
Following code implements both solutions - fast solution find_fast() described above plus very tiny brute force solution find_slow() which is very slow as it scans all possible candidates. This slow solution is only used to compare correctness in tests and compare speedup.
Code below uses nothing except few standard Python library modules, no external modules were used.
Try it online!
def find_slow(N):
import math
def is_square(x):
root = int(math.sqrt(float(x)) + 0.5)
return root * root == x, root
l = []
for y in range(N):
if is_square(N + y ** 2)[0]:
l.append(y)
return l
def find_fast(N):
import itertools, functools
Prod = lambda it: functools.reduce(lambda a, b: a * b, it, 1)
fs = factor(N)
mfs = {}
for e in fs:
mfs[e] = mfs.get(e, 0) + 1
fs = sorted(mfs.items())
del mfs
Ys = set()
for take_a in itertools.product(*[
(range(v + 1) if k != 2 else range(1, v)) for k, v in fs]):
A = Prod([p ** t for (p, _), t in zip(fs, take_a)])
B = N // A
assert A * B == N, (N, A, B, take_a)
if A < B:
continue
X = (A + B) // 2
Y = (A - B) // 2
assert N + Y ** 2 == X ** 2, (N, A, B, X, Y)
Ys.add(Y)
return sorted(Ys)
def trial_div_factor(n, limit = None):
# https://en.wikipedia.org/wiki/Trial_division
fs = []
while n & 1 == 0:
fs.append(2)
n >>= 1
all_checked = False
for d in range(3, (limit or n) + 1, 2):
if d * d > n:
all_checked = True
break
while True:
q, r = divmod(n, d)
if r != 0:
break
fs.append(d)
n = q
if n > 1 and all_checked:
fs.append(n)
n = 1
return fs, n
def fermat_prp(n, trials = 32):
# https://en.wikipedia.org/wiki/Fermat_primality_test
import random
if n <= 16:
return n in (2, 3, 5, 7, 11, 13)
for i in range(trials):
if pow(random.randint(2, n - 2), n - 1, n) != 1:
return False
return True
def pollard_rho_factor(n):
# https://en.wikipedia.org/wiki/Pollard%27s_rho_algorithm
import math, random
fs, n = trial_div_factor(n, 1 << 7)
if n <= 1:
return fs
if fermat_prp(n):
return sorted(fs + [n])
for itry in range(8):
failed = False
x = random.randint(2, n - 2)
for cycle in range(1, 1 << 60):
y = x
for i in range(1 << cycle):
x = (x * x + 1) % n
d = math.gcd(x - y, n)
if d == 1:
continue
if d == n:
failed = True
break
return sorted(fs + pollard_rho_factor(d) + pollard_rho_factor(n // d))
if failed:
break
assert False, f'Pollard Rho failed! n = {n}'
def factor(N):
import functools
Prod = lambda it: functools.reduce(lambda a, b: a * b, it, 1)
fs = pollard_rho_factor(N)
assert N == Prod(fs), (N, fs)
return sorted(fs)
def test():
import random, time
limit = 1 << 20
num_tests = 20
t0, t1 = 0, 0
for i in range(num_tests):
if (round(i / num_tests * 1000)) % 100 == 0 or i + 1 >= num_tests:
print(f'test {i}, ', end = '', flush = True)
N = random.randrange(limit)
tb = time.time()
r0 = find_slow(N)
t0 += time.time() - tb
tb = time.time()
r1 = find_fast(N)
t1 += time.time() - tb
assert r0 == r1, (N, r0, r1, t0, t1)
print(f'\nTime slow {t0:.05f} sec, fast {t1:.05f} sec, speedup {round(t0 / max(1e-6, t1))} times')
if __name__ == '__main__':
test()
Output:
test 0, test 2, test 4, test 6, test 8, test 10, test 12, test 14, test 16, test 18, test 19,
Time slow 26.28198 sec, fast 0.00301 sec, speedup 8732 times
For the easiest solution, you can try this:
import math
n=13689 #or we can ask user to input a square number.
for i in range(1,9999):
if math.sqrt(n+i**2).is_integer():
print(i)

How to find the prime factors of a number with python

I'm writing a program that will calulate the private key for a weak RSA public key. I am wondering how I would go about determining the values for p and q from the value n. Here is the Python code so far:
from Crypto.PublicKey import RSA #PyCryptoDome
import .math as cm # My own module
with open(public_keyfile, 'rb') as key: # Public Keyfile Is in PEM format
public_key = RSA.import_key(key)
n = public_key.n # N value of the public_key
e = public_key.e # E value of the public_key
p, q = get_factors_of(n) # This I don't know how to do, though there is a question that might help [see bottom]
t = cm.lcm(p-1, q-1) # Get the lowest common multiple of q and q
d = cm.mod_inverse(e, t) # Get d, the modular inverse of e % t
private_key = RSA.construct((n, e, d, p, q) # Construct the RSA private_key
The .math module referenced above:
from math import gcd
def mod_inverse(a, b):
a = a % b
for x in range(1, b):
if (a * x) % b == 1:
return x
return 1
def lcm(x, y):
return x * y // gcd(x, y)
What I need to do appears to be referenced
here but this code is in Java.
If anyone knows how to get p and q from n with python, help would be appreciated.
Many thanks, Legorooj.
Mandatory warning: if you are after performance, you will need to investigate the details of the algorithms yourself. Even "weak" public keys will take forever to crack with a simplistic algorithm (e.g. Erathostene's sieve).
That being said, sympy.ntheory.factorint() might be what you need:
from sympy.ntheory import factorint
print(factorint(54)) # {2: 1, 3: 3} i.e. 54 == 2**1 * 3**3
After lots of googling, and pdf reading, I found an algorithm that works. Here is a python implementation:
import math
def get_factors_of(num):
poss_p = math.floor(math.sqrt(num))
if poss_p % 2 == 0: # Only checks odd numbers, it reduces time by orders of magnitude
poss_p += 1
while poss_p < num:
if num % poss_p == 0:
return poss_p
poss_p += 2
This algorithm effectively finds the P/Q factors of a small RSA key. (I have tested it against a 64-bit PEM public key)

Optimizing Python Code for Competitions

I'm new to Python, and I'm trying to get familiar with it by solving problems on CodeChef. I'm attempting to solve the Easy problem Number Game. The issue is that the execution time is too long for my code.
I have translated the Python solution I wrote into C++, and the submission was accepted, so I know I have a correct answer, and it's just off by a constant multiple.
Is it possible to solve this problem in Python 3 in the allotted time? Can you help me speed up my code to accomplish this?
import time
def getStartValues(A, M):
startVals = [0]*M
b = [0]*len(A)
for i in range(len(A)-1):
b[i+1] = (10*b[i] + A[i]) % M
f = 0
power = 1
for i in range(len(A)-1,0,-1):
startVals[(b[i]*power + f) % M] += 1
f = (A[i]*power + f) % M
power = (power*10 % M)
startVals[f] += 1
return startVals, power
def checkValues(i, startVals, M, powNm1, checked, chklst):
if checked[i] == 1:
return startVals[i]
q = [i]
chk = [0]*M
chk[i] = 1
while len(q) > 0:
val = q.pop(0)
for j in chklst:
val2 = (powNm1*val + j) % M
if checked[val2] > 0:
checked[i] = 1
return startVals[i]
elif chk[val2] == 0:
q.append(val2)
chk[val2] = 1
return 0
def compute(A, M):
startVals, power = getStartValues(A, M)
checked = [0]*M
checked[0] = 1
chklst = [j for j in range(M) if startVals[j] > 0]
total = 0
for i in chklst:
c = checkValues(i, startVals, M, power, checked, chklst)
total += c
return total
start = time.time()
file = open('numbgame.in', 'r')
#T = int(input())
T = int(file.readline())
for i in range(T):
#A, M = input().split()
A, M = file.readline().split()
A = list(map(int,A))
M = int(M)
print(compute(A, M))
tDiff = time.time() - start
print('Total time: %s' % tDiff)
Note that I have modified the code to read from a file and to display execution time, as a convenience, and some small alterations are needed before it can be submitted.
getStartValues takes in the (big) list of digits of the input A and the (small) integer M and returns the values modulo M that can be generated from A by removing a single digit.
checkValues takes an index i, the list startValues, the integer M, the integer powNm1 (which is the value 10^(n-1) mod M, where n is the number of digits in A, a list checked that keeps track of whether a value has already been determined to be solvable, and the list chklst (which contains the indices i such that startValues[i] > 0).
The majority of the time is spent in the function getStartValues, since A could be up to 10^6 digits long. On my desktop, the getStartValues function call takes about 1.2s, while the rest of the compute function takes about 0.04s (for worst case inputs).

Is gmpy2 suitable for implementing RSA in python?

More specifically, is the gmpy2.next_prime function good enough to find the large primes needed? Or should I be using one of the other many gmpy2.*_prp functions?
For example, is the following code good enough for finding suitable primes for encryption?
import os
import gmpy2
def random(bytez):
seed = reduce(lambda a, b: (a << 8)|ord(b), os.urandom(bytez), 0)
return gmpy2.mpz_urandomb(gmpy2.random_state(seed), bytez*8)
def find_prime(bytez=128):
p = random(bytez)|1
while not gmpy2.is_bpsw_prp(p):
p = random(bytez)|1
return p
def good_pair(p, q):
n = p*q
k = gmpy2.ceil(gmpy2.log2(n))
if abs(p - q) > 2**(k/2 - 100):
return n
return 0
def make_rsa_keypair():
p, q = find_prime(), find_prime()
n = good_pair(p, q)
while not n:
p, q = find_prime(), find_prime()
n = good_pair(p, q)
tot = n - (p + q - 1)
e = (1 << 16) + 1
d = gmpy2.invert(e, tot)
return {
'public':{
'n':n,
'e':e,
},
'private':{
'n':n,
'd':d,
}
}
UPDATE: updated the code with the suggestion.
Disclaimer: I maintain gmpy2.
I would recommend using gmpy2.is_bpsw_prp instead of gmpy2.next_prime. The BPSW test will be faster and there are no known counter-examples. The is_prime and next_prime checks used to use, and may still use, a fixed set of bases and it is possible to composites that pass a series of known tests. IIRC, someone found a composite that passed the first 17 checks. By default, 25 checks are done but it is a weakness.
I am planning to include an APR-CL provable primality test in the next release of gmpy2.
There are specific guidelines for selecting RSA primes that should be followed to prevent accidentally choosing primes that create an n that can be easily factored.

Simple RSA code

Hi I am trying to create a working RSA program, but on a very small level, I am having problems encrypting and decrypting with this code, can someone help me figure out what is wrong? I have tried doing this many different ways, but this way seems to be the right math, so I believe it might just be my lack of coding skills? Thanks
import random, math
def RandomPrime():
prime = False
while prime == False:
n = 2
while n % 2 == 0:
n = random.randint(10000, 100000)
s = math.trunc(n**0.5)
s = int(s)
x = 3
# While n doesn't exactly divide to equal 0, and x is less then the sqrt of n
while ( n % x != 0 ) and (x <= s):
x = x + 2
# if n is greater than s, it means it has run out of numbers to test, so is prime
if x > s:
prime = True
return n
def Modulus(p, q):
M = p * q
return M
def Totient(p, q):
T = ((p-1) * (q-1))
return T
def Pubkey(T):
prime = False
while prime == False:
n = 2
while n % 2 == 0:
n = random.randint(3, T)
s = math.trunc(n**0.5)
s = int(s)
x = 3
# While
while ( n % x != 0 ) and (x <= s):
x = x + 2
if x > s:
prime = True
return n
def privkey( T, n):
y = math.fmod(1, T)
d = float((y / n))
return d
# z is my encyption in this scenario
z = 8
# I generate p and q, using my random prime generator, i used low primes in
# this example just to see if it would work but it is still not showing reults
p = RandomPrime()
q = RandomPrime()
print(p, q)
#This creates the modulus
M = Modulus(p, q)
print(M)
# Eulier's totient
T = Totient(p, q)
print(T)
#Pub key creation
n = Pubkey(T)
print(n)
#Priv key creation
d = privkey(n, T)
print(d)
enc = (pow(z, n)) % M
print('enc: ', enc)
dec = (pow(enc, d)) % M
print('dec: ', dec)
Your privkey function appears wrong - I'm guessing you saw the definition of RSA's private key value as something like:
the value "e" such that e * d = 1 mod Phi(N)
However in this case, 1 mod Phi(N) does not mean The remainder when 1 is divided by Phi(N) (which appears to be the way you have translated it into code, based on your use of math.fmod(1, T), but in fact should be read more like:
the value "e" such that (e * d) mod Phi(N) = 1
This value is generally calculated using the Extended Euclidean Algorithm. An example Python implementation is here.
It's also worth noting that you seem to be defining privkey(T, n) but calling it as privkey(n, T).
Check my blog which in detail contains the implementation of the following using python:
MD5 Secure hash Algorithm RFC 1321, RSA public Key cryptography RFC 3447, OpenPGP RFC 4880
def keyGen():
''' Generate Keypair '''
i_p=randint(0,20)
i_q=randint(0,20)
# Instead of Asking the user for the prime Number which in case is not feasible,
# generate two numbers which is much highly secure as it chooses higher primes
while i_p==i_q:
continue
primes=PrimeGen(100)
p=primes[i_p]
q=primes[i_q]
#computing n=p*q as a part of the RSA Algorithm
n=p*q
#Computing lamda(n), the Carmichael's totient Function.
# In this case, the totient function is the LCM(lamda(p),lamda(q))=lamda(p-1,q-1)
# On the Contrary We can also apply the Euler's totient's Function phi(n)
# which sometimes may result larger than expected
lamda_n=int(lcm(p-1,q-1))
e=randint(1,lamda_n)
#checking the Following : whether e and lamda(n) are co-prime
while math.gcd(e,lamda_n)!=1:
e=randint(1,lamda_n)
#Determine the modular Multiplicative Inverse
d=modinv(e,lamda_n)
#return the Key Pairs
# Public Key pair : (e,n), private key pair:(d,n)
return ((e,n),(d,n))
Blog Link :Python Cryptography
Github Link : Python Cryptography

Categories