Making RSA keys from a password in python

Making RSA keys from a password in python - python

I want to be able to generate and re-generate the same RSA keys from a password (and salt) alone in python.
Currently I was doing it using pycrypto, however, it does not seem to generate the same exact keys from the password alone. The reason seems to be that when pycrypto generates a RSA key it uses some sort of random number internally.
Currently my code looks as follows:
import DarkCloudCryptoLib as dcCryptoLib #some costume library for crypto
from Crypto.PublicKey import RSA
password = "password"
new_key1 = RSA.generate(1024) #rsaObj
exportedKey1 = new_key1.exportKey('DER', password, pkcs=1)
key1 = RSA.importKey(exportedKey1)
new_key2 = RSA.generate(1024) #rsaObj
exportedKey2 = new_key2.exportKey('DER', password, pkcs=1)
key2 = RSA.importKey(exportedKey2)
print dcCryptoLib.equalRSAKeys(key1, key2) #wish to return True but it doesn't
I don't really care if I have to not use pycrypto, as long as I can generate these RSA keys from passwords and salts alone.
Thanks for the help in advance.
Just for reference, this is how dcCryptoLib.equalRSAKeys(key1, key2) function looks like:
def equalRSAKeys(rsaKey1, rsaKey2):
public_key = rsaKey1.publickey().exportKey("DER")
private_key = rsaKey1.exportKey("DER")
pub_new_key = rsaKey2.publickey().exportKey("DER")
pri_new_key = rsaKey2.exportKey("DER")
boolprivate = (private_key == pri_new_key)
boolpublic = (public_key == pub_new_key)
return (boolprivate and boolpublic)
NOTE: Also, I am only using RSA for authentication. So any solution that provides a way of generating secure asymmetric signatures/verifying generated from passwords are acceptable solutions for my application. Though, generating RSA keys from passwords I feel, is a question that should also be answered as it seems useful if used correctly.

If you're trying to implement an authenticated encryption scheme using a shared password, you don't really need an RSA key: all you need is an AES key for encryption and an HMAC key for authentication.
If you do need to generate an asymmetric signature than can be verified without knowing the password, you're going to have to somehow generate RSA (or DSA, etc.) keys in a deterministic manner based on the password. Based on the documentation, this should be possible by defining a custom randfunc, something like this:
from Crypto.Protocol.KDF import PBKDF2
from Crypto.PublicKey import RSA
password = "swordfish" # for testing
salt = "yourAppName" # replace with random salt if you can store one
master_key = PBKDF2(password, salt, count=10000) # bigger count = better
def my_rand(n):
# kluge: use PBKDF2 with count=1 and incrementing salt as deterministic PRNG
my_rand.counter += 1
return PBKDF2(master_key, "my_rand:%d" % my_rand.counter, dkLen=n, count=1)
my_rand.counter = 0
RSA_key = RSA.generate(2048, randfunc=my_rand)
I've tested this, and it does generate deterministic RSA keys (as long as you remember to reset the counter, at least). However, note that this is not 100% future-proof: the generated keys might change, if the pycrypto RSA key generation algorithm is changed in some way.
In either case, you'll almost certainly want to preprocess your password using a slow key-stretching KDF such as PBKDF2, with an iteration count as high as you can reasonably tolerate. This makes breaking your system by brute-force password guessing considerably less easy. (Of course, you still need to use strong passwords; no amount of key-stretching is going to help if your password is abc123.)

Pass "randfunc" to the RSA.generate, and randfunc should return the output bytes, in order, of a well-known key derivation function that has been configured with enough output bits for RSA to "always complete" without needing more bits.
Argon2, scrypt, PBKDF2 are examples of KDFs designed for this purpose.
It may be possible to use Keccak directly as a KDF by specifying a high number of output bits.
If your generation function follows a well known standard closely, it should work across multiple implementations.

Related

Is this AES GCM file encryption good practice?

I'm using this to encrypt a file, and then to decrypt a file, using AES-GCM:
(do pip install pycryptodome first if not installed yet)
import Crypto.Random, Crypto.Protocol.KDF, Crypto.Cipher.AES
def cipherAES_GCM(pwd, nonce):
key = Crypto.Protocol.KDF.PBKDF2(pwd, nonce, count=100_000)
return Crypto.Cipher.AES.new(key, Crypto.Cipher.AES.MODE_GCM, nonce=nonce)
# encrypt
plaintext = b'HelloHelloHelloHelloHelloHelloHello' # in reality, read from a file
key = b'mykey'
nonce = Crypto.Random.new().read(16)
c, tag = cipherAES_GCM(key, nonce).encrypt_and_digest(plaintext)
ciphertext = nonce + tag + c # write ciphertext to disk as the "encrypted file"
# decrypt
nonce, tag, c = ciphertext[:16], ciphertext[16:32], ciphertext[32:] # read from the "encrypted file" on disk
plain = cipherAES_GCM(key, nonce).decrypt_and_verify(c, tag).decode()
print(plain) # HelloHelloHelloHelloHelloHelloHello
Is this considered a good encryption practice, and what the potential weaknesses of this file encryption implementation?
Remark: I have 10,000 files to encrypt. If each single time I encrypt a file, I call the KDF (with a high count value), this will be highly unefficient!
A better solution would be: call the KDF only once (with a nonce1), and then for each file do:
nonce2 = Crypto.Random.new().read(16)
cipher, tag = AES.new(key, AES.MODE_GCM, nonce=nonce2).encrypt_and_digest(plain)
But then does this mean I have to write nonce1 | nonce2 | ciphertext | tag to disk for each file? This adds an additional 16-byte nonce1 to each file...

A suggestion for improving your code would be to apply a 12 bytes nonce for GCM. Currently a 16 bytes nonce is used and this should be changed, see here sec. Note, and here.
Crucial for the security of GCM is that no key/nonce pair is used more than once, here. Since in your code for each encryption a random nonce is generated, this issue is prevented.
Your code applies the nonce also as salt for the key derivation, which is in principle no security problem as this does not lead to multiple use of the same key/nonce pair, here.
However, a disadvantage from this is possibly that the salt length is determined by the nonce length. If this is not desired (i.e. if e.g. a larger salt should be used), an alternative approach would be to generate a random salt for each encryption to derive both the key and nonce via the KDF, here. In this scenario, the concatenated data salt | ciphertext | tag would then be passed to the recipient. Another alternative would be to completely separate nonce and key generation and to generate for each encryption both a random nonce and a random salt for key generation. In this case the concatenated data salt | nonce | ciphertext | tag would have to be passed to the recipient. Note that like the nonce and the tag, also the salt is no secret, so that it can be sent along with the ciphertext.
The code applies an iteration count of 100,000. Generally, the following applies: The iteration count should be as high as can be tolerated for your environment, while maintaining acceptable performance, here. If 100,000 meets this criterion for your environment then this is OK.
The concatenation order you use is nonce | tag | ciphertext. This is not a problem as long as both sides know this. Often by convention, the nonce | ciphertext | tag order is used (e.g. Java implicitly appends the tag to the ciphertext), which could also be used in the code if you want to stick to this convention.
It is also important that an up-to-date, maintained library is used, which is the case with PyCryptodome (unlike its predecessor, the legacy PyCrypto, which should not be used at all).
Edit:
The PBKDF2 implementation of PyCryptodome uses by default 16 bytes for the length of the generated key, which corresponds to AES-128. For the digest HMAC/SHA1 is applied by default. The posted code uses these standard parameters, none of which are insecure, but can of course be changed if necessary, here.
Note: Although SHA1 itself is insecure, this does not apply in the context
of PBKDF2 or HMAC, here. However, to support the extinction of SHA1 from the ecosystem, SHA256 could be used.
Edit: (regarding the update of the question):
The use case presented in the edited question is the encryption of 10,000 files. The posted code is executed for each file, so that a corresponding number of keys are generated via the KDF which leads to a corresponding loss of perfomance. This is described by you as highly unefficient. However, it should not be forgotten that the current code focuses on security and less on performance. In my answer I pointed out that e.g. the iteration count is a parameter which allows tuning between performance and security within certain limits.
A PBKDF (password based key derivation function) allows to derive a key from a weak password. To keep the encryption secure, the derivation time is intentionally increased so that an attacker cannot crack the weak password faster than a strong key (ideally). If the derivation time is shortened (e.g. by decreasing the iteration count or by using the same key more than once) this generally leads to a security reduction. Or in short, a performance gain (by a faster PBKDF) generally reduces security. This results in a certain leeway for more performant (but weaker) solutions.
The more performant solution you suggest is the following: As before, a random nonce is generated for each file. But instead of encrypting each file with its own key, all files are encrypted with the same key. For this purpose, a random salt is generated once, with which this key is derived via the KDF. This does indeed mean a significant performance gain. However, this is automatically accompanied by a reduction in security: Should an attacker succeed in obtaining the key, the attacker can decrypt all files (and not just one as in the original scenario). However, this disadvantage is not a mandatory exclusion criterion if it is acceptable within the scope of your security requirements (which seems to be the case here).
The more performant solution requires that the information salt | nonce | ciphertext | tag must be sent to the recipient. The salt is important and must not be missing, because the recipient needs the salt to derive the key via the PBKDF. Once the recipient has determined the key, the ciphertext can be authenticated with the tag and decrypted using the nonce. If it has been agreed with the recipient that the same key will be used for each file, it is sufficient for the recipient to derive the key once via the PBKDF. Otherwise the key must be derived for each file.
If the salt with its 16 bytes is unwanted (since it is identical for all files in this approach), alternative architectures could be considered. For example, a hybrid scheme might be used: A random symmetric key is generated and exchanged using a public key infrastructure. Also here, all files can be encrypted with the same key or each file can be encrypted with its own key.
But for more specific suggestions for a design proposal, the use case should be described in more detail, e.g. regarding the files: How large are the files? Is processing in streams/chunks necessary? Or regarding the recipients: How many recipients are there? What is aligned with the recipients? etc.

This seems to be fine but I have a recommendation which is to not use same nonce for encryption and key derivation (nonce stands for key used only once using same nonce so you can pass the md5 hash of nonce to the encryption function instead if you dont want to use another nonce(IV). Second I think you can switch to cryptography if you are interested in better security . This is example code using cryptography module to encrypt which also has the advantage of encrypting using 128-bit key which is secure and it take care of the rest such as IV(nonces), decryption and verification(is done using HMAC). So all your code above can be summarized in this few lines which lead to less complexity so arguably more secure code.
from cryptography.fernet import Fernet
plaintext = b"hello world"
key = Fernet.generate_key()
ctx = Fernet(key)
ciphertext = ctx.encrypt(plaintext)
print(ciphertext)
decryption = ctx.decrypt(ciphertext)
print(decryption)
EDIT: Note that the nonce you use will also weaken up the key since the nonce is sent with ciphertext, now the salt used for PBKDF is pointless and now the attacker have to just guess your password(assuming using default count) which in this case is very simple one, brute forcing can take no longer than 26^5 tries(total of lowercase alphabets for total of length 5).

Python hashlib.sha256() digest length

I have some python code,
hash_object = hashlib.sha256(b'Hello World')
hex_dig = hash_object.hexdigest()
cipher = AES.new(hex_dig, AES.MODE_CBC, iv)
plain = cipher.decrypt( cipher )
but, I have an error - ValueError: AES key must be either 16, 24, or 32 bytes long
But, I want 32bytes key, not 16bytes key.
I don't know why hash_val=hashfct.digest() is not 32bytes
Also, I tried "hash_val=hashfct.digest()[0:32]" but it is not work, too.
How can I get the 32byte long key?
Thanks.

You should really consider a proper key derivation algorithm instead of rolling your own. PBKDF2 is one of the more common algorithms that should protect you from some of the usual mistakes. For example, in your case, it is very easy to brute force the password because you only have one round of hashing.
Here is some modified sample code from hashlib:
>>> import hashlib
>>> dk = hashlib.pbkdf2_hmac('sha256', b'password', b'salt', 100000)
>>> dk[:32]
b'\x03\x94\xa2\xed\xe32\xc9\xa1>\xb8.\x9b$c\x16\x04\xc3\x1d\xf9x\xb4\xe2\xf0\xfb\xd2\xc5I\x94O\x9dy\xa5'
You should also make sure b'salt' is random and different every time you generate a new key. For a cryptographically secure random function in Python, see How can I create a random number that is cryptographically secure in python?
This is for Python 3, but should be simple enough to adjust for Python 2.

you need to use the digest method
hash_object = hashlib.sha256(b'Hello World')
hex_dig = hash_object.digest()
cipher = AES.new(hex_dig, AES.MODE_CBC, iv)
plain = cipher.decrypt( cipher )
I really don't know the reason but this works, because I had the same problem.

Unique Salt per User using Flask-Security

After reading here a bit about salting passwords, it seems that it's best to use a unique salt for each user. I'm working on implementing Flask-Security atm, and from the documentation it appears you can only set a global salt: ie SECURITY_PASSWORD_SALT = 'thesalt'
Question: How would one go about making a unique salt for each password?
Thanks!
edit: from the docs on Flask-Security, I found this, which seems to again suggest that this module only uses a single salt for all passwords out of the box.
flask_security.utils.get_hmac(password)
Returns a Base64 encoded HMAC+SHA512 of the password signed with the salt
specified by SECURITY_PASSWORD_SALT.

Yes, Flask-Security does use per-user salts by design if using bcrypt (and other schemes such as des_crypt, pbkdf2_sha256, pbkdf2_sha512, sha256_crypt, sha512_crypt).
The config for 'SECURITY_PASSWORD_SALT' is only used for HMAC encryption. If you are using bcrypt as the hashing algorithm Flask-Security uses passlib for hashing and it generates a random salt during hashing. This confustion is noted in issue 268: https://github.com/mattupstate/flask-security/issues/268
It can be verified in the code, walking from encrypt to passlib:
flask_security/utils.py (lines 143-151, 39, and 269)
def encrypt_password(password):
...
return _pwd_context.encrypt(signed)
_pwd_context = LocalProxy(lambda: _security.pwd_context)
flask_security/core.py (269, 244-251, and 18)
pwd_context=_get_pwd_context(app)
def _get_pwd_context(app):
...
return CryptContext(schemes=schemes, default=pw_hash, deprecated=deprecated)
from passlib.context import CryptContext
and finally from: https://pythonhosted.org/passlib/password_hash_api.html#passlib.ifc.PasswordHash.encrypt
note that each call to encrypt() generates a new salt,

Turns out that if you use bcrypt, it takes care of the salting and stores it with the hash. So I'll go that route!
Thanks to this topic which lead me to this discovery:
Do I need to store the salt with bcrypt?

AES Encryption pycrypto and salting

I don't have a problem with my code but I don't understand the different arguments you can use with pycrypto and AES encryption. so where I define my encryptor below, what is mode, and IV? the tutorial I found this on didn't really help me understand it. I have it working properly but I want to understand that the arguments are.
so Question #1: What are the arguments associated with defining a encryptor with pycrpto?
Question #2 is this an appropriate salting method for the encryption. I'm using a very long randomized ascii string, then converting it to a 256bit sha then using that to do AES encryption on the information, then I base64 encode and insert into the database.
def pad(string):
return string + ((16-len(string) % 16) * '{' )
password = hashlib.sha256("").digest()
IV = 16 * '\x00'
mode = AES.MODE_CBC
encryptor = AES.new(password, mode, IV=IV)
encrypted_customer_name = encryptor.encrypt(pad(customer_name))
encoded_ecryption_name = base64.b64encode(encrypted_customer_name)
customer_name = base64.b64decode(customer_name)
decryptor = AES.new(password, mode, IV=IV)
customer_name = decryptor.decrypt(customer_name)
lenofdec = customer_name.count('{')
customer_name = customer_name[:len(customer_name)-lenofdec]
My code isn't in that order but I didn't include all of the code just the relevant parts.

Ok I'm going to do my best here to answer these questions!
Q1:
Ok it looks like the signature is
new(key, *args, **kwargs)
The first argument key is pretty self explanatory, but after that you notice that it can take a number of keyword arguments.
It seems that it can take:
mode: The cypher mode (these are as follows. Look on wikipedia for definitions)
MODE_ECB = 1 Electronic Code Book (ECB). See blockalgo.MODE_ECB.
MODE_CBC = 2 Cipher-Block Chaining (CBC). See blockalgo.MODE_CBC.
MODE_CFB = 3 Cipher FeedBack (CFB). See blockalgo.MODE_CFB.
MODE_PGP = 4 This mode should not be used.
MODE_OFB = 5 Output FeedBack (OFB). See blockalgo.MODE_OFB.
MODE_CTR = 6 CounTer Mode (CTR). See blockalgo.MODE_CTR.
MODE_OPENPGP = 7 OpenPGP Mode. See blockalgo.MODE_OPENPGP.
IV: the salt (you seem to already understand this)
From here on out the options seem to be based on the specific mode you are using
counter: A function that returns the next block of data (not normally used). From the docs:
(Only MODE_CTR). A stateful function that returns the next counter block, which is a byte string of block_size bytes
segment_size is the size of the segment in CFB mode
The Pycrypto docs
Q2: Is this a good method for salting your encryption?
First what is the salt for? I find that this is a very common question that people ask, I mean we already have a password, why else would we need a key?
The answer makes a lot of sense when you talk about passwords. Lets say my password is banana, when we write this to a password file we would send it through a hash algorithm and get 5a814... (sha256).
Now next time someone tries to use the password banana they get the same hash. Any one with permissions to the file can then look and see that the passwords are the same. This is where the salt comes in. If I append a random salt before running through the hash algorithm then the hash will come out different every time, even if the passwords are the same. This makes your system WAY more secure.
Alright now for your code:
First, congrats you are calling the function correctly! But... Your code sets IV = 16 * '\x00' this is not a very good salt at all. I would recommend using os.urandom(16) to generate high quality entropy (uses system entropy) and place the output in your code. It is common practice to write the salt into the beginning of the the encrypted content.
This is tricky to say without knowing what you are attempting to do with code, but let me explain with an example:
# Get User password
MODE = AES.MODE_CBC
def encrypt(msg, password):
salt = os.urandom(16)
password = sha256(password)
crypter = AES.new(password, mode=MODE, IV=salt)
return "{}:{}".format(salt, crypter.encrypt(msg))
def decrypt(enc, password):
salt, content = enc.split(':')
password = sha256(password)
crypter = AES.new(password, mode=MODE, IV=salt)
return crypter.decrypt(content)
I hope this was helpful! Happy Coding!

Hashing in SHA512 using a salt? - Python

I have been looking through ths hashlib documentation but haven't found anything talking about using salt when hashing data.
Help would be great.

Samir's answer is correct but somewhat cryptic. Basically, the salt is just a randomly derived bit of data that you prefix or postfix your data with to dramatically increase the complexity of a dictionary attack on your hashed value. So given a salt s and data d you'd just do the following to generate a salted hash of the data:
import hashlib
hashlib.sha512( s + d ).hexdigest()
See this wikipedia article for more details

Just add the salt to your sensitive data:
>>> import hashlib
>>> m = hashlib.sha512()
>>> m.update('salt')
>>> m.update('sensitive data')
>>> m.hexdigest()
'70197a4d3a5cd29b62d4239007b1c5c3c0009d42d190308fd855fc459b107f40a03bd427cb6d87de18911f21ae9fdfc24dadb0163741559719669c7668d7d587'
>>> n = hashlib.sha512()
>>> n.update('%ssensitive data' % 'salt')
>>> n.hexdigest()
'70197a4d3a5cd29b62d4239007b1c5c3c0009d42d190308fd855fc459b107f40a03bd427cb6d87de18911f21ae9fdfc24dadb0163741559719669c7668d7d587'
>>> hashlib.sha512('salt' + 'sensitive data').hexdigest()
'70197a4d3a5cd29b62d4239007b1c5c3c0009d42d190308fd855fc459b107f40a03bd427cb6d87de18911f21ae9fdfc24dadb0163741559719669c7668d7d587'

Salting isn't a magical process that the library needs to help you with—it's just additional data provided to stop rainbow tables from working.
>>> import hashlib
>>> m = hashlib.sha512()
>>> m.update(b"Nobody inspects")
>>> m.update(b" the spammish repetition")
>>> m.digest()
b'\xd0\xf4\xc1LH\xadH7\x90^\xa7R\x0c\xc4\xafp\x0fd3\xce\t\x85\xe6\xbb\x87\xb6\xb4a|\xb9D\xab\xf8\x14\xbdS\x96M\xdb\xf5[A\xe5\x81+:\xfe\x90\x89\x0c\nM\xb7\\\xb0Cg\xe19\xfdb\xea\xb2\xe1'
>>> m.update(b"My super-secret salt.")
>>> m.digest()
b'\xcd\xd7K\xd9!~\xa8\x1d6\x9b\xa6\xde\x06\t\x02\xa1+}\xaeNA\x94a`\xaa\xf4\xe9\xb5\xff\x1f\x9cE\x84m\xbb\x98U\xb4z\x92\x9e\xe8\xc9\xc2\xc8\x8f\x068e\xb0\r\xed\xb7\xde\x80\xa6,\n\x111w{\xa2\x9b'

If you're looking for a replacement for crypt(), newer versions of glibc have SHA-512-based "$6$" with a variable iteration count (see Ulrich Drepper's page, which has a description and links to a complete C implementation of sha512_crypt_r()).
Writing your own crypto is highly unadvisable — the above sha512(salt+password) doesn't help against a brute-force attack.
For generating salt, use something like os.urandom(16) for random bytes or ''.join(map(lambda x:'./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'[ord(x)%64], os.urandom(16))) for random base64-alike chars (for use with crypt()-alikes).
(I say base64-alike it's not the same as the Base64 in PEM/MIME.)

use passlib, writing your own password crypto is an almost sure way to failure.

SHA512 isn't a great way to store hashed passwords these days. You should be using bcrypt or something similar. What's important is that salting is built in and that the algorithm has a significant work factor.
If you salt your SHA512 passwords by simply appending (or prepending) the salt to the plaintext, anyone who gets their hands on a set of your hashed passwords and applies a modern cracking tool (http://arstechnica.com/security/2013/05/how-crackers-make-minced-meat-out-of-your-passwords/) will be able to see the concatenated password+salt values and will probably, through trivial pattern matching, be able to separate the password portion from the salt portion for most if not all of the accounts in question.
I haven't thought this through all the way, and I am by no means a security expert, but it seems to me that if you were to encrypt (using, for example, AES256) the password using the salt as the key, and then hash that with SHA512, you'd be safe from the vulnerability I described above.
However, at that point you've put in more effort than it would have taken to switch to bcrypt and you still wouldn't have the protection of a work factor, so I would only recommend an approach like that if the environment you're working in does not offer that option.

yes yes if my password is "pass" and my salt is "word"
my pass+salt is "password" same as just use password xD
or we use very secure crypth that safe salt to hashed output lol.
we just strip salt and generate hash with random passwords when we got same hash we got password lol lol

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.