I'm working on a project using Pyramid 1.3 (Python 2.7) and storing data in MySQL. I have a table of email addresses, and I would like to encrypt them for storage. I am trying to encrypt them in the application, and then will decrypt them for viewing. I'm not going for complete security but am mainly aiming to obfuscate the data enough were the database itself compromised.
I'm using PyCrypto with AES, and have been trying to follow some posts on here and some web tutorials I found. The closest I found so far is this post, and it seems to work, at least encrypting it. I follow that and get something like "7hBAQrWhJRnL9YdBGJfRErGFwGi3aC6noGzYTrGwAoQ=" stored in the database. But the decrypt function keeps erroring with this:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa1 in position 1: ordinal not in range(128)
I came across some unicode presentation about Python which sort of helped me make more sense of it but I still keep getting the same error.
Is there a straightforward tutorial on how to encode, store in a database, pull out of database, and decode a source data string?
Do I need a specific collation on the database column? Does the field need to be a certain type? So far I've been using a default collation and setting it to VARCHAR, assuming that I was storing a string. It sounds like I've got some encoding problem somewhere with incompatible types or something but my head is spinning on where I need to change something.
Any better pointers or anything else I can provide? I can show my code but its basically a copy of the link above... I was just trying to get a proof of concept working before modifying it too much.
edit:
some sample source...
In MySQL, the table is
id (int)
client_id (int)
emailaddress varchar(100) utf8mb4_general_ci (I've been playing around with the collations, I have no idea what it should be!)
Python:
from base64 import b64encode, b64decode, urlsafe_b64decode, urlsafe_b64encode
BLOCK_SIZE = 32
INTERRUPT = u'\u0001'
PAD = u'\u0000'
def AddPadding(data, interrupt, pad, block_size):
new_data = ''.join([data, interrupt])
new_data_len = len(new_data)
remaining_len = block_size - new_data_len
to_pad_len = remaining_len % block_size
pad_string = pad * to_pad_len
return ''.join([new_data, pad_string])
def StripPadding(data, interrupt, pad):
return data.rstrip(pad).rstrip(interrupt)#data.rsplit(interrupt,1)[0]#rstrip(pad).rstrip(interrupt)
SECRET_KEY = u'a1b2c3d4e5f6g7h8a1b2c3d4e5f6g7h8'
IV = u'12345678abcdefgh'
cipher_for_encryption = AES.new(SECRET_KEY, AES.MODE_CBC, IV)
cipher_for_decryption = AES.new(SECRET_KEY, AES.MODE_CBC, IV)
def EncryptWithAES(encrypt_cipher, plaintext_data):
plaintext_padded = AddPadding(plaintext_data, INTERRUPT, PAD, BLOCK_SIZE)
encrypted = encrypt_cipher.encrypt(plaintext_padded)
return urlsafe_b64encode(encrypted)
def DecryptWithAES(decrypt_cipher, encrypted_data):
decoded_encrypted_data = urlsafe_b64decode(encrypted_data)
decrypted_data = decrypt_cipher.decrypt(decoded_encrypted_data)
return StripPadding(decrypted_data, INTERRUPT, PAD)
#encrypts it
posted_singleaddress = EncryptWithAES(cipher_for_encryption, posted_singleaddress)
#"me#mail.com" inserts "Ktpr49Uzn99HZXbmqEzGKlWo9wk-XBMXGZl_iyna-8c=" into the database
clientemails is the list of emails from the table above. I get the error when uncommenting out:
#if clientemails:
# decrypted = DecryptWithAES(cipher_for_decryption, clientemails[0].emailaddress)
I was just trying to decode the first item just to try and get it to work but that's the part that seems to be giving it fits now....
The general rule with PyCrypto is that cryptographic keys, IVs, plaintexts, paddings, and ciphertexts should always be defined as binary strings, not text. The fact you use Unicode for them is by itself a source of problems.
Another problems is that you pass to AES.new key and IV in hexadecimal encoded form, so that the former is 256 bits and the latter 128 bits. That seems still to work, but I guess your intention was to use AES128 - which has a 128 bit key. You therefore need to convert it to binary, for instance via unhexlify: the two character string b'34' will map to the single byte '\x34'. The IV needs to be twice as long.
In your code it's therefore better to have:
from binascii import unhexlify
INTERRUPT = b'\x01'
PAD = b'\x00'
SECRET_KEY = unhexlify('a1b2c3d4e5f6g7h8a1b2c3d4e5f6g7h8')
IV = unhexlify('12345678abcdefgh'*2)
If you need to encrypt text, you would first encode it (e.g. to UTF-8) and then pass it to your function EncryptWithAES(). See also this example taken from the PyCrypto API:
from Crypto.Cipher import AES
from Crypto import Random
key = b'Sixteen byte key'
iv = Random.new().read(AES.block_size)
cipher = AES.new(key, AES.MODE_CFB, iv)
msg = iv + cipher.encrypt(b'Attack at dawn')
The result of the encryption step (that is, the ciphertext) is again a binary string. In order to store it directly in the MySQL DB you must use either a BINARY or a VARBINARY type column.
Related
I have a 10 character code that I want to sign by my python program, then put both the code as well as the signature in an URL, which then get's processed by a PHP SLIM API. Here the signature should get verified.
I generated my RSA keys in python like this:
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.asymmetric import padding
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives.serialization import load_pem_private_key
from cryptography.hazmat.primitives.asymmetric import rsa
from cryptography.hazmat.primitives import serialization
def gen_key():
private_key = rsa.generate_private_key(
public_exponent=65537, key_size=2048, backend=default_backend()
)
return private_key
def save_key(pk):
pem_priv = pk.private_bytes(
encoding=serialization.Encoding.PEM,
format=serialization.PrivateFormat.PKCS8,
encryption_algorithm=serialization.NoEncryption()
)
with open(os.path.join('.', 'private_key.pem'), 'wb') as pem_out:
pem_out.write(pem_priv)
pem_pub = pk.public_key().public_bytes(
encoding=serialization.Encoding.PEM,
format=crypto_serialization.PublicFormat.SubjectPublicKeyInfo
)
with open(os.path.join('.', 'public_key.pem'), 'wb') as pem_out:
pem_out.write(pem_pub)
def main():
priv_key = gen_key()
save_key(priv_key)
I sign the key like this in python:
private_key = load_key()
pub_key = private_key.public_key()
code = '09DD57CE10'
signature = private_key.sign(
str.encode(code),
padding.PSS(
mgf=padding.MGF1(hashes.SHA256()),
salt_length=padding.PSS.MAX_LENGTH
),
hashes.SHA256()
)
The url is built like this
my_url = 'https://www.exmaple.com/codes?code={}&signature={}'.format(
code,
signature.hex()
)
Because the signature is a byte object I'm converting it to a string using the .hex() function
Now, in PHP, I am trying to verify the code and signature:
use phpseclib3\Crypt\PublicKeyLoader;
$key = PublicKeyLoader::load(file_get_contents(__DIR__ . "/public_key.pem"));
echo $key->verify($code, pack('h*', $signature)) ? 'yes' : 'no';
I also tried using PHP openssl_verify
$pub_key = file_get_contents(__DIR__ . "/public_key.pem");
$res = openssl_verify($code, pack('n*', $signature), $pub_key, OPENSSL_ALGO_SHA256);
However, it always tells me the signature is wrong, when I obviously know, that in general it is the correct signature. The RSA keys are all the correct and same keys in both python and php.
I think the issue is with the signature and how I had to convert it to a string and then back to a bytes like string in both python and php.
The Python code uses PSS.MAX_LENGTH as the salt length. This value denotes the maximum salt length and is recommended in the Cryptography documentation (s. here):
salt_length (int) – The length of the salt. It is recommended that this be set to PSS.MAX_LENGTH
In RFC8017, which specifies PKCS#1 and thus also PSS, the default value of the salt length is defined as the output length of the hash (s. A.2.3. RSASSA-PSS):
For a given hashAlgorithm, the default value of saltLength is the octet length of the hash value.
Most libraries, e.g. PHPSECLIB, apply for the default value of the salt length the default defined in RFC8017, i.e. the output length of the hash (s. here). Therefore the maximum salt length must be set explicitly. The maximum salt length is given by (s. here):
signature length (bytes) - digest output length (bytes) - 2 = 256 - 32 - 2 = 222
for a 2048 bits key and SHA256.
Thus, the verification in the PHP code must be changed as follows:
$verified = $key->
withPadding(RSA::SIGNATURE_PSS)->
//withHash('sha256')-> // default
//withMGFHash('sha256')-> // default
withSaltLength(256-32-2)-> // set maximum salt length
verify($code, pack('H*', $signature)); // alternatively hex2bin()
Note that in the posted code of the question h (hex string, low nibble first) is specified in the format string of pack(). I' ve chosen the more common H (hex string, high nibble first) in my code snippet which is also compatible with Python's hex(). Ultimately, the format string to choose depends on the encoding applied in the Python code.
Using this change, on my machine, the signature generated with the Python code can be successfully verified with the PHP code.
Alternatively, of course, the salt length of the Python code can be adapted to the output length of the digest (32 bytes in this case).
By the way, a verification with openssl_verify() is not possible, because PSS is not supported.
Okay, so basically I am having issues decrypting with Python.
I've managed to encrypt/decrypt data with Node.js - using "aes-128-ctr", the same goes for PyCrypto, but when I try to encrypt with Node.js and decrypt with Python I get invalid deciphered text.
Node.js code:
var key = "1234567890123456";
var cipher = crypto.createCipher("aes-128-ctr",key)
var ctext = cipher.update('asasasa','utf8','hex') + cipher.final('hex')
console.log(ctext) // outputs: "f2cf6ecd8f"
Python code:
counter = Counter.new(128)
cipher = AES.new("1234567890123456", AES.MODE_CTR, counter=counter)
cipher.decrypt("f2cf6ecd8f") // outputs: weird encoding characters
By the way, I don't care about the level of security of this encryption, I care about performance more.
crypto.createCipher takes a password and EVP_BytesToKey to derive a key and IV from that, but pycrypto directly expects a key and IV. You need to use exactly the same procedure.
crypto.createCipher must never be used with CTR-mode, because the key and IV generation are not randomized. Since the CTR-mode is a streaming mode, it will always produce the same key stream which might enable an attacker who only observes multiple ciphertexts that are encrypted with the same password to deduce the plaintext. This is possible because of the resulting many-time pad issue.
If you must use CTR-mode, then you have to use crypto.createCipheriv. If you use the same key, you have to use a different IV every time. This is why this is actually called a nonce for CTR-mode. For AES-CTR, a nonce of 96 bit is a good compromise between security and size of possibly encryptable plaintexts.
var key = "1234567890123456"
var iv = Buffer.concat([crypto.randomBytes(12), Buffer.alloc(4, 0)])
var cipher = crypto.createCipheriv("aes-128-ctr", key, iv)
var ctext = iv.toString('hex') + cipher.update('asasasa','utf8','hex') + cipher.final('hex')
console.log(ctext)
Example output:
5b88aeb265712b6c8bfa8dbd0000000063012d1e52eb42
The IV is not secret and you have to use the exact same IV during decryption. Usually, it is sent along with the ciphertext by being prefixed to it. It is then sliced off before decryption:
ct = codecs.decode('5b88aeb265712b6c8bfa8dbd0000000063012d1e52eb42', 'hex') # I'm using Python 3
counter = Counter.new(32, prefix=ct[:12], initial_value=0)
cipher = AES.new("1234567890123456", AES.MODE_CTR, counter=counter)
cipher.decrypt(ct[16:])
Output:
b'asasasa'
Keep in mind that a key needs to be randomly chosen. You can generate a random key and keep it in an encoded form in the source code (i.e. as Hex). If you do that, you must not give the source code or the bytecode to anyone that you wouldn't trust the key with.
EDIT: The issue can be simplified to this:
The following Node.js code give an "Invalid IV length" Error. Why? What should the IV be?
const crypto = require('crypto')
const decipher = crypto.createDecipheriv('aes-128-gcm', crypto.randomBytes(16), crypto.randomBytes(16))
I'm using AES in GCM mode to encrypt some data, but I'm using two different languages and libraries for encryption and decryption and they seem to have different vocabularies about what I need.
I'm encrypting with a Python library (Crypto). The encrypt_and_digest method takes a 128 bit key and a message and returns a 128 bit nonce, 128 bit tag, and a ciphertext.
(Encryption code taken from this example)
I'm decrypting with the default Node.js crypto library. That library expects a session key, a tag, and an IV. When I pass the nonce from the Python library as the IV, it gives me an “invalid iv size” error. Examples of the Node library seem to use a 12-character string as an IV.
My decryption code looks like this (taken from here):
var decipher = crypto.createDecipheriv(algorithm, password, nonce)
decipher.setAuthTag(encrypted.tag);
var dec = decipher.update(encrypted.content, 'hex', 'utf8')
What is the difference between IV and nonce for this scheme? How should I resolve this? Thanks!
It turns out the nonce for GCM should be 12 bytes long. I'm not sure why the python library defaults to auto-generating a 16-byte nonce, but you can generate your own and specify it manually in the AES constructor, so thats what I did. The whole system works perfectly now
I'm working with PyCrypto in Django and I need to encrypt a string using the user's secret key they made themselves. I successfully wrote an encryption method as follows:
from Crypto.Cipher import AES
from Crypto.Random import get_random_string
def encrypt(value, key):
"""
Return an encryption of value under key, as well as IV.
Pads value with extra bytes to make it multiple of 16.
"""
extra = 16 - (len(value) % 16)
data = value + chr(extra) * extra
iv = get_random_bytes(16)
encryption_suite = AES.new(key, AES.MODE_CBC, iv)
cipher_text = encryption_suite.encrypt(data)
return cipher_text, iv
Why am I not using Django's encryptions? Because there is a client application that is NOT written in Django (and won't ever be) that accepts the encrypted value the user stored previously and decrypts it once the user enters their secret key.
Problem is that I can't seem to save the encrypted value to the database for the User model. For example:
user = User.objects.get(id=user_id)
cipher, iv = encrypt(user_value, user_key)
user.secret_value = cipher
user.iv = iv
user.save()
This results in this error:
Warning: Incorrect string value: '\xE7\xAA\x13\x036\xC8...' for column 'iv' at row 1
(same error for secret_value)
I know this must be something to do with improper encoding. What's the right way to go about fixing this? Should I convert each byte into a string character?
Thanks.
I guess you're trying to save binary data into CharFields. Either change field types of user.iv and user.secret_value to BinaryField, or encode these values using for example base64 encoder.
I don't have a problem with my code but I don't understand the different arguments you can use with pycrypto and AES encryption. so where I define my encryptor below, what is mode, and IV? the tutorial I found this on didn't really help me understand it. I have it working properly but I want to understand that the arguments are.
so Question #1: What are the arguments associated with defining a encryptor with pycrpto?
Question #2 is this an appropriate salting method for the encryption. I'm using a very long randomized ascii string, then converting it to a 256bit sha then using that to do AES encryption on the information, then I base64 encode and insert into the database.
def pad(string):
return string + ((16-len(string) % 16) * '{' )
password = hashlib.sha256("").digest()
IV = 16 * '\x00'
mode = AES.MODE_CBC
encryptor = AES.new(password, mode, IV=IV)
encrypted_customer_name = encryptor.encrypt(pad(customer_name))
encoded_ecryption_name = base64.b64encode(encrypted_customer_name)
customer_name = base64.b64decode(customer_name)
decryptor = AES.new(password, mode, IV=IV)
customer_name = decryptor.decrypt(customer_name)
lenofdec = customer_name.count('{')
customer_name = customer_name[:len(customer_name)-lenofdec]
My code isn't in that order but I didn't include all of the code just the relevant parts.
Ok I'm going to do my best here to answer these questions!
Q1:
Ok it looks like the signature is
new(key, *args, **kwargs)
The first argument key is pretty self explanatory, but after that you notice that it can take a number of keyword arguments.
It seems that it can take:
mode: The cypher mode (these are as follows. Look on wikipedia for definitions)
MODE_ECB = 1 Electronic Code Book (ECB). See blockalgo.MODE_ECB.
MODE_CBC = 2 Cipher-Block Chaining (CBC). See blockalgo.MODE_CBC.
MODE_CFB = 3 Cipher FeedBack (CFB). See blockalgo.MODE_CFB.
MODE_PGP = 4 This mode should not be used.
MODE_OFB = 5 Output FeedBack (OFB). See blockalgo.MODE_OFB.
MODE_CTR = 6 CounTer Mode (CTR). See blockalgo.MODE_CTR.
MODE_OPENPGP = 7 OpenPGP Mode. See blockalgo.MODE_OPENPGP.
IV: the salt (you seem to already understand this)
From here on out the options seem to be based on the specific mode you are using
counter: A function that returns the next block of data (not normally used). From the docs:
(Only MODE_CTR). A stateful function that returns the next counter block, which is a byte string of block_size bytes
segment_size is the size of the segment in CFB mode
The Pycrypto docs
Q2: Is this a good method for salting your encryption?
First what is the salt for? I find that this is a very common question that people ask, I mean we already have a password, why else would we need a key?
The answer makes a lot of sense when you talk about passwords. Lets say my password is banana, when we write this to a password file we would send it through a hash algorithm and get 5a814... (sha256).
Now next time someone tries to use the password banana they get the same hash. Any one with permissions to the file can then look and see that the passwords are the same. This is where the salt comes in. If I append a random salt before running through the hash algorithm then the hash will come out different every time, even if the passwords are the same. This makes your system WAY more secure.
Alright now for your code:
First, congrats you are calling the function correctly! But... Your code sets IV = 16 * '\x00' this is not a very good salt at all. I would recommend using os.urandom(16) to generate high quality entropy (uses system entropy) and place the output in your code. It is common practice to write the salt into the beginning of the the encrypted content.
This is tricky to say without knowing what you are attempting to do with code, but let me explain with an example:
# Get User password
MODE = AES.MODE_CBC
def encrypt(msg, password):
salt = os.urandom(16)
password = sha256(password)
crypter = AES.new(password, mode=MODE, IV=salt)
return "{}:{}".format(salt, crypter.encrypt(msg))
def decrypt(enc, password):
salt, content = enc.split(':')
password = sha256(password)
crypter = AES.new(password, mode=MODE, IV=salt)
return crypter.decrypt(content)
I hope this was helpful! Happy Coding!