I'm trying to use PHP7.4 to replicate a piece of python code which is using Pycryptodome to do a AES-128-CFB encryption.
For this I'm using the openssl_encrypt built-in function of PHP.
I tried several configuration parameters and CFB modes but I'm getting different results all the time.
I found out that pycryptodomes CFB implementation seems to use the 8 bit segment size, which should be the aes-128-cfb8 mode in PHP's openssl implementation.
The IV is intentionally fixed to 0, so please just ignore the fact it is unsecure.
Here is the code I want to replicate, followed by the PHP code trying to replicate the results with different approaches.
Something tells me it has to do with PHP's 'byte handling', because python distincts between a byte string (returned by .encode('utf-8')) and string.
At the end you can see the outputs of both codes:
Python code:
import hashlib
from Crypto.Cipher import AES
key = 'testKey'
IV = '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
ENC_KEY = hashlib.md5(key.encode('utf-8')).hexdigest()
print('key: "' + key + '"')
print('hashedKey: ' + ENC_KEY)
obj = AES.new(ENC_KEY.encode("utf8"), AES.MODE_CFB, IV.encode("utf8"))
test_data = 'test'
print('encrypting "' + test_data + '"')
encData = obj.encrypt(test_data.encode("utf8"))
print('encData: ' + encData.hex())
PHP code:
function encTest($testStr, $ENC_KEY)
{
$iv = hex2bin('00000000000000000000000000000000');
echo "aes-128-cfb8-1: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb8', $ENC_KEY, OPENSSL_RAW_DATA, $iv))."\n";
echo "aes-128-cfb1-1: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb1', $ENC_KEY, OPENSSL_RAW_DATA, $iv))."\n";
echo "aes-128-cfb-1: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb', $ENC_KEY, OPENSSL_RAW_DATA, $iv))."\n";
echo "\n";
echo "aes-128-cfb8-2: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb8', $ENC_KEY, OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
echo "aes-128-cfb1-2: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb1', $ENC_KEY, OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
echo "aes-128-cfb-2: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb', $ENC_KEY, OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
echo "\n";
echo "aes-128-cfb8-3: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb8', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
echo "aes-128-cfb1-3: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb1', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
echo "aes-128-cfb-3: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
echo "\n";
echo "aes-128-cfb8-4: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb8', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA, $iv))."\n";
echo "aes-128-cfb1-4: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb1', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA, $iv))."\n";
echo "aes-128-cfb-4: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA, $iv))."\n";
echo "\n";
}
$key = "testKey";
$ENC_KEY = hash('md5', utf8_encode($key));
echo "ENC_KEY: ".$ENC_KEY."\n";
$test = "test";
echo "encrypting \"".$test."\"\n";
encTest($test, $ENC_KEY);
Python output (encData should be replicated):
key: "testKey"
hashedKey: 24afda34e3f74e54b61a8e4cbe921650
encrypting "test"
encData: 117c1974
PHP output:
key: "testKey"
hashedKey: 24afda34e3f74e54b61a8e4cbe921650
encrypting "test"
aes-128-cfb8-1: b0016a55
aes-128-cfb1-1: bac44c56
aes-128-cfb-1: b0f1c27a
aes-128-cfb8-2: b0016a55
aes-128-cfb1-2: bac44c56
aes-128-cfb-2: b0f1c27a
aes-128-cfb8-3: b0016a55
aes-128-cfb1-3: bac44c56
aes-128-cfb-3: b0f1c27a
aes-128-cfb8-4: b0016a55
aes-128-cfb1-4: bac44c56
aes-128-cfb-4: b0f1c27a
In the PHP code (more precisely for openssl_encrypt), the AES variant is specified explicitly, e.g. as in the current case with aes-128-..., i.e. PHP uses AES-128. A key that is too long is truncated, a key that is too short is padded with 0 values. Since the hash method in the PHP code returns its result as hex string, the 16 bytes MD5 hash is represented by 32 characters (32 bytes), i.e. in the current case PHP uses the first 16 bytes of the key (AES-128).
The hexdigest method in the Python code also returns the result as hex string. However, in the Python code (more precisely for PyCryptodome), the AES variant is specified by the keysize, i.e. the Python code uses the full 32 bytes key and thus AES-256.
The different keys and AES variants are the main reason for the different results. To fix this issue, the same keys and AES variants must be used in both codes:
Option 1 is to use AES-128 in the Python code as well. This can be achieved by the following change:
obj = AES.new(ENC_KEY[:16].encode("utf8"), AES.MODE_CFB, IV.encode("utf8"))
Then the output b0016a55 is in accordance with the result of the PHP code for aes-128-cfb8.
Option 2 is to also use AES-256 in the PHP code. This can be done by replacing aes-128... with aes-256... Then the output is
aes-256-cfb8-1: 117c1974
aes-256-cfb1-1: 54096db1
aes-256-cfb-1 : 11bfdaa9
and, as expected, the output 117c1974 for aes-128-cfb8 matches the original value of the Python code.
The CFB mode changes a block cipher into a stream cipher. Thereby n bits are encrypted in each encryption step, which is called CFBn. For the exact details s. here.
The term CFBn (or cfbn) is also used in PHP, i.e. CFB1 means encryption of one bit, CFB8 of 8 bit (= one byte) and CFB of a whole block (16 bytes). In Python, the number of bits per step is specified with segment_size.
I.e. the counterpart of ...-cfb8 in PHP is segment_size = 8 in Python and the counterpart of ...-cfb in PHP is segment_size = 128 in Python.
In the following it is assumed that an identical key and an identical AES variant are used in both codes.
Since segment_size = 8 is the default, the result from the Python code is the same as for ...-cfb8 from the PHP code. If segement_size = 128 in the Python code is chosen, the result is the same as for ...-cfb in the PHP code. However, in PyCryptodome the segment_size must be an integer multiple of 8, otherwise the error message 'segment_size' must be positive and multiple of 8 bits is displayed. For this reason the CFB1 mode is not supported by PyCryptodome.
Also note:
The result of the digest can also be returned binary in both codes and not as hex string. To do this, the third parameter of the PHP method hash must be set to TRUE (default: FALSE). In Python, simply use the digest method instead of hexdigest.
In the PHP code, for a stream cipher mode like CFB, padding is automatically disabled, so the OPENSSL_ZERO_PADDING flag (which can be used to explicitly disable padding) makes no difference.
utf8_encode allows you to convert from ISO-8859-1 encoding to UTF-8, but since the $ENC_KEY consists of alphanumeric characters (hex encoding) this has no effect. In general, however, arbitrary binary data (such as the result of a digest) must not be UTF8 encoded, as this would corrupt the data. There are other encodings for this purpose, such as Base64. If the results of the digest are returned in binary form (see 1st point), no UTF8 encoding may be performed.
There is a bug in the legacy PyCrypto library in the context of CFB mode that requires the plaintext to have a length that is an integer multiple of the segment size. Otherwise the following error occurs: Input strings must be a multiple of the segment size 16 in length.
Can use the openssl command to implement equivalent functionality with Python RSA and Base64 algorithms?
For example, the RSA public key and the password to be encrypted are known. The Python algorithm is
ciphertext = base64.b64encode(PKCS1_v1_5.new(pubkey).encrypt(password.encode('utf-8')))
Suppose the password is 123456 and the public key is pubkey.pem. Is the following openssl command equivalent to the Python algorithm?
echo 123456 | openssl rsautl -encrypt -pubin -inkey pubkey.pem -out ciphertext.txt | openssl enc -e -base64 -in ciphertext.txt -out r.txt
The result of r.txt is equivalent to the ciphertext of the Python algorithm?
Yes, it should be, well, if and only if your password is "123456" of course.
The randomization that the PKCS#1 v1.5 padding performs will mean that the ciphertext is always different, of course.
To test: decrypt the ciphertext using the reverse operations and the private key in Python. For more certainty, decrypt using openssl as well.
I can decrypt some data encrypted via openssl command line tool, but some 'extra' data is returned with the original data.
I've created a encrypted file like this:
$ echo this is it >file.txt
$ openssl rsautl -encrypt -pubin -inkey public.pem -in file.txt -out encrypted.txt
And I can access the original data with:
from Crypto.PublicKey import RSA
key = open('/tmp/.private.key').read()
rsakey = RSA.importKey(key, 'MyPassphrase')
data = open('/tmp/encrypted.txt', 'rb').read()
original = rsakey.decrypt(data)
But some extra data is returned and the output is something like this:
\x02h\x83\xcfx\x84,\xb1\xa6 [...] \xcf5\x9f\xbbG\xf1\x14\xd0\x8d\x1f\xfe\x9c4\xbb\x1aB\xfa\xc3b\xc2\xe0K\x85\xb5\x10y\xe1\x8e\x00this is this\n
Can I avoid to receive this raw data before the decrypted data?
Obs.: The keys were created with openssl tool
Tks.
You are getting PKCS#1 v1.5 padded plaintext back. You need to remove the PKCS#1 v1.5 padding first. Currently you are performing textbook (or "raw") RSA decryption, which is little more than modular exponentiation. Try a PKCS#1 v1.5 capable class instead, like this one
When I perform following action in terminal I get different decryption text compared to message showing that these keys map to unique cipher
openssl enc -des-ecb -in text.in -out cipher.txt -k '96508092'
openssl enc -d -des-ecb -in cipher.txt -out text.in -k '82514145'
But when I implement it in programming using <openssl/des.h>, Crypto.cipher, pyDes I got same decrypted text. I found why I get same text and it is because these 8 byte keys map to a unique 7 byte key 0x3832343134313401. Refer to my previous question Why can I encrypt data with one DES key and successfully decrypt with another?
My question is: how is it implemented on OpenSSL terminal commands differently compared to mentioned libraries that it could map these 8 byte keys to unique cipher?
You need to use an uppercase -K if you want to supply key bytes. Otherwise, OpenSSL assumes its a password and derives a (different) key from it.
You also need to use the hex versions of the keys:
openssl enc -des-ecb -in text.in -out cipher.txt -K '3832353134313435'
openssl enc -d -des-ecb -in cipher.txt -out text.out -K '3933353035303434'
I need to encrypt a file, send it to another person, who then can only decrypt it using shell.
I usually encrypt the file with the openssl command: openssl enc -aes-256-cbc -salt -in [filename] -out [file out name] -pass file:[direct path to key file], then send the file.
The other person then would decrypt the file again with the openssl command: openssl enc -d -aes-256-cbc -in [encrypted file] -out [file out name] -pass file:[direct path to key file]
I would use os.system to do this, but I feel like there has to be another way to encrypt the file with python which then could be decrypted on the shell side.
Do you need to use openssl?
I use command line GnuPG and there is very nice Python library: python-gnupg . It is a wrapper over command line gpg so they work simply the same.
Instead of key file (I think it contains password) you can use asymmetric cryptography. Create private/public pairs of keys for each part and then encrypt message using recipient public key and sigg it using sender private key. Recipient will check signature of sender using sender public key and recipient will decrypt message using her private key. Private keys can be protected by password but if you are sure your environments are safe you can use empty passwords.