Converting Passlib hash salt and digest to regular / PHC Base64 - python

I need to port a passlib (pbkdf2-sha512 to be specific) digest to another system (Auth0) and need to convert it to PHC string format using a B64 encoded salt (regular base64 without padding characters).
Passlib encodes to a shortened base64 format using it's own 'adapted base64 encoding' method. I basically need to convert that to the B64 format to include in my import file for Auth0.
The pbkdf2-sha512 documentation explains the output format as:
$pbkdf2-digest$rounds$salt$checksum
My complete pbkdf2-sha512 hash of my password 'Password!' looks like this:
$pbkdf2-sha512$25000$8d7bW2stZaw1BoBQyhkjZA$Dszct0GGjjfikK3cJhx.4M.YdOoytY9T5qaib9y8C/gvC1rE4iCWT970bN/MJD81RVToY.855KWRsGoPudA0HA
A simple script to output the passlib hash:
from passlib.hash import pbkdf2_sha512
print(pbkdf2_sha512.hash("Password!"))
From the Passlib documentation, I assume the following:
Key size: 16k (128 bits) - this is the default (not specified anywhere in the output)
Digest type: pbkdf2-sha512
Rounds: 25000
Salt: 8d7bW2stZaw1BoBQyhkjZA
Digest / Hash data: Dszct0GGjjfikK3cJhx.4M.YdOoytY9T5qaib9y8C/gvC1rE4iCWT970bN/MJD81RVToY.855KWRsGoPudA0HA
What I'm struggling to do is convert the salt and digest to the B64 format required by Auth0. Any help is appreciated!

OK, I think I've figured it out...
My understanding is that passlib's format simply replaces + with . and strips the padding and white space, so for my needs (vanilla B64 without whitespace or padding), I merely need to replace . with + and I have the correct format.
In my testing I also noticed that key length is 512 bytes (64k) not 128/16 as I originally thought.
I confirmed it using this tool: https://8gwifi.org/pbkdf.jsp
(Note that padding is added to the salt to make it 24 characters long)
For anyone wanting a bit of background on the Passlib base 64 flavor, see this post on Google groups.

Related

Encryption with NodeJS doesn't match encryption using Python (cryptography.fernet)

Cryptography noob here. I'm trying to write a script in NodeJS that encrypts a string and produces output that matches the output of my Python script that uses the cryptography.fernet library. My overall goal is to use the original key to encrypt messages in Node that will later be decrypted using Python.
Sample of my Python code:
from cryptography.fernet import Fernet
key = Fernet.generate_key() # For example: 6saGtiTFEXej729GUWSeAyQdIpRFdGhfY2XFUDpvsu8=
f = Fernet(key)
message = 'Hello World'
encoded = message.encode()
encrypted = f.encrypt(encoded)
Which produces the output: gAAAAABhJs_E-dDVp_UrLK6PWLpukDAM0OT5M6bfcqvVoCvg7r63NSi4OWOamLpABuYQG-5wsts_9h7cLbCsWmctArXcGqelXz_BXl_o2C7KM9o7_eq7VTc=
My Node script uses the built-in Crypto module and must also use the same 32-byte key that is being used in my Python program. I know that fernet uses is AES-128-CBC as its algorithm, so that's what I'm using for my Node script.
My NodeJS code:
const crypto = require("crypto");
const key = '6saGtiTFEXej729GUWSeAyQdIpRFdGhfY2XFUDpvsu8=';
const algorithm = 'aes-128-cbc';
const message = 'Hello World';
const iv = crypto.randomBytes(16);
const cipher = crypto.createCipheriv(algorithm, key, iv);
const encrypted = cipher.update(message, 'utf8', 'hex') + cipher.final('hex');
Which is giving me: Error: Invalid key length
My first problem is that I'm unsure how to convert the key so that it's the proper length. I also know from looking at fernet's source code that the key is split into two parts: the first 16 bytes are the signing_key and the last 16 bytes are the encryption_key - I haven't found much information on whether/how I need to deal with those two pieces of the original key in my Node implementation.
Since I'm new to this I'm a little confused on how to accomplish what I'm after. Any tips or advice is very much appreciated.
The specs for the Fernet format can be found on https://github.com/fernet/spec/blob/master/Spec.md
There they specify both a generating and a veryfying steps, here is the generating which should give enough information for your implementation:
Record the current time for the timestamp field.
Choose a unique IV.
Construct the ciphertext:
Pad the message to a multiple of 16 bytes (128 bits) per RFC 5652, section 6.3. This is the same padding technique used in PKCS #7 v1.5 and all versions of SSL/TLS (cf. RFC 5246, section 6.2.3.2 for TLS 1.2).
Encrypt the padded message using AES 128 in CBC mode with the chosen IV and user-supplied encryption-key.
Compute the HMAC field as described above using the user-supplied signing-key.
Concatenate all fields together in the format above.
base64url encode the entire token.
From this we can see that the signing key (first half of full key) is used in HMAC, while the second half is used in the AES128-CBC, so just dividing the key into two separate elements (with proper conversion from hex string to bytes) should be enough for using Node.js crypto module (https://nodejs.org/en/knowledge/cryptography/how-to-use-crypto-module/) to construct your own implementation.

Convert string with HEX MD5 to base64 encoding

I need to convert a HEX-type md5 string to the base64 version in Python.
For example, if I had MD5: 4297f44b13955235245b2497399d7a93
I need the code to produce Qpf0SxOVUjUkWySXOZ16kw==
This is identical to another SO asking for a C# implementation, but I need the Python code. This is similar to this SO asking to convert a single binary number to base64 in Python.
Depending on the version of Python you are running, the following will work:
Python 2
base64.b64encode("4297f44b13955235245b2497399d7a93".decode("‌​hex"))
Python 3
base64.b64encode(bytes.fromhex("4297f44b13955235245b2497399d‌​7a93"))

Verify Python Passlib generated PBKDF2 SHA512 Hash in .NET

I am migrating a platform which used Passlib 1.6.2 to generate password hashes. The code to encrypt the password is (hash is called with default value for rounds):
from passlib.hash import pbkdf2_sha512 as pb
def hash(cleartext, rounds=10001):
return pb.encrypt(cleartext, rounds=rounds)
The output format looks like (for the password "Patient3" (no quotes)):
$pbkdf2-sha512$10001$0dr7v7eWUmptrfW.9z6HkA$w9j9AMVmKAP17OosCqDxDv2hjsvzlLpF8Rra8I7p/b5746rghZ8WrgEjDpvXG5hLz1UeNLzgFa81Drbx2b7.hg
And "Testing123"
$pbkdf2-sha512$10001$2ZuTslYKAYDQGiPkfA.B8A$ChsEXEjanEToQcPJiuVaKk0Ls3n0YK7gnxsu59rxWOawl/iKgo0XSWyaAfhFV0.Yu3QqfehB4dc7yGGsIW.ARQ
I can see that represents:
Algorithm SHA512
Iterations 10001
Salt 0dr7v7eWUmptrfW.9z6HkA (possibly)
The Passlib algorithm is defined on their site and reads:
All of the pbkdf2 hashes defined by passlib follow the same format, $pbkdf2-digest$rounds$salt$checksum.
$pbkdf2-digest$ is used as the Modular Crypt Format identifier ($pbkdf2-sha256$ in the example).
digest - this specifies the particular cryptographic hash used in conjunction with HMAC to form PBKDF2’s pseudorandom function for that particular hash (sha256 in the example).
rounds - the number of iterations that should be performed. this is encoded as a positive decimal number with no zero-padding (6400 in the example).
salt - this is the adapted base64 encoding of the raw salt bytes passed into the PBKDF2 function.
checksum - this is the adapted base64 encoding of the raw derived key bytes returned from the PBKDF2 function. Each scheme uses the digest size of its specific hash algorithm (digest) as the size of the raw derived key. This is enlarged by approximately 4/3 by the base64 encoding, resulting in a checksum size of 27, 43, and 86 for each of the respective algorithms listed above.
I found passlib.net which looks a bit like an abandoned beta and it uses '$6$' for the algorithm. I could not get it to verify the password. I tried changing the algorithm to $6$ but I suspect that in effect changes the salt as well.
I also tried using PWDTK with various values for salt and hash, but it may have been I was splitting the shadow password incorrectly, or supplying $ in some places where I should not have been.
Is there any way to verify a password against this hash value in .NET? Or another solution which does not involve either a Python proxy or getting users to resupply a password?
The hash is verified by passing the password into the PBKDF HMAC-SHA-256 hash method and then comparing the resulting hash to the saved hash portion, converted back from the Base64 version.
Saved hash to binary, then separate the hash
Convert the password to binary using UTF-8 encoding
PBKDF2,HMAC,SHA-256(toBinary(password, salt, 10001) == hash
Password: "Patient3"
$pbkdf2 - sha512$10001$0dr7v7eWUmptrfW.9z6HkA$w9j9AMVmKAP17OosCqDxDv2hjsvzlLpF8Rra8I7p/b5746rghZ8WrgEjDpvXG5hLz1UeNLzgFa81Drbx2b7.hg
Breaks down to (with the strings converted to standard Base64 (change '.' to '+' and add trailing '=' padding:
pbkdf2 - sha512
10001
0dr7v7eWUmptrfW+9z6HkA==
w9j9AMVmKAP17OosCqDxDv2hjsvzlLpF8Rra8I7p/b5746rghZ8WrgEjDpvXG5hLz1UeNLzgFa81Drbx2b7+hg==
Decoded to hex:
D1DAFBBFB796526A6DADF5BEF73E8790
C3D8FD00C5662803F5ECEA2C0AA0F10EFDA18ECBF394BA45F11ADAF08EE9FDBE7BE3AAE0859F16AE01230E9BD71B984BCF551E34BCE015AF350EB6F1D9BEFE86
Which makes sense: 16-byte (128-bit) salt and 64-byte (512-bit) SHA-512 hash.
Converting "Patient3" using UTF-8 to a binary array
Converting the salt from a modified BASE64 encoding to a 16 byte binary array
Using an iteration count od 10001
Feeding this to PBKDF2 using HMAC with SHA-512
I get
C3D8FD00C5662803F5ECEA2C0AA0F10EFDA18ECBF394BA45F11ADAF08EE9FDBE7BE3AAE0859F16AE01230E9BD71B984BCF551E34BCE015AF350EB6F1D9BEFE86
Which when Base64 encoded, replacing '+' characters with '.' and stripping the trailing '=' characters returns:
w9j9AMVmKAP17OosCqDxDv2hjsvzlLpF8Rra8I7p/b5746rghZ8WrgEjDpvXG5hLz1UeNLzgFa81Drbx2b7.hg
I quickly knocked together a .NET implementation using zaph's logic and using the code from JimmiTh on SO answer. I have put the code on GitHub (this is not supposed to be production ready). It appears to work with more than a handful of examples from our user base.
As zaph said the logic was:
Split the hash to find the iteration count, salt and hashed password. (I have assumed the algorithm, but you'd verify it). You'll have an array of 5 values containing [0] - Nothing, [1] - Algorithm, [2] - Iterations, [3] - Salt and [4] - Hash
Turn the salt into standard Base64 encoding by replacing any '.' characters with '+' characters and appending "==".
Pass the password, salt and iteration count to the PBKDF2-HMAC-SHA512 generator.
Convert back to the original base64 format by replacing any '+' characters with '.' characters and stripping the trailing "==".
Compare to the original hash (element 4 in the split string) to this converted value and if they're equal you've got a match.

RFC 4648 (Base64) Implementation in Python

Is there any implementation of RFC4648 ("The Base16, Base32, and Base64 Data Encodings") in Python?
Note, I am specifically looking for RFC 4648, not its predecessor. Other scripting languages might work as long as it does not take too long. Python is preferred.
Python has a base64 module which implements RFC 3548 (an older revision of RFC 4648).
Ruby implements RFC 4648 in the Base64#strict_decode64 and Base64#strict_encode64 methods. Source.
There's a string-operand version of base64 as well:
a_str = 'This is a string'
a_str.encode('base64', 'strict')
'VGhpcyBpcyBhIHN0cmluZw==\n'
It does the same as base64.b64encode but I thought I'll throw it in there as an option.
A Patch:
What you're looking for is "Extended Hex" for the regular base64 module which i remember i installed for a DNSSEC project a while back: http://bugs.python.org/issue16995
The Go standard library has a pretty good base64 implementation (as defined in RFC 4648).
Relevant source file:
base64.go
While working with https://golang.org/src/encoding/base32/base32.go
I wasn't able to find a python implementation of what RFC 4648 refers to as "Base 32 Encoding with Extended Hex Alphabet"; so i wrote one for python:
https://gist.github.com/graham/d7845f00fce0690b65ab049d52c1ddcb
As far as I can tell RFC 3548 doesn't differ if you use the standard character map, but I wasn't able to find a standard way to use the extended hex alphabet.
Feel free to copy paste, i'll write some tests and create a git repo if people think this is a reasonable solution.
i believe this hack will do the trick:
base64_url = base64_string.replace("==","").replace("/", "_").replace("+", "-")
Both Base64 and Base64url are ways to encode binary data in string form. You can read about the theory of base64 here http://en.m.wikipedia.org/wiki/Base64. The problem with Base64 is that it contains the characters +, /, and =, which have a reserved meaning in some filesystem names and URLs. So base64url solves this by replacing + with - and / with _. The trailing padding character = can be eliminated when not needed, but in a URL it would instead most likely be % URL encoded. Then the encoded data can be included in a URL without problems.

How to add a padding to the data to make it acceptable for AES256 encryption algorithm in pycrypto library

Can someone tell me how to add a padding to the data to make it acceptable for AES256 encryption algorithm in pycrypto library (Python).
Thanks a lot in advance.. :)
Looking at the documentation, it seems that it's up to you, the library user, to pad the data yourself. The documentation states that the block size for AES is always 16 bytes, so you need to pad the data to a multiple of 16 bytes.
How the padding is done depends on the type of the data. For strings the best approach is probably to encode the string to a specific encoding and then take the length of that encoding. That way you're not relying on all characters being represented by an 8-bit codepoint:
plaintext = data.encode('utf-8')
l = len(plaintext)
ciphertext = cipher.encrypt(plaintext + ((16 - len%16) * PADDING_BYTE))
A similar approach will work when you're data is an array of bytes.
0 should work fine as the PADDING_BYTE, but you need to take care to remove the padding when you're decrypting the data. It might be worth while including the length of the data in the ciphertext, e.g. prepend the length of the data to the plaintext before encryption, but then you need to jump through some hoops to make sure the padding is generated correctly.
Edit: oh yes, just like the RFC GregS links to mentions, the standard way of handling the length problem is the use the length of the padding as the padding byte. I.e. if you need 6 bytes of padding the padding byte is 0x06. Note that if you don't need any padding, you to add a whole block of padding bytes (16 bytes of 0xa0) so that you can recover the message correctly.
Use a standard padding scheme, such as the scheme outlined in PKCS-5, section 6.1.1 step #4 (replace the 8 in that example with 16 if you are using AES).

Categories