Verify Python Passlib generated PBKDF2 SHA512 Hash in .NET - python

I am migrating a platform which used Passlib 1.6.2 to generate password hashes. The code to encrypt the password is (hash is called with default value for rounds):
from passlib.hash import pbkdf2_sha512 as pb
def hash(cleartext, rounds=10001):
return pb.encrypt(cleartext, rounds=rounds)
The output format looks like (for the password "Patient3" (no quotes)):
$pbkdf2-sha512$10001$0dr7v7eWUmptrfW.9z6HkA$w9j9AMVmKAP17OosCqDxDv2hjsvzlLpF8Rra8I7p/b5746rghZ8WrgEjDpvXG5hLz1UeNLzgFa81Drbx2b7.hg
And "Testing123"
$pbkdf2-sha512$10001$2ZuTslYKAYDQGiPkfA.B8A$ChsEXEjanEToQcPJiuVaKk0Ls3n0YK7gnxsu59rxWOawl/iKgo0XSWyaAfhFV0.Yu3QqfehB4dc7yGGsIW.ARQ
I can see that represents:
Algorithm SHA512
Iterations 10001
Salt 0dr7v7eWUmptrfW.9z6HkA (possibly)
The Passlib algorithm is defined on their site and reads:
All of the pbkdf2 hashes defined by passlib follow the same format, $pbkdf2-digest$rounds$salt$checksum.
$pbkdf2-digest$ is used as the Modular Crypt Format identifier ($pbkdf2-sha256$ in the example).
digest - this specifies the particular cryptographic hash used in conjunction with HMAC to form PBKDF2’s pseudorandom function for that particular hash (sha256 in the example).
rounds - the number of iterations that should be performed. this is encoded as a positive decimal number with no zero-padding (6400 in the example).
salt - this is the adapted base64 encoding of the raw salt bytes passed into the PBKDF2 function.
checksum - this is the adapted base64 encoding of the raw derived key bytes returned from the PBKDF2 function. Each scheme uses the digest size of its specific hash algorithm (digest) as the size of the raw derived key. This is enlarged by approximately 4/3 by the base64 encoding, resulting in a checksum size of 27, 43, and 86 for each of the respective algorithms listed above.
I found passlib.net which looks a bit like an abandoned beta and it uses '$6$' for the algorithm. I could not get it to verify the password. I tried changing the algorithm to $6$ but I suspect that in effect changes the salt as well.
I also tried using PWDTK with various values for salt and hash, but it may have been I was splitting the shadow password incorrectly, or supplying $ in some places where I should not have been.
Is there any way to verify a password against this hash value in .NET? Or another solution which does not involve either a Python proxy or getting users to resupply a password?

The hash is verified by passing the password into the PBKDF HMAC-SHA-256 hash method and then comparing the resulting hash to the saved hash portion, converted back from the Base64 version.
Saved hash to binary, then separate the hash
Convert the password to binary using UTF-8 encoding
PBKDF2,HMAC,SHA-256(toBinary(password, salt, 10001) == hash
Password: "Patient3"
$pbkdf2 - sha512$10001$0dr7v7eWUmptrfW.9z6HkA$w9j9AMVmKAP17OosCqDxDv2hjsvzlLpF8Rra8I7p/b5746rghZ8WrgEjDpvXG5hLz1UeNLzgFa81Drbx2b7.hg
Breaks down to (with the strings converted to standard Base64 (change '.' to '+' and add trailing '=' padding:
pbkdf2 - sha512
10001
0dr7v7eWUmptrfW+9z6HkA==
w9j9AMVmKAP17OosCqDxDv2hjsvzlLpF8Rra8I7p/b5746rghZ8WrgEjDpvXG5hLz1UeNLzgFa81Drbx2b7+hg==
Decoded to hex:
D1DAFBBFB796526A6DADF5BEF73E8790
C3D8FD00C5662803F5ECEA2C0AA0F10EFDA18ECBF394BA45F11ADAF08EE9FDBE7BE3AAE0859F16AE01230E9BD71B984BCF551E34BCE015AF350EB6F1D9BEFE86
Which makes sense: 16-byte (128-bit) salt and 64-byte (512-bit) SHA-512 hash.
Converting "Patient3" using UTF-8 to a binary array
Converting the salt from a modified BASE64 encoding to a 16 byte binary array
Using an iteration count od 10001
Feeding this to PBKDF2 using HMAC with SHA-512
I get
C3D8FD00C5662803F5ECEA2C0AA0F10EFDA18ECBF394BA45F11ADAF08EE9FDBE7BE3AAE0859F16AE01230E9BD71B984BCF551E34BCE015AF350EB6F1D9BEFE86
Which when Base64 encoded, replacing '+' characters with '.' and stripping the trailing '=' characters returns:
w9j9AMVmKAP17OosCqDxDv2hjsvzlLpF8Rra8I7p/b5746rghZ8WrgEjDpvXG5hLz1UeNLzgFa81Drbx2b7.hg

I quickly knocked together a .NET implementation using zaph's logic and using the code from JimmiTh on SO answer. I have put the code on GitHub (this is not supposed to be production ready). It appears to work with more than a handful of examples from our user base.
As zaph said the logic was:
Split the hash to find the iteration count, salt and hashed password. (I have assumed the algorithm, but you'd verify it). You'll have an array of 5 values containing [0] - Nothing, [1] - Algorithm, [2] - Iterations, [3] - Salt and [4] - Hash
Turn the salt into standard Base64 encoding by replacing any '.' characters with '+' characters and appending "==".
Pass the password, salt and iteration count to the PBKDF2-HMAC-SHA512 generator.
Convert back to the original base64 format by replacing any '+' characters with '.' characters and stripping the trailing "==".
Compare to the original hash (element 4 in the split string) to this converted value and if they're equal you've got a match.

Related

How is the random ciphertext in RSA encryption algorithm implemented?

When I used rsa library to encrypt content in Python,
I found that even if the same public key and the same plaintext were used,
the output content was different each time, and the output ciphertext could be decrypted perfectly.
So I want to know how the RSA encryption algorithm implements this algorithm with different encryption results each time.
The following is the source code and the ciphertext output for many times.
import rsa
data = b'hello, world'
pk = rsa.PublicKey(21968272887747488664299300886573437453854580842272801065486318320328573181104433915148345103361664593733184722692105149694142557011266255075972021704711966860643495011049367729520386363274015109405027569939049707059547205662044677513224725454246882263137472476944688288600202939249708651097639414591301098996178101611307541565108035735952182518865647460401330824147744542993709272159435504287548711774248609991298003738752699597664282754244110245104529559246443251024491287411685325071990133422302961361831613169335261576570530061643400976849033234171349450189113706076777344091951159628029458250885131329209309850429, 65537)
sk = rsa.PrivateKey(21968272887747488664299300886573437453854580842272801065486318320328573181104433915148345103361664593733184722692105149694142557011266255075972021704711966860643495011049367729520386363274015109405027569939049707059547205662044677513224725454246882263137472476944688288600202939249708651097639414591301098996178101611307541565108035735952182518865647460401330824147744542993709272159435504287548711774248609991298003738752699597664282754244110245104529559246443251024491287411685325071990133422302961361831613169335261576570530061643400976849033234171349450189113706076777344091951159628029458250885131329209309850429, 65537, 7180742814003184493745817226790609535628314246962295259545720906634095162818242875479619891118201610188935763454388765380592975819694916096822751254380575157372246976924478622789961650274744826184819271605876418277150620865958482714928972468695190683750109638846897363602141498155351308783613387153774908482554823734710213533339079775940427840254792667407339506634483414544868884993644469123554250547973774825288728499603644573043340903253662627022861078040710813466717381393318974263956822836617559198769733538785368579523554468493535497334351910973554355558084517450711717078208243534059900951053098416621979162953, 2892399658197458942905975614589062229163400545478597547382814345027395128547900843767403239802516658965367060847402270250006453487328128143951683257674546551047677883067394312961875875837583648708792776670850392284514504120294996660277476938434444686489314576152155327763997732075822518345380214599954128122325100250621109610911, 7595171996887213720796562116779069406951367089854155042546817829399701614804640519699383335239152053864712615020908685785110173445687693446414448808069297671341400340127530462352491976340390927112062123224788804186559233620266300549932283394695195359373967318632526999572685782623554155939)
print(rsa.encrypt(data, pk)
# 1
b'\x17T\xc0\x03\xa4\xa6\xc06\x83\xdcM\xe5\xf9\xd8t\xc9>\xad}\xc9\x15[\xcc!\x19\x97/\xbf\xc7\xe4\xcbhu\x8d\xfb&\x18\x84\xc8e\xec\xe1\n\xfd$\x92\xda\x12S\x0f\r\xba\x81y\x88E\x9ceu\xd9\xd2Z\xf8\xc3\xd3&\xf2\xf7j\t\t\xf2\xc6w\xf6\x9a7\xbd\x01\x96\xad\xf5\x9e\xf4\xa8,\xd2\x19b\x0f\x05\x0c\xd8G\xe66\x91\x85.\xbdX\x0b\xd9H\xb14\xc6\x88\xb5\xd7\x1f\xed\xf7\xb4\x10\xb7\xad\x9f\xab\x01\r(\r*\xd90\x84\xba\xfb\xd9\x94HK\xdf\xaf\xa0\xf2\x98\x96\xb6*b\xb5\xc0\xa6\xe5A[\x9fwf\x18\x08v\x85\t\xb7\xf7\x97\xc74\xe5{;9qw\xb1u>\t`\xfd\x10\xfbu\xfb\xf5\x11\xe9\xc1\xa0I\x96\x03\xa5\x84\x0b\xcd\x060\xa1\xb1\xbcs|\xfe\xf3N\xad\xddA\xe2l\xf83N\xae\x9c\xbe\x1568\xe9\xf5\xfdn\xe9\xbc\x98\xb5\xb9Bn\xf1]!\x86\xd39\xd2<&\xd6}\x9a\xe2\xa4|\xf0\x9a\xaf\xac\x08^\x93\x174\n~L<+=\x8d\x95'
# 2
b'5\xbc\xb2\xaa\x16\'\xa2\x93\x16D\'S\xfc\x9fm\xc9\xbbF\xa6:dN\x91f\xc1\xaa\x05\xeb\xe4\x16|\xd3\x07#\xd5\xda\xe9\x9b\xd0V\xd4\xb0#Y\xf2G\x0c\xae\xb7A\x9a\xaa\xb8^\xf8\xea\xddj%\xd0\xe8w\xb2\xf1\x9c\xf8D\xcc\x9b\xfe\xea\x16hT\x81\'u`\x10"\xaf\xe3\xd3#\xa0\xc2\x18\x8f^lE\xb0H\xe8\xd5\xf2\x8e\xd8\x8fq;\xd7B]\xc8j\x94\'0\xb0\x80\x0f\xd3\xd1\x90I\x1eL\x91y\x8dA\x01\xda>x`\x0b}6:\xb6o\xcf\xd1=\x15p\xdb\x16\xd3bF\xd5\xc9\\\x86\x1b\xeb\xc4H\x11\x04\xa9o\xe1\xffSF\xe3\xc1\x99\x05\xc44\x03\x86\x81\xbb#>\xfb\xc2\x0bscbW\x0f\xb8\x92\x81\xbb\x19c\xd1n\t\xa4sI\x91+\x97\x9e\x0b\xf1\x8b\xd2;\xa9NV\xc1\xb0#\xd1\xa24P\xce\x93US\xf5\x97=m\xb3\xb6\xd3\x9b\'\xade\x1e\xbc\x80\x13C\x99\x93\x89&\xbd\xde\x83f\\H6\xad2\nFM\xf07q\xe9`\xb1H\x98#X'
# 3
b"'E\xdb\xfd\xe4\xf9\x0c\xe1\xa4l\xaaq\x0e#\xde2\xe9\xe4\x12\xb3\xc2d\xd1W\xde*\x8d<\xcb\x1a\xea\xb4\xb86\x9bV0\r\xef\xfb\xafg\xe8\x1eHzg\x03I\x99ta\xad\x84[r.E\xbb\xc2\xae\xf1\xc2\xafd\xcb\xa6`\xf0)U\x85\xb1\n0\xb2\x05\x17s\xa3\xe3f\xb7\xda\x08\xd1\xae#\xd8\xa7\x90Tce\xc2\xac\xf3Q\x81\xbe1\x92\x8d\xcb\xbf\xfa\x88\xf3'\xe8\xa1\x9e\x9e\xae~\xb90Uq\x98\xe6\x17b\x9d]1\xf6\xabirw\xbc\x89\xae\xd8\xdf\x8a\xf5\xf1\xd4*~\x94\xe38\x1f$\x0e\x94t\xb64\x83q\xf8\x8f\xd6pR\xd4%\xf8\x1cv\xc5\xfe\x8d]\xcfy\xff\xb9\xc7\x10\xaao%\xa8\x13\xce6#Y\xfa\x06\xb8\xab(H^\xd8\x1a\xb63\xb0\xb0c\xe0\x11#\xa9\t\xdd\xa8\\\xeag\xc6H\xa5L\x0b\x10\xdb\xa9\xc44\xdcZ\xf1`\xa2\xc1^;\x1d\xdf\xbf\x92\x894\x847\xe9\x16\x15\xad\xd1c\xf9.\xc21\x02\x85\xb1\x0b\x96=\xf3D\xdf\xf7\xbep\x9c"
# 4
b'$\x82\xc8\x95\xcb\xdaq\xc0\x16\x0e\xef\xb6\xc8\x89\xabKQafM\x10^\x11\xea2\xfc\x8b\x0b~H\xfd\xe5\xe0\x80\x81<\xae\xb7\xfeT)K\xb3\x96\xc0y\x83e\x93\xae\xdb\x93\x82\xea\xb7\xb7\xdbQJX\xb2\xfdM\xf2(A6+e\xb7\x89\x8a\xba6\xb7\xa3\xde*\xea\xe0\x1cR\xa9i\x8a\x9aEK\xa2T\xebM\xa9\x1d\x96\x87\xaf\xb2I\xcej!"\xe2\xc8\xc08\x94\x8a\x18\x1d\t\x11`\xdf*\xbc\xb9\xf6J\xbci\xb3\xcc\xde\xb0\xa5\x98b}o\x94\xbe\xe0\x7f\xe2J\x8a\xa2)R{U\xdfu\xf6UO\xc2C\xf3\'\x87c\x1e\xc6\xe0\xbe\x879\xa5N\xb3J\xc8Cz\x9b\xa7\xec\x90[\xa8\x8a\xac\xeep\\ar\xbd\x94O\xce]\x1fw\x1bm|K\xce\x15\xf6\xcc\xc5\xc84\x9a\x00Z\x0b\xfd\xe9\xfb^6\x9b\xfd\xeb\x8c\xf1h\xda\x17\xc4\xb0\x08\\-\n7\x9e\x1f\x1d\xa7\xb4\xb9\xf0wq\x9a\x15G\xc5\x90\xf5\x00\x89\tI\x16\x90\xbcI\x80z\x90\xdb\nO\xdc\xe5\x8fh\xca'
Any asymmetric encryption method has to be randomized, so that if you encrypt the same plaintext twice, you don't get the same ciphertext. Otherwise it would be very insecure. Anyone who has the public key can encrypt something. Suppose an adversary has a ciphertext, they want to find out the plaintext, and they have partial information about the plaintext (e.g. they know it's a message in a certain format, but they don't know the exact content). They can try encrypting possible values of the plaintext until the result is the ciphertext they want to break. But since the encryption is randomized, they need to use the same data input and the same random value, otherwise they won't get the same ciphertext. And the adversary can't know what random value went into the ciphertext they want to break.
For RSA, in practice, there are two methods for doing encryption. Both are defined by the document known as PKCS#1. Both take the plaintext to encrypt and apply a transformation to it that involves either appending random data (PKCS#1 v1.5) or masking with random data (PSS). Then the result undergoes the well-known exponentiation part of RSA.
You can use the exponentiation to inspect a ciphertext.
n = 21968272887747488664299300886573437453854580842272801065486318320328573181104433915148345103361664593733184722692105149694142557011266255075972021704711966860643495011049367729520386363274015109405027569939049707059547205662044677513224725454246882263137472476944688288600202939249708651097639414591301098996178101611307541565108035735952182518865647460401330824147744542993709272159435504287548711774248609991298003738752699597664282754244110245104529559246443251024491287411685325071990133422302961361831613169335261576570530061643400976849033234171349450189113706076777344091951159628029458250885131329209309850429
e = 65537
d = 7180742814003184493745817226790609535628314246962295259545720906634095162818242875479619891118201610188935763454388765380592975819694916096822751254380575157372246976924478622789961650274744826184819271605876418277150620865958482714928972468695190683750109638846897363602141498155351308783613387153774908482554823734710213533339079775940427840254792667407339506634483414544868884993644469123554250547973774825288728499603644573043340903253662627022861078040710813466717381393318974263956822836617559198769733538785368579523554468493535497334351910973554355558084517450711717078208243534059900951053098416621979162953
c1 = b'\x17T\xc0\x03\xa4\xa6\xc06\x83\xdcM\xe5\xf9\xd8t\xc9>\xad}\xc9\x15[\xcc!\x19\x97/\xbf\xc7\xe4\xcbhu\x8d\xfb&\x18\x84\xc8e\xec\xe1\n\xfd$\x92\xda\x12S\x0f\r\xba\x81y\x88E\x9ceu\xd9\xd2Z\xf8\xc3\xd3&\xf2\xf7j\t\t\xf2\xc6w\xf6\x9a7\xbd\x01\x96\xad\xf5\x9e\xf4\xa8,\xd2\x19b\x0f\x05\x0c\xd8G\xe66\x91\x85.\xbdX\x0b\xd9H\xb14\xc6\x88\xb5\xd7\x1f\xed\xf7\xb4\x10\xb7\xad\x9f\xab\x01\r(\r*\xd90\x84\xba\xfb\xd9\x94HK\xdf\xaf\xa0\xf2\x98\x96\xb6*b\xb5\xc0\xa6\xe5A[\x9fwf\x18\x08v\x85\t\xb7\xf7\x97\xc74\xe5{;9qw\xb1u>\t`\xfd\x10\xfbu\xfb\xf5\x11\xe9\xc1\xa0I\x96\x03\xa5\x84\x0b\xcd\x060\xa1\xb1\xbcs|\xfe\xf3N\xad\xddA\xe2l\xf83N\xae\x9c\xbe\x1568\xe9\xf5\xfdn\xe9\xbc\x98\xb5\xb9Bn\xf1]!\x86\xd39\xd2<&\xd6}\x9a\xe2\xa4|\xf0\x9a\xaf\xac\x08^\x93\x174\n~L<+=\x8d\x95'
print(binascii.unhexlify('0' + hex(pow(int(binascii.hexlify(c1), 16), d, n))[2:]))
That last value is the padded plaintext. You can see the data in there, with padding before it. This is the PKCS#1 v1.5 padding method (which is insecure unless used very carefully, and should not be used except for backward compatibility with systems that require it).

Converting Passlib hash salt and digest to regular / PHC Base64

I need to port a passlib (pbkdf2-sha512 to be specific) digest to another system (Auth0) and need to convert it to PHC string format using a B64 encoded salt (regular base64 without padding characters).
Passlib encodes to a shortened base64 format using it's own 'adapted base64 encoding' method. I basically need to convert that to the B64 format to include in my import file for Auth0.
The pbkdf2-sha512 documentation explains the output format as:
$pbkdf2-digest$rounds$salt$checksum
My complete pbkdf2-sha512 hash of my password 'Password!' looks like this:
$pbkdf2-sha512$25000$8d7bW2stZaw1BoBQyhkjZA$Dszct0GGjjfikK3cJhx.4M.YdOoytY9T5qaib9y8C/gvC1rE4iCWT970bN/MJD81RVToY.855KWRsGoPudA0HA
A simple script to output the passlib hash:
from passlib.hash import pbkdf2_sha512
print(pbkdf2_sha512.hash("Password!"))
From the Passlib documentation, I assume the following:
Key size: 16k (128 bits) - this is the default (not specified anywhere in the output)
Digest type: pbkdf2-sha512
Rounds: 25000
Salt: 8d7bW2stZaw1BoBQyhkjZA
Digest / Hash data: Dszct0GGjjfikK3cJhx.4M.YdOoytY9T5qaib9y8C/gvC1rE4iCWT970bN/MJD81RVToY.855KWRsGoPudA0HA
What I'm struggling to do is convert the salt and digest to the B64 format required by Auth0. Any help is appreciated!
OK, I think I've figured it out...
My understanding is that passlib's format simply replaces + with . and strips the padding and white space, so for my needs (vanilla B64 without whitespace or padding), I merely need to replace . with + and I have the correct format.
In my testing I also noticed that key length is 512 bytes (64k) not 128/16 as I originally thought.
I confirmed it using this tool: https://8gwifi.org/pbkdf.jsp
(Note that padding is added to the salt to make it 24 characters long)
For anyone wanting a bit of background on the Passlib base 64 flavor, see this post on Google groups.

What hash (Python 3 hashlib) yields a portable hash of file contents?

I would like to compute the hash of the contents (sequence of bits) of a file (whose length could be any number of bits, and so not necessarily a multiple of the trendy eight) and send that file to a friend along with the hash-value. My friend should be able to compute the same hash from the file contents. I want to use Python 3 to compute the hash, but my friend can't use Python 3 (because I'll wait till next year to send the file and by then Python 3 will be out of style, and he'll want to be using Python++ or whatever). All I can guarantee is that my friend will know how to compute the hash, in a mathematical sense---he might have to write his own code to run on his implementation of the MIX machine (which he will know how to do).
What hash do I use, and, more importantly, what do I take the hash of? For example, do I hash the str returned from a read on the file opened for reading as text? Do I hash some bytes-like object returned from a binary read? What if the file has weird end-of-line markers? Do I pad the tail end first so that the thing I am hashing is an appropriate size?
import hashlib
FILENAME = "filename"
# Now, what?
I say "sequence of bits" because not all computers are based on the 8-bit byte, and saying "sequence of bytes" is therefore too ambiguous. For example, GreenArrays, Inc. has designed a supercomputer on a chip, where each computer has 18-bit (eighteen-bit) words (when these words are used for encoding native instructions, they are composed of three 5-bit "bytes" and one 3-bit byte each). I also understand that before the 1970's, a variety of byte-sizes were used. Although the 8-bit byte may be the most common choice, and may be optimal in some sense, the choice of 8 bits per byte is arbitrary.
See Also
Is python's hash() portable?
First of all, the hash() function in Python is not the same as cryptographic hash functions in general. Here're the differences:
hash()
A hash is an fixed sized integer that identifies a particular value. Each value needs to have its own hash, so for the same value you will get the same hash even if it's not the same object.
Note that the hash of a value only needs to be the same for one run of Python. In Python 3.3 they will in fact change for every new run of Python
What does hash do in python?
Cryptographic hash functions
A cryptographic hash function (CHF) is a mathematical algorithm that maps data of an arbitrary size (often called the "message") to a bit array of a fixed size
It is deterministic, meaning that the same message always results in the same hash.
https://en.wikipedia.org/wiki/Cryptographic_hash_function
Now let's come back to your question:
I would like to compute the hash of the contents (sequence of bits) of a file (whose length could be any number of bits, and so not necessarily a multiple of the trendy eight) and send that file to a friend along with the hash-value. My friend should be able to compute the same hash from the file contents.
What you're looking for is one of the cryptographic hash functions. Typically, to calculate the file hash, MD5, SHA-1, SHA-256 are used. You want to open the file as binary and hash the binary bits, and finally digest it & encode it in hexadecimal form.
import hashlib
def calculateSHA256Hash(filePath):
h = hashlib.sha256()
with open(filePath, "rb") as f:
data = f.read(2048)
while data != b"":
h.update(data)
data = f.read(2048)
return h.hexdigest()
print(calculateSHA256Hash(filePath = 'stackoverflow_hash.py'))
The above code takes itself as an input, hence it produced an SHA-256 hash for itself, being 610e15155439c75f6b63cd084c6a235b42bb6a54950dcb8f2edab45d0280335e. This remains consistent as long as the code is not changed.
Another example would be to hash a txt file, test.txt with content Helloworld.
This is done by simply changing the last line of the code to "test.txt"
print(calculateSHA256Hash(filePath = 'text.txt'))
This gives a SHA-256 hash of 5ab92ff2e9e8e609398a36733c057e4903ac6643c646fbd9ab12d0f6234c8daf.
I arrived at sha256hexdigestFromFile, an alternative to #Lincoln Yan 's calculateSHA256Hash, after reviewing the standard for SHA-256.
This is also a response to my comment about 2048.
def sha256hexdigestFromFile(filePath, blocks = 1):
'''Return as a str the SHA-256 message digest of contents of
file at filePath.
Reference: Introduction of NIST (2015) Secure Hash
Standard (SHS), FIPS PUB 180-4. DOI:10.6028/NIST.FIPS.180-4
'''
assert isinstance(blocks, int) and 0 < blocks, \
'The blocks argument must be an int greater than zero.'
with open(filePath, 'rb') as MessageStream:
from hashlib import sha256
from functools import reduce
def hashUpdated(Hash, MESSAGE_BLOCK):
Hash.update(MESSAGE_BLOCK)
return Hash
def messageBlocks():
'Return a generator over the blocks of the MessageStream.'
WORD_SIZE, BLOCK_SIZE = 4, 512 # PER THE SHA-256 STANDARD
BYTE_COUNT = WORD_SIZE * BLOCK_SIZE * blocks
yield MessageStream.read(BYTE_COUNT)
return reduce(hashUpdated, messageBlocks(), sha256()).hexdigest()

Python Password Hashing with hashlib

I'm trying to hash a password for a login system I am creating. I am using the hashlib import and using the blake2b hash algorithm. I can't seem to figure out how to hash a variable such as passwordEntry. All the hashlib examples are just of blake2b hashing characters. For example: blake2b(b'IWantToHashThis') I am quite confused on why the "b" letter has to be included in the hash. If I try to hash a variable the "b" letter can't be concluded with the variable I want to hash. Example of me trying to hash a variable: blake2b(passwordEntry) Another example of me trying to hash the variable: blake2b(b passwordEntry) On the second example I just gave hashlib thinks that it is trying to hash the variable "b passwordEntry." Like I said before the "b" letter has to be included in the hashing algorithm for it to preform correctly. Sorry for the long question if it is hard to follow I understand.
The letter b only works before quotes, [", ', """, ''''].
And it is there to notate that this string is bytes.
If you want to convert your string to bytes you can do that by
b"string" or "string".encode(). However, in your case you can only use the encode() method of str since b only works for Literal Strings.
So in your case it will be blake2b(passwordEntry.encode())

MD5 Encrypting Woes

I've written an encrypting program in Python, one of my options is an md5 encryption. When i run a known string through my md5 encryption I receive a different hash value then if I run the EXACT same string through an md5 encryption website or cryptofox for firefox.
eg. my programs hash output - fe9c25d61e56054ea87703e30c672d91 - plaintext: g4m3
eg. online hash / cryptofox - 26e4477a0fa9cb24675379331dba9c84 - plaintext: g4m3
EXACT same word, 2 different hash values.
now heres my code snipet:
word="g4m3"
string=md5.new(word).hexdigest()
print string
You included a newline in your MD5 input string:
>>> import md5
>>> word="g4m3"
>>> md5.new(word).hexdigest() # no newline
'26e4477a0fa9cb24675379331dba9c84'
>>> md5.new(word + '\n').hexdigest() # with a newline
'fe9c25d61e56054ea87703e30c672d91'
When reading data from a file, make sure you remove the newline character at the end of the lines. You can use .rstrip('\n') to just remove line separation characters from the end of the line, or use .strip() to remove all whitespace from start or end of the line:
>>> word = 'g4m3\n'
>>> md5.new(word).hexdigest()
'fe9c25d61e56054ea87703e30c672d91'
>>> word = word.strip()
>>> md5.new(word).hexdigest()
'26e4477a0fa9cb24675379331dba9c84'
As to your question: Hashing is very sensitive. Even a single character of difference can result in a radically different output string. It may be the case that the online implementation is appending a whitespace char, or more likely, a newline. This extra character will change the output of the algorithm. (It's also possible the opposite is happening: you are appending a newline and the online one is not)
As to MD5 "encryption":
MD5 is NOT encryption. It is hashing. Encryption takes in data and a key, and spits out random data. This data is recoverable. Hashing, on the other hand, takes in data and spits out a finite amount of data that REPRESENTS the original data. The original data, however, unless stored elsewhere, is lost.
More information for reference:
Another interesting difference is the data the various types of algorithms spit out. Encryption can take in any amount of data (within the scope of the OS/software of course) and will output a bunch of data appx. equal in size to the input data. Hashing, however, will not. Since it is a mere representation of the data, it has a limited output. This can pose problems. For instance, if you had an infinite amount of data, eventually, two entirely different pieces of data would have the same hash. For this reason, when using hashing to compare two different values, it is usually a good idea to compare two separate hashes as well. The statistical probability that two separate pieces of data having TWO EQUAL HASHES is astronomically low.
Of course, then you get into hashing algorithms that utilize encryption methods at their core, but I won't go into that here.

Categories