what does the following line of code do? - python

var=hashlib.md5(str(random.random())).hexdigest()[:16]
I was reading a code in python,when I came across the above code line.
can anybody explain me what the above code line does ?

The line creates a random 16 character hex string.
random.random() produces a random float value in the range [0.0, 1.0).
>>> import random
>>> random.random()
0.845295579640289
str() produces a string version of that random value.
>>> str(0.845295579640289)
'0.84529557964'
hashlib.md5() creates a MD5 message digest hash object, initialised with the string value.
>>> hashlib.md5('0.84529557964')
<md5 HASH object # 0x10074c530>
The hexdigest() method then produces the hash digest, expressed in hexadecimal. The MD5 algorithm produces a 16 bytes of information, when expressed as in hexadecimal that means 32 characters are produced:
>>> hashlib.md5('0.84529557964').hexdigest()
'5180b52225eac65bee1d6419e28ef397'
The [:16] slice picks out the first 16 characters. This step is halves the digest to just the first 16 characters out of 32:
>>> '5180b52225eac65bee1d6419e28ef397'[:16]
'5180b52225eac65b'
All in all, a rather verbose, inefficient and insecure way of producing a random 16 character hex value. I'd use os.urandom() instead, encoding to hex:
>>> import os
>>> os.urandom(8).encode('hex')
'a8cb7b56d476b556'
This produces a random 8-byte string value, which when expressed as hex, also produces 16 hex characters, entirely random.
My crypto-fu isn't that great, but I have the impression that the latter form is cryptographically stronger than taking half of a MD5 hash of a string of a floating point psuedo-random value.

md5 is Encryption-Decryption technique which generates 128 bit checksum and expressed in 32 digit Hex number in text format.
so hashlib.md5(str(random.random())).hexdigest() will give you this number in a string and
[:16] will extract first 16 Digits of that hash and will store in var
Read these References for more details .
Python Md5
Md5 Hash

Related

Pack into c types and obtain the binary value back

I'm using the following code to pack an integer into an unsigned short as follows,
raw_data = 40
# Pack into little endian
data_packed = struct.pack('<H', raw_data)
Now I'm trying to unpack the result as follows. I use utf-16-le since the data is encoded as little-endian.
def get_bin_str(data):
bin_asc = binascii.hexlify(data)
result = bin(int(bin_asc.decode("utf-16-le"), 16))
trimmed_res = result[2:]
return trimmed_res
print(get_bin_str(data_packed))
Unfortunately, it throws the following error,
result = bin(int(bin_asc.decode("utf-16-le"), 16)) ValueError: invalid
literal for int() with base 16: '㠲〰'
How do I properly decode the bytes in little-endian to binary data properly?
Use unpack to reverse what you packed. The data isn't UTF-encoded so there is no reason to use UTF encodings.
>>> import struct
>>> data_packed = struct.pack('<H', 40)
>>> data_packed.hex() # the two little-endian bytes are 0x28 (40) and 0x00 (0)
2800
>>> data = struct.unpack('<H',data_packed)
>>> data
(40,)
unpack returns a tuple, so index it to get the single value
>>> data = struct.unpack('<H',data_packed)[0]
>>> data
40
To print in binary use string formatting. Either of these work work best. bin() doesn't let you specify the number of binary digits to display and the 0b needs to be removed if not desired.
>>> format(data,'016b')
'0000000000101000'
>>> f'{data:016b}'
'0000000000101000'
You have not said what you are trying to do, so let's assume your goal is to educate yourself. (If you are trying to pack data that will be passed to another program, the only reliable test is to check if the program reads your output correctly.)
Python does not have an "unsigned short" type, so the output of struct.pack() is a byte array. To see what's in it, just print it:
>>> data_packed = struct.pack('<H', 40)
>>> print(data_packed)
b'(\x00'
What's that? Well, the character (, which is decimal 40 in the ascii table, followed by a null byte. If you had used a number that does not map to a printable ascii character, you'd see something less surprising:
>>> struct.pack("<H", 11)
b'\x0b\x00'
Where 0b is 11 in hex, of course. Wait, I specified "little-endian", so why is my number on the left? The answer is, it's not. Python prints the byte string left to right because that's how English is written, but that's irrelevant. If it helps, think of strings as growing upwards: From low memory locations to high memory. The least significant byte comes first, which makes this little-endian.
Anyway, you can also look at the bytes directly:
>>> print(data_packed[0])
40
Yup, it's still there. But what about the bits, you say? For this, use bin() on each of the bytes separately:
>>> bin(data_packed[0])
'0b101000'
>>> bin(data_packed[1])
'0b0'
The two high bits you see are worth 32 and 8. Your number was less than 256, so it fits entirely in the low byte of the short you constructed.
What's wrong with your unpacking code?
Just for fun let's see what your sequence of transformations in get_bin_str was doing.
>>> binascii.hexlify(data_packed)
b'2800'
Um, all right. Not sure why you converted to hex digits, but now you have 4 bytes, not two. (28 is the number 40 written in hex, the 00 is for the null byte.) In the next step, you call decode and tell it that these 4 bytes are actually UTF-16; there's just enough for two unicode characters, let's take a look:
>>> b'2800'.decode("utf-16-le")
'㠲〰'
In the next step Python finally notices that something is wrong, but by then it does not make much difference because you are pretty far away from the number 40 you started with.
To correctly read your data as a UTF-16 character, call decode directly on the byte string you packed.
>>> data_packed.decode("utf-16-le")
'('
>>> ord('(')
40

How to use Hashlib to MD5 hash a number?

It seems everyone is doing this:
import hashlib
# initializing string
str = "GeeksforGeeks"
# encoding GeeksforGeeks using encode()
# then sending to md5()
result = hashlib.md5(str.encode())
However, I want to hash plain numbers. Something like
result = hashlib.md5(0)
#or
var = 5
result = hashlib.md5(var)
isn't working, and I've tried lots of other variations. What's the right syntax?
Hashes operate on a sequence of bytes.
An integer in Python is just simply a logical value; it has no definite size or byte-wise representation. If you want to hash numbers, you need to decide what form to put the number in before hashing it.
The simplest option would be to hash the string representation of the number. Do this by calling str and hashing that result. E.g.
var = 5
hash_input = str(var)
result = hashlib.md5(hash_input)
Another option would be to choose a fixed size, and hash the binary representation of the number:
var = 5
hash_input = struct.pack('<I', var) # Little-endian 32-bit unsigned
result = hashlib.md5(hash_input)
The correct way to do this totally depends on what exactly you're trying to accomplish, which you haven't told us.
Hashing plain numbers is ambiguous, to say the least.
The number should be converted to bytes before being fed to the digest algorithm. The first problem that you'll run into is how much byte size the number occupies, could be 4 byte, 8 byte or whatever. Then comes endianness, the order of bytes in memory. All these would result in different digest for seemingly the same number. (For simplicity, I've assumed the number is int)
>>> hashlib.md5(b'4').hexdigest()
'a87ff679a2f3e71d9181a67b7542122c'
>>> i = 4
>>> hashlib.md5(i.to_bytes(2, 'big')).hexdigest()
'c244b9cdf7853b5693a295e384c07367'
>>> hashlib.md5(i.to_bytes(4, 'big')).hexdigest()
'ea4959eb64a1f09be580d950964f3843'
>>> hashlib.md5(i.to_bytes(8, 'big')).hexdigest()
'59cff542fae7e0c4267e45740a12c9a0'
So, your solution would be to convert the int to str and encode it so you can get a digest from it.
# either this
>>> hashlib.md5(b'4').hexdigest()
# or
>>> hashlib.md5(str(4).encode()).hexdigest()
Hope that helps.

Hash algorithm(sha-512) doesn't return fixed length of bits

I am wondering why sha-512 or sha-256 function will return not fixed length bits in hashlib library, Python3.6.x
Firstly, I changed string like 'aaa' or 'mov' with hashlib.sha512 function.
then made return object hexdecimal, then changed hex value into binary.
it happens 2~3 in 100000times.
can anyone explain this?
i1 = h.sha512(b'mov')
num= int(i1.hexdigest(),16)
>>>11775820457324453297447618001055999940741095690927818803951219801773598183145805229667200223221871868971369247216868356532234761527576077269523848115505381
binary = bin(num)
>>>'0b1110000011010111000000101110101100011111100100001000101010011111011000010001101101101100010000110000111101010000001011000010011101110110010110111111011000100011011101000001111001111001111000001101101011100111010110001010110101111011001010100001001111010111000111110110110011110001110010111100110100001111011111000101101110001101001011001001111101111110010011010011111110101011010100101101111011001000011001100000000110101110110110001111110101011101001000011111111101110011001000111110000101111111110010001110010101'
len(binary[2:]) # to remove prefix
>>> 514
The binary representation of a number doesn't include leading zeros. Just as str(1) and str(11) are different lengths because they don't pad with leading zeros to a fixed length.

Why does using this code can generate a random password?

Here a snippet for generating password code,
I have 2 questions about this, Could you please share how to understand?
urandom(6), help from urandom said,return n random bytes suitable for cryptographic use, it is say, it will return 6 bytes, is it 6 of ASCII ?
ord(c) , get the decimal base for above bytes, why here transfer to decimal base?
Help for urandom:
def urandom(n): # real signature unknown; restored from __doc__
"""
urandom(n) -> str
Return n random bytes suitable for cryptographic use.
"""
return ""
Python script:
from os import urandom
letters = "ABCDEFGHJKLMNPRSTUVWXYZ"
password = "".join(letters[ord(c) % len(letters)] for c in urandom(6))
urandom will return a byte (i.e. a value between 0 and 255). The sample code uses that value and the modulo operator (%) to convert it into a value between 0 and 22, so that it can return one of the 23 letters (I, O, and Q are excluded not to be confused with numbers).
Note that it is not a perfectly balanced algorithm as it would favour the first 3 letters (A, B, and C) more, because 256 is not divisible by 23 and 256 % 23 is 3.
ord() function takes in a string containing a single character, and returns its Unicode index.
ex.
ord("A") => 65
ord("£") => 163
It is not used to get the decimal base of a byte as you mentioned, but rather its Unicode Index (its place in the Unicode Table).
P.S. :- Even though it returns the Unicode index but that doesn't mean its, range = len(Unicode Table) , the reason being that your python compiler may not support such long character sets under normal circumstances.

Python - obtain least 4 significant bytes from SHA1 hash

I am trying to derive a function that will return least significant 4 bytes of a SHA1 hash.
For example, given the following SHA1 hash:
f552bfdca00b431eb0362ee5f638f0572c97475f (hex digest)
\xf5R\xbf\xdc\xa0\x0bC\x1e\xb06.\xe5\xf68\xf0W,\x97G_ (digest)
I need to derive a function that extracts the last significant 4 bytes from the hash and produces:
E456406 (hex)
239428614 (dec)
So far I have tried solutions as described in Get the x Least Significant Bits from a String in Python and Reading least significant bits in Python.
However, the resulting value is not the same.
If you want the least significant 4 bytes (8 hex characters) from a the sha1 value f552bfdca00b431eb0362ee5f638f0572c97475f use a binary mask:
sha1_value = 0xf552bfdca00b431eb0362ee5f638f0572c97475f
mask = 0xffffffff
least_four = sha1_value & mask # 0x2c97475f

Categories