Converting from hex to binary without losing leading 0's python - python

I have a hex value in a string like
h = '00112233aabbccddee'
I know I can convert this to binary with:
h = bin(int(h, 16))[2:]
However, this loses the leading 0's. Is there anyway to do this conversion without losing the 0's? Or is the best way to do this just to count the number of leading 0's before the conversion then add it in afterwards.

I don't think there is a way to keep those leading zeros by default.
Each hex digit translates to 4 binary digits, so the length of the new string should be exactly 4 times the size of the original.
h_size = len(h) * 4
Then, you can use .zfill to fill in zeros to the size you want:
h = ( bin(int(h, 16))[2:] ).zfill(h_size)

This is actually quite easy in Python, since it doesn't have any limit on the size of integers. Simply prepend a '1' to the hex string, and strip the corresponding '1' from the output.
>>> h = '00112233aabbccddee'
>>> bin(int(h, 16))[2:] # old way
'1000100100010001100111010101010111011110011001101110111101110'
>>> bin(int('1'+h, 16))[3:] # new way
'000000000001000100100010001100111010101010111011110011001101110111101110'

Basically the same but padding to 4 bindigits each hexdigit
''.join(bin(int(c, 16))[2:].zfill(4) for c in h)

A newbie to python such as I would proceed like so
datastring = 'HexInFormOfString'
Padding to accommodate preceding zeros if any, when python converts string to Hex.
datastrPadded = 'ffff' + datastring
Convert padded value to binary.
databin = bin(int(datastrPadded,16))
Remove 2bits ('0b') that python adds to denote binary + 16 padded bits .
databinCrop = databin[18:]

This converts a hex string into a binary string. Since you want the length to be dependent on the original, this may be what you want.
data = ""
while len(h) > 0:
data = data + chr(int(h[0:2], 16))
h = h[2:]
print h

I needed integer as input and pure hex/bin strings out with the prefixs '0b' and '0x' so my general solution is this:
def pure_bin(data, no_of_bits=NO_OF_BITS):
data = data + 2**(no_of_bits)
return bin(data)[3:]
def pure_hex(data, no_of_bits=NO_OF_BITS):
if (no_of_bits%4) != 0:
no_of_bits = 4*int(no_of_bits / 4) + 4
data = data + 2**(no_of_bits)
return hex(data)[3:]

hexa = '91278c4bfb3cbb95ffddc668d995bfe0'
binary = bin(int(hexa, 16))[2:]
print binary
hexa_dec = hex(int(binary, 2))[2:]
print hexa_dec

Related

Loss of size of numbers when processing in hex format

Faced a problem while processing hex numbers.
When running str (hex ()) from a file, the zeros after 0x... disappear.
At the entrance:
0x0000000000000000000000000000000000000000000000000000000000000001f01f80f12f7cf16638f7a8074d46fe2f421a73432b1441a01ed3dd883c68acad
0x00000000000000000000000000000000000000000000000000000000000000029a799033fc54073346f870c15c9836f6b2e9eccdb85f09d29a8ddc90dc3a8ef1
0x00000000000000000000000000000000000000000000000000000000000000033e561483073e429ec25c09c99de2a81d5a34a539ad2dbb688af6b6f5f24936a4
On exit:
0x1f01f80f12f7cf16638f7a8074d46fe2f421a73432b1441a01ed3dd883c68acad
0x29a799033fc54073346f870c15c9836f6b2e9eccdb85f09d29a8ddc90dc3a8ef1
0x33e561483073e429ec25c09c99de2a81d5a34a539ad2dbb688af6b6f5f24936a4
Code:
with open("data.txt", "r") as file:
for line in file:
L = int(line, 0)
R = str(hex(L))
print(R)
What needs to be fixed in the code? I need one size of numbers and no loss of zeros.
Use string formatting:
# means put 0x on the front for hex numbers.
0130 means the fields is 130 characters long, the leading zero means pad with zeros.
x means hexadecimal (lowercase a-f).
line = '0x0000000000000000000000000000000000000000000000000000000000000001f01f80f12f7cf16638f7a8074d46fe2f421a73432b1441a01ed3dd883c68acad'
print(line) # as read from file
integer = int(line, 0)
formatted = f'{integer:#0130x}'
print(formatted)
print(formatted == line) # check that original and re-formatted are the same
Output:
0x0000000000000000000000000000000000000000000000000000000000000001f01f80f12f7cf16638f7a8074d46fe2f421a73432b1441a01ed3dd883c68acad
0x0000000000000000000000000000000000000000000000000000000000000001f01f80f12f7cf16638f7a8074d46fe2f421a73432b1441a01ed3dd883c68acad
True

Python - write long string of bits inside a binary file

I've a string composed of ~75 000 bits (very long string).
I would like to create a binary file which is represented by this sequence of bits.
I did the following code :
byte_array = bytearray(global_bits_str.encode())
with open('file1.bin', 'wb') as f:
f.write(byte_array)
But when I check file1.bin I can see that it's composed of 75 000 bytes instead of 75 000 bits. I guess it has been encoded in ascii (1 byte per bit) in the file.
Any suggestions ?
You can use the int builtin to convert your binary string into a sequence of integers, then pass that to bytearray.
Example for one byte:
>>> int('10101010', 2)
170
>>> bytearray([170])
bytearray(b'\xaa')
Splitting the string:
chunks = [bit_string[n:n+8] for n in range(0, len(bit_string), 8)]
You'll have to do some special casing for the last chunk since it may not be a full byte, which can be done by ljust to left-pad it with zeros.
Putting it together:
def to_bytes(bits, size=8, pad='0'):
chunks = [bits[n:n+size] for n in range(0, len(bits), size)]
if pad:
chunks[-1] = chunks[-1].ljust(size, pad)
return bytearray([int(c, 2) for c in chunks]
# Usage:
byte_array = to_bytes(global_bits_str)

Comparing slices of binary data in python

Say I have a file in hexadecimal and I need to search for repeating sets of bytes in it. What would be the best way to do this in python?
Right now what I'm doing is treating everything as a string with the re module, which is extremely slow and not the right way to do it. I just can't figure out how to slice up and compare binary data.
for i in range(int(len(data))):
string = data[i:i+16]
pattern = re.compile(string)
m = pattern.findall(data)
count += 1
if len(m) > 1:
k = [str(i), str(len(m))]
t = ":".join(k)
output_file.write(' {}'.format(t))
else:
continue
Just to make sure there's no confusion, data here is just a big string of hex data from open('pathtofile/file', 'r')

Want to convert text file of complex numbers to a list of numbers in Python

I have a text file of complex numbers called output.txt in the form:
[-3.74483279909056 + 2.54872970226369*I]
[-3.64042002652517 + 0.733996349939531*I]
[-3.50037473491252 + 2.83784532111642*I]
[-3.80592861109028 + 3.50296053533826*I]
[-4.90750592116062 + 1.24920836601026*I]
[-3.82560512449716 + 1.34414866823615*I]
etc...
I want to create a list from these (read in as a string in Python) of complex numbers.
Here is my code:
data = [line.strip() for line in open("output.txt", 'r')]
for i in data:
m = map(complex,i)
However, I'm getting the error:
ValueError: complex() arg is a malformed string
Any help is appreciated.
From the help information, for the complex builtin function:
>>> help(complex)
class complex(object)
| complex(real[, imag]) -> complex number
|
| Create a complex number from a real part and an optional imaginary part.
| This is equivalent to (real + imag*1j) where imag defaults to 0.
So you need to format the string properly, and pass the real and imaginary parts as separate arguments.
Example:
num = "[-3.74483279909056 + 2.54872970226369*I]".translate(None, '[]*I').split(None, 1)
real, im = num
print real, im
>>> -3.74483279909056 + 2.54872970226369
im = im.replace(" ", "") # remove whitespace
c = complex(float(real), float(im))
print c
>>> (-3.74483279909+2.54872970226j)
Try this:
numbers = []
with open("output.txt", 'r') as data:
for line in data.splitlines():
parts = line.split('+')
real, imag = tuple( parts[0].strip(' ['), parts[1].strip(' *I]') )
numbers.append(complex(float(real), float(imag)))
The problem with your original approach is that your input file contains lines of text that complex() does not know how to process. We first need to break each line down to a pair of numbers - real and imag. To do that, we need to do a little string manipulation (split and strip). Finally, we convert the real and imag strings to floats as we pass them into the complex() function.
Here is a concise way to create the list of complex values (based on dal102 answer):
data = [complex(*map(float,line.translate(None, ' []*I').split('+'))) for line in open("output.txt")]

python: find and replace numbers < 1 in text file

I'm pretty new to Python programming and would appreciate some help to a problem I have...
Basically I have multiple text files which contain velocity values as such:
0.259515E+03 0.235095E+03 0.208262E+03 0.230223E+03 0.267333E+03 0.217889E+03 0.156233E+03 0.144876E+03 0.136187E+03 0.137865E+00
etc for many lines...
What I need to do is convert all the values in the text file that are less than 1 (e.g. 0.137865E+00 above) to an arbitrary value of 0.100000E+01. While it seems pretty simple to replace specific values with the 'replace()' method and a while loop, how do you do this if you want to replace a range?
thanks
I think when you are beginning programming, it's useful to see some examples; and I assume you've tried this problem on your own first!
Here is a break-down of how you could approach this:
contents='0.259515E+03 0.235095E+03 0.208262E+03 0.230223E+03 0.267333E+03 0.217889E+03 0.156233E+03 0.144876E+03 0.136187E+03 0.137865E+00'
The split method works on strings. It returns a list of strings. By default, it splits on whitespace:
string_numbers=contents.split()
print(string_numbers)
# ['0.259515E+03', '0.235095E+03', '0.208262E+03', '0.230223E+03', '0.267333E+03', '0.217889E+03', '0.156233E+03', '0.144876E+03', '0.136187E+03', '0.137865E+00']
The map command applies its first argument (the function float) to each of the elements of its second argument (the list string_numbers). The float function converts each string into a floating-point object.
float_numbers=map(float,string_numbers)
print(float_numbers)
# [259.51499999999999, 235.095, 208.262, 230.22300000000001, 267.33300000000003, 217.88900000000001, 156.233, 144.876, 136.18700000000001, 0.13786499999999999]
You can use a list comprehension to process the list, converting numbers less than 1 into the number 1. The conditional expression (1 if num<1 else num) equals 1 when num is less than 1, otherwise, it equals num.
processed_numbers=[(1 if num<1 else num) for num in float_numbers]
print(processed_numbers)
# [259.51499999999999, 235.095, 208.262, 230.22300000000001, 267.33300000000003, 217.88900000000001, 156.233, 144.876, 136.18700000000001, 1]
This is the same thing, all in one line:
processed_numbers=[(1 if num<1 else num) for num in map(float,contents.split())]
To generate a string out of the elements of processed_numbers, you could use the str.join method:
comma_separated_string=', '.join(map(str,processed_numbers))
# '259.515, 235.095, 208.262, 230.223, 267.333, 217.889, 156.233, 144.876, 136.187, 1'
typical technique would be:
read file line by line
split each line into a list of strings
convert each string to the float
compare converted value with 1
replace when needed
write back to the new file
As I don't see you having any code yet, I hope that this would be a good start
def float_filter(input):
for number in input.split():
if float(number) < 1.0:
yield "0.100000E+01"
else:
yield number
input = "0.259515E+03 0.235095E+03 0.208262E+03 0.230223E+03 0.267333E+03 0.217889E+03 0.156233E+03 0.144876E+03 0.136187E+03 0.137865E+00"
print " ".join(float_filter(input))
import numpy as np
a = np.genfromtxt('file.txt') # read file
a[a<1] = 0.1 # replace
np.savetxt('converted.txt', a) # save to file
You could use regular expressions for parsing the string. I'm assuming here that the mantissa is never larger than 1 (ie, begins with 0). This means that for the number to be less than 1, the exponent must be either 0 or negative. The following regular expression matches '0', '.', unlimited number of decimal digits (at least 1), 'E' and either '+00' or '-' and two decimal digits.
0\.\d+E(-\d\d|\+00)
Assuming that you have the file read into variable 'text', you can use the regexp with the following python code:
result = re.sub(r"0\.\d*E(-\d\d|\+00)", "0.100000E+01", text)
Edit: Just realized that the description doesn't limit the valid range of input numbers to positive numbers. Negative numbers can be matched with the following regexp:
-0\.\d+E[-+]\d\d
This can be alternated with the first one using the (pattern1|pattern2) syntax which results in the following Python code:
result = re.sub(r"(0\.\d+E(-\d\d|\+00)|-0\.\d+E[-+]\d\d)", "0.100000E+00", subject)
Also if there's a chance that the exponent goes past 99, the regexp can be further modified by adding a '+' sign after the '\d\d' patterns. This allows matching digits ending in two OR MORE digits.
I've got the script working as I want now...thanks people.
When writing the list to a new file I used the replace method to get rid of the brackets and commas - is there a simpler way?
ftext = open("C:\\Users\\hhp06\\Desktop\\out.grd", "r")
otext = open("C:\\Users\\hhp06\\Desktop\\out2.grd", "w+")
for line in ftext:
stringnum = line.split()
floatnum = map(float, stringnum)
procnum = [(1.0 if num<1 else num) for num in floatnum]
stringproc = str(procnum)
s = (stringproc).replace(",", " ").replace("[", " ").replace("]", "")
otext.writelines(s + "\n")
otext.close()

Categories