How to print binary file output as Base 2 (in bits)? - python

I have a bin file which contains binary data stored in bytes.
When trying to read them in python, the output is something like this \xb5D\xbe"jSUk\xe75\x18}#\'%\x89oRqR\xfb\xe9\xe9\
How can I print file contents as Base 2 binary ?
For example 10000000 01000000 11000000 , etc

Here is an example reading 8 bytes at a time and formatting them in the way that you describe.
Note that you probably already have system utilities that will do a similar task, for example the od program on Unix-like systems.
with open("your_binary_file", "rb") as f:
while True:
data = f.read(8)
if not data:
break
print(" ".join(f"{byte:08b}" for byte in data))

Related

Reading a python binary file with a C# BinaryReader

I need to export some data like integers, floats etc. to a binary file with python. Afterwards, I have to read the file with C# again but it doesnt work for me.
I tried several ways of writing a binary file with python and it works as long as I read it with python as well:
a = 3
b = 5
with open('test.tcd', 'wb') as file:
file.write(bytes(a))
file.write(bytes(b))
or writing it like this:
import pickle as p
with open('test.tcd', 'wb') as file:
p.dump([a, b], file)
Currently I am reading the file in C# like this:
static void LoadFile(String path)
{
BinaryReader br = new BinaryReader(new FileStream(path, FileMode.Open));
int a = br.ReadInt32();
int b = br.ReadInt32();
System.Diagnostics.Debug.WriteLine(a);
System.Diagnostics.Debug.WriteLine(b);
br.Close();
}
Unfortunately the output isnt 3 and 5, instead my output is just zero. How do i read or write the binary file properly?
In Python, you have to write your integers with 4 bytes each. Read more here: struct.pack
a = 3
b = 5
with open('test.tcd', 'wb') as file:
f.write(struct.pack("<i", 3))
f.write(struct.pack("<i", 5))
Your C# code should work now.
It's possible python is not writing data in the same format that C# expects. You may need to swap byte endianess or do something else. You could read the raw bytes instead and use BitConverter to see if that fixes it.
Another option is to specify the endianess explicitly in python, I think big endian is the default binary reader format for C#.
an_int = 5
a_bytes_big = an_int.to_bytes(2, 'big')
print(a_bytes_big)
Output
b'\x00\x05'
a_bytes_little = an_int.to_bytes(2, 'little')
print(a_bytes_little)
Output
b'\x05\x00'

Why am I only writing 28,672 bits to this file?

I have been working on a project where it is necessary to program a binary file, of a certain kind, to a AT28C256 chip. The specifics are not important beyond the fact that the file needs to be 32,768 bytes in size (exactly).
I have some "minimal problem" code here:
o = open("images.bin", "wb")
c = 0
for i in range(256):
for j in range(128):
c += 1
o.write(chr(0).encode('utf-8'))
print(c)
This, to me, would appear to write 32,768 bytes to a file (the split into i,j is necessary because I need to write an image to the device) as 128*256 = 32768. And the output of c is 32768!
But the file it creates is 28672 bytes long! The fact that this is 7000 in hex has not passed me by but I'm not sure why this is happening. Any ideas?
You should call o.close() to flush the write buffer and close the file properly.

Find a string in a binary file

I am trying to extract data from a binary file where the data chunks are "tagged" with ASCII text. I need to find the word "tracers" in the binary file so I can read the next 4 bytes (int).
I am trying to simply loop over the lines, decoding them and checking for the text, which works. But I am having trouble seeking to the correct place in the file directly after the text (the seek_to_key function):
from io import BytesIO
import struct
binary = b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\n\x00\x00\xd6\x00\x8c<TE\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00tracers\x00\xf2N\x03\x00P\xd9U=6\x1e\x92=\xbe\xa8\x0b<\xb1\x9f\x9f=\xaf%\x82=3\x81|=\xbeM\xb4=\x94\xa7\xa6<\xb9\xbd\xcb=\xba\x18\xc7=\x18?\xca<j\xe37=\xbc\x1cm=\x8a\xa6\xb5=q\xc1\x8f;\xe7\xee\xa0=\xe7\xec\xf7<\xc3\xb8\x8c=\xedw\xae=C$\x84<\x94\x18\x9c=&Tj=\xb3#\xb3=\r\xdd3=\x0eL==4\x00~<\xc6q\x1e=pHw=\xc1\x9a\x92="\x08\x9a=\xe6a\xeb<\xa4#.=\xc4\x0f-=\xa9O\xcb=i\'\x15=\x94\x03\x80=\x8f\xcd\xaf=\xd6\x00\x8c<TE\x9f<m\x9ad<[;Q=\x157X=\x17\xf1u=\xb8(\xa4=\x13\xd3\xfa<\x811_=\xd1iX=Q\x17^;\xd1n\xbe=\xfcb\xcc=\xe8\x9b\x99=W\xa9\x16=\xc5\x83\xa4=\xc0%\x98<\xbb|\x99<>#\x8b:\x1cY\x82;\xb8T\xa4<Cv\x87="n\x1c<J\x152=\x1f\xb2\x9d=&\x18\xb6=\x8a\xf9{=\x0fT\xba=HrX=\xa0\\S=#\xee\xbd=\x1e,\xc5=y\rU<gK\x84=\xe3*\r=\x04\xc4M=\x98a\xb3<\x95 T=\xf2Z\x94=lL\x15=\x07\x1b^=\xf3W\x83<\xf6\xff\xa1<\xb8\xfb\xcb<p\xb4\xd8<\xc9#\xfd<s\xa6\x1f;\xbf7W<\x8a\x9c\x82<\x1c\xb7l=\xa7\xd0\xb7=\xe4\x8d\x97=\xe2\x7f\x82=\x82\xa1\xcc<\xdfs\xca=C\x10p=\xb4\xfa\xb0=\xf35\x87=\x9d\x8bR<d\xb9\x0c<\xb26\xcd=\r\xd5\x1d<\xf4p\xb1=f)\xaf=\xe2M\\=F|\xf9<\x9baW=\x85|\xa3=\x0f\xdd\xa1=\xb6f\xa9=\xcbW\xcf<\xfa\x1a\xbe=\xeb\xda\xb2=\x88\xfb\x8e=\x9f+$=\xbbS\xac;\xa2o\xb5=\x08\xca\xe5<\xc9IC=\xa8\x05\xa6=\xbc \xbd=\x8e\x8d}=U\xcd\xba=\xcbG\x89=}\xadg=Z\xad\x9f=_=\xb6:y\x1c==\xa5\x0b3<<\xe5\x1e=*\xa0\xb6=\n\xcd\xb8\xd9<u\xb5W=rZ\x88=\xe0w}=\xa5\xf0\xa0=\xf4\x91\x82=\xe4r\xc5<\x0e\x91A=Z\x9d-<[N:=\xf1\t\x1e=\xc5_\xc2=\xf8\xea\x98=t\xd7\xbf<~N\xce==#\x93=\x98A\xa7=c\x81x=\xe3\xc6\x94=\xe2&\xcc=\x05\xa9^=\xf7\x05\xa8=[m\x81=\x1b\x0b\x84=\xf5\x98\xb9=+\x90\xd8<\xa2\xcc\xa5=5^\x92=\x0e\x9d\x1d=\x96\xc7\x8b;\xc5E\x9e;r\x1e\xc7=\xea6\xbf=\x19mN;\xd9$D=\x85\xa9\x8b=!\xe9\x90=\xe4/~<\xc1\x9c\xaf=\xde\xe4\x18=e\xb0H=hLO;\x9f\xf8\x8b=p.\xcf=L\x1f\x01<\xea\x19\xaf=Z\xd5\xc2<\xb4\xd8\xcf=s\x84\x0c=\x987\xa5;\x19Z\x93=\x0c\x8fO=y/\x97=\xeaOG=\xb0Fl=\x03\x7f\xbe=\x96\n'
binary_data = BytesIO()
binary_data.write(binary)
binary_data.seek(0)
def seek_to_key(f, line_str, key):
key_start = line_str.find(key)
offset = len(line_str[key_start+len(key)].encode('utf-8'))
f.seek(-offset, 1)
for line in binary_data:
line_str = line.decode('utf-8', errors='replace')
print(line_str)
if 'tracers' in line_str:
seek_to_key(binary_data, line_str, 'tracers')
nfloats = struct.unpack('<i', binary_data.read(4))
print(nfloats)
break
Any recommendations on a better way to do this would be awesome!
It's not completely clear to me what you are trying to achieve. Please explain that in more detail if you want a better answer. What I understand from your current question and code is that you are trying to read the 32-bit number directly after the ASCII text 'tracers'. I'm guessing this is only the first step of your code, since the name `nfloats' suggests that you will be reading a number of floats in the next step ;-) But I'll try to answer this question only.
There are a number of problems with your code:
First of all, a simple typo: Instead of line_str[key_start+len(key)] you probably meant line_str[key_start+len(key):]. (You missed the colon.)
You are mixing binary and text data. Why do you decode the binary data as UTF-8? It clearly isn't. You can't just "decode" binary data as UTF-8, slicing a piece of it, and then re-encode that using UTF-8. In this case, the part after your marker is 518 bytes, but when encoded as UTF-8 it becomes 920 bytes. This messes up your offset calculation. Tip: you can search binary data in binary data in Python :-) For example: b'Hello, world!'.find(b'world') returns 7. So you don't have to encode/decode the data at all.
You are reading line by line. Why is that? Lines are a concept of text files and don't have a real meaning in binary files. It could work, but that depends on the file format (which I don't know). In any case, your current code can only find one tracer per line. Is that intentionally, or could there be more markers in each line? Anyway, if the file is small enough to fit in memory, it would be much easier to process the data in one chunk.
A minor note: you could write binary_data = BytesIO(binary) and avoid the additional write(). Also the seek(0) is not necessary.
Example code
I think the following code gives the correct result. I hope it will be a useful start to finish your application. Note that this code conforms to the Style Guide for Python Code and that all pylint issues were resolved (except for a too long line and missing docstrings).
import io
import struct
DATA = b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\n\x00\x00\xd6\x00\x8c<TE\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00tracers\x00\xf2N\x03\x00P\xd9U=6\x1e\x92=\xbe\xa8\x0b<\xb1\x9f\x9f=\xaf%\x82=3\x81|=\xbeM\xb4=\x94\xa7\xa6<\xb9\xbd\xcb=\xba\x18\xc7=\x18?\xca<j\xe37=\xbc\x1cm=\x8a\xa6\xb5=q\xc1\x8f;\xe7\xee\xa0=\xe7\xec\xf7<\xc3\xb8\x8c=\xedw\xae=C$\x84<\x94\x18\x9c=&Tj=\xb3#\xb3=\r\xdd3=\x0eL==4\x00~<\xc6q\x1e=pHw=\xc1\x9a\x92="\x08\x9a=\xe6a\xeb<\xa4#.=\xc4\x0f-=\xa9O\xcb=i\'\x15=\x94\x03\x80=\x8f\xcd\xaf=\xd6\x00\x8c<TE\x9f<m\x9ad<[;Q=\x157X=\x17\xf1u=\xb8(\xa4=\x13\xd3\xfa<\x811_=\xd1iX=Q\x17^;\xd1n\xbe=\xfcb\xcc=\xe8\x9b\x99=W\xa9\x16=\xc5\x83\xa4=\xc0%\x98<\xbb|\x99<>#\x8b:\x1cY\x82;\xb8T\xa4<Cv\x87="n\x1c<J\x152=\x1f\xb2\x9d=&\x18\xb6=\x8a\xf9{=\x0fT\xba=HrX=\xa0\\S=#\xee\xbd=\x1e,\xc5=y\rU<gK\x84=\xe3*\r=\x04\xc4M=\x98a\xb3<\x95 T=\xf2Z\x94=lL\x15=\x07\x1b^=\xf3W\x83<\xf6\xff\xa1<\xb8\xfb\xcb<p\xb4\xd8<\xc9#\xfd<s\xa6\x1f;\xbf7W<\x8a\x9c\x82<\x1c\xb7l=\xa7\xd0\xb7=\xe4\x8d\x97=\xe2\x7f\x82=\x82\xa1\xcc<\xdfs\xca=C\x10p=\xb4\xfa\xb0=\xf35\x87=\x9d\x8bR<d\xb9\x0c<\xb26\xcd=\r\xd5\x1d<\xf4p\xb1=f)\xaf=\xe2M\\=F|\xf9<\x9baW=\x85|\xa3=\x0f\xdd\xa1=\xb6f\xa9=\xcbW\xcf<\xfa\x1a\xbe=\xeb\xda\xb2=\x88\xfb\x8e=\x9f+$=\xbbS\xac;\xa2o\xb5=\x08\xca\xe5<\xc9IC=\xa8\x05\xa6=\xbc \xbd=\x8e\x8d}=U\xcd\xba=\xcbG\x89=}\xadg=Z\xad\x9f=_=\xb6:y\x1c==\xa5\x0b3<<\xe5\x1e=*\xa0\xb6=\n\xcd\xb8\xd9<u\xb5W=rZ\x88=\xe0w}=\xa5\xf0\xa0=\xf4\x91\x82=\xe4r\xc5<\x0e\x91A=Z\x9d-<[N:=\xf1\t\x1e=\xc5_\xc2=\xf8\xea\x98=t\xd7\xbf<~N\xce==#\x93=\x98A\xa7=c\x81x=\xe3\xc6\x94=\xe2&\xcc=\x05\xa9^=\xf7\x05\xa8=[m\x81=\x1b\x0b\x84=\xf5\x98\xb9=+\x90\xd8<\xa2\xcc\xa5=5^\x92=\x0e\x9d\x1d=\x96\xc7\x8b;\xc5E\x9e;r\x1e\xc7=\xea6\xbf=\x19mN;\xd9$D=\x85\xa9\x8b=!\xe9\x90=\xe4/~<\xc1\x9c\xaf=\xde\xe4\x18=e\xb0H=hLO;\x9f\xf8\x8b=p.\xcf=L\x1f\x01<\xea\x19\xaf=Z\xd5\xc2<\xb4\xd8\xcf=s\x84\x0c=\x987\xa5;\x19Z\x93=\x0c\x8fO=y/\x97=\xeaOG=\xb0Fl=\x03\x7f\xbe=\x96\n' # noqa
def find_tracers(data):
start = 0
while True:
pos = data.find(b'tracers', start)
if pos == -1:
break
num_floats = struct.unpack('<i', data[pos+7: pos+11])
print(num_floats)
start = pos + 11
def main():
with io.BytesIO(DATA) as file:
data = file.read()
find_tracers(data)
if __name__ == '__main__':
main()

How to store the data in little endian format in a binary file in Python

I have a binary file. I want to read hexadecimal data from the terminal in my python code. I am executing the program as follows:
python hello.py "2DB6C" "CDEF"
"2DB6C" :- (Address in hex) Indicates GoTo address <2DB6C> in the temp.bin file, where I want to start writing the data.
"CDEF" :- Data to be written in binary file. Remember the data is given in the hex format.
I want to write the data in the small endian format. But it is not working for me.
file = open("temp.bin", "r+b")
file.seek(4)
datatomodify = "CDEF"
data = binascii.unhexlify(datatomodify)
print ("data :", data, "offset addr :", hex(file.tell()))
file.write(data)
print ("after writing addr :", hex(file.tell()))
file.close()
It is writing in the file as "CDEF". But I want to write the data in the little endian format.
Kindly help me in fixing it.

python error on struct.unpack

I am new to python and I am trying to use unpack like this:
data = f.read(4)
AAA=len(data)
BBB=struct.calcsize(cformat)
print AAA
print BBB
value = struct.unpack(cformat, data)
return value[0]
This runs fine as long as AAA == BBB but sometimes, f.read only reads 3 bytes and then I get an error. The actual value in the file that I am trying to read is 26. It reads all of the values from 1-221 except for 26 where it errors because f.read(size) only reads three bytes
Assuming the question is "How should I read a 26 without an error?"
First check the arguments to the open() that produces f. Under Windows, unless you open a file in binary mode (f = open(filename, "rb")), Python assumes that the file is a text file. Windows treats byte value 26 (Ctrl+Z) in a text file as an end-of-file marker, a quirk that it inherited from CP/M.
You have opened a binary file in text mode, and you are using an operating system where the distinction matters. Try adding b to the mode parameter when you open the file:
f = open("my_input_file.bin", "rb")

Categories