I have a long string of repeating two hexadecimal characters separated by a space read in from a file that I would like to store into a two dimensional (array) list for processing later. The string is in the form:
file_content = "00 18 00 19 F0 0F 1A 80 FF C7 E8 11 7F 52 7D 00 F0 0D F0 0C 0B FF"
Each sub string that needs indexed begins with "00" and ends with "FF". There are no instances of "FF" mid string but there are instances of "00" possible which makes this tricky. I would like to store each one of these events to its own index in the list. For example:
event_list = [[00 18 00 19 F0 0F 1A 80 FF], [C7 E8 11 7F 52 7D 00 F0 0D F0 0C 0B FF], .....}
If I understand this correctly, you're splitting it up based on the 'FF's present in the string, so you could probably get away with something like:
event_list = [('%sFF' % x).strip().split(' ') for x in file_content.split('FF')[:-1]]
This will split your original string by the 'FF's present, then loop through the split parts, append an ' FF' to the end of them. It then splits the new string by the space character, generating a new list and appends it to the outer list, creating the 2D array you require in 1 line :)
Related
So I'm trying to create a python script to generate a level for a game made in MMF2+Lua, and I've run into something I can't figure out how to fix.
Generating a 16x16 empty level with borders with the game gives this (deflated):
78 5E 63 20 0A FC 27 00 40 86 8C AA C1 1D 02 23 3D 7C 08 27 32 00 9F 62 FE 10
which should be a flattened 18x18 array with the edge having 0x00, and the rest having 0xFF.
My python script generates this with the exact same input to zlib.deflate:
78 9C 63 60 20 06 FC 27 00 46 D5 8C AA C1 A7 86 30 00 00 9F 62 FE 10
They're different, but inflating them gives the same exact data. However, when I put the data into the game, it crashes when trying to load the level.
What's really different between the two values, and am I able to fix it?
Those are two different encodings of the same data, both valid. They differ in the sequence of copies. Here are readable forms of both, first from the game:
! infgen 2.6 output
!
zlib
!
last
fixed
literal 0
match 37 1
literal 255
match 31 1
match 4 69
match 258 36
match 26 258
match 256 288
match 34 613
end
!
adler
then from zlib:
! infgen 2.6 output
!
zlib
!
last
fixed
literal 0 0
match 36 1
literal 255
match 31 1
match 258 36
match 258 36
match 28 36
match 34 1
end
!
adler
literal gives a byte or bytes inserted in the stream. match is a copy of previous bytes in the stream (possibly overlapped with bytes being copied), where the first parameter is the number of bytes to copy, and the second parameter is the distance back in bytes to copy from.
I have binary for example https://github.com/andrew-d/static-binaries/blob/master/binaries/linux/x86_64/nmap
1) How to find what is the address of this series of bytes :48 8B 45 A8 48 8D 1C 02 48 8B 45 C8 ? , the result need to be 0x6B0C67
2)How to find out the 12 bytes that in address 0x6B0C67 ? the result need to be 48 8B 45 A8 48 8D 1C 02 48 8B 45 C8 .
3) How to find which address call to specific string? for example i + 1 == features[i].index that locate in 0x6FC272 ? the result need to be 0x4022F6
How can I find all of this without open Ida? only with python/c code?
thanks
For 1) Is your file small enough to be loaded into memory? Then it's as simple as
offset = open(file, 'rb').read().find(
bytes.fromhex("48 8B 45 A8 48 8D 1C 02 48 8B 45 C8")
)
# offset will be -1 if not found
If not, you will need to read it in chunks.
For 2), do
with open(file, 'rb') as stream:
stream.seek(0x6b0c67)
data = stream.read(12)
I'm afraid I don't understand the question in 3)...
I have data files which contain series of 32-bit binary "numbers."
I say "numbers" because the 32 1/0's define what type of data sensors were picking up, when they were, which sensor,etc; so the decimal value of the numbers is of no concern to me. In particular some (most) of the data will begin with possibly up to 5 zeros.
I simply need a way in python to read these files, get a list containing each 32-bit number, and then I'll need to mess around with it a little (delete some events) and rewrite it to a new file.
Can anyone help me with this? I've tried the following so far but the numbers which should be corresponding to the time data we encode seem to be impossible.
with open(lm_file, mode='rb') as file:
bytes_read = file.read(struct.calcsize("I"))
while bytes_read:
idList = struct.unpack("I", bytes_read)[0]
idList=bin(idList)
print(idList)
bytes_read = file.read(struct.calcsize("=l"))
Output of hexdump:
00000000 80 0a 83 4d ba a5 80 0c c0 00 7b 42 cb 90 0f 41 |...M......{B...A|
00000010 98 c9 9c 53 4c 15 35 52 d8 54 f7 0a 5d 87 16 4d |...SL.5R.T..]..M|
00000020 89 6a 3f 04 f2 eb c4 4a e2 37 e6 08 23 5e ca 06 |.j?....J.7..#^..|
My question maybe is simple but i'm not good with bytes/hex operations. I need to do a checksum from a Serial Port data with this Values:
55 55 3A 0B 47 09 3E 08 FF 0F 93
The last value 93 is the sum value but i don't know how to do this.
55 + 55 + 3A + 0B + 47 + 09 + 3E + 08 + FF + 0F = 93
Convert the raw bytestring into a sequence of numbers, then add all but the last number, mask to byte-length, and compare the result with the last number in the sequence.
>>> data = bytearray('\x55\x55\x3a\x0b\x47\x09\x3e\x08\xff\x0f\x93')
>>> sum(data[:-1]) & 0xff == data[-1]
True
The following Fortran code:
INTEGER*2 :: i, Array_A(32)
Array_A(:) = (/ (i, i=0, 31) /)
OPEN (unit=11, file = 'binary2.dat', form='unformatted', access='stream')
Do i=1,32
WRITE(11) Array_A(i)
End Do
CLOSE (11)
Produces streaming binary output with numbers from 0 to 31 in integer 16bit. Each record is taking up 2 bytes, so they are written at byte 1, 3, 5, 7 and so on. The access='stream' suppresses the standard header of Fortran for each record (I need to do that to keep the files as tiny as possible).
Looking at it with a Hex-Editor, I get:
00 00 01 00 02 00 03 00 04 00 05 00 06 00 07 00
08 00 09 00 0A 00 0B 00 0C 00 0D 00 0E 00 0F 00
10 00 11 00 12 00 13 00 14 00 15 00 16 00 17 00
18 00 19 00 1A 00 1B 00 1C 00 1D 00 1E 00 1F 00
which is completely fine (despite the fact that the second byte is never used, because decimals are too low in my example).
Now I need to import these binary files into Python 2.7, but I can't. I tried many different routines, but I always fail in doing so.
1. attempt: "np.fromfile"
with open("binary2.dat", 'r') as f:
content = np.fromfile(f, dtype=np.int16)
returns
[ 0 1 2 3 4 5 6 7 8 9 10 11
12 13 14 15 16 17 18 19 20 21 22 23
24 25 0 0 26104 1242 0 0]
2. attempt: "struct"
import struct
with open("binary2.dat", 'r') as f:
content = f.readlines()
struct.unpack('h' * 32, content)
delivers
struct.error: unpack requires a string argument of length 64
because
print content
['\x00\x00\x01\x00\x02\x00\x03\x00\x04\x00\x05\x00\x06\x00\x07\x00\x08\x00\t\x00\n', '\x00\x0b\x00\x0c\x00\r\x00\x0e\x00\x0f\x00\x10\x00\x11\x00\x12\x00\x13\x00\x14\x00\x15\x00\x16\x00\x17\x00\x18\x00\x19\x00']
(note the delimiter, the t and the n which shouldn't be there according to what Fortran's "streaming" access does)
3. attempt: "FortranFile"
f = FortranFile("D:/Fortran/Sandbox/binary2.dat", 'r')
print(f.read_ints(dtype=np.int16))
With the error:
TypeError: only length-1 arrays can be converted to Python scalars
(remember how it detected a delimiter in the middle of the file, but it would also crash for shorter files without line break (e.g. decimals from 0 to 8))
Some additional thoughts:
Python seems to have troubles with reading parts of the binary file. For np.fromfile it reads Hex 19 (dec: 25), but crashes for Hex 1A (dec: 26). It seems to be confused with the letters, although 0A, 0B ... work just fine.
For attempt 2 the content-result is weird. Decimals 0 to 8 work fine, but then there is this strange \t\x00\n thing. What is it with hex 09 then?
I've been spending hours trying to find the logic, but I'm stuck and really need some help. Any ideas?
The problem is in open file mode. Default it is 'text'. Change this mode to binary:
with open("binary2.dat", 'rb') as f:
content = np.fromfile(f, dtype=np.int16)
and all the numbers will be readed successfull. See Dive in to Python chapter Binary Files for more details.