I am new to python, and trying to create a packet parser. Then I stripped off the ethernet packet from the socket from the following commands:
>>raw=socket.socket(socket.PF_PACKET,socket.SOCK_RAW,socket.htons(0x800)) # Raw Packed Created
>>raw=raw.recvfrom(2048) #Received data from socket
>>raw
('\x01\x00^\x00\x00\x01T\xe6\xfc\xd0\x93\x10\x08\x00F\xc0\x00 \x00\x00#\x00\x01\x02Bm\xc0\xa8\x01\x01\xe0\x00\x00\x01\x94\x04\x00\x00\x11d\xee\x9b\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', ('eth0', 2048, 2, 1, 'T\xe6\xfc\xd0\x93\x10'))
>> ether=raw[0][0:14] #Ethernet header
>>ether_unpack=struct.unpack("!6s6s2s",ether) #Unpack ethernet header into string form
>>ether_unpack #print unpacked data
('\x01\x00^\x00\x00\x01', 'T\xe6\xfc\xd0\x93\x10', '\x08\x00')
>>ether_hex=binascii.hexlify(ether_unpack[0]) #converted data into hexadecimal format
.
.
.
tcpHeader=raw[0][34:54] #strippin TCP packet
tcp_hdr=struct.unpack("!HH16s", tcpHeader) // TCP header unpack
First Question : what is the format of '\x01\x00^\x00\x00\x01'; What is the format of numerics in my first output
Second question: ether=raw[0][0:14]; [0:14] that takes 14 bytes from first tuple? Requires confirmation
Third Question: tcp_hdr=struct.unpack("!HH16s", tcpHeader) What does the first argument do? I took this command from somewhere, and cannot figure out why the there are 'double H' in the first argument.
Thanks in advance!
1) The first element of raw is an hex format string, you can convert it to an int list using:
>> payload = [int(x.encode('hex'), 16) for x in raw[0]]
>> [1, 0, 94, 0, 0, 1, 84, 230, 252, 208, 147, 16, 8, 0, 70, 192, 0, 32, 0, 0, 64, 0, 1, 2, 66, 109, 192, 168, 1, 1, 224, 0, 0, 1, 148, 4, 0, 0, 17, 100, 238, 155, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
2) Yes, raw[0][0:14] takes the first 14 bytes of the first element of the raw tuple.
3) The first argument for the unpack function is the format in which the given string should be unpacked (https://docs.python.org/2/library/struct.html). The first character of the format string can be used to indicate the byte order, size and alignment of the packed data; the ! character in this case is for big-endian or little-endian. The 'double H' means that two unsigned short integers should be unpacked (2 bytes each), followed by a 16 characters string defined by 16s (16 bytes), accordingly tcpHeader is a 20 bytes string and tcp_hdr stores a (int, int, string) tuple.
Related
I am facing a challenge to create a bytes list in python. I just want to convert the int list into bytes list as mentioned in expected result. The problem statement is that I want to send the expected output to the serial device connected to the com port and with current output the serial device is not encouraging the '\\' as a separator. Please suggest me the correct way to handle the '\' in a list of bytes.
cmdlist = [2, 12, 1, 1, 1, 0, 0, 1, 3, 7, 42, 101, 85, 18]
#Convert CMD list to Hex List
for i in range(len(cmdlist)):
cmdlist[i] = hex(cmdlist[i])
f_cmdList = ''
#Convert hex CMD list to string List
for i in range(len(cmdlist)):
f_cmdList += '\\' + (cmdlist[i])
Final_cmdlist = (bytes(f_cmdList,'utf-8'))
print(Final_cmdlist)
Current output : b'\\0x2\\0xc\\0x1\\0x1\\0x1\\0x0\\0x0\\0x1\\0x3\\0x7\\0x2a\\0x65\\0x55\\0x12'
Expected output : b'\0x2\0xc\0x1\0x1\0x1\0x0\0x0\0x1\0x3\0x7\0x2a\0x65\0x55\0x12'
Thank You !
You can convert a list to bytes simply by using the bytes constructor. Your method is trying to create a string that contains the string representation of a byte array, which won't work when sent to the serial device:
>>> cmdlist = [2, 12, 1, 1, 1, 0, 0, 1, 3, 7, 42, 101, 85, 18]
>>> bytes(cmdlist)
b'\x02\x0c\x01\x01\x01\x00\x00\x01\x03\x07*eU\x12'
You get what you say you expect if you replace your
f_cmdList += '\\' + (cmdlist[i])
with this:
f_cmdList += '\0' + cmdlist[i][1:]
Still not convinced that you really want that, though.
I have a binary file with size of 10 MB, what I want to do with this file is to read bit by bit. In Python- Numpy, as far as I know we cannot read data bit by bit but byte. So, in order to read the data bit by bit, first I read the file using np.fromfile function then later unpack the byte into 8 bits using np.unpackbits function. Here is the script how I did it:
fbyte = np.fromfile(binar_file, dtype='uint8')
fbit = np.unpackbits(fbyte)
What I have in fbit is a long binary file but with reversing order in every 8 bits (MSB - LSB) e.g 10010011 ..., what I actually expected is in order LSB - MSB like this 11001001. By using for loop to flip the order of binary file every 8 bits will solve the problem, but it will take some time which I would like to avoid since I want to read thousand of files. So my question is, is there any way to unpack the bytes into bit but directly in order of LSB - MSB. Just as comparison, in Matlab this process is easy to do since there is Matlab function fread where I can specify bit configuration, e.g 'ubit1' for reading bit by bit and the result is as I expected --> LSB - MSB. Any help/hints would be appreciated. Thanks.
You could simply reshape to 2D keeping 8 columns and then flip those, like so -
np.unpackbits(fbyte).reshape(-1,8)[:,::-1]
Sample run -
In [1176]: fbyte
Out[1176]: array([253, 35, 198, 182, 62], dtype=uint8)
In [1177]: np.unpackbits(fbyte).reshape(-1,8)[:,::-1]
Out[1177]:
array([[1, 0, 1, 1, 1, 1, 1, 1],
[1, 1, 0, 0, 0, 1, 0, 0],
[0, 1, 1, 0, 0, 0, 1, 1],
[0, 1, 1, 0, 1, 1, 0, 1],
[0, 1, 1, 1, 1, 1, 0, 0]], dtype=uint8)
Timings on one million elements array -
In [1173]: fbyte = np.random.randint(0,255,(1000000)).astype(np.uint8)
In [1174]: %timeit np.unpackbits(fbyte).reshape(-1,8)[:,::-1]
1000 loops, best of 3: 541 µs per loop
Seems crazy fast to me!
In NumPy 1.17 and newer, unpackbits accepts a bitorder parameter that will accomplish this -- just pass bitorder="little" to the np.unpackbits call.
I'm writing a program in Linux which reads and distinguish inputs from two USB devices(two barcode readers) which simulates a keyboard.
I've already can read inputs from USB, but it happens before OS translate keycode in a charactere.
For example, when I read 'a' i got 24, 'b' 25, etc....
For example, when I read 'a' i got 4, 'b' 5, etc....
Is there any way to convert that code in a char without manual mapping?
Some output exemples:
KEYPRESS = a output = array('B', [0, 0, 4, 0, 0, 0, 0, 0])
KEYPRESS = SHIFT + a output = array('B', [2, 0, 4, 0, 0, 0, 0, 0])
KEYPRESS = 1 output = array('B', [0, 0, 30, 0, 0, 0, 0, 0])
KEYPRESS = ENTER output = array('B', [0, 0, 81, 0, 0, 0, 0, 0])
thx!
Use the chr function. Python uses a different character mapping (ASCII) from whatever you're receiving though, so you will have to add 73 to your key values to fix the offset.
>>> chr(24 + 73)
'a'
>>> chr(25 + 73)
'b'
I've already can read inputs from USB, but it happens before OS
translate keycode in a charactere.
The problem seems to me in your interface or the driver program.
In ASCII 'a' is supposed to have ordinal value 97 whose binary representation is 0b1100001, where as what you are receiving is 27 whose binary representation is 0b11000, similarly for 'b' you were supposed to received '0b1100010' instead you received 25 which is 0b11001. Check your hardware to determine if the 1st and the 3rd bit is dropped from the input.
What you are receiving is USB scan code. I do not think there is a third party python library to do the conversion for you. I would suggest you to refer any of the USB Scan Code Table and from it, create a dictionary of USB Scan Code vs the corresponding ASCII.
I append to tcp packet unsinged long long with value 4 and additional unsigned long long with value 8616616 ( i dont remember the second value ).
I do it in c on ubuntu 32 , so unsigned long long is 8 bytes.
I sniff the packet with scapy and print the padding.load .
In the output i see symbols that i dont undesrtand the meaning of them - g, |
In additional the load should be 16 bytes , but i dont see 16 bytes.
If i append only one unsigned long long i get 8 bytes and i dont see these symbols
>>> pkt = sniff(count=2,filter="tcp")
>>> raw = pkt[1].sprintf('%Padding.load%')
>>> raw
"'\\x04\\x00\\x00\\x00\\x00\\x00\\x00\\x00g\\xc4|\\x00\\x00\\x00\\x00\\x00'"
>>> print raw
'\x04\x00\x00\x00\x00\x00\x00\x00g\xc4|\x00\x00\x00\x00\x00'
When you are printing out the value of raw, python interprets all bytes that have a value greater than 31 as ASCII characters. When you see g the value for that byte is equal to 103, likewise | is the ASCII code for 124. For characters above 127, python uses a different representation which is why you have \xc4 in your output, the value of that byte is 196.
The actual value of each of the bytes in raw is:
[4, 0, 0, 0, 0, 0, 0, 0, 103, 196, 124, 0, 0, 0, 0, 0]
Which is 16 bytes long.
You can test this by converting the value of each byte back into a character:
>>> values = [4, 0, 0, 0, 0, 0, 0, 0, 103, 196, 124, 0, 0, 0, 0, 0]
>>> as_characters = ''.join(chr(c) for c in values)
>>> as_characters
'\x04\x00\x00\x00\x00\x00\x00\x00g\xc4|\x00\x00\x00\x00\x00'
>>> len(as_characters)
16
I think what you have for raw has had each of the bytes escaped. In my example, when I output as_characters I only see a single backslash, you have two. You may need to use something like pkt[1].sprintf('%Padding.loadr%') to get the non escaped version.
So I have kind of a ignorant (maybe?) question. I'm working with writing to a serial device for the first time. I have a frame [12, 0, 0, 0, 0, 0, 0, 0, 7, 0, X, Y] that I need to send. X and Y are checksum values. My understanding in using the pyserial module is that I need to convert this frame into a string representation. Ok that's fine, but I'm confused on what format things are supposed to be in. I tried doing
a = [12, 0, 0, 0, 0, 0, 0, 0, 7, 0, X, Y]
send = "".join(chr(t) for t in a)
But my confusion comes from the fact that X and Y, when using chr, transform into weird strings (assuming their ascii representation). For example if X is 36, chr(x) is '$' instead of '\x24'. Is there a way I can get a string representing the '\xnn' value instead of the ascii code? What's confusing me is that 12 and 7 convert to '\x0b' and '\x07' correctly. Am I missing something?
Update:
So it might be that I'm not quite understanding how serial writes are being done or what my device is expecting of me. This is a portion of my C code that is working:
fd=open("/dev/ttyS2",O_RDWR|O_NDELAY);
char buff_out[20]
//Next line is psuedo
for i in buff_out print("%x ",buff_out[i]); // prints b 0 0 0 0 0 0 0 9 b3 36
write(fd,buff_out,11);
sleep()
read(fd,buff_in,size);
for i in buff_in print("%x ",buff_in[i]); // prints the correct frame that I'm expecting
Python:
frame = [11, 0, 0, 0, 0, 0, 0, 0, 9] + [crc1, crc1]
senddata = "".join(chr(x) for x in frame)
IEC = serial.Serial(port='/dev/ttyS2', baudrate=1200, timeout=0)
IEC.send(senddata)
IEC.read(18) # number of bytes to read doesn't matter, it's always 0
Am I going about this the right way? Obviously you can't tell exactly since it's device specific and I can't really give too many specifics out. But is that the correct format that serial.send() expects data in?
It's perfectly normal for ASCII bytes to be represented by single characters if they can be printed, and by the \x?? notation otherwise. In both cases they represent a single byte, and you can write strings in either fashion:
>>> '\x68\x65\x6c\x6c\x6f'
'hello'
However if you're using Python 2.6 or later then you might find it easier and more natural to use the built-in bytearray rather than messing around with ord or struct.
>>> vals = [12, 0, 0, 0, 0, 0, 0, 0, 7, 0, 36, 100]
>>> b = bytearray(vals)
>>> b
bytearray(b'\x0c\x00\x00\x00\x00\x00\x00\x00\x07\x00$d')
You can convert to a str (or bytes in Python 3) just by casting, and can index the bytearray to get the integers back.
>>> str(b)
'\x0c\x00\x00\x00\x00\x00\x00\x00\x07\x00$d'
>>> b[0]
12
>>> b[-1]
100
As to your serial Python code, it looks fine to me - I'm not sure why you think there is a problem...
The character with the ASCII-code 36 is '$'. Look it up in any ASCII table. Python only displays the hex escapes if the character is not printable (control characters etc).
At the lowest level, it's the same bit pattern anyway - no matter whether Python prints it as a hex escape or as the char with that ASCII value.
But you might want to use the struct module, it takes care of such conversions for you.
I would guess you want struct.
>>> import struct
>>> struct.pack('>B', 12)
'\x0c'
>>> vals = [12, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0xa, 0xb]
>>> ''.join(struct.pack('>B', x) for x in vals)
'\x0c\x00\x00\x00\x00\x00\x00\x00\x07\x00\n\x0b'
What you do is perfectly fine: your send is what you want: a sequence of bytes with the values you want (a).
If you want to see what are the hexadecimal codes of the characters in send, you can do:
import binascii
print binascii.hexlify(send)
or
print ''.join(r'\x%02x' % ord(char) for char in send)
(if you want \x prefixes).
What you see when directly printing repr(send) is a representation of send, which uses ASCII: 65 represents 'A', but character 12 is '\x0c'. This is merely a convention used by Python, which is convenient when the string contains words, for instance: it is better to display 'Hello' than \x48\x65\x6c\x6c\x6f!