Python decoding UDP - python

code:
import socket, binascii, struct
s = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_UDP)
while True:
print s.recv(2048)
output:
Ek�9##�F5��W�jq��� stackexchangecom� electronics �
h h
stackexchangecomDa�scifi ET##�<���� stackoverflowcom���meta
,��� stackoverflowcom�A���meta
,��� stackexchangecomG��security Ee##�+���� stackexchangecom���scifi
as you can see some of the data has been decoded/interpreted but the rest isn't not sure as to why
Can anyone help?

You're printing raw UDP packets, which contain arbitrary binary data. Some of those bytes are in the printable range, but those that aren't in that range get converted into �.
You can get a better look at that data by printing its representation, which shows the printable bytes as normal and shows the unprintable ones as hexadecimal escape codes. To do that, change your print statement to:
print repr(s.recv(2048))
I suspect you'd like to actually decode those packets. That's quite possible, but it's a bit technical, and you should probably study the topic a bit first. :) This article by Silver Moon, Code a network packet sniffer in python for Linux, looks quite helpful.

Related

How to decode Byte Array in Python

Hey, I contacted the company multiple times and after some weird conversations I got some code that let me read and decode the data. Thank you everyone for your help!
I connected a PCB to my Raspberry PI that should output temperature, humidity, pressure and air quality. I receive the data via serial. I wrote a Python script that read on the serial and outputs the data.
#!/usr/bin/env python
import time
import serial
ser = serial.Serial(
port='/dev/ttyAMA0',
baudrate = 9600,
parity=serial.PARITY_NONE,
stopbits=serial.STOPBITS_ONE,
bytesize=serial.EIGHTBITS,
timeout=1
)
while 1:
x=ser.readline()
print (x)
And the data looks like this (multiple sample data):
b'ZZ?\x0f\t,\x16a\x01\x86\x8d\x10Y\x00\x02\xa5\x9b\x00p\xdd'
b'ZZ?\x0f\t.\x16]\x01\x86\x8f\x10Z\x00\x02\xa3\x7f\x00p\xc0'
b'ZZ?\x0f\t0\x16[\x01\x86\x91\x10Y\x00\x02\xa2\xcc\x00p\r'
b'ZZ?\x0f\t2\x16S\x01\x86\x91\x10V\x00\x02\xa4\xe7\x00p!'
b'ZZ?\x0f\t3\x16O\x01\x86\x8f\x10X\x00\x02\xa3\x7f\x00p\xb5'
So that should be multiple byte-arrays. Sadly there is no documentation so I can't find anything how to decode this. If I try to decode the data:
x=ser.readline().decode()
I get the following error:
Traceback (most recent call last):
File "ser.py", line 16, in <module>
x=ser.readline().decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x86 in position 9: invalid start byte
So maybe the data is not utf-8? Ignoring the errors does not help. Does someone know how to proper decode the data? That would help me a lot!
Thanks!
This looks like raw binary data (not human readable in any of the common encodings I tried). You'll need to look up the structure of bytes and likely use the struct library to convert to regular python objects.
If there is no documentation you'll have to reverse engineer it. Each bytearray is 20 bytes long, and the first four bytes are all the same, so my gut assumption is that the first four bytes (32 bits) are a header, followed by your four values as 32 bit floats or ints. If that is the case you could decode each array with something like:
>>>struct.unpack('iiiii', b'ZZ?\x0f\t3\x16O\x01\x86\x8f\x10X\x00\x02\xa3\x7f\x00p\xb5')
(255810138, 1326854921, 277841409, -1560149928, -1250951041)
The examples you provided suggested the simple case of all 4 byte numbers probably isn't the case (none of those numbers make sense for weather readings), but it may be a mixture of various length numbers to account for the various sensors having differing levels of precision.
I contacted the company multiple times and after some weird conversations I got some code that let me read and decode the data. Thank you everyone for your help!

Spoofing bytes of a UDP checksum over network

I'm trying to play with a security tool using scapy to spoof ASCII characters in a UDP checksum. I can do it, but only when I hardcode the bytes in Hex notation. But I can't convert the ASCII string word into binary notation. This works to send the bytes of "He" (first two chars of "Hello world"):
sr1(IP(dst=server)/UDP(dport=53, chksum=0x4865)/DNS(rd=1,qd=DNSQR(qname=query)),verbose=0)
But whenever I try to use a variable of test2 instead of 0x4865, the DNS packet is not transmitted over the network. This should create binary for this ASCII:
test2 = bin(int(binascii.hexlify('He'),16))
sr1(IP(dst=server)/UDP(dport=53, chksum=test2)/DNS(rd=1,qd=DNSQR(qname=query)),verbose=0)
When I print test2 variable is shows correct binary notation representation.
How do I convert a string such as He so that is shows in the checksum notation accepted by scapy, of 0x4865 ??
I was able to get this working by removing the bin(). This works:
test2 = int(binascii.hexlify('He'),16)

Discovering data type of incoming socket data in python

There are couple of devices which are sending socket data over TCP/IP to socket server. Some of the devices are sending data as Binary encoded Hexadecimal string, others are ASCII string.
Eg.;
If device sending data in ASCII string type, script is begin to process immediately without any conversion.
If device sending Binary encoded HEX string, script should has to convert Binary encoded Hex string into Hex string first with;
data = binascii.hexlify(data)
There are two scripts running for different data types for that simple single line. But, I think this could be done in one script if script be aware of the incoming data type. Is there a way to discover type of the incoming socket data in Python?
If you can you should make the sending devices signal what data they are sending eg by using different TCP ports or prepending each message with an "h" for hex or an "a" for ascii - possibly even use an established protocol like XML-RPC
Actually you can only be sure in some cases as all hex-encoded strings are valid ascii and some ascii-strings are valid hex like "CAFE".
You can make sure you can decode a string as hex with
import string
def is_possibly_hex(s):
return all(c in string.hexdigits for c in s)
or
import binascii
def is_possibly_hex(s):
try:
binascii.unhexlify(s)
except binascii.Error:
return False
return True

Python: Creating 16-bit source and destination ports for a packet header

I'm creating a networking protocol in application space on top of UDP in python for homework. I need to represent the source port and destination port as 16-bit numbers. All attempts have failed.
The way I'm testing this is by creating a udp socket and looking at the return value of sendto(). Here's your typical socket code:
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
addr = ("127.0.0.1", 1234)
Ports range from 0 to 65535. Let's say I choose a port of 65000. I want sendto() to return 2 (2 bytes = 16 bits sent). Here's what I've tried:
I call the following and get:
>>>mySock.sendto(655000, addr)
TypeError: must be string or buffer, not int
Ok, let's try using bytes()
>>>mySock.sendto(bytes(65000), addr)
5
Hm, that's not what I want. That is making each number into a character that is a single byte.
What if I bitwise or it with 0x0000?
>>>mySock.sendto(bytes(65000 | 0x0000), addr)
5
Well, darn it! The closest thing I've come to is messing around with hex() and bytearray(). See below.
>>>hex(65000)
'0xfde8'
>>>mySock.sendto('\xfde8', addr)
3
Shouldn't that say 2 bytes? I'm not sure how this works. Also, when the number is less than 16384 I want to preserve the preceding 0's. So, for example, if the port number is 255 (0b0000000011111111) I want it to remain as a 2 byte data structure (0x00FF) rather than truncating down to 0xFF or 0b11111111.
If you want to send binary data, please use module struct. That will help you to encode the string and to make sure that you are using the proper endianness. For example:
>>> import struct
>>> struct.pack('!H', 65000)
'\xfd\xe8'
That's 65000 as an unsigned short, in network order (big endian)

Reading LabVIEW TCP data (Flattened String / Data Cluster) in Python

I have a LabVIEW application that is flattening a cluster (array) of Doubles to a string, before transmitting over TCP/IP to my python application. It does this because TCP/IP will only transmit strings.
The problem is that python reads the string as a load of nonsense ASCII characters, and I can't seem to unscramble them back to the original array of doubles.
How do I interpret the string data that LabVIEW sends after flattening a data strings. My only hint of useful information after hours of google was a PyPI entry called pyLFDS, however it has since been taken down.
The LabVIEW flattened data format is described in some detail here. That document doesn't explicitly describe how double-precision floats (DBL type) are represented, but a little more searching found this which clarifies that they are stored in IEEE 754 format.
However it would probably be simpler and more future proof to send your data in a standard text format such as XML or JSON, both of which are supported by built-in functions in LabVIEW and standard library modules in Python.
A further reason not to use LabVIEW flattened data for exchange with other programs, if you have the choice, is that the flattened string doesn't include the type descriptor you need to convert it back into the original data type - you need to know what type the data was in order to decode it.
I wanted to document the problem and solution so others can hopefully avoid the hours I have wasted looking for a solution on google.
When LabVIEW flattens data, in this case a cluster of doubles, it sends them simply as a concatonated string with each double represented by 8 bytes. This is interpreted by python as 8 ASCII characters per double, which appears as nonsense in your console.
To get back to the transmitted doubles, you need to take each 8-byte section in turn and convert the ASCII characters to their ASCII codes, in Python's case using ord().
This will give you an 8 bytes of decimal codes (e.g. 4.8 = [64 19 51 51 51 51 51 51])
It turns out that LabVIEW does most things, including TCP/IP transmissions, Big Endian. Unless you are working Big Endian, you will probably need to change it around. For example the example above will become [51 51 51 51 51 51 19 64]. I put each of my doubles into a list, so was able to use the list(reversed()) functions to change the endienness.
You can then convert this back to a double. Example python code:
import struct
b = bytearray([51,51,51,51,51,51,19,64]) #this is the number 4.8
value = struct.unpack('d', b)
print(value) #4.8
This is probably obvious to more experienced programmers, however it had me flummuxed for days. I apologise for using stackoverflow as the platform to share this by answering my own question, but hopefully this post helps the next person who is struggling.
EDIT: Note if you are using an earlier version than Python 2.7.5 then you might find struct.unpack() will fail. Using the example code above substituting the following code worked for me:
b = bytes(bytearray([51,51,51,51,51,51,19,64]))
This code works for me. UDP server accept flattened dbl array x, return x+1 to port 6503. Modify LabView UDP client to your needs.
import struct
import socket
import numpy as np
def get_ip():
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
try:
# doesn't even have to be reachable
s.connect(('10.255.255.255', 1))
IP = s.getsockname()[0]
except:
IP = '127.0.0.1'
finally:
s.close()
return IP
#bind_ip = get_ip()
print("\n\n[*] Current ip is %s" % (get_ip()))
bind_ip = ''
bind_port = 6502
server = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
server.bind((bind_ip,bind_port))
print("[*] Ready to receive UDP on %s:%d" % (bind_ip,bind_port))
while True:
data, address = server.recvfrom(1024)
print('[*] Received %s bytes from %s:' % (len(data), address))
arrLen = struct.unpack('>i', data[:4])[0]
print('[*] Received array of %d doubles:' % (arrLen,))
x = []
elt = struct.iter_unpack('>d', data[4:])
while True:
try:
x.append(next(elt)[0])
print(x[-1])
except StopIteration:
break
x = np.array(x)
y = x+1 # np.sin(x)
msg = data[:4]
for item in y:
msg += struct.pack('>d', item)
print(msg)
A = (address[0], 6503)
server.sendto(msg, A)
break
server.close()
print('[*] Server closed')
print('[*] Done')
LabView UDP client:
I understand that this does not solve your problem as you mentioned you didn't have the ability to modify the LabVIEW code. But, I was hoping to add some clarity on common ways string data is transmitted over TCP in LabVIEW.
The Endianness of the data string sent through the Write TCP can be controlled. I recommend using the Flatten To String Function as it gives you the ability to select which byte order you want to use when you flatten your data; big-endian (default if unwired), native (use the byte-order of the host machine), or little-endian.
Another common technique I've seen is using the Type Cast Function. Doing this will convert the numeric to a big-endian string. This of course can be confusing when you read it on the other end of the network as most everything else is little-endian, so you'll need to do some byte-swapping.
In general, if you're not sure what the code looks like, assume that it will be big-endian if its coming from LabVIEW code.
The answer from nekomatic is good one. Using a standard text format when available is always a good option.

Categories