Python how to concatenate string to bytes with \x - python

I'm trying to add a string message to bytes that represent a signed digest of the message for tcp transmission. The client has the signed digest as bytes and the message as a string.
digest = b'd-\xfc*\x7f\xfc\xabr6S>\xa5\xe5\xff\xdd\x80o\xf2\x93\xa0\xdeR\x7f\x1e%W\x81Z\xf2\x06\x12(\x1c\xad"\x1a\x8aNFE\x8ba\x82\xfc\xe19\xe2\x80\x87\xa7\xcf\xbe\x88\xd3\x11}4\r\xc3\x94E\x11#\xbc\x8cF\xd4D+\xdb#e\xb5\x0cVC\x12\x04|J\x9ey\xb8\x88[#\x00ib\xae\x12\xb0\xca\x14X#\rl\xdf\x97\xf4rra\xf1\xa4\xc1\x07\xe2r\xf4\x8f]\xcb\x02\x95\x90z\xc8\x9d\xa6\xa7\x0b^\xc3t\xb8\x01\x89N\xa3\t,\x1c\x06ki\xdb\xdb\x9a6\xbd\xb3W\xfdQ\xeai1\xe2z\xe1Td\xd0\xa0\xc7N\xe4W\x8f\xa0\x8fz`6\x12<\xe9\xdd\xe6:\xfci\xae\x0e\xc3\xfeQ\xaa\xefFw{\x84Ly]\xfc\xe0\xf3G\xa8\xfeA\x9d\xb5+s\xc1\xdf\x98\xb7\xdb.tp+\xd1\xbe\xe1\x15\xba\xa3\xfb\xee\xcf\xf4\x1d\xec\x853Y\xc2\xec\x1cf\x1a5%\xb2!o\x88\x83\x14\x1d"2\xaa\xdc\x03\x97\xd2\xc7\xba\xe8\xe9\x81\xb9\xd0%\xdf\x98b\xf0'
message = hello"
If I use message.encode() message is converted to bytes but it is b'hello' without the \x.
If I concatenate with digest + message the client now has this: md: b'd-\xfc*\x7f\xfc\xabr6S>\xa5\xe5\xff\xdd\x80o\xf2\x93\xa0\xdeR\x7f\x1e%W\x81Z\xf2\x06\x12(\x1c\xad"\x1a\x8aNFE\x8ba\x82\xfc\xe19\xe2\x80\x87\xa7\xcf\xbe\x88\xd3\x11}4\r\xc3\x94E\x11#\xbc\x8cF\xd4D+\xdb#e\xb5\x0cVC\x12\x04|J\x9ey\xb8\x88[#\x00ib\xae\x12\xb0\xca\x14X#\rl\xdf\x97\xf4rra\xf1\xa4\xc1\x07\xe2r\xf4\x8f]\xcb\x02\x95\x90z\xc8\x9d\xa6\xa7\x0b^\xc3t\xb8\x01\x89N\xa3\t,\x1c\x06ki\xdb\xdb\x9a6\xbd\xb3W\xfdQ\xeai1\xe2z\xe1Td\xd0\xa0\xc7N\xe4W\x8f\xa0\x8fz`6\x12<\xe9\xdd\xe6:\xfci\xae\x0e\xc3\xfeQ\xaa\xefFw{\x84Ly]\xfc\xe0\xf3G\xa8\xfeA\x9d\xb5+s\xc1\xdf\x98\xb7\xdb.tp+\xd1\xbe\xe1\x15\xba\xa3\xfb\xee\xcf\xf4\x1d\xec\x853Y\xc2\xec\x1cf\x1a5%\xb2!o\x88\x83\x14\x1d"2\xaa\xdc\x03\x97\xd2\xc7\xba\xe8\xe9\x81\xb9\xd0%\xdf\x98b\xf0hello'
But after being transmitted and unencrypted the encoded string message is no longer present. The server receives only: md: b'd-\xfc*\x7f\xfc\xabr6S>\xa5\xe5\xff\xdd\x80o\xf2\x93\xa0\xdeR\x7f\x1e%W\x81Z\xf2\x06\x12(\x1c\xad"\x1a\x8aNFE\x8ba\x82\xfc\xe19\xe2\x80\x87\xa7\xcf\xbe\x88\xd3\x11}4\r\xc3\x94E\x11#\xbc\x8cF\xd4D+\xdb#e\xb5\x0cVC\x12\x04|J\x9ey\xb8\x88[#\x00ib\xae\x12\xb0\xca\x14X#\rl\xdf\x97\xf4rra\xf1\xa4\xc1\x07\xe2r\xf4\x8f]\xcb\x02\x95\x90z\xc8\x9d\xa6\xa7\x0b^\xc3t\xb8\x01\x89N\xa3\t,\x1c\x06ki\xdb\xdb\x9a6\xbd\xb3W\xfdQ\xeai1\xe2z\xe1Td\xd0\xa0\xc7N\xe4W\x8f\xa0\x8fz`6\x12<\xe9\xdd\xe6:\xfci\xae\x0e\xc3\xfeQ\xaa\xefFw{\x84Ly]\xfc\xe0\xf3G\xa8\xfeA\x9d\xb5+s\xc1\xdf\x98\xb7\xdb.tp+\xd1\xbe\xe1\x15\xba\xa3\xfb\xee\xcf\xf4\x1d\xec\x853Y\xc2\xec\x1cf\x1a5%\xb2!o\x88\x83\x14\x1d"2\xaa\xdc\x03\x97\xd2\xc7\xba\xe8\xe9\x81\xb9\xd0%\xdf\x98b\xf0'
What is the issue here, and how can I fix it?

Your computer doesn't show the \x code for ASCII letters and numbers. the \x is shown for bytes that could not be decoded in either ASCII or UTF-8. When you do message.encode(), the "hello" string turns into "hello" bytes. The computer is registering it as \x something but decodes it before showing it to you.
As for why it isn't showing up, I'm guessing it has something to do with the way you are encrypting it. The digest variable has a length of 256 bytes. Depending on what encryption method you use or whether or not you padded the data correctly, the excess data that didn't fit into the block (in this case the bytes "hello") were lost.

Related

Spoofing bytes of a UDP checksum over network

I'm trying to play with a security tool using scapy to spoof ASCII characters in a UDP checksum. I can do it, but only when I hardcode the bytes in Hex notation. But I can't convert the ASCII string word into binary notation. This works to send the bytes of "He" (first two chars of "Hello world"):
sr1(IP(dst=server)/UDP(dport=53, chksum=0x4865)/DNS(rd=1,qd=DNSQR(qname=query)),verbose=0)
But whenever I try to use a variable of test2 instead of 0x4865, the DNS packet is not transmitted over the network. This should create binary for this ASCII:
test2 = bin(int(binascii.hexlify('He'),16))
sr1(IP(dst=server)/UDP(dport=53, chksum=test2)/DNS(rd=1,qd=DNSQR(qname=query)),verbose=0)
When I print test2 variable is shows correct binary notation representation.
How do I convert a string such as He so that is shows in the checksum notation accepted by scapy, of 0x4865 ??
I was able to get this working by removing the bin(). This works:
test2 = int(binascii.hexlify('He'),16)

Discovering data type of incoming socket data in python

There are couple of devices which are sending socket data over TCP/IP to socket server. Some of the devices are sending data as Binary encoded Hexadecimal string, others are ASCII string.
Eg.;
If device sending data in ASCII string type, script is begin to process immediately without any conversion.
If device sending Binary encoded HEX string, script should has to convert Binary encoded Hex string into Hex string first with;
data = binascii.hexlify(data)
There are two scripts running for different data types for that simple single line. But, I think this could be done in one script if script be aware of the incoming data type. Is there a way to discover type of the incoming socket data in Python?
If you can you should make the sending devices signal what data they are sending eg by using different TCP ports or prepending each message with an "h" for hex or an "a" for ascii - possibly even use an established protocol like XML-RPC
Actually you can only be sure in some cases as all hex-encoded strings are valid ascii and some ascii-strings are valid hex like "CAFE".
You can make sure you can decode a string as hex with
import string
def is_possibly_hex(s):
return all(c in string.hexdigits for c in s)
or
import binascii
def is_possibly_hex(s):
try:
binascii.unhexlify(s)
except binascii.Error:
return False
return True

Sending hexadecimal or ASCII value stored in a variable using pyserial

I was trying to send a byte containing hex value over serial port using pyserial. The hex value has to be in a variable (so that I can do some manipulations before sending). Sample code will explain my intent:
import serial
com=serial.Serial('COM1')
a_var=64
a_var=a_var+1
com.write(a_var) #This of course throws error
I want to receive 'A' or 0x41 on the other side. I could send hex using
com.write(b'\x41')
but not using a variable. Converting it to string or character or encoding the string did not help. I am using python 3.5.
Thanks
At first the name choice of your variable was not optimal. input is a built-in function and you might shadow it.
There are many way to put bytes into a variable:
to_send = b'A'
to_send = b'\x41'
to_send = bytes([65])
You see how to use an ASCII character, the escape sequence for hex numbers and the list of integers.
Now send via
com.write(to_send)
bytearray can be used to send bytes (as hex or ascii). They are mutable, hence numerical manipulations are possible. Any number of bytes can be sent using it.
import serial
com=serial.Serial('COM2')
elements=[65,67,69,71] #Create an array of your choice
a_var=bytearray(elements) #Create byte array of it
com.write(a_var[0:3]) #Write desired elements at serial port
a_var[0]=a_var[0]+1 #Do your mathematical manipulation
com.write(a_var[0:1]) #Write again as desired
com.close()

Issues with Bytes from a Microcontroller in Python

I am using Python to read micro controller values in a windows based program. The encodings / byte decodings and values have begun to confuse me. Here is my situation:
In the software, I am allowed to call a receive function once per byte received by the Python interpreter, once per line (not quite sure what that is) or once per message which I assume is the entire transmission from the micro controller.
I am struggling with the best way to decode these values. The microcontroller is putting out specific values that correlate to a protocol. For example, calling a function that is supposed to return the hex values:
F0, 79, (the phrase standard_firmata.pde) [then] F7
returns:
b'\xf0y\x02\x03S\x00t\x00a\x00n\x00d\x00a\x00r\x00d\x00F\x00i\x00r\x00m\x00a\x00t\x00a\x00.\x00i\x00n\x00o\x00\xf7'
When set to "once per message" . This is what I want, I can see that the correct values are being sent, but there are too man \x00 values included (they are after every byte it seems). Additionally, the second byte is 0ywhen it is supposed to be 79. It seems like it printed its value in ASCII when all the others were in hex.
How can I ignore all these null characters and make everything in the right format (I am fine with normal hex values)
When Python represents a bytes value, it'll use the ASCII representation for anything that has a printable character. Thus the hex 0x79 byte is indeed represented by a y:
>>> b'\x79'
b'y'
Using ASCII characters makes the representation more readable, but doesn't affect the contents. You can use \x.. hex and ASCII notations interchangeably when creating bytes values.
The data appears to encode a UTF-16 message, little endian:
>>> data = b'\xf0y\x02\x03S\x00t\x00a\x00n\x00d\x00a\x00r\x00d\x00F\x00i\x00r\x00m\x00a\x00t\x00a\x00.\x00i\x00n\x00o\x00\xf7'
>>> data[4:-1].decode('utf-16-le')
'Ì‚StandardFirmata.ino'
UTF 16 uses 2 bytes per character, and for ASCII (and Latin 1) codepoints that means that each 2nd byte is a null.
You can use simple comparisons to test for message types:
if data[:2] == b'\xf0\x79':
assert data[-1] == 0xf7, "Message did not end with F7 closing byte"
version = tuple(data[2:4])
message = data[4:-1].decode('utf-16-le')

python b64decode incorrect padding

I'm sending a file over small UDP packets. (python 3)
On the server I divide the file into small pieces and do
packets.append(b64encode(smallPart))
on the other side I do exactly the opposite
packets.append(b64decode(peice))
However, I keep getting (in all but on packet) Incorrect Padding exception
Is there a standard size for b64decode that I'm missing?
Base 64 works by encoding every 3 bytes into 4 bytes. When decoding, it takes those 4 bytes and converts them back to 3 bytes. If there were less than 3 bytes remaining in the input, the output is still padded with '=' to make 4 bytes. If the input to b64decode is not a multiple of 4 bytes you will get the exception.
The easiest solution for you will be to make sure your packets are always a multiple of 4 bytes.
Your description of what you are doing sounds OK. Choice of the input piece size affects only the efficiency. Padding bytes are minimised if the length of each input piece (except of course the last) is a multiple of 3.
You need to show us both your server code and your client code. Alternatively: on the server, log the input and the pieces transmitted. On the client, log the pieces received. Compare.
Curiosity: Why don't you just b64encode the whole string, split the encoded result however you like, transmit the pieces, at the client reassemble the pieces using b''.join(pieces) and b64decode that?
Further curiosity: I thought the contents of a UDP packet could be any old binary bunch of bytes; why are you doing base64 encoding at all?
The length of any properly encoded base64 string should be divisible by 4.
Base64 encodes 3 bytes as 4, so if you start out with a length of string that's not a multiple of 3, the algorithm adds one or two = characters on the end of the encoded form, one for each byte shy of some multiple of 3 (see http://en.wikipedia.org/wiki/Base64#Padding).
The way the alignment comes out, the number of = characters also equals the number of characters shy of a multiple of 4 in the encoded form.
I had been trying to decode an URL-safe base64 encoded string. Simply replacing "." with "=" did the trick for me.
s = s.replace('.', '=')
# then base64decode

Categories