Python 3 Concat Single Byte with String Bytes - python

I need to concat a single byte with the bytes I get from a parameter string.
byte_command = 0x01
socket.send(byte_command + bytes(message, 'UTF-8'))
but I get this error:
socket.send(byte_command + bytes(message, 'UTF-8'))
TypeError: str() takes at most 1 argument (2 given)
I assume this happens because I am using the string concat operator - how do I resolve that?

From the error message, I get that you are running Python2 (works in Python3). Assuming that message is a string:
Python 3 ([Python 3.Docs]: class bytes([source[, encoding[, errors]]])):
byte_command = b"\x01"
sock.send(byte_command + bytes(message, 'UTF-8'))
Python 2 (where bytes and str are the same):
byte_command = "\x01"
sock.send(byte_command + message)
I also renamed the socket to sock so it doesn't clash with the socket module itself.
As everyone suggested, it's recommended / common to do the transformation using message.encode("utf8") (in Python 3 the argument is not even necessary, as utf8 is the default encoding).
More on the differences (although question is in different area): [SO]: Passing utf-16 string to a Windows function (#CristiFati's answer).

From that error message, it looks like you are using python2, not python3. In python2, bytes is just an alias for str, and str only takes one argument.
To make something that works in python2 and python3, use str.encode rather than bytes:
byte_command = b'0x01'
socket.send(byte_command + message.encode('UTF-8'))

Related

Converting PySerial Code from Python 3 to 2.7

I'm trying to port a program written in Python 3.5 to 2.7, but it seems the addition of bytes objects in Python 3 changes how PySerial is implemented. Unfortunately, I cannot find any documentation for PySerial 2.x, so I would greatly appreciate help converting this code from PySerial 3 to 2:
import serial
ser = serial.Serial('COM6', 9600)
ser.write(bytes(chr(0x30), 'UTF-8'))
dataIn = ser.read(size=4)
Since the bytes object is just an alias for the str type in Python 2.x, I get the following error:
TypeError: str() takes at most 1 argument (2 given)
Does PySerial's write() method use a bytearray object as the parameter or does it use a string with another parameter for the encoding?
What datatype does ser.read(size=4) return?
Or better yet if someone has a link to the documentation...
Well, I had to literally use a time machine to find PySerial 2's documentation, which no longer exists, but now I can answer my own question:
ser.write() accepts a str or, indeed, a bytearray in Python 2.6+. A bytearray is probably better so I can specify the encoding if I want, since a str's encoding seems to depend on system settings. So my code is now:
ser.write(bytearray([0x30]))
Also, ser.read returns a str (or a bytes in Py2.6+).

Python: How to send a hexdecimal string through socket in Python3 without encoding it?

I performed socket communication in python2, it worked well and I have to make it works in python3 again. I have tired str.encode() stuff with many formats, but the other side of the network can't recognize what I send. The only thing I know is that the python3 str type is encoded as Unicode uft-8 in default, and I'm pretty sure the critical question in here is that what is the format of python2 str type. I have to send exactly the same thing as what was stored in python2 str. But the tricky thing is the socket of python3 only sends the encoded unicode bytes or other buffer interface, rather than the str type with the raw data in Python2. The example is as follow:
In python2:
data = 'AA060100B155'
datasplit = [fulldata[i: i+2] for i in range(0, len(fulldata), 2)]
senddata = ''
for item in datasplit:
itemdec = chr(int(item, 16))
senddata += itemdec
print(senddata)
#'\xaa\x06\x01\x00\xb1U',which is the data I need
In python3, seems it can only sends the encoded bytes using "senddata.encode()", but it is not the format I want. You can try:
print(senddata.encode('latin-1'))
#b'\xaa\x06\x01\x01\xb2U'
to see the difference of two senddatas, and an interesting thing is that it is faulty encoded when using utf-8.
The data stored in Python3 str type is the thing I need, but my question is how to send the data of that string without encoding it? Or how to perform the same str type of Python2 in Python3?
Can anyone help me with this?
I performed socket communication in python2, it worked well and I have to make it works in python3 again. I have tired str.encode() stuff with many formats, but the other side of the network can't recognize what I send.
You have to make sure that whatever you send is decodable by the other side. The first step you need to take is to know what sort of encoding that network/file/socket is using. If you use UTF-8 for instance to send your encoded data and the client has ASCII encoding, this will work. But, say cp500 is the encoding scheme of your client and you send the encoded string as UTF-8, this won't work. It's better to pass the name of your desired encoding explicitly to functions, because sometimes the default encoding of your platform may not necessarily be UTF-8. You can always check the default encoding by this call sys.getdefaultencoding().
The only thing I know is that the python3 str type is encoded as Unicode uft-8 in default, and I'm pretty sure the critical question in here is that what is the format of python2 str type. I have to send exactly the same thing as what was stored in python2 str. But the tricky thing is the socket of python3 only sends the encoded unicode bytes or other buffer interface, rather than the str type with the raw data in Python2
Yes, Python 3.X uses UTF-8 as the default encoding, but this is not guaranteed in some cases the default encoding could be changed, it's better to pass the name of the desired encoding explicitly to avoid such cases. Notice though, str in Python 3.X is the equivalent of unicode + str in 2.X, but str in 2.X supports only 8-bit (1-byte) (0-255) characters.
On one hand, your problem seems with 3.X and its type distinction between str and bytes strings. For APIs that expect bytes won't accept str in 3.X as of today. This is unlike 2.X, where you can mix unicode and str freely. This distinction in 3.X makes sense, given str represents decoded strings and used for textual data. Whereas, bytes represents encoded strings as raw bytes with absolute byte values.
On the other hand, you have problem with choosing the right encoding for your text in 3.X that you need to pass to client. First check what sort of encoding does your client use. Second, pass the encoded string with the the proper encoding scheme of your client so your client can decode it properly: str.encode('same-encoding-as-client').
Because you pass your data as str in 2.X and it works, I suspect and it's most likely your client uses 8-bit encoding for characters, something like Latin-1 might be the encoding used by your client.
You can convert the whole string to an integer, then use the integer method to_bytes to convert it into a bytes object:
fulldata = 'AA060100B155'
senddata = int(fulldata, 16).to_bytes(len(fulldata)//2, byteorder='big')
print(senddata)
# b'\xaa\x06\x01\x00\xb1U'
The first parameter of to_bytes is the number of bytes, the second (required) is the byteorder.
See int.to_bytes in the official documentation for reference.
There are various ways to do this. Here's one that works in both Python 2 and Python 3.
from binascii import unhexlify
fulldata = 'AA060100B155'
senddata = unhexlify(fulldata)
print(repr(senddata))
Python 2 output
'\xaa\x06\x01\x00\xb1U'
Python 3 output
b'\xaa\x06\x01\x00\xb1U'
The following is Python 2/3 compatible. The unhexlify function converts hexadecimal notation to bytes. Use a byte string and you don't have to deal with Unicode strings. Python 2 is byte strings by default, but recognizes the b'' syntax that Python 3 requires to use a byte string.
from binascii import unhexlify
fulldata = b'AA060100B155'
print(repr(unhexlify(fulldata)))
Python 2 output:
'\xaa\x06\x01\x00\xb1U'
Python 3 output:
b'\xaa\x06\x01\x00\xb1U'

Working with strings seems more cumbersome than it needs to be in Python 3.x

I have a function that takes in a string, sends it via a socket, and prints it to the console. Sending strings to this function yields some warnings that turn into other warnings when attempting to fix them.
Function:
def log(socket, sock_message):
sock_message = sock_message.encode()
socket.send(sock_message)
print(sock_message.decode())
I'm attempting to call my function this way:
log(conn, "BATT " + str(random.randint(1, 100)))
And also, for simplicity:
log(conn, "SIG: 100%")
With both of the log calls, I get Type 'str' doesn't have expected attribute 'decode'. So instead, I saw you could pass a string as an array of bytes with bytes("my string", 'utf-8') but then I get the warning Type 'str' doesn't have expected attribute 'encode'.
I'm 100% sure I'm just missing some key bit of information on how to pass strings around in python, so what's the generally accepted way to accomplish this?
EDIT:
As explained below, an str can't have both decode and encode and I'm confusing my IDE by doing both on the same variable. I fixed it by maintaining a separate variable for the bytes version, and this fixes the issue.
def log(sock, msg):
sock_message = msg.encode()
sock.send(sock_message)
print(sock_message.msg())
In Python 2 you could be very sloppy (and sometimes get away with it) when handling characters (strings) and handling bytes. Python 3 fixes this by making them two separate types: str and bytes.
You encode to convert from str to bytes. Many characters (in particular ones not in English / US-ASCII) require two or more bytes to represent them (in many encodings).
You decode to convert from bytes to str.
Thus you can't decode a str. You need to encode it to print it or to send it anywhere that requires bytes (files, sockets, etc.). You also need to use the correct encoding so that the receiver of the bytes can correctly decode it and receive the correct characters. For some US-ASCII is sufficient. Many prefer using UTF-8, in part because all the characters that can be handled by US-ASCII are the same in UTF-8 but UTF-8 can handle (other) Unicode characters.
The socket.send description indicates that it takes bytes. Try encoding your string to bytes as part of your log function.
def log(socket, sock_message):
sock_bytes = bytes(sock_message, 'UTF-8')
socket.send(sock_bytes)

Strings operation issue while switching from python 2.x to python 3

I am facing some problem with strings switching from python 2.x to python 3
Issue 1:
from ctypes import*
charBuffer=create_string_buffer(1000)
var = charBuffer.value # var contains like this "abc:def:ghi:1234"
a,b,c,d= var.split(':')
It works fine in python 2.x but not in 3.x it is throwing some errors like this
a,b,c,d= var.split(':')
TypeError: 'str' does not support the buffer interface
I got the links after doing some research in stackoverflow link link2
If I print, desired output would be
a= abc
b =def
c=ghi
d=1234
Issue2:
from ctypes import*
cdll = "Windll"
var = 0x1fffffffffffffffffffffff # I want to send this long variable to character pointer which is in cdll
charBuf =create_string_buffer(var.to_bytes(32,'little'))
cdll.createBuff (charBuf )
cdll function
int createBuff (char * charBuff){
print charBuff
return 0;
}
I want to send this long variable to character pointer which is in cdll, since its a character pointer its throwing errors.
Need your valuable inputs on how could I achieve this. Thanks in advance
In Python 3.x , '.value' on return of create_string_buffer() returns a byte string .
In your example you are trying to split the byte string using a Unicode string (which is the normal string in Python 3.x ) . This is what is causing your issue.
You would need to either split with byte string . Example -
a,b,c,d = var.split(b':')
Or you can decode the byte string to a Unicode string using '.decode()' method on it .
Example -
var = var.decode('<encoding>')
Split using b":" and you will be fine in both versions of python.
In py2 str is a bytestring, in py3 str is a unicode object. The object returned by the ctypes string buffer is a bytestring (str on py2 and bytes on py3). By writing the string literal as b"... you force it to be a bytestring in both version of python.

Python string argument without an encoding

Am trying to a run this piece of code, and it keeps giving an error saying "String argument without an encoding"
ota_packet = ota_packet.encode('utf-8') + bytearray(content[current_pos:(final_pos)]) + '\0'.encode('utf-8')
Any help?
You are passing in a string object to a bytearray():
bytearray(content[current_pos:(final_pos)])
You'll need to supply an encoding argument (second argument) so that it can be encoded to bytes.
For example, you could encode it to UTF-8:
bytearray(content[current_pos:(final_pos)], 'utf8')
From the bytearray() documentation:
The optional source parameter can be used to initialize the array in a few different ways:
If it is a string, you must also give the encoding (and optionally, errors) parameters; bytearray() then converts the string to bytes using str.encode().
byteObject = b'\x18,\xa3\xf0A\x93*<bAd\x15K.A\xba'
print(byteObject)
print('-----------asbytearray----------')
print('-------As a string------------------')
o = base64.b64encode(bytes(str(byteObject), 'utf-8'))
print(o.decode("utf-8"))`enter code here`
print('--------Nonce as a string------------------')

Categories