Converting PySerial Code from Python 3 to 2.7 - python

I'm trying to port a program written in Python 3.5 to 2.7, but it seems the addition of bytes objects in Python 3 changes how PySerial is implemented. Unfortunately, I cannot find any documentation for PySerial 2.x, so I would greatly appreciate help converting this code from PySerial 3 to 2:
import serial
ser = serial.Serial('COM6', 9600)
ser.write(bytes(chr(0x30), 'UTF-8'))
dataIn = ser.read(size=4)
Since the bytes object is just an alias for the str type in Python 2.x, I get the following error:
TypeError: str() takes at most 1 argument (2 given)
Does PySerial's write() method use a bytearray object as the parameter or does it use a string with another parameter for the encoding?
What datatype does ser.read(size=4) return?
Or better yet if someone has a link to the documentation...

Well, I had to literally use a time machine to find PySerial 2's documentation, which no longer exists, but now I can answer my own question:
ser.write() accepts a str or, indeed, a bytearray in Python 2.6+. A bytearray is probably better so I can specify the encoding if I want, since a str's encoding seems to depend on system settings. So my code is now:
ser.write(bytearray([0x30]))
Also, ser.read returns a str (or a bytes in Py2.6+).

Related

Python: How to send a hexdecimal string through socket in Python3 without encoding it?

I performed socket communication in python2, it worked well and I have to make it works in python3 again. I have tired str.encode() stuff with many formats, but the other side of the network can't recognize what I send. The only thing I know is that the python3 str type is encoded as Unicode uft-8 in default, and I'm pretty sure the critical question in here is that what is the format of python2 str type. I have to send exactly the same thing as what was stored in python2 str. But the tricky thing is the socket of python3 only sends the encoded unicode bytes or other buffer interface, rather than the str type with the raw data in Python2. The example is as follow:
In python2:
data = 'AA060100B155'
datasplit = [fulldata[i: i+2] for i in range(0, len(fulldata), 2)]
senddata = ''
for item in datasplit:
itemdec = chr(int(item, 16))
senddata += itemdec
print(senddata)
#'\xaa\x06\x01\x00\xb1U',which is the data I need
In python3, seems it can only sends the encoded bytes using "senddata.encode()", but it is not the format I want. You can try:
print(senddata.encode('latin-1'))
#b'\xaa\x06\x01\x01\xb2U'
to see the difference of two senddatas, and an interesting thing is that it is faulty encoded when using utf-8.
The data stored in Python3 str type is the thing I need, but my question is how to send the data of that string without encoding it? Or how to perform the same str type of Python2 in Python3?
Can anyone help me with this?
I performed socket communication in python2, it worked well and I have to make it works in python3 again. I have tired str.encode() stuff with many formats, but the other side of the network can't recognize what I send.
You have to make sure that whatever you send is decodable by the other side. The first step you need to take is to know what sort of encoding that network/file/socket is using. If you use UTF-8 for instance to send your encoded data and the client has ASCII encoding, this will work. But, say cp500 is the encoding scheme of your client and you send the encoded string as UTF-8, this won't work. It's better to pass the name of your desired encoding explicitly to functions, because sometimes the default encoding of your platform may not necessarily be UTF-8. You can always check the default encoding by this call sys.getdefaultencoding().
The only thing I know is that the python3 str type is encoded as Unicode uft-8 in default, and I'm pretty sure the critical question in here is that what is the format of python2 str type. I have to send exactly the same thing as what was stored in python2 str. But the tricky thing is the socket of python3 only sends the encoded unicode bytes or other buffer interface, rather than the str type with the raw data in Python2
Yes, Python 3.X uses UTF-8 as the default encoding, but this is not guaranteed in some cases the default encoding could be changed, it's better to pass the name of the desired encoding explicitly to avoid such cases. Notice though, str in Python 3.X is the equivalent of unicode + str in 2.X, but str in 2.X supports only 8-bit (1-byte) (0-255) characters.
On one hand, your problem seems with 3.X and its type distinction between str and bytes strings. For APIs that expect bytes won't accept str in 3.X as of today. This is unlike 2.X, where you can mix unicode and str freely. This distinction in 3.X makes sense, given str represents decoded strings and used for textual data. Whereas, bytes represents encoded strings as raw bytes with absolute byte values.
On the other hand, you have problem with choosing the right encoding for your text in 3.X that you need to pass to client. First check what sort of encoding does your client use. Second, pass the encoded string with the the proper encoding scheme of your client so your client can decode it properly: str.encode('same-encoding-as-client').
Because you pass your data as str in 2.X and it works, I suspect and it's most likely your client uses 8-bit encoding for characters, something like Latin-1 might be the encoding used by your client.
You can convert the whole string to an integer, then use the integer method to_bytes to convert it into a bytes object:
fulldata = 'AA060100B155'
senddata = int(fulldata, 16).to_bytes(len(fulldata)//2, byteorder='big')
print(senddata)
# b'\xaa\x06\x01\x00\xb1U'
The first parameter of to_bytes is the number of bytes, the second (required) is the byteorder.
See int.to_bytes in the official documentation for reference.
There are various ways to do this. Here's one that works in both Python 2 and Python 3.
from binascii import unhexlify
fulldata = 'AA060100B155'
senddata = unhexlify(fulldata)
print(repr(senddata))
Python 2 output
'\xaa\x06\x01\x00\xb1U'
Python 3 output
b'\xaa\x06\x01\x00\xb1U'
The following is Python 2/3 compatible. The unhexlify function converts hexadecimal notation to bytes. Use a byte string and you don't have to deal with Unicode strings. Python 2 is byte strings by default, but recognizes the b'' syntax that Python 3 requires to use a byte string.
from binascii import unhexlify
fulldata = b'AA060100B155'
print(repr(unhexlify(fulldata)))
Python 2 output:
'\xaa\x06\x01\x00\xb1U'
Python 3 output:
b'\xaa\x06\x01\x00\xb1U'

how to convert Python 2 unicode() function into correct Python 3.x syntax

I enabled the compatibility check in my Python IDE and now I realize that the inherited Python 2.7 code has a lot of calls to unicode() which are not allowed in Python 3.x.
I looked at the docs of Python2 and found no hint how to upgrade:
I don't want to switch to Python3 now, but maybe in the future.
The code contains about 500 calls to unicode()
How to proceed?
Update
The comment of user vaultah to read the pyporting guide has received several upvotes.
My current solution is this (thanks to Peter Brittain):
from builtins import str
... I could not find this hint in the pyporting docs.....
As has already been pointed out in the comments, there is already advice on porting from 2 to 3.
Having recently had to port some of my own code from 2 to 3 and maintain compatibility for each for now, I wholeheartedly recommend using python-future, which provides a great tool to help update your code (futurize) as well as clear guidance for how to write cross-compatible code.
In your specific case, I would simply convert all calls to unicode to use str and then import str from builtins. Any IDE worth its salt these days will do that global search and replace in one operation.
Of course, that's the sort of thing futurize should catch too, if you just want to use automatic conversion (and to look for other potential issues in your code).
You can test whether there is such a function as unicode() in the version of Python that you're running. If not, you can create a unicode() alias for the str() function, which does in Python 3 what unicode() did in Python 2, as all strings are unicode in Python 3.
# Python 3 compatibility hack
try:
unicode('')
except NameError:
unicode = str
Note that a more complete port is probably a better idea; see the porting guide for details.
Short answer: Replace all unicode calls with str calls.
Long answer: In Python 3, Unicode was replaced with strings because of its abundance. The following solution should work if you are only using Python 3:
unicode = str
# the rest of your goes goes here
If you are using it with both Python 2 or Python 3, use this instead:
import sys
if sys.version_info.major == 3:
unicode = str
# the rest of your code goes here
The other way: run this in the command line
$ 2to3 package -w
First, as a strategy, I would take a small part of your program and try to port it. The number of unicode calls you are describing suggest to me that your application cares about string representations more than most and each use-case is often different.
The important consideration is that all strings are unicode in Python 3. If you are using the str type to store "bytes" (for example, if they are read from a file), then you should be aware that those will not be bytes in Python3 but will be unicode characters to begin with.
Let's look at a few cases.
First, if you do not have any non-ASCII characters at all and really are not using the Unicode character set, it is easy. Chances are you can simply change the unicode() function to str(). That will assure that any object passed as an argument is properly converted. However, it is wishful thinking to assume it's that easy.
Most likely, you'll need to look at the argument to unicode() to see what it is, and determine how to treat it.
For example, if you are reading UTF-8 characters from a file in Python 2 and converting them to Unicode your code would look like this:
data = open('somefile', 'r').read()
udata = unicode(data)
However, in Python3, read() returns Unicode data to begin with, and the unicode decoding must be specified when opening the file:
udata = open('somefile', 'r', encoding='UTF-8').read()
As you can see, transforming unicode() simply when porting may depend heavily on how and why the application is doing Unicode conversions, where the data has come from, and where it is going to.
Python3 brings greater clarity to string representations, which is welcome, but can make porting daunting. For example, Python3 has a proper bytes type, and you convert byte-data to unicode like this:
udata = bytedata.decode('UTF-8')
or convert Unicode data to character form using the opposite transform.
bytedata = udata.encode('UTF-8')
I hope this at least helps determine a strategy.
You can use six library which have text_type function (unicode in py2, str in py3):
from six import text_type

binascii.unhexlify working differently in Python 3.2 and Python3.4?

I used to work on Linux Mint, and the latest version of Python 3 embedded in it is Python 3.4. My program takes a hex string as input, decodes it and creates a bytearray so I can decode several information using struct.unpack. For example:
hex_str = "00000E0C180E180FEABF070030313564336332363338303431653039004A62004A62006A62406A622E636F6D00"
s = binascii.unhexlify(hex_str)
print(s) # Would print b'\x00\x00\x0e\x0c\x18\x0e\x18\x0f\xea\xbf\x07\x00015d3c2638041e09\x00Jb\x00Jb\x00jb#jb.com\x00'
data = bytearray(s)
date_data = data[:9]
form_date = get_date(date_data) # Get the date using a bunch of struct.unpack
print(form_date) # Would print '2014-12-24 14:24:15'
Last week my computer crashed, so I had to build a new machine. I decided to give a try to Debian Wheezy. However, I discovered that the only version of Python is Python 2.7. I installed Python 3 using apt-get, but I noticed that the version installed is only Python 3.2. When I run the exact same code as above, I get a TypeError on the binascii.unhexlify line:
hex_str = "00000E0C180E180FEABF070030313564336332363338303431653039004A62004A62006A62406A622E636F6D00"
s = binascii.unhexlify(hex_str)
# TypeError: 'NavigableString' does not support the buffer interface
I don't understand this error, what does it mean?
I checked on Google but couldn't find anything: have there been any changes on binascii.unhexlify between the two versions? Do I have to change something in 3.2?
I really don't see how to solve this... Maybe there is a better way to achieve that?
Thanks.
PS: I could go back to Linux Mint, or install Python 3.4 on Debian, but I think my production server is a fresh install of Debian, so with Python 3.2... so I'd better target that version (and I am glad I discovered it now!).
Yes, there was a change in behavior between versions. From the binascii module documentation:
Note: a2b_* functions accept Unicode strings containing only ASCII characters. Other functions only accept bytes-like objects (such as bytes, bytearray and other objects that support the buffer protocol).
Changed in version 3.3: ASCII-only unicode strings are now accepted by the a2b_* functions.
So if you want to target Python <3.3, you need to pass in either bytes or bytearray objects instead of strings.

How can I read a byte array from a socket in Python

I am using bluetooth to send a 16-byte byte-array to a Python server. Basically what I would like to achieve is read the byte-array as it is. How can I do that in Python.
What I am doing right now is reading a string since that is the only way I know how I can read data from a socket. This is my code from the socket in python
data = client_sock.recv(1024)
Where data is the string. Any ideas?
You're already doing exactly what you asked.
data is the bytes received from the socket, as-is.
In Python 3.x, it's a bytes object, which is just an immutable version of bytearray. In Python 2.x, it's a str object, since str and bytes are the same type. But either way, that type is just a string of bytes.
If you want to access those bytes as numbers rather than characters: In Python 3.x, just indexing or iterating the bytes will do that, but in Python 2.x, you have to call ord on each character. That's easy.
Or, in both versions, you can just call data = bytearray(data), which makes a mutable bytearray copy of the data, which gives you numbers rather than characters when you index or iterate it.
So, for example, let's say we want to write the decimal values of each bytes on a separate line to a text file (a silly thing to do, but it demonstrates the ideas) in Python 2.7:
data = client_sock.recv(1024)
with open('textfile.txt', 'a') as f:
for ch in data:
f.write('{}\n'.format(ord(ch)))
what you want is the struct module. specifically struct.unpack()

TypeError connecting to tweetstream -- Python 3.2

Apologies, sure this is simple:
New to Python and to Python interfaces with Twitter's streaming API, trying to use tweetstream on Python 3.2 to do so.
import tweetstream
stream = tweetstream.FilterStream(username = "myusername", password = "mypassword", track = bytes("oprah", encoding = "utf-8"))
for tweet in stream:
print(tweet)
Throws:
TypeError: sequence item 0: expected str instance, int found
I had encoded the 'track' argument because earlier attempts to pass a string threw,
TypeError: POST data should be bytes or an iterable of bytes. It cannot be str.
Thanks.
Your problem is you are using Python3 with a Python2 package.
See here for information on Python3 bytes:
http://docs.python.org/release/3.0.1/library/functions.html#bytes
Also this:
http://docs.python.org/whatsnew/2.6.html#pep-3112-byte-literals
In python3, this will send an array of bytes. However, _get_post_data function is expecting a string. The way strings are handled in python3 and python2 is totally different and the source of much frustration for those wishing to port to Python3.
Unless this package makes this compatible with Python3, you will need to use Python2. You could do it yourself of course, but considering any other package you will use will also be limited to using python2, I would recommend going that way.
Basically, in python 2.6/2.7, if you do
b = bytes('a')
print type(b)
you will get type str
In python 3, if you do
b = bytes('a')
print(type(b))
You will get type "bytearray"

Categories