Convert bytes array to int python wrong result - python

I know this should be easy, but I just can't get the syntax right for python.
My int is not converted correctly. This is the output of my 2 print statements. My output should be 9718 instead of 959918392.
bytearray(b'9718')
959918392
This is my conversion. I don't understand what am I doing wrong.
print(size)
print(int.from_bytes(size, byteorder='big'))

What you tried assumes the number is directly encoded as bytes. You actually want to parse it from ascii, which you can do like this:
int(b'9718'.decode('ascii'))

Related

How does ord() convert bytes to int

I'm using PySerial and trying to receive a byte array {1,2,3,4,5,6,7,8,9,10,11} sent from a MCU. Here's the array I get from PySerial
b'\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b
By looking at the first few "elements" (and the last one as well), I first thought they are just hex numbers until I saw 't' and 'n'.
So I tried to see the output of ord(b'\t') and it indeed gives me the int number 9. I'm a bit confused since ord() is supposed to return the unicode.
Why is 9 represented as b'\t' and 10 as b'\n'? What is this representation and can I find like a conversion table anywhere?
Thank you!

python pack integer as hex byte

I thought this would be simple, but spent quite some time trying to figure it out.
I want to convert an integer into a byte string, and display in hex format. But I seem to get the ascii representation? Specifically, for int value of 122.
from struct import *
pack("B",122) #this returns b'z', what i need is 'b\x7A'
pack("B",255) #this returns b'\xff', which is fine.
I know in python 2.x you can use something like chr() but not in python 3, which is what I have. Ideally the solution would work in both.
You can use codecs or string encoding
codecs.encode(pack("B",122),"hex")
or
a = pack("B",122)
a.encode("hex")
I think you are getting the results you desire, and that whatever you are using to look at your results is causing the confusion. Try running this code:
from struct import *
x = pack("B",122)
assert 123 == x[0] + 1
You will discover that it works as expected and does not assert.

Changing bytes in the file?

I have a request to, "Encode the file by adding 5 to every byte in the file". I tried opening the file as read binary, but all that does is add a b to the beginning of the string- I don't think that is what the expectation of the statement is. I tried looking into pickle, but I don't think that is right either.
What else could this mean? Any ideas as to what possible solutions there are?
It doesn't actually add a b to the beginning of the string -- b is just a marker that python puts on the string when representing it to you so that you know it's a bytes type, not str. Bytes are really just numbers (0-255) so you can walk through the byte object and get each value, figure out what number it corresponds to and add 5, etc.
hint - this task probably gets easier if you choose to use a bytearray to store the bytes.

How to encode a long string of hex values in unicode easily in python

I have hex code point values for a long string. For a short one, following is fine.
msg = unichr(0x062A) + unichr(0x0627) + unichr(0x0628)
print msg
However, since unichr's alternate api unicode() does exist, i thought there must be a way to pass an entire code point string to it. So far i wasn't able to do it.
Now i have to type in a string of 150 hex values (code points) like the 3 above to generate a complete string. I was hoping to get something like
msg = unicode('0x062A, 0x0627....')
I have to use 'msg' latter. Printing it was a mere example. Any ideas?
Perhaps something like this:
", ".join(unichr(u) for u in (0x062A, 0x0627, 0x0628))
Result:
u'\u062a, \u0627, \u0628'
Edit: This uses str.join.
Hard to tell what you're asking for exactly. Are you looking for u'\u062A\u0627\u0628'? The \u escape lets you enter individual characters by code point in a unicode string.
Following your example:
>>> c = u'\u062A\u0627\u0628'
>>> print c
تاب

Python: Convert Unicode-Hex-String to Unicode

I have a hex-string made from a unicode string with that function:
def toHex(s):
res = ""
for c in s:
res += "%02X" % ord(c) #at least 2 hex digits, can be more
return res
hex_str = toHex(u"...")
This returns a string like this one:
"80547CFB4EBA5DF15B585728"
That's a sequence of 6 chinese symbols.
But
u"Knödel"
converts to
"4B6EF664656C"
What I need now is a function to convert this back to the original unicode. The chinese symbols seem to have a 2-byte representation while the second example has 1-byte representations for all characters. So I can't just use unichr() for each 1- or 2-byte block.
I've already tried
binascii.unhexlify(hex_str)
but this seems to convert byte-by-byte and returns a string, not unicode. I've also tried
binascii.unhexlify(hex_str).decode(...)
with different formats. Never got the original unicode string.
Thank you a lot in advance!
This seems to work just fine:
binascii.unhexlify(binascii.hexlify(u"Knödel".encode('utf-8'))).decode('utf-8')
Comes back to the original object. You can do the same for the chinese text if it's encoded properly, however ord(x) already destroys the text you started from. You'll need to encode it first and only then treat like a string of bytes.
Can't be done. Using %02X loses too much information. You should be using something like UTF-8 first and converting that, instead of inventing a broken encoding.
>>> u"Knödel".encode('utf-8').encode('hex')
'4b6ec3b664656c'
When I was working with Unicode in a VB app a while ago the first 1 or 2 digits would be removed if they were a "0". Meaning "&H00A2" would automatically be converted to "&HA2", I just created a small function to check the length of the string and if it was less than 4 chars add the missing 0's. I'm not sure if this is what's happening to you, but I thought I would give bit of information as something to be aware of.

Categories