Converting character to int and vice versa in Python

Converting character to int and vice versa in Python - python

I have an external app that appends the length of the packet at the start of the data. Something like the following code:
x = "ABCDE"
x_len = len(x)
y = "GHIJK"
y_len = len(y)
test_string = chr(x_len) + x + chr(y_len) + y
#TODO:perform base64 encoding
In the client side of the code I need to be able to extract x_len and y_len and read x and y accrodingly.
#TODO:perform base64 decoding
x_len = int(test_string[0])
x = test_string[:x_len]
I get the following error:
ValueError: invalid literal for int() with base 10: '\x05'
I assume the argument of int is in hex so I probbaly need to do some decoding before passing to the int. Can someone give me a pointer as to what function to use from decode or if there is any easier way to accomplish this?

You probably want ord(), not int(), since ord() is the opposite operation from chr().
Note that your code will only work for lengths up to 255 since that is the maximum chr() and ord() support.

t="ABCDE"
print reduce(lambda x,y:x+y,[ord(i) for i in t])
#output 335
usage of ord: it is used to convert character to its ascii values ..
in some cases only for alphabets they consider A :1 --- Z:26 in such cases use
ord('A')-64 results 1 since we know ord('A') is 65

Related

Unicode decode error when reading wav file [duplicate]

I assign a value to a variable x in the following way:
import wave
w = wave.open('/usr/share/sounds/ekiga/voicemail.wav', 'r')
x = w.readframes(1)
When I type x I get:
'\x1e\x00'
So x got a value. But what is that? Is it hexadecimal? type(x) and type(x[0]) tell me that x and x[0] a strings. Can anybody tell me how should I interpret this strings? Can I transform them into integer?

The interactive interpreter echoes unprintable characters like that. The string contains two bytes, 0x1E and 0x00. You can convert it to an integer with struct.unpack("<h", x) (little endian, 2 bytes, signed).

Yes, it is in hexadecimal, but what it means depends on the other outputs of the wav file e.g. the sample width and number of channels. Your data could be read in two ways, 2 channels and 1 byte sample width (stereo sound) or 1 channel and 2 byte sample width (mono sound). Use x.getparams(): the first number will be the number of channels and the second will be the sample width.
This Link explains it really well.

It's a two byte string:
>>> x='\x1e\x00'
>>> map(ord, list(x))
[30, 0]
>>> [ord(i) for i in x]
[30, 0]

This strings represent bytes. I guess you can turn them into an integer with struct package, which allows interpreting strings of bytes.

Convert binary to signed, little endian 16bit integer in Python

Trying to a convert a binary list into a signed 16bit little endian integer
input_data = [['1100110111111011','1101111011111111','0010101000000011'],['1100111111111011','1101100111111111','0010110100000011']]
Desired Output =[[-1074, -34, 810],[-1703, -39, 813]]
This is what I've got so far. It's been adapted from: Hex string to signed int in Python 3.2?,
Conversion from HEX to SIGNED DEC in python
results = []
for i in input_data:
hex_convert = [hex(int(x,2)) for x in i]
convert = [int(y[4:6] + y[2:4], 16) for y in hex_convert]
results.append(convert)
print (results)
output: [[64461, 65502, 810], [64463, 65497, 813]]
This is works fine, but the above are unsigned integers. I need signed integers capable of handling negative values. I then tried a different approach:
results_2 = []
for i in input_data:
hex_convert = [hex(int(x,2)) for x in i]
to_bytes = [bytes(j, 'utf-8') for j in hex_convert]
split_bits = [int(k, 16) for k in to_bytes]
convert_2 = [int.from_bytes(b, byteorder = 'little', signed = True) for b in to_bytes]
results_2.append(convert_2)
print (results_2)
Output: [[108191910426672, 112589973780528, 56282882144304], [108191943981104, 112589235583024, 56282932475952]]
This result is even more wild than the first. I know my approach is wrong, and it doesn't help that i've never been able to get my head around binary conversion etc, but I feel i'm on the right path with:
(b, byteorder = 'little', signed = True)
but can't work out where i'm wrong. Any help explaining this concept would be greatly appreciated.

This result is even more wild than the first. I know my approach is wrong... but can't work out where i'm wrong.
The problem is in the conversion to bytes. Let's look at it a step at a time:
int(x, 2)
Fine; we treat the string as a base-2 representation of the integer value, and get that integer. Only problem is it's a) unsigned and b) big-endian.
hex(int(x,2))
What this does is create a string representation of the integer, in base 16, with a 0x prefix. Notably, there are two text characters per byte that we want. This is already heading is down the wrong path.
You might have thought of using hexadecimal because you've seen \xAB style escapes inside string representations. This is a completely different thing. The string '\xAB' contains one character. The string '0xAB' contains four.
From there, everything else is still nonsense. Converting to bytes with a text encoding just means that the text character 0 for example is replaced with the byte value 48 (since in UTF-8 it's encoded with a single byte with that value). For this data we get the same results with UTF-8 that we would by assuming plain ASCII (since UTF-8 is "ASCII transparent" and there are no non-ASCII characters in the text).
So how do we do it?
We want to convert the integer from the first step into the bytes used to represent it. Just as there is a .from_bytes class method allowing us to create an integer from underlying bytes, there is an instance method allowing us to get the bytes that would represent the integer.
So, we use .to_bytes, specifying the length, signedness and endianness that was assumed when we created the int from the binary string - that gives us bytes that correspond to that string. Then, we re-create the integer from those bytes, except now specifying the proper signedness and endianness. The reason that .to_bytes makes us specify a length is because the integer doesn't have a particular length - there are a minimum number of bytes required to represent it, but you could use as many more as you like. (This is especially important if you want to handle signed values, since it will do sign-extension automatically.)
Thus:
for i in input_data:
values = [int(x,2) for x in i]
as_bytes = [x.to_bytes(2, byteorder='big', signed=False) for x in values]
reinterpreted = [int.from_bytes(x, byteorder='little', signed=True) for x in as_bytes]
results_2.append(reinterpreted)
But let's improve the organization of the code a bit. I will first make a function to handle a single integer value, and then we can use comprehensions to process the list. In fact, we can use nested comprehensions for the nested list.
def as_signed_little(binary_str):
# This time, taking advantage of positional args and default values.
as_bytes = int(binary_str, 2).to_bytes(2, 'big')
return int.from_bytes(as_bytes, 'little', signed=True)
# And now we can do:
results_2 = [[as_signed_little(x) for x in i] for i in input_data]

Forming a byte string in Python

I am creating a method in Python whereby it will take a number which will form a byte string that will then get sent to the Arduino. However whenever I try, the escape character is always included in the final byte string.
Here is the snippet of the code I am using:
num = 5
my_str = '\\x4' + str(num)
my_str.encode('utf-8')
Result:
b'\\x45'
I tried another method:
num2 = 5
byte1 = b'\\x4'
byte2 = bytes(str(num2), 'ISO-8859-1')
new_byte = byte1 + byte2
new_byte
Result:
b'\\x45'
Trying yet in a different way:
num = 5
u = chr(92) + 'x4' + str(num)
u.encode('ISO-8859-1')
Result:
b'\\x45'
I would like to get the byte string to be b'\x45' without the escape character but not really sure what I have missed. I will appreciate any pointers on how I can achieve this.

Your problem is that you have already escaped the backslash. It is not recommended to construct a literal using an unknown variable, especially if there's a simpler way, which there is:
def make_into_bytes(n):
return bytes([64 + n])
print(make_into_bytes(5))
This outputs
b'E'
Note that this isn't a bug, as this is simply the value of 0x45:
>>> b'\x45'
b'E'
The way this function works is basically just doing it by hand. Prepending '4' to a hex string (of length 1) is the same as adding 4 * 16 to it, which is 64. I then construct a bytes object out of this. Note that I assume n is an integer, as in your code. If n should be a digit like 'a', this would be the integer 10.
If you want it to work on hex digits, rather than on integer digits, you would need to change it to this:
def make_into_bytes(n):
return bytes([64 + int(n, 16)])
print(make_into_bytes('5'))
print(make_into_bytes('a'))
with output
b'E'
b'J'
This quite simply converts the digit from base 16 first.

You can use the built-in function chr() to convert an integer to the corresponding character:
>>> chr(0x40 + 5)
'E'
Alternatively, if you just one to get the n-th letter of the alphabet, it might be more readable to use str.ascii_uppercase
>>> string.ascii_uppercase[5 - 1]
'E'
Note that the results in this answer are strings in Python 3, not bytes objects. Simply calling .encode() on them will convert them to bytes.

How to convert int to signed char?

Let's say I have integer x (-128 <= x <= 127). How to convert x to signed char?
I did like this. But I couldn't.
>>> x = -1
>>> chr(x)
ValueError: chr() arg not in range(256)

Within Python, there aren't really any borders between signed, unsigned, char, int, long, etc. However, if your aim is to serialize data to use with a C module, function, or whatever, you can use the struct module to pack and unpack data. b, for example, packs data as a signed char type.
>>> import struct
>>> struct.pack('bbb', 1, -2, 4)
b'\x01\xfe\x04'
If you're directly talking to a C function, use ctypes types as Ashwin suggests; when you pass them Python translates them as appropriate and piles them on the stack.

It does not work, because x is not between 0 and 256.
chr must be given an int between 0 and 256.

In Python 3, chr() does not convert an int into a signed char (which Python doesn't have anyway). chr() returns a Unicode string having the ordinal value of the inputted int.
>>> chr(77)
'M'
>>> chr(690)
'ʲ'
The input can be a number in any supported base, not just decimal:
>>> chr(0xBAFF)
'뫿'
Note: In Python 2, the int must be between 0 and 255, and will output a standard (non-Unicode) string. unichr() in Py2 behaves as above.

Is there a way to pad to an even number of digits?

I'm trying to create a hex representation of some data that needs to be transmitted (specifically, in ASN.1 notation). At some points, I need to convert data to its hex representation. Since the data is transmitted as a byte sequence, the hex representation has to be padded with a 0 if the length is odd.
Example:
>>> hex2(3)
'03'
>>> hex2(45)
'2d'
>>> hex2(678)
'02a6'
The goal is to find a simple, elegant implementation for hex2.
Currently I'm using hex, stripping out the first two characters, then padding the string with a 0 if its length is odd. However, I'd like to find a better solution for future reference. I've looked in str.format without finding anything that pads to a multiple.

def hex2(n):
x = '%x' % (n,)
return ('0' * (len(x) % 2)) + x

To be totally honest, I am not sure what the issue is. A straightforward implementation of what you describe goes like this:
def hex2(v):
s = hex(v)[2:]
return s if len(s) % 2 == 0 else '0' + s
I would not necessarily call this "elegant" but I would certainly call it "simple."

Python's binascii module's b2a_hex is guaranteed to return an even-length string.
the trick then is to convert the integer into a bytestring. Python3.2 and higher has that built-in to int:
from binascii import b2a_hex
def hex2(integer):
return b2a_hex(integer.to_bytes((integer.bit_length() + 7) // 8, 'big'))

Might want to look at the struct module, which is designed for byte-oriented i/o.
import struct
>>> struct.pack('>i',678)
'\x00\x00\x02\xa6'
#Use h instead of i for shorts
>>> struct.pack('>h',1043)
'\x04\x13'

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Converting character to int and vice versa in Python - python

You probably want ord(), not int(), since ord() is the opposite operation from chr(). Note that your code will only work for lengths up to 255 since that is the maximum chr() and ord() support.

t="ABCDE" print reduce(lambda x,y:x+y,[ord(i) for i in t]) #output 335 usage of ord: it is used to convert character to its ascii values .. in some cases only for alphabets they consider A :1 --- Z:26 in such cases use ord('A')-64 results 1 since we know ord('A') is 65

Related

Unicode decode error when reading wav file [duplicate]

Convert binary to signed, little endian 16bit integer in Python

Forming a byte string in Python

How to convert int to signed char?

Is there a way to pad to an even number of digits?

Categories

Resources