I am facing a challenge to create a bytes list in python. I just want to convert the int list into bytes list as mentioned in expected result. The problem statement is that I want to send the expected output to the serial device connected to the com port and with current output the serial device is not encouraging the '\\' as a separator. Please suggest me the correct way to handle the '\' in a list of bytes.
cmdlist = [2, 12, 1, 1, 1, 0, 0, 1, 3, 7, 42, 101, 85, 18]
#Convert CMD list to Hex List
for i in range(len(cmdlist)):
cmdlist[i] = hex(cmdlist[i])
f_cmdList = ''
#Convert hex CMD list to string List
for i in range(len(cmdlist)):
f_cmdList += '\\' + (cmdlist[i])
Final_cmdlist = (bytes(f_cmdList,'utf-8'))
print(Final_cmdlist)
Current output : b'\\0x2\\0xc\\0x1\\0x1\\0x1\\0x0\\0x0\\0x1\\0x3\\0x7\\0x2a\\0x65\\0x55\\0x12'
Expected output : b'\0x2\0xc\0x1\0x1\0x1\0x0\0x0\0x1\0x3\0x7\0x2a\0x65\0x55\0x12'
Thank You !
You can convert a list to bytes simply by using the bytes constructor. Your method is trying to create a string that contains the string representation of a byte array, which won't work when sent to the serial device:
>>> cmdlist = [2, 12, 1, 1, 1, 0, 0, 1, 3, 7, 42, 101, 85, 18]
>>> bytes(cmdlist)
b'\x02\x0c\x01\x01\x01\x00\x00\x01\x03\x07*eU\x12'
You get what you say you expect if you replace your
f_cmdList += '\\' + (cmdlist[i])
with this:
f_cmdList += '\0' + cmdlist[i][1:]
Still not convinced that you really want that, though.
I'm writing a program in Linux which reads and distinguish inputs from two USB devices(two barcode readers) which simulates a keyboard.
I've already can read inputs from USB, but it happens before OS translate keycode in a charactere.
For example, when I read 'a' i got 24, 'b' 25, etc....
For example, when I read 'a' i got 4, 'b' 5, etc....
Is there any way to convert that code in a char without manual mapping?
Some output exemples:
KEYPRESS = a output = array('B', [0, 0, 4, 0, 0, 0, 0, 0])
KEYPRESS = SHIFT + a output = array('B', [2, 0, 4, 0, 0, 0, 0, 0])
KEYPRESS = 1 output = array('B', [0, 0, 30, 0, 0, 0, 0, 0])
KEYPRESS = ENTER output = array('B', [0, 0, 81, 0, 0, 0, 0, 0])
thx!
Use the chr function. Python uses a different character mapping (ASCII) from whatever you're receiving though, so you will have to add 73 to your key values to fix the offset.
>>> chr(24 + 73)
'a'
>>> chr(25 + 73)
'b'
I've already can read inputs from USB, but it happens before OS
translate keycode in a charactere.
The problem seems to me in your interface or the driver program.
In ASCII 'a' is supposed to have ordinal value 97 whose binary representation is 0b1100001, where as what you are receiving is 27 whose binary representation is 0b11000, similarly for 'b' you were supposed to received '0b1100010' instead you received 25 which is 0b11001. Check your hardware to determine if the 1st and the 3rd bit is dropped from the input.
What you are receiving is USB scan code. I do not think there is a third party python library to do the conversion for you. I would suggest you to refer any of the USB Scan Code Table and from it, create a dictionary of USB Scan Code vs the corresponding ASCII.
I append to tcp packet unsinged long long with value 4 and additional unsigned long long with value 8616616 ( i dont remember the second value ).
I do it in c on ubuntu 32 , so unsigned long long is 8 bytes.
I sniff the packet with scapy and print the padding.load .
In the output i see symbols that i dont undesrtand the meaning of them - g, |
In additional the load should be 16 bytes , but i dont see 16 bytes.
If i append only one unsigned long long i get 8 bytes and i dont see these symbols
>>> pkt = sniff(count=2,filter="tcp")
>>> raw = pkt[1].sprintf('%Padding.load%')
>>> raw
"'\\x04\\x00\\x00\\x00\\x00\\x00\\x00\\x00g\\xc4|\\x00\\x00\\x00\\x00\\x00'"
>>> print raw
'\x04\x00\x00\x00\x00\x00\x00\x00g\xc4|\x00\x00\x00\x00\x00'
When you are printing out the value of raw, python interprets all bytes that have a value greater than 31 as ASCII characters. When you see g the value for that byte is equal to 103, likewise | is the ASCII code for 124. For characters above 127, python uses a different representation which is why you have \xc4 in your output, the value of that byte is 196.
The actual value of each of the bytes in raw is:
[4, 0, 0, 0, 0, 0, 0, 0, 103, 196, 124, 0, 0, 0, 0, 0]
Which is 16 bytes long.
You can test this by converting the value of each byte back into a character:
>>> values = [4, 0, 0, 0, 0, 0, 0, 0, 103, 196, 124, 0, 0, 0, 0, 0]
>>> as_characters = ''.join(chr(c) for c in values)
>>> as_characters
'\x04\x00\x00\x00\x00\x00\x00\x00g\xc4|\x00\x00\x00\x00\x00'
>>> len(as_characters)
16
I think what you have for raw has had each of the bytes escaped. In my example, when I output as_characters I only see a single backslash, you have two. You may need to use something like pkt[1].sprintf('%Padding.loadr%') to get the non escaped version.
I need to convert an ASCII string into a list of bits and vice versa:
str = "Hi" -> [0,1,0,0,1,0,0,0,0,1,1,0,1,0,0,1]
[0,1,0,0,1,0,0,0,0,1,1,0,1,0,0,1] -> "Hi"
There are many ways to do this with library functions. But I am partial to the third-party bitarray module.
>>> import bitarray
>>> ba = bitarray.bitarray()
Conversion from strings requires a bit of ceremony. Once upon a time, you could just use fromstring, but that method is now deprecated, since it has to implicitly encode the string into bytes. To avoid the inevitable encoding errors, it's better to pass a bytes object to frombytes. When starting from a string, that means you have to specify an encoding explicitly -- which is good practice anyway.
>>> ba.frombytes('Hi'.encode('utf-8'))
>>> ba
bitarray('0100100001101001')
Conversion to a list is easy. (Also, bitstring objects have a lot of list-like functions already.)
>>> l = ba.tolist()
>>> l
[False, True, False, False, True, False, False, False,
False, True, True, False, True, False, False, True]
bitstrings can be created from any iterable:
>>> bitarray.bitarray(l)
bitarray('0100100001101001')
Conversion back to bytes or strings is relatively easy too:
>>> bitarray.bitarray(l).tobytes().decode('utf-8')
'Hi'
And for the sake of sheer entertainment:
>>> def s_to_bitlist(s):
... ords = (ord(c) for c in s)
... shifts = (7, 6, 5, 4, 3, 2, 1, 0)
... return [(o >> shift) & 1 for o in ords for shift in shifts]
...
>>> def bitlist_to_chars(bl):
... bi = iter(bl)
... bytes = zip(*(bi,) * 8)
... shifts = (7, 6, 5, 4, 3, 2, 1, 0)
... for byte in bytes:
... yield chr(sum(bit << s for bit, s in zip(byte, shifts)))
...
>>> def bitlist_to_s(bl):
... return ''.join(bitlist_to_chars(bl))
...
>>> s_to_bitlist('Hi')
[0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1]
>>> bitlist_to_s(s_to_bitlist('Hi'))
'Hi'
There are probably faster ways to do this, but using no extra modules:
def tobits(s):
result = []
for c in s:
bits = bin(ord(c))[2:]
bits = '00000000'[len(bits):] + bits
result.extend([int(b) for b in bits])
return result
def frombits(bits):
chars = []
for b in range(len(bits) / 8):
byte = bits[b*8:(b+1)*8]
chars.append(chr(int(''.join([str(bit) for bit in byte]), 2)))
return ''.join(chars)
not sure why, but here are two ugly oneliners using only builtins:
s = "Hi"
l = map(int, ''.join([bin(ord(i)).lstrip('0b').rjust(8,'0') for i in s]))
s = "".join(chr(int("".join(map(str,l[i:i+8])),2)) for i in range(0,len(l),8))
yields:
>>> l
[0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1]
>>> s
'Hi'
In real world code, use the struct or the bitarray module.
You could use the built-in bytearray:
>>> for i in bytearray('Hi', 'ascii'):
... print(i)
...
72
105
>>> bytearray([72, 105]).decode('ascii')
'Hi'
And bin() to convert to binary.
def text_to_bits(text):
"""
>>> text_to_bits("Hi")
[0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1]
"""
bits = bin(int.from_bytes(text.encode(), 'big'))[2:]
return list(map(int, bits.zfill(8 * ((len(bits) + 7) // 8))))
def text_from_bits(bits):
"""
>>> text_from_bits([0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1])
'Hi'
"""
n = int(''.join(map(str, bits)), 2)
return n.to_bytes((n.bit_length() + 7) // 8, 'big').decode()
See also, Convert Binary to ASCII and vice versa (Python).
def to_bin(string):
res = ''
for char in string:
tmp = bin(ord(char))[2:]
tmp = '%08d' %int(tmp)
res += tmp
return res
def to_str(string):
res = ''
for idx in range(len(string)/8):
tmp = chr(int(string[idx*8:(idx+1)*8], 2))
res += tmp
return res
These function is really simple.
It doesn't use third party module.
A few speed comparisons. Each of these were run using
python -m timeit "code"
or
cat <<-EOF | python -m timeit
code
EOF
if multiline.
Bits to Byte
A: 100000000 loops, best of 3: 0.00838 usec per loop
res = 0
for idx,x in enumerate([0,0,1,0,1,0,0,1]):
res |= (x << idx)
B: 100000000 loops, best of 3: 0.00838 usec per loop
int(''.join(map(str, [0,0,1,0,1,0,0,1])), 2)
Byte to Bits
A: 100000000 loops, best of 3: 0.00836 usec per loop
[(41 >> x) & 1 for x in range(7, -1, -1)]
B: 100000 loops, best of 3: 2.07 usec per loop
map(int, bin(41)[2:])
import math
class BitList:
def __init__(self, value):
if isinstance(value, str):
value = sum([bytearray(value, "utf-8")[-i - 1] << (8*i) for i in range(len(bytearray(value, "utf-8")))])
try:
self.value = sum([value[-i - 1] << i for i in range(len(value))])
except Exception:
self.value = value
def __getitem__(self, index):
if isinstance(index, slice):
if index.step != None and index.step != 1:
return list(self)[index]
else:
start = index.start if index.start else 0
stop = index.stop if index.stop != None else len(self)
return BitList(math.floor((self.value % (2 ** (len(self) - start))) >> (len(self) - stop)))
else:
return bool(self[index:index + 1].value)
def __len__(self):
return math.ceil(math.log2(self.value + 1))
def __str__(self):
return self.value
def __repr__(self):
return "BitList(" + str(self.value) + ")"
def __iter__(self):
yield from [self[i] for i in range(len(self))]
Then you can initialize BitList with a number or a list (of numbers or booleans), then you can get its value, get positional items, get slices, and convert it to a list. Note: Cannot currently set items, but when I add that I will edit this post.
I made this my self, then went looking for how to convert a string (or a file) into a list of bits, then figured that out from another answer.
This might work, but it does not work if you ask PEP 8 (long line, complex)
tobits = lambda x: "".join(map(lambda y:'00000000'[len(bin(ord(y))[2:]):]+bin(ord(y))[2:],x))
frombits = lambda x: ''.join([chr(int(str(y), 2)) for y in [x[y:y+8] for y in range(0,len(x),8)]])
These are used like normal functions.
Because I like generators, I'll post my version here:
def bits(s):
for c in s:
yield from (int(bit) for bit in bin(ord(c))[2:].zfill(8))
def from_bits(b):
for i in range(0, len(b), 8):
yield chr(int(''.join(str(bit) for bit in b[i:i + 8]), 2))
print(list(bits('Hi')))
[0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1]
print(''.join(from_bits([0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1])))
Hi
If you have bits in a list then you simply convert it into str and then to a number. Number will behave like a bit string and then bitwise operation can be applied.
For example :
int(str([1,0,0,1]) | int(str([1,0,1,1])
So I have kind of a ignorant (maybe?) question. I'm working with writing to a serial device for the first time. I have a frame [12, 0, 0, 0, 0, 0, 0, 0, 7, 0, X, Y] that I need to send. X and Y are checksum values. My understanding in using the pyserial module is that I need to convert this frame into a string representation. Ok that's fine, but I'm confused on what format things are supposed to be in. I tried doing
a = [12, 0, 0, 0, 0, 0, 0, 0, 7, 0, X, Y]
send = "".join(chr(t) for t in a)
But my confusion comes from the fact that X and Y, when using chr, transform into weird strings (assuming their ascii representation). For example if X is 36, chr(x) is '$' instead of '\x24'. Is there a way I can get a string representing the '\xnn' value instead of the ascii code? What's confusing me is that 12 and 7 convert to '\x0b' and '\x07' correctly. Am I missing something?
Update:
So it might be that I'm not quite understanding how serial writes are being done or what my device is expecting of me. This is a portion of my C code that is working:
fd=open("/dev/ttyS2",O_RDWR|O_NDELAY);
char buff_out[20]
//Next line is psuedo
for i in buff_out print("%x ",buff_out[i]); // prints b 0 0 0 0 0 0 0 9 b3 36
write(fd,buff_out,11);
sleep()
read(fd,buff_in,size);
for i in buff_in print("%x ",buff_in[i]); // prints the correct frame that I'm expecting
Python:
frame = [11, 0, 0, 0, 0, 0, 0, 0, 9] + [crc1, crc1]
senddata = "".join(chr(x) for x in frame)
IEC = serial.Serial(port='/dev/ttyS2', baudrate=1200, timeout=0)
IEC.send(senddata)
IEC.read(18) # number of bytes to read doesn't matter, it's always 0
Am I going about this the right way? Obviously you can't tell exactly since it's device specific and I can't really give too many specifics out. But is that the correct format that serial.send() expects data in?
It's perfectly normal for ASCII bytes to be represented by single characters if they can be printed, and by the \x?? notation otherwise. In both cases they represent a single byte, and you can write strings in either fashion:
>>> '\x68\x65\x6c\x6c\x6f'
'hello'
However if you're using Python 2.6 or later then you might find it easier and more natural to use the built-in bytearray rather than messing around with ord or struct.
>>> vals = [12, 0, 0, 0, 0, 0, 0, 0, 7, 0, 36, 100]
>>> b = bytearray(vals)
>>> b
bytearray(b'\x0c\x00\x00\x00\x00\x00\x00\x00\x07\x00$d')
You can convert to a str (or bytes in Python 3) just by casting, and can index the bytearray to get the integers back.
>>> str(b)
'\x0c\x00\x00\x00\x00\x00\x00\x00\x07\x00$d'
>>> b[0]
12
>>> b[-1]
100
As to your serial Python code, it looks fine to me - I'm not sure why you think there is a problem...
The character with the ASCII-code 36 is '$'. Look it up in any ASCII table. Python only displays the hex escapes if the character is not printable (control characters etc).
At the lowest level, it's the same bit pattern anyway - no matter whether Python prints it as a hex escape or as the char with that ASCII value.
But you might want to use the struct module, it takes care of such conversions for you.
I would guess you want struct.
>>> import struct
>>> struct.pack('>B', 12)
'\x0c'
>>> vals = [12, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0xa, 0xb]
>>> ''.join(struct.pack('>B', x) for x in vals)
'\x0c\x00\x00\x00\x00\x00\x00\x00\x07\x00\n\x0b'
What you do is perfectly fine: your send is what you want: a sequence of bytes with the values you want (a).
If you want to see what are the hexadecimal codes of the characters in send, you can do:
import binascii
print binascii.hexlify(send)
or
print ''.join(r'\x%02x' % ord(char) for char in send)
(if you want \x prefixes).
What you see when directly printing repr(send) is a representation of send, which uses ASCII: 65 represents 'A', but character 12 is '\x0c'. This is merely a convention used by Python, which is convenient when the string contains words, for instance: it is better to display 'Hello' than \x48\x65\x6c\x6c\x6f!