Add 1 bit to a string - python

I'm looking for a way to "append '1' bit to message" in Python, in order to recreate the MD5 algorithm mentioned here.
This is what I've done, but the problem is that msg is actually a string:
msg.append(0x01)
while len(msg)%56!=0:
msg.append(0x00)
What should I do?

use chr to convert the byte values to strings. Then you can append them to msg.
msg += chr(0x01 << 8)
while len(msg)%56 != 0:
msg += chr(0x00)
In Python 2.7, each character in an ASCII string is 8 bits large. So with this method, you're not truly adding "1 bit" each time. That's why it's necessary to left-shift the 1 argument eight times. Otherwise you would be adding 0b00000001 to the string instead of the desired 0b10000000.

Related

convert str 'A123456' to bytes b'\x0A\x12\x34\x56' in python

I use python 2.7 and windows.
I want to convert string'A123456' to bytes: b'\x0A\x12\x34\x56' and then concatenate it with other bytes (b'\xBB') to b'\xbb\x0A\x12\x34\x56'.
That is, I want to obtain b'\xbb\x0A\x12\x34\x56' from string'A123456' and b'\xBB'
This isn't too hard to do with binascii.unhexlify, the only problem you've got is that you want to zero pad your string when it's not an even number of nibbles (unhexlify won't accept a length 7 string).
So first off, it's probably best to make a quick utility function that does that, because doing it efficiently isn't super obvious, and you want a self-documenting name:
def zeropad_even(s):
# Adding one, then stripping low bit leaves even values unchanged, and rounds
# up odd values to next even value
return s.zfill(len(s) + 1 & ~1)
Now all you have to do is use that to fix up your string before unhexlifying it:
>>> from binascii import unhexlify
>>> unhexlify(zeropad_even('A123456'))
'\n\x124V'
>>> _ == b'\x0A\x12\x34\x56'
True
I included that last test just to show that you got the expected result; the repr of str tries to use printable ASCII or short escapes where available, so only the \x12 actually ends up in the repr; \x0A' becomes \n, \x34 is 4 and \x56 is V, but those are all equivalent ways to spell the same bytes.

Forming a byte string in Python

I am creating a method in Python whereby it will take a number which will form a byte string that will then get sent to the Arduino. However whenever I try, the escape character is always included in the final byte string.
Here is the snippet of the code I am using:
num = 5
my_str = '\\x4' + str(num)
my_str.encode('utf-8')
Result:
b'\\x45'
I tried another method:
num2 = 5
byte1 = b'\\x4'
byte2 = bytes(str(num2), 'ISO-8859-1')
new_byte = byte1 + byte2
new_byte
Result:
b'\\x45'
Trying yet in a different way:
num = 5
u = chr(92) + 'x4' + str(num)
u.encode('ISO-8859-1')
Result:
b'\\x45'
I would like to get the byte string to be b'\x45' without the escape character but not really sure what I have missed. I will appreciate any pointers on how I can achieve this.
Your problem is that you have already escaped the backslash. It is not recommended to construct a literal using an unknown variable, especially if there's a simpler way, which there is:
def make_into_bytes(n):
return bytes([64 + n])
print(make_into_bytes(5))
This outputs
b'E'
Note that this isn't a bug, as this is simply the value of 0x45:
>>> b'\x45'
b'E'
The way this function works is basically just doing it by hand. Prepending '4' to a hex string (of length 1) is the same as adding 4 * 16 to it, which is 64. I then construct a bytes object out of this. Note that I assume n is an integer, as in your code. If n should be a digit like 'a', this would be the integer 10.
If you want it to work on hex digits, rather than on integer digits, you would need to change it to this:
def make_into_bytes(n):
return bytes([64 + int(n, 16)])
print(make_into_bytes('5'))
print(make_into_bytes('a'))
with output
b'E'
b'J'
This quite simply converts the digit from base 16 first.
You can use the built-in function chr() to convert an integer to the corresponding character:
>>> chr(0x40 + 5)
'E'
Alternatively, if you just one to get the n-th letter of the alphabet, it might be more readable to use str.ascii_uppercase
>>> string.ascii_uppercase[5 - 1]
'E'
Note that the results in this answer are strings in Python 3, not bytes objects. Simply calling .encode() on them will convert them to bytes.

How do I convert 32b (four characters) from an int value to an ASCII string in python

Hi I have a 32b value that I need to easily truncate in to it's four bytes, convert each byte to ASCII and combine them to a four letter string. And I also need the reverse process. I have been able to do this in one direction in the following ugly way:
## the variable "binword" is a 32 bit value read directly from an MCU, where each byte is an
## ASCII character
char0 = (binword & 0xFF000000) >> 24
char1 = (binword & 0xFF0000) >> 16
char2 = (binword & 0xFF00) >> 8
char3 = (binword & 0xFF)
fourLetterWord = str(unichr(char0))+str(unichr(char1))+str(unichr(char2))+str(unichr(char3))
Now, I find this method really un-elegant and time consuming, so the question is how do I do this better? And, I guess the more important question, how do I convert the other way?
You should use the struct module's pack and unpack calls for these convertions
number = 32424234
import struct
result = struct.pack("I", number)
and back:
number = struct.unpack("I", result)[0]
Please, refer to the official docs on the struct module for the struct-string syntax,
and markers to ensure endiannes, and number size.
https://docs.python.org/2/library/struct.html
On a side note - this is by no way "ASCII" - it is a bytestring.
ASCII refers to a particular text encoding with codes on the 32-127 numeric range.
The point is that you should not think on bytestrings as text, if you need a stream of bytes - and much less think of "ASCII" as an alias for text strings - as it can represent less than 1% of textual characters existing in the World.

Python 2.7 Reading Hex and Dec as Str and then adding them

Today I am reading in a file, and extracting information. I've figured out pretty much everything, but for some reason I am having a very, very annoying problem! I read in an entire line and use the .split() command to break the 'sentence' into 'words' right? And then I alias the 'words' as such:
startAddress = line[ 0 ]
length = line[ 2 ].strip( "(" ).strip( ")" )
...
endAddress = startAddress + length
Note: I strip the length because in the data file it is encased with () which, later, cause problems when I load it into a .csv file because () are used as negatives.
Anyways, if I were to have 0x00230008 be the start address and (4) be the length, my program makes 0x002300084 be the end address instead of 0x00230008C, but if I do hex(length) or hex(startAddress) or even hex(str(length) or hex(str(startAddress)) it throws an error saying hex numbers cannot be converted into hex. Likewise I cannot convert them into integers, either.
Really, all I need to do is add the starting address (which is in Hex, but reads in as a string) and the length (which is in int and reads in as int.) I have tried converting them around, but that didn't work. I also tried the line
endAddress = startAddress + length - 1
which tells me " unsupported operand type(s) for -: 'str' and 'int' " so, I've toyed with it as much as I can, but I'm just not figuring this out. I was thinking of removing the 0x in front of the hex value via strip, but then it reads in as an integer and is incorrect.
The last thing I tried was using line[ 0 ] and line[ 2 ] (with strips) directly to find endAddress, but it gives all the same errors. I tried to force type by stating that startAddress = 0xFFFFFFFF before I assign it equal to line[ 0 ], but that didn't work. So how the heck do I convert a string to a hexidecimal number if it complains that it is hexidecimal when it is not? Or maybe my method of adding them is wrong? Can I use some other adding method?
The biggest confusion for me is that if I try to convert startAddress to a string, and then back into a hexidecimal number, it still complains.
int takes an optional parameter specifying the base of integer you want to convert it into. So you could simple call something like:
proper_int = int(number, 16)
To get a proper representation.
For example:
int("10", 16) = 16
int("F0", 16) = 240
int("0x10", 16) = 16
If you want to add zero padding I would recommend zfill:
"10".zfill(4) = "0010"
You have to parse the string as a base-16 int
>>> int("0x00230008", 16)
2293768
Add the ints
>>> int("0x00230008", 16) + 4
2293772
And convert it back to a hex string:
>>> hex(int("0x00230008", 16) + 4)
'0x23000c'
You'll have to use some string formatting instead of hex to pad it with zeroes, if you need it:
>>> '0x%08x' % (int("0x00230008", 16) + 4)
'0x0023000c'
int() defaults to base-10, so specify the base when calling int on a base-16 string:
>>> int('0x00230008', 16)
2293768
Use int or eval function:
>>> int('0x002300084', 16)
36700292
>>> eval('0x002300084')
36700292
>>> hex(36700292)
'0x2300084'
hex, oct and bin functions all take integers and return string
while int takes string (or unicode), and an optional base argument (default to 10) and returns and integer

Integer to Unique String

There's probably someone else who asked a similar question, but I didn't take much time to search for this, so just point me to it if someone's already answered this.
I'm trying to take an integer (or long) and turn it into a string, in a very specific way.
The goal is essentially to split the integer into 8-bit segments, then take each of those segments and get the corresponding ASCII character for that chunk, then glue the chunks together.
This is easy to implement, but I'm not sure I'm going about it in the most efficient way.
>>> def stringify(integer):
output = ""
part = integer & 255
while integer != 0:
output += chr(part)
integer = integer >> 8
return output
>>> stringify(10)
'\n'
>>> stringify(10 << 8 | 10)
'\n\n'
>>> stringify(32)
' '
Is there a more efficient way to do this?
Is this built into Python?
EDIT:
Also, as this will be run sequentially in a tight loop, is there some way to streamline it for such use?
>>> for n in xrange(1000): ## EXAMPLE!
print stringify(n)
...
struct can easily do this for integers up to 64 bits in size. Any larger will require you to carve the number up first.
>>> struct.pack('>Q', 12345678901234567890)
'\xabT\xa9\x8c\xeb\x1f\n\xd2'

Categories