Python how to shorten a uuuid and decode? - python

I'm trying to create a shortened ID for one of my models using the following method:
_char_map = string.ascii_letters+string.digits
def index_to_char(sequence):
return "".join([_char_map[x] for x in sequence])
def make_short_id(self):
_id = self.id
digits = []
while _id > 0:
rem = _id % 62
digits.append(rem)
_id /= 62
digits.reverse()
return index_to_char(digits)
#staticmethod
def decode_id(string):
i = 0
for c in string:
i = i * 64 + _char_map.index(c)
return i
Where self.id is a uuid i.e. 1c7a2bc6-ca2d-47ab-9808-1820241cf4d4, but I get the following error:
rem = _id % 62
TypeError: not all arguments converted during string formatting
This method only seems to work when the id is an int.
How can I modify the method to shorten a uuuid and decode?
UPDATE:
Thank you for the help. I was trying to find a way create an encode and decode method that took a string, made it shorter then decode it back again. The methods above can never work with a string (uuid) as pointed out,

The % operator is the string formatting or interpolation operator and does not return the remainder in Python when used with strings. It will try to return a formatted string instead.
I'm not sure what your input is, but try converting it using int so you can get the remainder of it.
Edit: I see your input now, not sure why I missed it. Here's one method of converting a UUID to a number:
import uuid
input = "1c7a2bc6-ca2d-47ab-9808-1820241cf4d4"
id = uuid.UUID(input)
id.int
# returns 37852731992078740357317306657835644116L
Not sure what you mean by "shorten", but it looks like you are trying to "base 62 encode" the UUID. If you use the function from this question you will end up with the following:
uuid62 = base62_encode(id.int)
# uuid62 contains 'RJChvUCPWDvJ7BdQKOw7i'
To get the original UUID back:
# Create a UUID again
id = uuid.UUID(int=base62_decode(uuid62))
id.hex
# returns '1c7a2bc6ca2d47ab98081820241cf4d4'
str(id)
# returns '1c7a2bc6-ca2d-47ab-9808-1820241cf4d4'

_id is string
>>> 11 % 2
1
>>> "11" % 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting

I would suggest using base64.urlsafe_b64encode() from the standard library, rather than rolling your own base62_encode() function.
You first need to convert your hex string to a binary string:
binary_id = id.replace("-", "").decode("hex")
This binary string can the be encoded using the afore-mentioned function:
shortened_id = base64.urlsafe_b64encode(binary_id)

Related

Why doesn't this python regex work?

I wrote a simple class that takes an input zip or postal code and either returns that value or zero-pads it out to five digits if it happens to be all-numeric and less than 5 digits long.
Why doesn't my code work?
import re
class ZipOrPostalCode:
def __init__(self, data):
self.rawData = data
def __repr__(self):
if re.match(r"^\d{1,4}$", self.rawData):
return self.rawData.format("%05d")
else:
return self.rawData
if __name__ == "__main__":
z=ZipOrPostalCode("2345")
print(z)
The output I expect is 02345. It outputs 2345.
Running it in the debugger, it is clear that the regular expression didn't match.
Your regex works, it's the format that doesn't because you're trying to pass an integer format for a string, and the other way round, and with old-style % syntax...
In str.format, the string object bears the format (using {} style syntax) and the strings/integers/whatever objects to format are passed as parameters.
Replace (for instance) by:
if re.match(r"^\d{1,4}$", self.rawData):
return "{:05}".format(int(self.rawData))
without format, you can also use zfill to left-pad with zeroes (faster, since you don't have to convert to integer)
return self.rawData.zfill(5)
and you probably don't even need to test the number of digits, just zfill no matter what or only if the zipcode is only digits:
def __repr__(self):
return self.rawData.zfill(5) if self.rawData.isdigit() else self.rawData
You've got your format code backwards.
return "{:05d}".format(int(self.rawData))

Keyerror in multiple key string interpolation in Python

I am having a problem like
In [5]: x = "this string takes two like {one} and {two}"
In [6]: y = x.format(one="one")
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-6-b3c89fbea4d3> in <module>()
----> 1 y = x.format(one="one")
KeyError: 'two'
I have a compound string with many keys that gets kept in a config file. For 8 different queries, they all use the same string, except 1 key is a different setting. I need to be able to substitute a key in that file to save the strings for later like:
"this string takes two like one and {two}"
How do I substitute one key at a time using format?
I think string.Template does what you want:
from string import Template
s = "this string takes two like $one and $two"
s = Template(s).safe_substitute(one=1)
print(s)
# this string takes two like 1 and $two
s = Template(s).safe_substitute(two=2)
print(s)
# this string takes two like 1 and 2
If placeholders in your string don't have any format specifications, in Python 3 you can use str.format_map and provide a mapping, returning the field name for missing fields:
class Default(dict):
def __missing__(self, key):
return '{' + key + '}'
In [6]: x = "this string takes two like {one} and {two}"
In [7]: x.format_map(Default(one=1))
Out[7]: 'this string takes two like 1 and {two}'
If you do have format specifications, you'll have to subclass string.Formatter and override some methods, or switch to a different formatting method, like string.Template.
you can escape the interpolation of {two} by doubling the curly brackets:
x = "this string takes two like {one} and {{two}}"
y = x.format(one=1)
z = y.format(two=2)
print(z) # this string takes two like 1 and 2
a different way to go are template strings:
from string import Template
t = Template('this string takes two like $one and $two')
y = t.safe_substitute(one=1)
print(y) # this string takes two like 1 and $two
z = Template(y).safe_substitute(two=2)
print(z) # this string takes two like 1 and 2
(this answer was before mine for the template strings....)
You can replace {two} by {two} to enable further replacement later:
y = x.format(one="one", two="{two}")
This easily extends in multiple replacement passages, but it requires that you give all keys, in each iteration.
All great answers, I will start using this Template package soon. Very disappointed in the default behavior here, not understanding why a string template requires passing all the keys each time, if there are 3 keys I can't see a logical reason you can't pass 1 or 2 (but I also don't know how compilers work)
Solved by using %s for the items I'm immediately substituting in the config file, and {key} for the keys I replace later upon execution of the flask server
In [1]: issue = "Python3 string {item} are somewhat defective: %s"
In [2]: preformatted_issue = issue % 'true'
In [3]: preformatted_issue
Out[3]: 'Python3 string {item} are somewhat defective: true'
In [4]: result = preformatted_issue.format(item='templates')
In [5]: result
Out[5]: 'Python3 string templates are somewhat defective: true'

Python 3.3 binary to hex function

def bintohex(path):
hexvalue = []
file = open(path,'rb')
while True:
buffhex = pkmfile.read(16)
bufflen = len(buffhex)
if bufflen == 0: break
for i in range(bufflen):
hexvalue.append("%02X" % (ord(buffhex[i])))
I am making a function that will return a list of hex values of a specific file. However, this function doesn't work properly in Python 3.3. How should I modify this code?
File "D:\pkmfile_web\pkmtohex.py", line 12, in bintohex hexvalue.append("%02X" % (ord(buffhex[i]))) TypeError: ord() expected string of length 1, but int found
There's a module for that :-)
>>> import binascii
>>> binascii.hexlify(b'abc')
'616263'
In Python 3, indexing a bytes object returns the integer value; there is no need to call ord:
hexvalue.append("%02X" % buffhex[i])
Additionally, there is no need to be manually looping over the indices. Just loop over the bytes object. I've also modified it to use format rather than %:
buffhex = pkmfile.read(16)
if not buffhex:
for byte in buffhex:
hexvalue.append(format(byte, '02X'))
You may want to even make bintohex a generator. To do that, you could start yielding values:
yield format(byte, '02X')

Python Regular Expression TypeError

I am writing my first python program and I am running into a problem with regex. I am using regular expression to search for a specific value in a registry key.
import _winreg
import re
key = _winreg.OpenKey(_winreg.HKEY_LOCAL_MACHINE,"Software\\Microsoft\\Windows\\CurrentVersion\\Uninstall\\{26A24AE4-039D-4CA4-87B4-2F83216020FF}")
results=[]
v = re.compile(r"(?i)Java")
try:
i = 0
while 1:
name, value, type = _winreg.EnumValue(key, i)
if v.search(value):
results.append((name,value,type))
i += 1
except WindowsError:
print
for x in results:
print "%-50s%-80s%-20s" % x
I am getting the following error:
exceptions.TypeError: expected string
or buffer
I can use the "name" variable and my regex works fine. For example if I make the following changes regex doesn't complain:
v = re.compile(r"(?i)DisplayName")
if v.search(name):
Thanks for any help.
The documentation for EnumValue explains that the 3-tuple returned is a string, an object that can be any of the Value Types, then an integer. As the error explained, you must pass in a string or a buffer, so that's why v.search(value) fails.
You should be able to get away with v.search(str(value)) to convert value to a string.

"Function object is unsubscriptable" in basic integer to string mapping function

I'm trying to write a function to return the word string of any number less than 1000.
Everytime I run my code at the interactive prompt it appears to work without issue but when I try to import wordify and run it with a test number higher than 20 it fails as "TypeError: 'function' object is unsubscriptable".
Based on the error message, it seems the issue is when it tries to index numString (for example trying to extract the number 4 out of the test case of n = 24) and the compiler thinks numString is a function instead of a string. since the first line of the function is me defining numString as a string of the variable n, I'm not really sure why that is.
Any help in getting around this error, or even just help in explaining why I'm seeing it, would be awesome.
def wordify(n):
# Convert n to a string to parse out ones, tens and hundreds later.
numString = str(n)
# N less than 20 is hard-coded.
if n < 21:
return numToWordMap(n)
# N between 21 and 99 parses ones and tens then concatenates.
elif n < 100:
onesNum = numString[-1]
ones = numToWordMap(int(onesNum))
tensNum = numString[-2]
tens = numToWordMap(int(tensNum)*10)
return tens+ones
else:
# TODO
pass
def numToWordMap(num):
mapping = {
0:"",
1:"one",
2:"two",
3:"three",
4:"four",
5:"five",
6:"six",
7:"seven",
8:"eight",
9:"nine",
10:"ten",
11:"eleven",
12:"twelve",
13:"thirteen",
14:"fourteen",
15:"fifteen",
16:"sixteen",
17:"seventeen",
18:"eighteen",
19:"nineteen",
20:"twenty",
30:"thirty",
40:"fourty",
50:"fifty",
60:"sixty",
70:"seventy",
80:"eighty",
90:"ninety",
100:"onehundred",
200:"twohundred",
300:"threehundred",
400:"fourhundred",
500:"fivehundred",
600:"sixhundred",
700:"sevenhundred",
800:"eighthundred",
900:"ninehundred",
}
return mapping[num]
if __name__ == '__main__':
pass
The error means that a function was used where there should have been a list, like this:
def foo(): pass
foo[3]
You must have changed some code.
By the way, wordify(40) returned "fourty". I spell it "forty"
And you have no entry for zero
In case someone looks here and has the same problem I had, you can also get a pointer to a function object if the wrong variable name is returned. For example, if you have function like this:
def foo():
my_return_val = 0
return return_val
my_val = foo()
then my_val will be a pointer to a function object which is another cause to "TypeError: 'function' object is unsubscriptable" if my_val is treated like a list or array when it really is a function object.
The solution? Simple! Fix the variable name in foo that is returned to my_return_val.

Categories