Python allows easy creation of an integer from a string of a given base via
int(str, base).
I want to perform the inverse: creation of a string from an integer,
i.e. I want some function int2base(num, base), such that:
int(int2base(x, b), b) == x
The function name/argument order is unimportant.
For any number x and base b that int() will accept.
This is an easy function to write: in fact it's easier than describing it in this question. However, I feel like I must be missing something.
I know about the functions bin, oct, hex, but I cannot use them for a few reasons:
Those functions are not available on older versions of Python, with which I need compatibility with (2.2)
I want a general solution that can be called the same way for different bases
I want to allow bases other than 2, 8, 16
Related
Python elegant inverse function of int(string, base)
Integer to base-x system using recursion in python
Base 62 conversion in Python
How to convert an integer to the shortest url-safe string in Python?
Surprisingly, people were giving only solutions that convert to small bases (smaller than the length of the English alphabet). There was no attempt to give a solution which converts to any arbitrary base from 2 to infinity.
So here is a super simple solution:
def numberToBase(n, b):
if n == 0:
return [0]
digits = []
while n:
digits.append(int(n % b))
n //= b
return digits[::-1]
so if you need to convert some super huge number to the base 577,
numberToBase(67854 ** 15 - 102, 577), will give you a correct solution:
[4, 473, 131, 96, 431, 285, 524, 486, 28, 23, 16, 82, 292, 538, 149, 25, 41, 483, 100, 517, 131, 28, 0, 435, 197, 264, 455],
Which you can later convert to any base you want
at some point of time you will notice that sometimes there is no built-in library function to do things that you want, so you need to write your own. If you disagree, post you own solution with a built-in function which can convert a base 10 number to base 577.
this is due to lack of understanding what a number in some base means.
I encourage you to think for a little bit why base in your method works only for n <= 36. Once you are done, it will be obvious why my function returns a list and has the signature it has.
If you need compatibility with ancient versions of Python, you can either use gmpy (which does include a fast, completely general int-to-string conversion function, and can be built for such ancient versions – you may need to try older releases since the recent ones have not been tested for venerable Python and GMP releases, only somewhat recent ones), or, for less speed but more convenience, use Python code – e.g., for Python 2, most simply:
import string
digs = string.digits + string.ascii_letters
def int2base(x, base):
if x < 0:
sign = -1
elif x == 0:
return digs[0]
else:
sign = 1
x *= sign
digits = []
while x:
digits.append(digs[int(x % base)])
x = int(x / base)
if sign < 0:
digits.append('-')
digits.reverse()
return ''.join(digits)
For Python 3, int(x / base) leads to incorrect results, and must be changed to x // base:
import string
digs = string.digits + string.ascii_letters
def int2base(x, base):
if x < 0:
sign = -1
elif x == 0:
return digs[0]
else:
sign = 1
x *= sign
digits = []
while x:
digits.append(digs[x % base])
x = x // base
if sign < 0:
digits.append('-')
digits.reverse()
return ''.join(digits)
"{0:b}".format(100) # bin: 1100100
"{0:x}".format(100) # hex: 64
"{0:o}".format(100) # oct: 144
def baseN(num,b,numerals="0123456789abcdefghijklmnopqrstuvwxyz"):
return ((num == 0) and numerals[0]) or (baseN(num // b, b, numerals).lstrip(numerals[0]) + numerals[num % b])
ref:
http://code.activestate.com/recipes/65212/
Please be aware that this may lead to
RuntimeError: maximum recursion depth exceeded in cmp
for very big integers.
>>> numpy.base_repr(10, base=3)
'101'
Note that numpy.base_repr() has a limit of 36 as its base. Otherwise it throws a ValueError
Recursive
I would simplify the most voted answer to:
BS="0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
def to_base(n, b):
return "0" if not n else to_base(n//b, b).lstrip("0") + BS[n%b]
With the same advice for RuntimeError: maximum recursion depth exceeded in cmp on very large integers and negative numbers. (You could usesys.setrecursionlimit(new_limit))
Iterative
To avoid recursion problems:
BS="0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
def to_base(s, b):
res = ""
while s:
res+=BS[s%b]
s//= b
return res[::-1] or "0"
Great answers!
I guess the answer to my question was "no" I was not missing some obvious solution.
Here is the function I will use that condenses the good ideas expressed in the answers.
allow caller-supplied mapping of characters (allows base64 encode)
checks for negative and zero
maps complex numbers into tuples of strings
def int2base(x,b,alphabet='0123456789abcdefghijklmnopqrstuvwxyz'):
'convert an integer to its string representation in a given base'
if b<2 or b>len(alphabet):
if b==64: # assume base64 rather than raise error
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
else:
raise AssertionError("int2base base out of range")
if isinstance(x,complex): # return a tuple
return ( int2base(x.real,b,alphabet) , int2base(x.imag,b,alphabet) )
if x<=0:
if x==0:
return alphabet[0]
else:
return '-' + int2base(-x,b,alphabet)
# else x is non-negative real
rets=''
while x>0:
x,idx = divmod(x,b)
rets = alphabet[idx] + rets
return rets
You could use baseconv.py from my project: https://github.com/semente/python-baseconv
Sample usage:
>>> from baseconv import BaseConverter
>>> base20 = BaseConverter('0123456789abcdefghij')
>>> base20.encode(1234)
'31e'
>>> base20.decode('31e')
'1234'
>>> base20.encode(-1234)
'-31e'
>>> base20.decode('-31e')
'-1234'
>>> base11 = BaseConverter('0123456789-', sign='$')
>>> base11.encode('$1234')
'$-22'
>>> base11.decode('$-22')
'$1234'
There is some bultin converters as for example baseconv.base2, baseconv.base16 and baseconv.base64.
def base(decimal ,base) :
list = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
other_base = ""
while decimal != 0 :
other_base = list[decimal % base] + other_base
decimal = decimal / base
if other_base == "":
other_base = "0"
return other_base
print base(31 ,16)
output:
"1F"
def base_conversion(num, base):
digits = []
while num > 0:
num, remainder = divmod(num, base)
digits.append(remainder)
return digits[::-1]
http://code.activestate.com/recipes/65212/
def base10toN(num,n):
"""Change a to a base-n number.
Up to base-36 is supported without special notation."""
num_rep={10:'a',
11:'b',
12:'c',
13:'d',
14:'e',
15:'f',
16:'g',
17:'h',
18:'i',
19:'j',
20:'k',
21:'l',
22:'m',
23:'n',
24:'o',
25:'p',
26:'q',
27:'r',
28:'s',
29:'t',
30:'u',
31:'v',
32:'w',
33:'x',
34:'y',
35:'z'}
new_num_string=''
current=num
while current!=0:
remainder=current%n
if 36>remainder>9:
remainder_string=num_rep[remainder]
elif remainder>=36:
remainder_string='('+str(remainder)+')'
else:
remainder_string=str(remainder)
new_num_string=remainder_string+new_num_string
current=current/n
return new_num_string
Here's another one from the same link
def baseconvert(n, base):
"""convert positive decimal integer n to equivalent in another base (2-36)"""
digits = "0123456789abcdefghijklmnopqrstuvwxyz"
try:
n = int(n)
base = int(base)
except:
return ""
if n < 0 or base < 2 or base > 36:
return ""
s = ""
while 1:
r = n % base
s = digits[r] + s
n = n / base
if n == 0:
break
return s
I made a pip package for this.
I recommend you use my bases.py https://github.com/kamijoutouma/bases.py which was inspired by bases.js
from bases import Bases
bases = Bases()
bases.toBase16(200) // => 'c8'
bases.toBase(200, 16) // => 'c8'
bases.toBase62(99999) // => 'q0T'
bases.toBase(200, 62) // => 'q0T'
bases.toAlphabet(300, 'aAbBcC') // => 'Abba'
bases.fromBase16('c8') // => 200
bases.fromBase('c8', 16) // => 200
bases.fromBase62('q0T') // => 99999
bases.fromBase('q0T', 62) // => 99999
bases.fromAlphabet('Abba', 'aAbBcC') // => 300
refer to https://github.com/kamijoutouma/bases.py#known-basesalphabets
for what bases are usable
EDIT:
pip link https://pypi.python.org/pypi/bases.py/0.2.2
def int2base(a, base, numerals="0123456789abcdefghijklmnopqrstuvwxyz"):
baseit = lambda a=a, b=base: (not a) and numerals[0] or baseit(a-a%b,b*base)+numerals[a%b%(base-1) or (a%b) and (base-1)]
return baseit()
explanation
In any base every number is equal to a1+a2*base**2+a3*base**3... The "mission" is to find all a 's.
For everyN=1,2,3... the code is isolating the aN*base**N by "mouduling" by b for b=base**(N+1) which slice all a 's bigger than N, and slicing all the a 's that their serial is smaller than N by decreasing a everytime the func is called by the current aN*base**N .
Base%(base-1)==1 therefor base**p%(base-1)==1 and therefor q*base^p%(base-1)==q with only one exception when q=base-1 which returns 0.
To fix that in case it returns 0 the func is checking is it 0 from the beggining.
advantages
in this sample theres only one multiplications (instead of division) and some moudulueses which relatively takes small amounts of time.
While the currently top answer is definitely an awesome solution, there remains more customization users might like.
Basencode adds some of these features, including conversions of floating point numbers, modifying digits (in the linked answer, only numbers can be used).
Here's a possible use case:
>>> from basencode import *
>>> n1 = Number(12345)
>> n1.repr_in_base(64) # convert to base 64
'30V'
>>> Number('30V', 64) # construct Integer from base 64
Integer(12345)
>>> n1.repr_in_base(8)
'30071'
>>> n1.repr_in_octal() # shortcuts
'30071'
>>> n1.repr_in_bin() # equivelant to `n1.repr_in_base(2)`
'11000000111001'
>>> n1.repr_in_base(2, digits=list('-+')) # override default digits: use `-` and `+` in place of `0` and `1`
'++------+++--+'
>>> n1.repr_in_base(33) # yet another base - all bases from 2 to 64 are supported from the start
'bb3'
How would you add any bases you want? Let me replicate the example of the currently most upvoted answer: the digits parameter allows you to override the default digits from base 2 to 64, and provide digits for any base higher than that. The mode parameter determines how the value of the representation will determine how (list or string) the answer will be returned.
>>> n2 = Number(67854 ** 15 - 102)
>>> n2.repr_in_base(577, digits=[str(i) for i in range(577)], mode="l")
['4', '473', '131', '96', '431', '285', '524', '486', '28', '23', '16', '82', '292', '538', '149', '25', '41', '483', '100', '517', '131', '28', '0', '435', '197', '264', '455']
>>> n2.repr_in_base(577, mode="l") # the program remembers the digits for base 577 now
['4', '473', '131', '96', '431', '285', '524', '486', '28', '23', '16', '82', '292', '538', '149', '25', '41', '483', '100', '517', '131', '28', '0', '435', '197', '264', '455']
Operations can be done: the Number class returns an instance of basencode.Integer if the provided number is an Integer, else it returns a basencode.Float
>>> n3 = Number(54321) # the Number class returns an instance of `basencode.Integer` if the provided number is an Integer, otherwise it returns a `basencode.Float`.
>>> n1 + n3
Integer(66666)
>>> n3 - n1
Integer(41976)
>>> n1 * n3
Integer(670592745)
>>> n3 // n1
Integer(4)
>>> n3 / n1 # a basencode.Float class allows conversion of floating point numbers
Float(4.400243013365735)
>>> (n3 / n1).repr_in_base(32)
'4.cpr56v6rnc4oitoblha2r11sus0dheqd4pgechfcjklo74b2bgom7j8ih86mipdvss0068sehi9f3791mdo4uotfujq66cf0jkgo'
>>> n4 = Number(0.5) # returns a basencode.Float
>>> n4.repr_in_bin() # binary version of 0.5
'0.1'
Disclaimer: this project is under active maintenance, and I'm a contributor.
>>> import string
>>> def int2base(integer, base):
if not integer: return '0'
sign = 1 if integer > 0 else -1
alphanum = string.digits + string.ascii_lowercase
nums = alphanum[:base]
res = ''
integer *= sign
while integer:
integer, mod = divmod(integer, base)
res += nums[mod]
return ('' if sign == 1 else '-') + res[::-1]
>>> int2base(-15645, 23)
'-16d5'
>>> int2base(213, 21)
'a3'
A recursive solution for those interested. Of course, this will not work with negative binary values. You would need to implement Two's Complement.
def generateBase36Alphabet():
return ''.join([str(i) for i in range(10)]+[chr(i+65) for i in range(26)])
def generateAlphabet(base):
return generateBase36Alphabet()[:base]
def intToStr(n, base, alphabet):
def toStr(n, base, alphabet):
return alphabet[n] if n < base else toStr(n//base,base,alphabet) + alphabet[n%base]
return ('-' if n < 0 else '') + toStr(abs(n), base, alphabet)
print('{} -> {}'.format(-31, intToStr(-31, 16, generateAlphabet(16)))) # -31 -> -1F
def base_changer(number,base):
buff=97+abs(base-10)
dic={};buff2='';buff3=10
for i in range(97,buff+1):
dic[buff3]=chr(i)
buff3+=1
while(number>=base):
mod=int(number%base)
number=int(number//base)
if (mod) in dic.keys():
buff2+=dic[mod]
continue
buff2+=str(mod)
if (number) in dic.keys():
buff2+=dic[number]
else:
buff2+=str(number)
return buff2[::-1]
Here is an example of how to convert a number of any base to another base.
from collections import namedtuple
Test = namedtuple("Test", ["n", "from_base", "to_base", "expected"])
def convert(n: int, from_base: int, to_base: int) -> int:
digits = []
while n:
(n, r) = divmod(n, to_base)
digits.append(r)
return sum(from_base ** i * v for i, v in enumerate(digits))
if __name__ == "__main__":
tests = [
Test(32, 16, 10, 50),
Test(32, 20, 10, 62),
Test(1010, 2, 10, 10),
Test(8, 10, 8, 10),
Test(150, 100, 1000, 150),
Test(1500, 100, 10, 1050000),
]
for test in tests:
result = convert(*test[:-1])
assert result == test.expected, f"{test=}, {result=}"
print("PASSED!!!")
Say we want to convert 14 to base 2. We repeatedly apply the division algorithm until the quotient is 0:
14 = 2 x 7
7 = 2 x 3 + 1
3 = 2 x 1 + 1
1 = 2 x 0 + 1
The binary representation is just the remainder read from bottom to top. This can be proved by expanding
14 = 2 x 7 = 2 x (2 x 3 + 1) = 2 x (2 x (2 x 1 + 1) + 1) = 2 x (2 x (2 x (2 x 0 + 1) + 1) + 1) = 2^3 + 2^2 + 2
The code is the implementation of the above algorithm.
def toBaseX(n, X):
strbin = ""
while n != 0:
strbin += str(n % X)
n = n // X
return strbin[::-1]
This is my approach. At first converting the number then casting it to string.
def to_base(n, base):
if base == 10:
return n
result = 0
counter = 0
while n:
r = n % base
n //= base
result += r * 10**counter
counter+=1
return str(result)
I have written this function which I use to encode in different bases. I also provided the way to shift the result by a value 'offset'. This is useful if you'd like to encode to bases above 64, but keeping displayable chars (like a base 95).
I also tried to avoid reversing the output 'list' and tried to minimize computing operations. The array of pow(base) is computed on demand and kept for additional calls to the function.
The output is a binary string
pows = {}
######################################################
def encode_base(value,
base = 10,
offset = 0) :
"""
Encode value into a binary string, according to the desired base.
Input :
value : Any positive integer value
offset : Shift the encoding (eg : Starting at chr(32))
base : The base in which we'd like to encode the value
Return : Binary string
Example : with : offset = 32, base = 64
100 -> !D
200 -> #(
"""
# Determine the number of loops
try :
pb = pows[base]
except KeyError :
pb = pows[base] = {n : base ** n for n in range(0, 8) if n < 2 ** 48 -1}
for n in pb :
if value < pb[n] :
n -= 1
break
out = []
while n + 1 :
b = pb[n]
out.append(chr(offset + value // b))
n -= 1
value %= b
return ''.join(out).encode()
This function converts any integer from any base to any base
def baseconvert(number, srcbase, destbase):
if srcbase != 10:
sum = 0
for _ in range(len(str(number))):
sum += int(str(number)[_]) * pow(srcbase, len(str(number)) - _ - 1)
b10 = sum
return baseconvert(b10, 10, destbase)
end = ''
q = number
while(True):
r = q % destbase
q = q // destbase
end = str(r) + end
if(q<destbase):
end = str(q) + end
return int(end)
The below provided Python code converts a Python integer to a string in arbitrary base ( from 2 up to infinity ) and works in both directions. So all the created strings can be converted back to Python integers by providing a string for N instead of an integer.
The code works only on positive numbers by intention (there is in my eyes some hassle about negative values and their bit representations I don't want to dig into). Just pick from this code what you need, want or like, or just have fun learning about available options. Much is there only for the purpose of documenting all the various available approaches ( e.g. the Oneliner seems not to be fast, even if promised to be ).
I like the by Salvador Dali proposed format for infinite large bases. A nice proposal which works optically well even for simple binary bit representations. Notice that the width=x padding parameter in case of infiniteBase=True formatted string applies to the digits and not to the whole number. It seems, that code handling infiniteBase digits format runs even a bit faster than the other options - another reason for using it?
I don't like the idea of using Unicode for extending the number of symbols available for digits, so don't look in the code below for it, because it's not there. Use the proposed infiniteBase format instead or store integers as bytes for compression purposes.
def inumToStr( N, base=2, width=1, infiniteBase=False,\
useNumpy=False, useRecursion=False, useOneliner=False, \
useGmpy=False, verbose=True):
''' Positive numbers only, but works in BOTH directions.
For strings in infiniteBase notation set for bases <= 62
infiniteBase=True . Examples of use:
inumToStr( 17, 2, 1, 1) # [1,0,0,0,1]
inumToStr( 17, 3, 5) # 00122
inumToStr(245, 16, 4) # 00F5
inumToStr(245, 36, 4,0,1) # 006T
inumToStr(245245245245,36,10,0,1) # 0034NWOQBH
inumToStr(245245245245,62) # 4JhA3Th
245245245245 == int(gmpy2.mpz('4JhA3Th',62))
inumToStr(245245245245,99,2) # [25,78, 5,23,70,44]
----------------------------------------------------
inumToStr( '[1,0,0,0,1]',2, infiniteBase=True ) # 17
inumToStr( '[25,78, 5,23,70,44]', 99) # 245245245245
inumToStr( '0034NWOQBH', 36 ) # 245245245245
inumToStr( '4JhA3Th' , 62 ) # 245245245245
----------------------------------------------------
--- Timings for N = 2**4096, base=36:
standard: 0.0023
infinite: 0.0017
numpy : 0.1277
recursio; 0.0022
oneliner: 0.0146
For N = 2**8192:
standard: 0.0075
infinite: 0.0053
numpy : 0.1369
max. recursion depth exceeded: recursio/oneliner
'''
show = print
if type(N) is str and ( infiniteBase is True or base > 62 ):
lstN = eval(N)
if verbose: show(' converting a non-standard infiniteBase bits string to Python integer')
return sum( [ item*base**pow for pow, item in enumerate(lstN[::-1]) ] )
if type(N) is str and base <= 36:
if verbose: show('base <= 36. Returning Python int(N, base)')
return int(N, base)
if type(N) is str and base <= 62:
if useGmpy:
if verbose: show(' base <= 62, useGmpy=True, returning int(gmpy2.mpz(N,base))')
return int(gmpy2.mpz(N,base))
else:
if verbose: show(' base <= 62, useGmpy=False, self-calculating return value)')
lstStrOfDigits="0123456789"+ \
"abcdefghijklmnopqrstuvwxyz".upper() + \
"abcdefghijklmnopqrstuvwxyz"
dictCharToPow = {}
for index, char in enumerate(lstStrOfDigits):
dictCharToPow.update({char : index})
return sum( dictCharToPow[item]*base**pow for pow, item in enumerate(N[::-1]) )
#:if
#:if
if useOneliner and base <= 36:
if verbose: show(' base <= 36, useOneliner=True, running the Oneliner code')
d="0123456789abcdefghijklmnopqrstuvwxyz"
baseit = lambda a=N, b=base: (not a) and d[0] or \
baseit(a-a%b,b*base)+d[a%b%(base-1) or (a%b) and (base-1)]
return baseit().rjust(width, d[0])[1:]
if useRecursion and base <= 36:
if verbose: show(' base <= 36, useRecursion=True, running recursion algorythm')
BS="0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
def to_base(n, b):
return "0" if not n else to_base(n//b, b).lstrip("0") + BS[n%b]
return to_base(N, base).rjust(width,BS[0])
if base > 62 or infiniteBase:
if verbose: show(' base > 62 or infiniteBase=True, returning a non-standard digits string')
# Allows arbitrary large base with 'width=...'
# applied to each digit (useful also for bits )
N, digit = divmod(N, base)
strN = str(digit).rjust(width, ' ')+']'
while N:
N, digit = divmod(N, base)
strN = str(digit).rjust(width, ' ') + ',' + strN
return '[' + strN
#:if
if base == 2:
if verbose: show(" base = 2, returning Python str(f'{N:0{width}b}')")
return str(f'{N:0{width}b}')
if base == 8:
if verbose: show(" base = 8, returning Python str(f'{N:0{width}o}')")
return str(f'{N:0{width}o}')
if base == 16:
if verbose: show(" base = 16, returning Python str(f'{N:0{width}X}')")
return str(f'{N:0{width}X}')
if base <= 36:
if useNumpy:
if verbose: show(" base <= 36, useNumpy=True, returning np.base_repr(N, base)")
import numpy as np
strN = np.base_repr(N, base)
return strN.rjust(width, '0')
else:
if verbose: show(' base <= 36, useNumpy=False, self-calculating return value)')
lstStrOfDigits="0123456789"+"abcdefghijklmnopqrstuvwxyz".upper()
strN = lstStrOfDigits[N % base] # rightmost digit
while N >= base:
N //= base # consume already converted digit
strN = lstStrOfDigits[N % base] + strN # add digits to the left
#:while
return strN.rjust(width, lstStrOfDigits[0])
#:if
#:if
if base <= 62:
if useGmpy:
if verbose: show(" base <= 62, useGmpy=True, returning gmpy2.digits(N, base)")
import gmpy2
strN = gmpy2.digits(N, base)
return strN.rjust(width, '0')
# back to Python int from gmpy2.mpz with
# int(gmpy2.mpz('4JhA3Th',62))
else:
if verbose: show(' base <= 62, useGmpy=False, self-calculating return value)')
lstStrOfDigits= "0123456789" + \
"abcdefghijklmnopqrstuvwxyz".upper() + \
"abcdefghijklmnopqrstuvwxyz"
strN = lstStrOfDigits[N % base] # rightmost digit
while N >= base:
N //= base # consume already converted digit
strN = lstStrOfDigits[N % base] + strN # add digits to the left
#:while
return strN.rjust(width, lstStrOfDigits[0])
#:if
#:if
#:def
I'm presenting a "unoptimized" solution for bases between 2 and 9:
def to_base(N, base=2):
N_in_base = ''
while True:
N_in_base = str(N % base) + N_in_base
N //= base
if N == 0:
break
return N_in_base
This solution does not require reversing the final result, but it's actually not optimized. Refer to this answer to see why: https://stackoverflow.com/a/37133870/7896998
Simple base transformation
def int_to_str(x, b):
s = ""
while x:
s = str(x % b) + s
x //= b
return s
Example of output with no 0 to base 9
s = ""
x = int(input())
while x:
if x % 9 == 0:
s = "9" + s
x -= x % 10
x = x // 9
else:
s = str(x % 9) + s
x = x // 9
print(s)
def dec_to_radix(input, to_radix=2, power=None):
if not isinstance(input, int):
raise TypeError('Not an integer!')
elif power is None:
power = 1
if input == 0:
return 0
else:
remainder = input % to_radix**power
digit = str(int(remainder/to_radix**(power-1)))
return int(str(dec_to_radix(input-remainder, to_radix, power+1)) + digit)
def radix_to_dec(input, from_radix):
if not isinstance(input, int):
raise TypeError('Not an integer!')
return sum(int(digit)*(from_radix**power) for power, digit in enumerate(str(input)[::-1]))
def radix_to_radix(input, from_radix=10, to_radix=2, power=None):
dec = radix_to_dec(input, from_radix)
return dec_to_radix(dec, to_radix, power)
Another short one (and easier to understand imo):
def int_to_str(n, b, symbols='0123456789abcdefghijklmnopqrstuvwxyz'):
return (int_to_str(n/b, b, symbols) if n >= b else "") + symbols[n%b]
And with proper exception handling:
def int_to_str(n, b, symbols='0123456789abcdefghijklmnopqrstuvwxyz'):
try:
return (int_to_str(n/b, b) if n >= b else "") + symbols[n%b]
except IndexError:
raise ValueError(
"The symbols provided are not enough to represent this number in "
"this base")
Here is a recursive version that handles signed integers and custom digits.
import string
def base_convert(x, base, digits=None):
"""Convert integer `x` from base 10 to base `base` using `digits` characters as digits.
If `digits` is omitted, it will use decimal digits + lowercase letters + uppercase letters.
"""
digits = digits or (string.digits + string.ascii_letters)
assert 2 <= base <= len(digits), "Unsupported base: {}".format(base)
if x == 0:
return digits[0]
sign = '-' if x < 0 else ''
x = abs(x)
first_digits = base_convert(x // base, base, digits).lstrip(digits[0])
return sign + first_digits + digits[x % base]
Strings aren't the only choice for representing numbers: you can use a list of integers to represent the order of each digit. Those can easily be converted to a string.
None of the answers reject base < 2; and most will run very slowly or crash with stack overflows for very large numbers (such as 56789 ** 43210). To avoid such failures, reduce quickly like this:
def n_to_base(n, b):
if b < 2: raise # invalid base
if abs(n) < b: return [n]
ret = [y for d in n_to_base(n, b*b) for y in divmod(d, b)]
return ret[1:] if ret[0] == 0 else ret # remove leading zeros
def base_to_n(v, b):
h = len(v) // 2
if h == 0: return v[0]
return base_to_n(v[:-h], b) * (b**h) + base_to_n(v[-h:], b)
assert ''.join(['0123456789'[x] for x in n_to_base(56789**43210,10)])==str(56789**43210)
Speedwise, n_to_base is comparable with str for large numbers (about 0.3s on my machine), but if you compare against hex you may be surprised (about 0.3ms on my machine, or 1000x faster). The reason is because the large integer is stored in memory in base 256 (bytes). Each byte can simply be converted to a two-character hex string. This alignment only happens for bases that are powers of two, which is why there are special cases for 2,8, and 16 (and base64, ascii, utf16, utf32).
Consider the last digit of a decimal string. How does it relate to the sequence of bytes that forms its integer? Let's label the bytes s[i] with s[0] being the least significant (little endian). Then the last digit is sum([s[i]*(256**i) % 10 for i in range(n)]). Well, it happens that 256**i ends with a 6 for i > 0 (6*6=36) so that last digit is (s[0]*5 + sum(s)*6)%10. From this, you can see that the last digit depends on the sum of all the bytes. This nonlocal property is what makes converting to decimal harder.
def baseConverter(x, b):
s = ""
d = string.printable.upper()
while x > 0:
s += d[x%b]
x = x / b
return s[::-1]
Related
Im stuck on a problem where I have to write a function that converts a denary number into a binary number using the repeated division by two algorithm. Steps Include:
The number to be converted is divided by two.
The remainder from the division is the next binary digit. Digits are added to the front of the sequence.
The result is truncated so that the input to the next division by two is always an integer.
The algorithm continues until the result is 0.
Please click the link below to see what the output should be like:
https://i.stack.imgur.com/pifUO.png
def dentobi(user):
denary = user
divide = user / 2
remainder = user % 2
binary = remainder
if user != 0:
print("Denary:", denary)
print("Divide by 2:", divide)
print("Remainder:", remainder)
print("Binary:", binary)
user = int(input("Please enter a number: "))
dentobi(user)
This is what I have done so far but Im not getting anywhere.
Can someone explain how I would do this?
The Answer provided by #user2390182 is functionally correct except that it returns an empty string when num is zero. However, I have noted on several occasions that divmod() is rather slow. Here are three slightly different techniques and their performance statistics.
import time
# This is the OP's original code edited to allow for num == 0
def binaryx(num):
b = ""
while num:
num, digit = divmod(num, 2)
b = f"{digit}{b}"
return b or '0'
# This is my preferred solution
def binaryo(n):
r = []
while n > 0:
r.append('1' if n & 1 else '0')
n >>= 1
return ''.join(reversed(r)) or '0'
# This uses techniques suggested by my namesake
def binaryy(n):
r = ''
while n > 0:
r = str(n & 1) + r
n >>= 1
return r or '0'
M = 250_000
for func in [binaryx, binaryo, binaryy]:
s = time.perf_counter()
for _ in range(M):
func(987654321)
e = time.perf_counter()
print(f'{func.__name__} -> {e-s:.4f}s')
Output:
binaryx -> 1.3817s
binaryo -> 0.9861s
binaryy -> 1.6052s
One way, using divmod to divide by 2 and get the remainder in one step:
def binary(num):
b = ""
while num:
num, digit = divmod(num, 2)
b = f"{digit}{b}"
return b
binary(26)
'11010'
This assumes a positive number but can easily be extended to work for 0 and negatives.
Let's say we're trying to multiply 10011 and 1101 (or in arithmetic terms, 19 x 13). We all know that this is the same as adding 10011 to itself 13 times or vice versa. Apparently, I've found a code at https://www.w3resource.com/python-exercises/challenges/1/python-challenges-1-exercise-31.php which provided a way on how to add two binary numbers. My question is, in general, if we multiply two binary numbers A and B, how are we going to iterate A to add itself B times? Obviously, in order to do that we have to convert B to decimal/integer first.
def add_binary_nums(x, y):
max_len = max(len(x), len(y))
x = x.zfill(max_len)
y = y.zfill(max_len)
result = ''
carry = 0
for i in range(max_len-1, -1, -1):
r = carry
r += 1 if x[i] == '1' else 0
r += 1 if y[i] == '1' else 0
result = ('1' if r % 2 == 1 else '0') + result
carry = 0 if r < 2 else 1
if carry !=0 : result = '1' + result
return result.zfill(max_len)
print(add_binary_nums('11', '1'))
You can count up to a number by starting at 0 and adding 1 until you are done. Since you already have defined a binary add, you only need to add the loop:
def binary_range(stop: str):
"""Count `stop` times"""
current = '0'
while stop != current:
yield current
current = add_binary_nums(current, '1')
This is enough to do something "n times". You can now do "a * b" as "add a to itself b times":
def binary_mul(a: str, b: str):
"""Multiplay the binary ``a`` by the binary ``b``"""
result = '0'
for _ in binary_range(b):
result = add_binary_nums(result, a)
return result
If you don't care about building a binary calculator, use Python to convert binary to integers or vice versa. int(bin_string, 2) converts a string such as "01101" to the appropriate integer, and bin(integer) converts it back to "0b01101".
For example, a binary multiplication that takes and returns strings looks like this:
def binary_mul(a: str, b: str):
return bin(int(a, 2) * int(b, 2))[:2]
So I was studying recursion function online. And the one question asks me to write a function to add up a number's digits together. For example (1023) -> 1 + 0 + 2 + 3 = 6. I used % and // get get rid of a digit each time. However, I don't know how to add them up together. The closest I can get is to print out each digit. Can anyone help me solve it or give me a hint please?
def digitalSum(n):
if n < 10:
sum_total = n
print(sum_total)
else:
sum_total = n % 10
digitalSum((n - (n % 10))//10)
print(sum_total)
digitalSum(1213)
Your function should return the current digit plus the sum of the rest of the digits:
def digitalSum(n):
if n < 10: return n
return n % 10 + digitalSum(n // 10)
print digitalSum(1213)
For completeness, you can also handle negative numbers:
def digitalSum(n):
if n < 0: sign = -1
else: sign = 1
n = abs(n)
if n < 10: return n
return sign * (n % 10 + digitalSum(n // 10))
print digitalSum(1213)
A correct version of your function is as follows:
from math import log10
def sum_digits(n, i=None):
if i is None:
i = int(log10(abs(n)))
e = float(10**i)
a, b = (n / e), (abs(n) % e)
if i == 0:
return int(a)
else:
return int(a) + sum_digits(b, (i - 1))
print sum_digits(1234)
print sum_digits(-1234)
Example:
$ python -i foo.py
10
8
>>>
Updated: Updated to properly (IHMO) cope with negative numbers. e.g: -1234 == -1 + 2 + 3 + 4 == 8
NB: Whilst this answer has been accepted (Thank you) I really think that perreal's answer should have been accepted for simplicity and clarity.
Also note: that whilst my solution handles negative numbers and summing their respective digits, perreal clearly points out in our comments that there are ate least three different ways to interpret the summing of digits of a negative number.
How does one convert a base-10 floating point number in Python to a base-N floating point number?
Specifically in my case, I would like to convert numbers to base 3 (obtain the representation of floating point numbers in base 3), for calculations with the Cantor set.
After a bit of fiddling, here's what I came up with. I present it to you humbly, keeping in mind Ignacio's warning. Please let me know if you find any flaws. Among other things, I have no reason to believe that the precision argument provides anything more than a vague assurance that the first precision digits are pretty close to correct.
def base3int(x):
x = int(x)
exponents = range(int(math.log(x, 3)), -1, -1)
for e in exponents:
d = int(x // (3 ** e))
x -= d * (3 ** e)
yield d
def base3fraction(x, precision=1000):
x = x - int(x)
exponents = range(-1, (-precision - 1) * 2, -1)
for e in exponents:
d = int(x // (3 ** e))
x -= d * (3 ** e)
yield d
if x == 0: break
These are iterators returning ints. Let me know if you need string conversion; but I imagine you can handle that.
EDIT: Actually looking at this some more, it seems like a if x == 0: break line after the yield in base3fraction gives you pretty much arbitrary precision. I went ahead and added that. Still, I'm leaving in the precision argument; it makes sense to be able to limit that quantity.
Also, if you want to convert back to decimal fractions, this is what I used to test the above.
sum(d * (3 ** (-i - 1)) for i, d in enumerate(base3fraction(x)))
Update
For some reason I've felt inspired by this problem. Here's a much more generalized solution. This returns two generators that generate sequences of integers representing the integral and fractional part of a given number in an arbitrary base. Note that this only returns two generators to distinguish between the parts of the number; the algorithm for generating digits is the same in both cases.
def convert_base(x, base=3, precision=None):
length_of_int = int(math.log(x, base))
iexps = range(length_of_int, -1, -1)
if precision == None: fexps = itertools.count(-1, -1)
else: fexps = range(-1, -int(precision + 1), -1)
def cbgen(x, base, exponents):
for e in exponents:
d = int(x // (base ** e))
x -= d * (base ** e)
yield d
if x == 0 and e < 0: break
return cbgen(int(x), base, iexps), cbgen(x - int(x), base, fexps)
Although 8 years have passed, I think it is worthwhile to mention a more compact solution.
def baseConversion( x=1, base=3, decimals=2 ):
import math
n_digits = math.floor(-math.log(x, base))#-no. of digits in front of decimal point
x_newBase = 0#initialize
for i in range( n_digits, decimals+1 ):
x_newBase = x_newBase + int(x*base**i) % base * 10**(-i)
return x_newBase
For example calling the function to convert the number 5+1/9+1/27
def baseConversion( x=5+1/9+1/27, base=3, decimals=2 )
12.01
def baseConversion( x=5+1/9+1/27, base=3, decimals=3 )
12.011
You may try this solution to convert a float string to a given base.
def eval_strint(s, base=2):
assert type(s) is str
assert 2 <= base <= 36
###
### YOUR CODE HERE
###
return int(s,base)
def is_valid_strfrac(s, base=2):
return all([is_valid_strdigit(c, base) for c in s if c != '.']) \
and (len([c for c in s if c == '.']) <= 1)
def eval_strfrac(s, base=2):
assert is_valid_strfrac(s, base), "'{}' contains invalid digits for a base-{} number.".format(s, base)
stg = s.split(".")
float_point=0.0
if len(stg) > 1:
float_point = (eval_strint(stg[1],base) * (base**(-len(stg[1]))))
stg_float = eval_strint(stg[0],base) + float_point
return stg_float
Suppose you take the strings 'a' and 'z' and list all the strings that come between them in alphabetical order: ['a','b','c' ... 'x','y','z']. Take the midpoint of this list and you find 'm'. So this is kind of like taking an average of those two strings.
You could extend it to strings with more than one character, for example the midpoint between 'aa' and 'zz' would be found in the middle of the list ['aa', 'ab', 'ac' ... 'zx', 'zy', 'zz'].
Might there be a Python method somewhere that does this? If not, even knowing the name of the algorithm would help.
I began making my own routine that simply goes through both strings and finds midpoint of the first differing letter, which seemed to work great in that 'aa' and 'az' midpoint was 'am', but then it fails on 'cat', 'doggie' midpoint which it thinks is 'c'. I tried Googling for "binary search string midpoint" etc. but without knowing the name of what I am trying to do here I had little luck.
I added my own solution as an answer
If you define an alphabet of characters, you can just convert to base 10, do an average, and convert back to base-N where N is the size of the alphabet.
alphabet = 'abcdefghijklmnopqrstuvwxyz'
def enbase(x):
n = len(alphabet)
if x < n:
return alphabet[x]
return enbase(x/n) + alphabet[x%n]
def debase(x):
n = len(alphabet)
result = 0
for i, c in enumerate(reversed(x)):
result += alphabet.index(c) * (n**i)
return result
def average(a, b):
a = debase(a)
b = debase(b)
return enbase((a + b) / 2)
print average('a', 'z') #m
print average('aa', 'zz') #mz
print average('cat', 'doggie') #budeel
print average('google', 'microsoft') #gebmbqkil
print average('microsoft', 'google') #gebmbqkil
Edit: Based on comments and other answers, you might want to handle strings of different lengths by appending the first letter of the alphabet to the shorter word until they're the same length. This will result in the "average" falling between the two inputs in a lexicographical sort. Code changes and new outputs below.
def pad(x, n):
p = alphabet[0] * (n - len(x))
return '%s%s' % (x, p)
def average(a, b):
n = max(len(a), len(b))
a = debase(pad(a, n))
b = debase(pad(b, n))
return enbase((a + b) / 2)
print average('a', 'z') #m
print average('aa', 'zz') #mz
print average('aa', 'az') #m (equivalent to ma)
print average('cat', 'doggie') #cumqec
print average('google', 'microsoft') #jlilzyhcw
print average('microsoft', 'google') #jlilzyhcw
If you mean the alphabetically, simply use FogleBird's algorithm but reverse the parameters and the result!
>>> print average('cat'[::-1], 'doggie'[::-1])[::-1]
cumdec
or rewriting average like so
>>> def average(a, b):
... a = debase(a[::-1])
... b = debase(b[::-1])
... return enbase((a + b) / 2)[::-1]
...
>>> print average('cat', 'doggie')
cumdec
>>> print average('google', 'microsoft')
jlvymlupj
>>> print average('microsoft', 'google')
jlvymlupj
It sounds like what you want, is to treat alphabetical characters as a base-26 value between 0 and 1. When you have strings of different length (an example in base 10), say 305 and 4202, your coming out with a midpoint of 3, since you're looking at the characters one at a time. Instead, treat them as a floating point mantissa: 0.305 and 0.4202. From that, it's easy to come up with a midpoint of .3626 (you can round if you'd like).
Do the same with base 26 (a=0...z=25, ba=26, bb=27, etc.) to do the calculations for letters:
cat becomes 'a.cat' and doggie becomes 'a.doggie', doing the math gives cat a decimal value of 0.078004096, doggie a value of 0.136390697, with an average of 0.107197397 which in base 26 is roughly "cumcqo"
Based on your proposed usage, consistent hashing ( http://en.wikipedia.org/wiki/Consistent_hashing ) seems to make more sense.
Thanks for everyone who answered, but I ended up writing my own solution because the others weren't exactly what I needed. I am trying to average app engine key names, and after studying them a bit more I discovered they actually allow any 7-bit ASCII characters in the names. Additionally I couldn't really rely on the solutions that converted the key names first to floating point, because I suspected floating point accuracy just isn't enough.
To take an average, first you add two numbers together and then divide by two. These are both such simple operations that I decided to just make functions to add and divide base 128 numbers represented as lists. This solution hasn't been used in my system yet so I might still find some bugs in it. Also it could probably be a lot shorter, but this is just something I needed to get done instead of trying to make it perfect.
# Given two lists representing a number with one digit left to decimal point and the
# rest after it, for example 1.555 = [1,5,5,5] and 0.235 = [0,2,3,5], returns a similar
# list representing those two numbers added together.
#
def ladd(a, b, base=128):
i = max(len(a), len(b))
lsum = [0] * i
while i > 1:
i -= 1
av = bv = 0
if i < len(a): av = a[i]
if i < len(b): bv = b[i]
lsum[i] += av + bv
if lsum[i] >= base:
lsum[i] -= base
lsum[i-1] += 1
return lsum
# Given a list of digits after the decimal point, returns a new list of digits
# representing that number divided by two.
#
def ldiv2(vals, base=128):
vs = vals[:]
vs.append(0)
i = len(vs)
while i > 0:
i -= 1
if (vs[i] % 2) == 1:
vs[i] -= 1
vs[i+1] += base / 2
vs[i] = vs[i] / 2
if vs[-1] == 0: vs = vs[0:-1]
return vs
# Given two app engine key names, returns the key name that comes between them.
#
def average(a_kn, b_kn):
m = lambda x:ord(x)
a = [0] + map(m, a_kn)
b = [0] + map(m, b_kn)
avg = ldiv2(ladd(a, b))
return "".join(map(lambda x:chr(x), avg[1:]))
print average('a', 'z') # m#
print average('aa', 'zz') # n-#
print average('aa', 'az') # am#
print average('cat', 'doggie') # d(mstr#
print average('google', 'microsoft') # jlim.,7s:
print average('microsoft', 'google') # jlim.,7s:
import math
def avg(str1,str2):
y = ''
s = 'abcdefghijklmnopqrstuvwxyz'
for i in range(len(str1)):
x = s.index(str2[i])+s.index(str1[i])
x = math.floor(x/2)
y += s[x]
return y
print(avg('z','a')) # m
print(avg('aa','az')) # am
print(avg('cat','dog')) # chm
Still working on strings with different lengths... any ideas?
This version thinks 'abc' is a fraction like 0.abc. In this approach space is zero and a valid input/output.
MAX_ITER = 10
letters = " abcdefghijklmnopqrstuvwxyz"
def to_double(name):
d = 0
for i, ch in enumerate(name):
idx = letters.index(ch)
d += idx * len(letters) ** (-i - 1)
return d
def from_double(d):
name = ""
for i in range(MAX_ITER):
d *= len(letters)
name += letters[int(d)]
d -= int(d)
return name
def avg(w1, w2):
w1 = to_double(w1)
w2 = to_double(w2)
return from_double((w1 + w2) * 0.5)
print avg('a', 'a') # 'a'
print avg('a', 'aa') # 'a mmmmmmmm'
print avg('aa', 'aa') # 'a zzzzzzzz'
print avg('car', 'duck') # 'cxxemmmmmm'
Unfortunately, the naïve algorithm is not able to detect the periodic 'z's, this would be something like 0.99999 in decimal; therefore 'a zzzzzzzz' is actually 'aa' (the space before the 'z' periodicity must be increased by one.
In order to normalise this, you can use the following function
def remove_z_period(name):
if len(name) != MAX_ITER:
return name
if name[-1] != 'z':
return name
n = ""
overflow = True
for ch in reversed(name):
if overflow:
if ch == 'z':
ch = ' '
else:
ch=letters[(letters.index(ch)+1)]
overflow = False
n = ch + n
return n
print remove_z_period('a zzzzzzzz') # 'aa'
I haven't programmed in python in a while and this seemed interesting enough to try.
Bear with my recursive programming. Too many functional languages look like python.
def stravg_half(a, ln):
# If you have a problem it will probably be in here.
# The floor of the character's value is 0, but you may want something different
f = 0
#f = ord('a')
L = ln - 1
if 0 == L:
return ''
A = ord(a[0])
return chr(A/2) + stravg_half( a[1:], L)
def stravg_helper(a, b, ln, x):
L = ln - 1
A = ord(a[0])
B = ord(b[0])
D = (A + B)/2
if 0 == L:
if 0 == x:
return chr(D)
# NOTE: The caller of helper makes sure that len(a)>=len(b)
return chr(D) + stravg_half(a[1:], x)
return chr(D) + stravg_helper(a[1:], b[1:], L, x)
def stravg(a, b):
la = len(a)
lb = len(b)
if 0 == la:
if 0 == lb:
return a # which is empty
return stravg_half(b, lb)
if 0 == lb:
return stravg_half(a, la)
x = la - lb
if x > 0:
return stravg_helper(a, b, lb, x)
return stravg_helper(b, a, la, -x) # Note the order of the args