Average of two strings in alphabetical/lexicographical order - python

Suppose you take the strings 'a' and 'z' and list all the strings that come between them in alphabetical order: ['a','b','c' ... 'x','y','z']. Take the midpoint of this list and you find 'm'. So this is kind of like taking an average of those two strings.
You could extend it to strings with more than one character, for example the midpoint between 'aa' and 'zz' would be found in the middle of the list ['aa', 'ab', 'ac' ... 'zx', 'zy', 'zz'].
Might there be a Python method somewhere that does this? If not, even knowing the name of the algorithm would help.
I began making my own routine that simply goes through both strings and finds midpoint of the first differing letter, which seemed to work great in that 'aa' and 'az' midpoint was 'am', but then it fails on 'cat', 'doggie' midpoint which it thinks is 'c'. I tried Googling for "binary search string midpoint" etc. but without knowing the name of what I am trying to do here I had little luck.
I added my own solution as an answer

If you define an alphabet of characters, you can just convert to base 10, do an average, and convert back to base-N where N is the size of the alphabet.
alphabet = 'abcdefghijklmnopqrstuvwxyz'
def enbase(x):
n = len(alphabet)
if x < n:
return alphabet[x]
return enbase(x/n) + alphabet[x%n]
def debase(x):
n = len(alphabet)
result = 0
for i, c in enumerate(reversed(x)):
result += alphabet.index(c) * (n**i)
return result
def average(a, b):
a = debase(a)
b = debase(b)
return enbase((a + b) / 2)
print average('a', 'z') #m
print average('aa', 'zz') #mz
print average('cat', 'doggie') #budeel
print average('google', 'microsoft') #gebmbqkil
print average('microsoft', 'google') #gebmbqkil
Edit: Based on comments and other answers, you might want to handle strings of different lengths by appending the first letter of the alphabet to the shorter word until they're the same length. This will result in the "average" falling between the two inputs in a lexicographical sort. Code changes and new outputs below.
def pad(x, n):
p = alphabet[0] * (n - len(x))
return '%s%s' % (x, p)
def average(a, b):
n = max(len(a), len(b))
a = debase(pad(a, n))
b = debase(pad(b, n))
return enbase((a + b) / 2)
print average('a', 'z') #m
print average('aa', 'zz') #mz
print average('aa', 'az') #m (equivalent to ma)
print average('cat', 'doggie') #cumqec
print average('google', 'microsoft') #jlilzyhcw
print average('microsoft', 'google') #jlilzyhcw

If you mean the alphabetically, simply use FogleBird's algorithm but reverse the parameters and the result!
>>> print average('cat'[::-1], 'doggie'[::-1])[::-1]
cumdec
or rewriting average like so
>>> def average(a, b):
... a = debase(a[::-1])
... b = debase(b[::-1])
... return enbase((a + b) / 2)[::-1]
...
>>> print average('cat', 'doggie')
cumdec
>>> print average('google', 'microsoft')
jlvymlupj
>>> print average('microsoft', 'google')
jlvymlupj

It sounds like what you want, is to treat alphabetical characters as a base-26 value between 0 and 1. When you have strings of different length (an example in base 10), say 305 and 4202, your coming out with a midpoint of 3, since you're looking at the characters one at a time. Instead, treat them as a floating point mantissa: 0.305 and 0.4202. From that, it's easy to come up with a midpoint of .3626 (you can round if you'd like).
Do the same with base 26 (a=0...z=25, ba=26, bb=27, etc.) to do the calculations for letters:
cat becomes 'a.cat' and doggie becomes 'a.doggie', doing the math gives cat a decimal value of 0.078004096, doggie a value of 0.136390697, with an average of 0.107197397 which in base 26 is roughly "cumcqo"

Based on your proposed usage, consistent hashing ( http://en.wikipedia.org/wiki/Consistent_hashing ) seems to make more sense.

Thanks for everyone who answered, but I ended up writing my own solution because the others weren't exactly what I needed. I am trying to average app engine key names, and after studying them a bit more I discovered they actually allow any 7-bit ASCII characters in the names. Additionally I couldn't really rely on the solutions that converted the key names first to floating point, because I suspected floating point accuracy just isn't enough.
To take an average, first you add two numbers together and then divide by two. These are both such simple operations that I decided to just make functions to add and divide base 128 numbers represented as lists. This solution hasn't been used in my system yet so I might still find some bugs in it. Also it could probably be a lot shorter, but this is just something I needed to get done instead of trying to make it perfect.
# Given two lists representing a number with one digit left to decimal point and the
# rest after it, for example 1.555 = [1,5,5,5] and 0.235 = [0,2,3,5], returns a similar
# list representing those two numbers added together.
#
def ladd(a, b, base=128):
i = max(len(a), len(b))
lsum = [0] * i
while i > 1:
i -= 1
av = bv = 0
if i < len(a): av = a[i]
if i < len(b): bv = b[i]
lsum[i] += av + bv
if lsum[i] >= base:
lsum[i] -= base
lsum[i-1] += 1
return lsum
# Given a list of digits after the decimal point, returns a new list of digits
# representing that number divided by two.
#
def ldiv2(vals, base=128):
vs = vals[:]
vs.append(0)
i = len(vs)
while i > 0:
i -= 1
if (vs[i] % 2) == 1:
vs[i] -= 1
vs[i+1] += base / 2
vs[i] = vs[i] / 2
if vs[-1] == 0: vs = vs[0:-1]
return vs
# Given two app engine key names, returns the key name that comes between them.
#
def average(a_kn, b_kn):
m = lambda x:ord(x)
a = [0] + map(m, a_kn)
b = [0] + map(m, b_kn)
avg = ldiv2(ladd(a, b))
return "".join(map(lambda x:chr(x), avg[1:]))
print average('a', 'z') # m#
print average('aa', 'zz') # n-#
print average('aa', 'az') # am#
print average('cat', 'doggie') # d(mstr#
print average('google', 'microsoft') # jlim.,7s:
print average('microsoft', 'google') # jlim.,7s:

import math
def avg(str1,str2):
y = ''
s = 'abcdefghijklmnopqrstuvwxyz'
for i in range(len(str1)):
x = s.index(str2[i])+s.index(str1[i])
x = math.floor(x/2)
y += s[x]
return y
print(avg('z','a')) # m
print(avg('aa','az')) # am
print(avg('cat','dog')) # chm
Still working on strings with different lengths... any ideas?

This version thinks 'abc' is a fraction like 0.abc. In this approach space is zero and a valid input/output.
MAX_ITER = 10
letters = " abcdefghijklmnopqrstuvwxyz"
def to_double(name):
d = 0
for i, ch in enumerate(name):
idx = letters.index(ch)
d += idx * len(letters) ** (-i - 1)
return d
def from_double(d):
name = ""
for i in range(MAX_ITER):
d *= len(letters)
name += letters[int(d)]
d -= int(d)
return name
def avg(w1, w2):
w1 = to_double(w1)
w2 = to_double(w2)
return from_double((w1 + w2) * 0.5)
print avg('a', 'a') # 'a'
print avg('a', 'aa') # 'a mmmmmmmm'
print avg('aa', 'aa') # 'a zzzzzzzz'
print avg('car', 'duck') # 'cxxemmmmmm'
Unfortunately, the naïve algorithm is not able to detect the periodic 'z's, this would be something like 0.99999 in decimal; therefore 'a zzzzzzzz' is actually 'aa' (the space before the 'z' periodicity must be increased by one.
In order to normalise this, you can use the following function
def remove_z_period(name):
if len(name) != MAX_ITER:
return name
if name[-1] != 'z':
return name
n = ""
overflow = True
for ch in reversed(name):
if overflow:
if ch == 'z':
ch = ' '
else:
ch=letters[(letters.index(ch)+1)]
overflow = False
n = ch + n
return n
print remove_z_period('a zzzzzzzz') # 'aa'

I haven't programmed in python in a while and this seemed interesting enough to try.
Bear with my recursive programming. Too many functional languages look like python.
def stravg_half(a, ln):
# If you have a problem it will probably be in here.
# The floor of the character's value is 0, but you may want something different
f = 0
#f = ord('a')
L = ln - 1
if 0 == L:
return ''
A = ord(a[0])
return chr(A/2) + stravg_half( a[1:], L)
def stravg_helper(a, b, ln, x):
L = ln - 1
A = ord(a[0])
B = ord(b[0])
D = (A + B)/2
if 0 == L:
if 0 == x:
return chr(D)
# NOTE: The caller of helper makes sure that len(a)>=len(b)
return chr(D) + stravg_half(a[1:], x)
return chr(D) + stravg_helper(a[1:], b[1:], L, x)
def stravg(a, b):
la = len(a)
lb = len(b)
if 0 == la:
if 0 == lb:
return a # which is empty
return stravg_half(b, lb)
if 0 == lb:
return stravg_half(a, la)
x = la - lb
if x > 0:
return stravg_helper(a, b, lb, x)
return stravg_helper(b, a, la, -x) # Note the order of the args

Related

Trying to convert a integer into a ACGT DNA sequence

I am trying to reverse my stringtobin function so that when I run bintostring([3]) it will return "AAAT" where A=0,C=1,G=2,T=3, for example CCCC will return 85 because (1 * 64) + (1 * 16) + (1 * 4) + (1 * 1) = 85. My bintostring function now just returns an empty string.
dna = {'A':0, 'C':1, 'G':2, 'T':3}
dna2 = {0:'A', 1:'C', 2:'G', 3:'T'}
def bintostring(num):
seq = []
nums = [64,16,4,1]
#main while
i = 0
while i<len(num):
#nums while (iterate through nums)
k = 0
while k<len(nums):
#dna2 while (iterate through dna2)
x = 0
while x<len(dna2):
check = 0
if num[i]//nums[k] == dna2[x]:
seq.append(dna2[x])
check+=1
elif check>0:
seq.append('A')
x+=1
k+=1
i+=1
return("".join(seq))
print(bintostring([3]))
def stringtobin(seq):
power_of_4 = 1
num = 0
if len(seq)!=4: return None
i = len(seq)-1
while i>=0:
power_of_4*=4
Digitval = dna[seq[i]]
num+=Digitval*power_of_4//4
i-=1
return num
print(stringtobin("AAAT"))
Your encoding is in base 4 which can't hold the length information of your sequence.
Without the length information the encoded value 3 could mean T or TA or TAAA or TAAAA... (there would be no way to know).
If the sequences are always 4 letters long (or the length is stored/provided separately), you can implement the functions like this
def stringToBin(S):
return sum( 4**i*"ACGT".index(p) for i,p in enumerate(S))
def binToString(N,size=4):
result = ""
for _ in range(size):
N,p = divmod(N,4)
result += "ACGT"[p]
return result
print(stringToBin("AAAT")) # 192
print(binToString(192)) # AAAT
print(stringToBin("TA")) # 3
print(stringToBin("TAAA")) # 3
print(binToString(3)) # TAAA
print(binToString(3,2)) # TA (length has to be supplied separately)
If you want your numeric encoding to also carry the length information, you should make it base 5 and use a non-zero value for each letter. This way, TA and TAAA would give different numbers.
def stringToBin(S):
return sum( 5**i*" ACGT".index(p) for i,p in enumerate(S))
def binToString(N):
result = ""
while N:
N,p = divmod(N,5)
result += " ACGT"[p]
return result
print(stringToBin("TA")) # 9
print(stringToBin("TAAA")) # 159
print(binToString(9)) # TA
print(binToString(159)) # TAAA
Obviously this produces larger number so, a 32 bit unsigned integer will only hold 13 letters as opposed to 16 in base 4. If you're doing this to reduce the size of storage, using text compression (e.g. zip) will probably be more efficient than converting to a fixed base binary representation
Your attempt seems inordinately complex. Just map the bottom two bits to a value, then shift them off.
def bintostring(num):
seq = []
for n in num:
subseq = []
for b in range(4):
subseq.append(dna2[n & 3])
n >>= 2
seq.append("".join(reversed(subseq)))
return seq
In case it's not obvious, & is bitwise AND; value & 3 obtains the bottom two bits of value.
The stringtobin function could be similarly simplified. Demo: https://ideone.com/RlzegN

Simple Fun #159: Middle Permutation

Task
You are given a string s. Every letter in s appears once.
Consider all strings formed by rearranging the letters in s. After ordering these strings in dictionary order, return the middle term. (If the sequence has a even length n, define its middle term to be the (n/2)th term.)
The problem
Hey guys. I am totaly stuck... I`ve got an algorithm to calculate the answer in O(n). All basic tests are passed. But I constantly fail all tests, where the lenght of string equals 23,24,25. Some scary stuff happens, always like this:
'noabcdefgijklymzustvpxwrq' should equal 'nmzyxwvutsrqpolkjigfedcba'
'lzyxvutsrqonmeijhkcafdgb' should equal 'lzyxvutsrqonmkjihgfedcba'
I mean that it goes in the right direction, but suddenly mistakes. Give me a hint what I should check or what thing to fix. Thanks a lot!
P.S. This execute middlePermutation in under 12000ms gave me the idea of solving
Code
import math
def middle_permutation(string):
ans, tmp = '', sorted(list(string))
dividend = math.factorial(len(tmp)) / 2
for i in range(len(tmp)):
perms = math.factorial(len(tmp)) / len(tmp)
if len(tmp) == 1:
ans += tmp[0]
break
letter = tmp[math.ceil(dividend / perms) - 1]
ans += letter
tmp.remove(letter)
dividend -= perms * (math.floor(dividend / perms))
print(len(string))
return ans
Here are some basic inputs
Test.describe("Basic tests")
Test.assert_equals(middle_permutation("abc"),"bac")
Test.assert_equals(middle_permutation("abcd"),"bdca")
Test.assert_equals(middle_permutation("abcdx"),"cbxda")
Test.assert_equals(middle_permutation("abcdxg"),"cxgdba")
Test.assert_equals(middle_permutation("abcdxgz"),"dczxgba")
You're not far from a good answer.
Because of 0 versus 1 indexing, you should start with
dividend = math.factorial(len(tmp)) // 2 - 1
and then you choose a slightly off letter, replace your code with
letter = tmp[dividend // perms]
Also as everything is integer here, it's better to use 'a // b' instead of math.floor(a / b).
All in all, here's a corrected version of your code:
def middle_permutation(string):
ans, tmp = '', sorted(list(string))
dividend = math.factorial(len(tmp)) // 2 - 1
for i in range(len(tmp)):
perms = math.factorial(len(tmp)) // len(tmp)
if len(tmp) == 1:
ans += tmp[0]
break
letter = tmp[dividend // perms]
ans += letter
tmp.remove(letter)
dividend -= perms * (dividend // perms)
return ans
and just for the beauty of it, a generalization:
def n_in_base(n, base):
r = []
for b in base:
r.append(n % b)
n //= b
return reversed(r)
def nth_permutation(s, n):
digits = n_in_base(n, range(1, len(s)+1))
alphabet = sorted(s)
return ''.join(alphabet.pop(ri) for ri in digits)
def middle_permutation(s):
return nth_permutation(s, math.factorial(len(s)) // 2 - 1)

How to use the former value of a variable after the value has changed?

I wrote a code that is supposed to convert a number from base 10 to another base. This is the code, where n is the number to convert and l is the base to convert to:
def convert_from_base_10(n,l):
import math
counter = 0
m=n
z=0
string = ""
if n<2:
return n
else:
while m>=l:
m=m/l
counter +=1
while counter >= 0:
z= math.floor(n/(l**counter))
string = string + str(z)
n = n-z*(l**counter)
counter = counter - 1
return string
Because in the first else statement I change the value of the number I want to convert by dividing m by l, I had to assign m = n and use m instead of n. Is there a way to get around this and use only on variable?
I don't have enough reputation to add a comment, but I'm adding an answer as a workaround. Please take a look at the Math library from Python.
Please let me know if something like this might help:
log(x)/log(base)
math.log(x[, base]) With one argument, return the natural logarithm of
x (to base e).
With two arguments, return the logarithm of x to the given base,
calculated as log(x)/log(base).
Source: https://docs.python.org/3.6/library/math.html
Not sure what you are trying to accomplish, but it you do not want to use the built in functions as NellMartinez pointed out I think adding another function might be what you are looking for to remove a variable from it.
def set_count(n, l):
count = 0
if n >= 2:
while n >= l:
n = n / l
count += 1
return count
def convert_from_base_10(n, l):
string = ""
counter = set_count(n, l)
while counter >= 0:
z = math.floor(n / (l ** counter))
string = string + str(z)
n = n - z * (l ** counter)
counter = counter - 1
return string
You may choose to use a dictionary or a list.
Instead of assigning n to m, and using two variables, you may choose to use a list list_name = [n, n]. Now you can keep list_name[0] as constant and vary list_name[1].
With dictionary, you may do something like dict_name = {"original": n, "variable": n}.
I suggest you rewrite your code like this (it is the same code just more readable and better formatted. Some names still remain for you to choose)
You should return always a coherent result if it is a string, always return a string
I don't see a problem in saving var content if you need it later.
Note that number is a local function parameter so you won't alter the original (which is actually an immutable also in your case) but you want to
change the local value a bit later, so it's ok to copy it to m.
There are other ways to do the conversion but I suppose you want to code the algorithm not a brief solution.
import math
def convert_base10_to(number, base):
counter = 0
m = number # choose a proper name to m: what is m?
# That will be the best name for m
z = 0 # same for z
result = ""
if number < 2:
return str(number)
else:
while m >= base:
m = m / base
counter += 1
while counter >= 0:
z = math.floor(number / (base ** counter))
result = result + str(z)
number = number - z *(base ** counter)
counter = counter - 1
return result
print(convert_base10_to(7, 2))

Python Swap two digits in a number?

What is the fastest way to swap two digits in a number in Python? I am given the numbers as strings, so it'd be nice if I could have something as fast as
string[j] = string[j] ^ string[j+1]
string[j+1] = string[j] ^ string[j+1]
string[j] = string[j] ^ string[j+1]
Everything I've seen has been much more expensive than it would be in C, and involves making a list and then converting the list back or some variant thereof.
This is faster than you might think, at least faster than Jon Clements' current answer in my timing test:
i, j = (i, j) if i < j else (j, i) # make sure i < j
s = s[:i] + s[j] + s[i+1:j] + s[i] + s[j+1:]
Here's my test bed should you want to compare any other answers you get:
import timeit
import types
N = 10000
R = 3
SUFFIX = '_test'
SUFFIX_LEN = len(SUFFIX)
def setup():
import random
global s, i, j
s = 'abcdefghijklmnopqrstuvwxyz'
i = random.randrange(len(s))
while True:
j = random.randrange(len(s))
if i != j: break
def swapchars_martineau(s, i, j):
i, j = (i, j) if i < j else (j, i) # make sure i < j
return s[:i] + s[j] + s[i+1:j] + s[i] + s[j+1:]
def swapchars_martineau_test():
global s, i, j
swapchars_martineau(s, i, j)
def swapchars_clements(text, fst, snd):
ba = bytearray(text)
ba[fst], ba[snd] = ba[snd], ba[fst]
return str(ba)
def swapchars_clements_test():
global s, i, j
swapchars_clements(s, i, j)
# find all the functions named *SUFFIX in the global namespace
funcs = tuple(value for id,value in globals().items()
if id.endswith(SUFFIX) and type(value) is types.FunctionType)
# run the timing tests and collect results
timings = [(f.func_name[:-SUFFIX_LEN],
min(timeit.repeat(f, setup=setup, repeat=R, number=N))
) for f in funcs]
timings.sort(key=lambda x: x[1]) # sort by speed
fastest = timings[0][1] # time fastest one took to run
longest = max(len(t[0]) for t in timings) # len of longest func name (w/o suffix)
print 'fastest to slowest *_test() function timings:\n' \
' {:,d} chars, {:,d} timeit calls, best of {:d}\n'.format(len(s), N, R)
def times_slower(speed, fastest):
return speed/fastest - 1.0
for i in timings:
print "{0:>{width}}{suffix}() : {1:.4f} ({2:.2f} times slower)".format(
i[0], i[1], times_slower(i[1], fastest), width=longest, suffix=SUFFIX)
Addendum:
For the special case of swapping digit characters in a positive decimal number given as a string, the following also works and is a tiny bit faster than the general version at the top of my answer.
The somewhat involved conversion back to a string at the end with the format() method is to deal with cases where a zero got moved to the front of the string. I present it mainly as a curiosity, since it's fairly incomprehensible unless you grasp what it does mathematically. It also doesn't handle negative numbers.
n = int(s)
len_s = len(s)
ord_0 = ord('0')
di = ord(s[i])-ord_0
dj = ord(s[j])-ord_0
pi = 10**(len_s-(i+1))
pj = 10**(len_s-(j+1))
s = '{:0{width}d}'.format(n + (dj-di)*pi + (di-dj)*pj, width=len_s)
It has to be of a mutable type of some sort, the best I can think of is (can't make any claims as to performance though):
def swapchar(text, fst, snd):
ba = bytearray(text)
ba[fst], ba[snd] = ba[snd], ba[fst]
return ba
>>> swapchar('thequickbrownfox', 3, 7)
bytearray(b'thekuicqbrownfox')
You can still utilise the result as a str/list - or explicitly convert it to a str if needs be.
>>> int1 = 2
>>> int2 = 3
>>> eval(str(int1)+str(int2))
23
I know you've already accepted an answer, so I won't bother coding it in Python, but here's how you could do it in JavaScript which also has immutable strings:
function swapchar(string, j)
{
return string.replace(RegExp("(.{" + j + "})(.)(.)"), "$1$3$2");
}
Obviously if j isn't in an appropriate range then it just returns the original string.
Given an integer n and two (zero-started) indexes i and j of digits to swap, this can be done using powers of ten to locate the digits, division and modulo operations to extract them, and subtraction and addition to perform the swap.
def swapDigits(n, i, j):
# These powers of 10 encode the locations i and j in n.
power_i = 10 ** i
power_j = 10 ** j
# Retrieve digits [i] and [j] from n.
digit_i = (n // power_i) % 10
digit_j = (n // power_j) % 10
# Remove digits [i] and [j] from n.
n -= digit_i * power_i
n -= digit_j * power_j
# Insert digit [i] in position [j] and vice versa.
n += digit_i * power_j
n += digit_j * power_i
return n
For example:
>>> swapDigits(9876543210, 4, 0)
9876503214
>>> swapDigits(9876543210, 7, 2)
9826543710

Optimizing python code

Any tips on optimizing this python code for finding next palindrome:
Input number can be of 1000000 digits
COMMENTS ADDED
#! /usr/bin/python
def inc(lst,lng):#this function first extract the left half of the string then
#convert it to int then increment it then reconvert it to string
#then reverse it and finally append it to the left half.
#lst is input number and lng is its length
if(lng%2==0):
olst=lst[:lng/2]
l=int(lng/2)
olst=int(olst)
olst+=1
olst=str(olst)
p=len(olst)
if l<p:
olst2=olst[p-2::-1]
else:
olst2=olst[::-1]
lst=olst+olst2
return lst
else:
olst=lst[:lng/2+1]
l=int(lng/2+1)
olst=int(olst)
olst+=1
olst=str(olst)
p=len(olst)
if l<p:
olst2=olst[p-3::-1]
else:
olst2=olst[p-2::-1]
lst=olst+olst2
return lst
t=raw_input()
t=int(t)
while True:
if t>0:
t-=1
else:
break
num=raw_input()#this is input number
lng=len(num)
lst=num[:]
if(lng%2==0):#this if find next palindrome to num variable
#without incrementing the middle digit and store it in lst.
olst=lst[:lng/2]
olst2=olst[::-1]
lst=olst+olst2
else:
olst=lst[:lng/2+1]
olst2=olst[len(olst)-2::-1]
lst=olst+olst2
if int(num)>=int(lst):#chk if lst satisfies criteria for next palindrome
num=inc(num,lng)#otherwise call inc function
print num
else:
print lst
I think most of the time in this code is spent converting strings to integers and back. The rest is slicing strings and bouncing around in the Python interpreter. What can be done about these three things? There are a few unnecessary conversions in the code, which we can remove. I see no way to avoid the string slicing. To minimize your time in the interpreter you just have to write as little code as possible :-) and it also helps to put all your code inside functions.
The code at the bottom of your program, which takes a quick guess to try and avoid calling inc(), has a bug or two. Here's how I might write that part:
def nextPal(num):
lng = len(num)
guess = num[:lng//2] + num[(lng-1)//2::-1] # works whether lng is even or odd
if guess > num: # don't bother converting to int
return guess
else:
return inc(numstr, n)
This simple change makes your code about 100x faster for numbers where inc doesn't need to be called, and about 3x faster for numbers where it does need to be called.
To do better than that, I think you need to avoid converting to int entirely. That means incrementing the left half of the number without using ordinary Python integer addition. You can use an array and carry out the addition algorithm "by hand":
import array
def nextPal(numstr):
# If we don't need to increment, just reflect the left half and return.
n = len(numstr)
h = n//2
guess = numstr[:n-h] + numstr[h-1::-1]
if guess > numstr:
return guess
# Increment the left half of the number without converting to int.
a = array.array('b', numstr)
zero = ord('0')
ten = ord('9') + 1
for i in range(n - h - 1, -1, -1):
d = a[i] + 1
if d == ten:
a[i] = zero
else:
a[i] = d
break
else:
# The left half was all nines. Carry the 1.
# Update n and h since the length changed.
a.insert(0, ord('1'))
n += 1
h = n//2
# Reflect the left half onto the right half.
a[n-h:] = a[h-1::-1]
return a.tostring()
This is another 9x faster or so for numbers that require incrementing.
You can make this a touch faster by using a while loop instead of for i in range(n - h - 1, -1, -1), and about twice as fast again by having the loop update both halves of the array rather than just updating the left-hand half and then reflecting it at the end.
You don't have to find the palindrome, you can just generate it.
Split the input number, and reflect it. If the generated number is too small, then increment the left hand side and reflect it again:
def nextPal(n):
ns = str(n)
oddoffset = 0
if len(ns) % 2 != 0:
oddoffset = 1
leftlen = len(ns) / 2 + oddoffset
lefts = ns[0:leftlen]
right = lefts[::-1][oddoffset:]
p = int(lefts + right)
if p < n:
## Need to increment middle digit
left = int(lefts)
left += 1
lefts = str(left)
right = lefts[::-1][oddoffset:]
p = int(lefts + right)
return p
def test(n):
print n
p = nextPal(n)
assert p >= n
print p
test(1234567890)
test(123456789)
test(999999)
test(999998)
test(888889)
test(8999999)
EDIT
NVM, just look at this page: http://thetaoishere.blogspot.com/2009/04/finding-next-palindrome-given-number.html
Using strings. n >= 0
from math import floor, ceil, log10
def next_pal(n):
# returns next palindrome, param is an int
n10 = str(n)
m = len(n10) / 2.0
s, e = int(floor(m - 0.5)), int(ceil(m + 0.5))
start, middle, end = n10[:s], n10[s:e], n10[e:]
assert (start, middle[0]) == (end[-1::-1], middle[-1]) #check that n is actually a palindrome
r = int(start + middle[0]) + 1 #where the actual increment occurs (i.e. add 1)
r10 = str(r)
i = 3 - len(middle)
if len(r10) > len(start) + 1:
i += 1
return int(r10 + r10[-i::-1])
Using log, more optized. n > 9
def next_pal2(n):
k = log10(n + 1)
l = ceil(k)
s, e = int(floor(l/2.0 - 0.5)), int(ceil(l/2.0 + 0.5))
mmod, emod = 10**(e - s), int(10**(l - e))
start, end = divmod(n, emod)
start, middle = divmod(start, mmod)
r1 = 10*start + middle%10 + 1
i = middle > 9 and 1 or 2
j = s - i + 2
if k == l:
i += 1
r2 = int(str(r1)[-i::-1])
return r1*10**j + r2

Categories