Related
I have this function hash() that encrypts a given string to an integer.
letters = 'weiojpknasdjhsuert'
def hash(string_input):
h = 3
for i in range(0, len(string_input)):
h = h * 43 + letters.index(string_input[i])
return h
So if I do print(hash('post')), my output is: 11231443
How can I find what my input needs to be to get an output like 1509979332193868 if the input can only be a string from letters? There is a formula inside the loop body but I couldn't figure out how to reverse it.
It seem like since 43 is larger than your alphabet, you can just reverse the math. I don't know how to prove there are no hash collisions, so this may have edge cases. For example:
letters = 'weiojpknasdjhsuert'
def hash(string_input):
h = 3
for i in range(0, len(string_input)):
h = h * 43 + letters.index(string_input[i])
return h
n = hash('wepostjapansand')
print(n)
# 9533132150649729383107184
def rev(n):
s = ''
while n:
l = n % 43 # going in reverse this is the index of the next letter
n -= l # now reverse the math...subtract that index
n //= 43 # and divide by 43 to get the next n
if n:
s = letters[l] + s
return s
print(rev(n))
# wepostjapansand
With a more reasonable alphabet, like lowercase ascii and a space, this still seem to be okay:
from string import ascii_lowercase
letters = ascii_lowercase + ' '
def hash(string_input):
h = 3
for i in range(0, len(string_input)):
h = h * 43 + letters.index(string_input[i])
return h
n = hash('this is some really long text to test how well this works')
print(n)
# 4415562436659212948343839746913950248717359454765313646276852138403823934037244869651587521298
def rev(n):
s = ''
# with more compact logic
while n > 3:
s = letters[n % 43] + s
n = (n - (n % 43)) // 43
return s
print(rev(n))
# this is some really long text to test how well this works
The basic idea is that after all the math, the last number is:
prev * 43 + letter_index
This means you can recover the final letter index by taking the prev modulus 43. Then subtract that and divide by 43 (which is just the reverse of the math) and do it again until your number is zero.
When I use this code my terminal shows full ASCII characters but it does not fullfil the requirement of my question which is to display 10 ASCII characters per line where characters are separated by one space using a while loop:
i = 1
while i < 127:
result = chr(i)
print(result)
i = i + 1
every character is on it's own line.
When I use this code
i = 1
while i < 127:
result = chr(i)
print(result,sep=' ', end=" ")
i = i + 1
ASCII character are displayed like this after 17 spaces
I'm still not sure what you mean with 17 decimal number but I think I know what you want to do.
The problem is that you don't know how to switch between spaces and a newline if I understand correctly.
We want to print a newline every 10 characters, and otherwise print a space. An easy way to do this in python is to use the modulo function. With it you can get the remainder of a division. If we divide our iterator i by 10, we get 0 as a remainder every 10 iterations. We can use this to print a newline every 10 iterations like this:
i = 1
while i < 127-32:
result = chr(i+32)
if i % 10 == 0:
print(result)
else:
print(result, end=' ')
i = i + 1
You also said it had to be from ! to ~. In asciim printable characters start at 32, so we need to start add 32 to the number in chr() to make sure we are not printing the non-printable characters.
This will print the ascii table with all printable characters like this:
! " # $ % & ' ( ) *
+ , - . / 0 1 2 3 4
5 6 7 8 9 : ; < = >
? # A B C D E F G H
I J K L M N O P Q R
S T U V W X Y Z [ \
] ^ _ ` a b c d e f
g h i j k l m n o p
q r s t u v w x y z
{ | } ~
You can gather the start and endvalues of '!' and '~' via the ord() function to remove magic numbers from your code:
ord(c)
Given a string representing one Unicode character, return an integer representing
the Unicode code point of that character. For example, ord('a') returns the integer
97 and ord('€') (Euro sign) returns 8364. This is the inverse of chr().
Then print them, add a space if the line is not yet done or a newline if you are done for this line.
start = ord("!")
end = ord("~")
i = start
while i <= end:
print(chr(i), end="") # print the character
i += 1
if (i - start) % 10 == 0: # check if we printed 10 characters in this line
print()
else:
print(" ", end="")
Output:
! " # $ % & ' ( ) *
+ , - . / 0 1 2 3 4
5 6 7 8 9 : ; < = >
? # A B C D E F G H
I J K L M N O P Q R
S T U V W X Y Z [ \
] ^ _ ` a b c d e f
g h i j k l m n o p
q r s t u v w x y z
{ | } ~
I would like to make a alphabetical list for an application similar to an excel worksheet.
A user would input number of cells and I would like to generate list.
For example a user needs 54 cells. Then I would generate
'a','b','c',...,'z','aa','ab','ac',...,'az', 'ba','bb'
I can generate the list from [ref]
from string import ascii_lowercase
L = list(ascii_lowercase)
How do i stitch it together?
A similar question for PHP has been asked here. Does some one have the python equivalent?
Use itertools.product.
from string import ascii_lowercase
import itertools
def iter_all_strings():
for size in itertools.count(1):
for s in itertools.product(ascii_lowercase, repeat=size):
yield "".join(s)
for s in iter_all_strings():
print(s)
if s == 'bb':
break
Result:
a
b
c
d
e
...
y
z
aa
ab
ac
...
ay
az
ba
bb
This has the added benefit of going well beyond two-letter combinations. If you need a million strings, it will happily give you three and four and five letter strings.
Bonus style tip: if you don't like having an explicit break inside the bottom loop, you can use islice to make the loop terminate on its own:
for s in itertools.islice(iter_all_strings(), 54):
print s
You can use a list comprehension.
from string import ascii_lowercase
L = list(ascii_lowercase) + [letter1+letter2 for letter1 in ascii_lowercase for letter2 in ascii_lowercase]
Following #Kevin 's answer :
from string import ascii_lowercase
import itertools
# define the generator itself
def iter_all_strings():
size = 1
while True:
for s in itertools.product(ascii_lowercase, repeat=size):
yield "".join(s)
size +=1
The code below enables one to generate strings, that can be used to generate unique labels for example.
# define the generator handler
gen = iter_all_strings()
def label_gen():
for s in gen:
return s
# call it whenever needed
print label_gen()
print label_gen()
print label_gen()
I've ended up doing my own.
I think it can create any number of letters.
def AA(n, s):
r = n % 26
r = r if r > 0 else 26
n = (n - r) / 26
s = chr(64 + r) + s
if n > 26:
s = AA(n, s)
elif n > 0:
s = chr(64 + n) + s
return s
n = quantity | r = remaining (26 letters A-Z) | s = string
To print the list :
def uprint(nc):
for x in range(1, nc + 1):
print AA(x,'').lower()
Used VBA before convert to python :
Function AA(n, s)
r = n Mod 26
r = IIf(r > 0, r, 26)
n = (n - r) / 26
s = Chr(64 + r) & s
If n > 26 Then
s = AA(n, s)
ElseIf n > 0 Then
s = Chr(64 + n) & s
End If
AA = s
End Function
Using neo's insight on a while loop.
For a given iterable with chars in ascending order. 'abcd...'.
n is the Nth position of the representation starting with 1 as the first position.
def char_label(n, chars):
indexes = []
while n:
residual = n % len(chars)
if residual == 0:
residual = len(chars)
indexes.append(residual)
n = (n - residual)
n = n // len(chars)
indexes.reverse()
label = ''
for i in indexes:
label += chars[i-1]
return label
Later you can print a list of the range n of the 'labels' you need using a for loop:
my_chrs = 'abc'
n = 15
for i in range(1, n+1):
print(char_label(i, my_chrs))
or build a list comprehension etc...
Print the set of xl cell range of lowercase and uppercase charterers
Upper_case:
from string import ascii_uppercase
import itertools
def iter_range_strings(start_colu):
for size in itertools.count(1):
for string in itertools.product(ascii_uppercase, repeat=size):
yield "".join(string)
input_colume_range = ['A', 'B']
input_row_range= [1,2]
for row in iter_range_strings(input_colume_range[0]):
for colum in range(int(input_row_range[0]), int(input_row_range[1]+1)):
print(str(row)+ str(colum))
if row == input_colume_range[1]:
break
Result:
A1
A2
B1
B2
In two lines (plus an import):
from string import ascii_uppercase as ABC
count = 100
ABC+=' '
[(ABC[x[0]] + ABC[x[1]]).strip() for i in range(count) if (x:= divmod(i-26, 26))]
Wrap it in a function/lambda if you need to reuse.
code:
alphabet = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
for i in range(len(alphabet)):
for a in range(len(alphabet)):
print(alphabet[i] + alphabet[a])
result:
aa
ab
ac
ad
ae
af
ag
ah
ai
aj
ak
al
am
...
What is the fastest way to swap two digits in a number in Python? I am given the numbers as strings, so it'd be nice if I could have something as fast as
string[j] = string[j] ^ string[j+1]
string[j+1] = string[j] ^ string[j+1]
string[j] = string[j] ^ string[j+1]
Everything I've seen has been much more expensive than it would be in C, and involves making a list and then converting the list back or some variant thereof.
This is faster than you might think, at least faster than Jon Clements' current answer in my timing test:
i, j = (i, j) if i < j else (j, i) # make sure i < j
s = s[:i] + s[j] + s[i+1:j] + s[i] + s[j+1:]
Here's my test bed should you want to compare any other answers you get:
import timeit
import types
N = 10000
R = 3
SUFFIX = '_test'
SUFFIX_LEN = len(SUFFIX)
def setup():
import random
global s, i, j
s = 'abcdefghijklmnopqrstuvwxyz'
i = random.randrange(len(s))
while True:
j = random.randrange(len(s))
if i != j: break
def swapchars_martineau(s, i, j):
i, j = (i, j) if i < j else (j, i) # make sure i < j
return s[:i] + s[j] + s[i+1:j] + s[i] + s[j+1:]
def swapchars_martineau_test():
global s, i, j
swapchars_martineau(s, i, j)
def swapchars_clements(text, fst, snd):
ba = bytearray(text)
ba[fst], ba[snd] = ba[snd], ba[fst]
return str(ba)
def swapchars_clements_test():
global s, i, j
swapchars_clements(s, i, j)
# find all the functions named *SUFFIX in the global namespace
funcs = tuple(value for id,value in globals().items()
if id.endswith(SUFFIX) and type(value) is types.FunctionType)
# run the timing tests and collect results
timings = [(f.func_name[:-SUFFIX_LEN],
min(timeit.repeat(f, setup=setup, repeat=R, number=N))
) for f in funcs]
timings.sort(key=lambda x: x[1]) # sort by speed
fastest = timings[0][1] # time fastest one took to run
longest = max(len(t[0]) for t in timings) # len of longest func name (w/o suffix)
print 'fastest to slowest *_test() function timings:\n' \
' {:,d} chars, {:,d} timeit calls, best of {:d}\n'.format(len(s), N, R)
def times_slower(speed, fastest):
return speed/fastest - 1.0
for i in timings:
print "{0:>{width}}{suffix}() : {1:.4f} ({2:.2f} times slower)".format(
i[0], i[1], times_slower(i[1], fastest), width=longest, suffix=SUFFIX)
Addendum:
For the special case of swapping digit characters in a positive decimal number given as a string, the following also works and is a tiny bit faster than the general version at the top of my answer.
The somewhat involved conversion back to a string at the end with the format() method is to deal with cases where a zero got moved to the front of the string. I present it mainly as a curiosity, since it's fairly incomprehensible unless you grasp what it does mathematically. It also doesn't handle negative numbers.
n = int(s)
len_s = len(s)
ord_0 = ord('0')
di = ord(s[i])-ord_0
dj = ord(s[j])-ord_0
pi = 10**(len_s-(i+1))
pj = 10**(len_s-(j+1))
s = '{:0{width}d}'.format(n + (dj-di)*pi + (di-dj)*pj, width=len_s)
It has to be of a mutable type of some sort, the best I can think of is (can't make any claims as to performance though):
def swapchar(text, fst, snd):
ba = bytearray(text)
ba[fst], ba[snd] = ba[snd], ba[fst]
return ba
>>> swapchar('thequickbrownfox', 3, 7)
bytearray(b'thekuicqbrownfox')
You can still utilise the result as a str/list - or explicitly convert it to a str if needs be.
>>> int1 = 2
>>> int2 = 3
>>> eval(str(int1)+str(int2))
23
I know you've already accepted an answer, so I won't bother coding it in Python, but here's how you could do it in JavaScript which also has immutable strings:
function swapchar(string, j)
{
return string.replace(RegExp("(.{" + j + "})(.)(.)"), "$1$3$2");
}
Obviously if j isn't in an appropriate range then it just returns the original string.
Given an integer n and two (zero-started) indexes i and j of digits to swap, this can be done using powers of ten to locate the digits, division and modulo operations to extract them, and subtraction and addition to perform the swap.
def swapDigits(n, i, j):
# These powers of 10 encode the locations i and j in n.
power_i = 10 ** i
power_j = 10 ** j
# Retrieve digits [i] and [j] from n.
digit_i = (n // power_i) % 10
digit_j = (n // power_j) % 10
# Remove digits [i] and [j] from n.
n -= digit_i * power_i
n -= digit_j * power_j
# Insert digit [i] in position [j] and vice versa.
n += digit_i * power_j
n += digit_j * power_i
return n
For example:
>>> swapDigits(9876543210, 4, 0)
9876503214
>>> swapDigits(9876543210, 7, 2)
9826543710
Suppose you take the strings 'a' and 'z' and list all the strings that come between them in alphabetical order: ['a','b','c' ... 'x','y','z']. Take the midpoint of this list and you find 'm'. So this is kind of like taking an average of those two strings.
You could extend it to strings with more than one character, for example the midpoint between 'aa' and 'zz' would be found in the middle of the list ['aa', 'ab', 'ac' ... 'zx', 'zy', 'zz'].
Might there be a Python method somewhere that does this? If not, even knowing the name of the algorithm would help.
I began making my own routine that simply goes through both strings and finds midpoint of the first differing letter, which seemed to work great in that 'aa' and 'az' midpoint was 'am', but then it fails on 'cat', 'doggie' midpoint which it thinks is 'c'. I tried Googling for "binary search string midpoint" etc. but without knowing the name of what I am trying to do here I had little luck.
I added my own solution as an answer
If you define an alphabet of characters, you can just convert to base 10, do an average, and convert back to base-N where N is the size of the alphabet.
alphabet = 'abcdefghijklmnopqrstuvwxyz'
def enbase(x):
n = len(alphabet)
if x < n:
return alphabet[x]
return enbase(x/n) + alphabet[x%n]
def debase(x):
n = len(alphabet)
result = 0
for i, c in enumerate(reversed(x)):
result += alphabet.index(c) * (n**i)
return result
def average(a, b):
a = debase(a)
b = debase(b)
return enbase((a + b) / 2)
print average('a', 'z') #m
print average('aa', 'zz') #mz
print average('cat', 'doggie') #budeel
print average('google', 'microsoft') #gebmbqkil
print average('microsoft', 'google') #gebmbqkil
Edit: Based on comments and other answers, you might want to handle strings of different lengths by appending the first letter of the alphabet to the shorter word until they're the same length. This will result in the "average" falling between the two inputs in a lexicographical sort. Code changes and new outputs below.
def pad(x, n):
p = alphabet[0] * (n - len(x))
return '%s%s' % (x, p)
def average(a, b):
n = max(len(a), len(b))
a = debase(pad(a, n))
b = debase(pad(b, n))
return enbase((a + b) / 2)
print average('a', 'z') #m
print average('aa', 'zz') #mz
print average('aa', 'az') #m (equivalent to ma)
print average('cat', 'doggie') #cumqec
print average('google', 'microsoft') #jlilzyhcw
print average('microsoft', 'google') #jlilzyhcw
If you mean the alphabetically, simply use FogleBird's algorithm but reverse the parameters and the result!
>>> print average('cat'[::-1], 'doggie'[::-1])[::-1]
cumdec
or rewriting average like so
>>> def average(a, b):
... a = debase(a[::-1])
... b = debase(b[::-1])
... return enbase((a + b) / 2)[::-1]
...
>>> print average('cat', 'doggie')
cumdec
>>> print average('google', 'microsoft')
jlvymlupj
>>> print average('microsoft', 'google')
jlvymlupj
It sounds like what you want, is to treat alphabetical characters as a base-26 value between 0 and 1. When you have strings of different length (an example in base 10), say 305 and 4202, your coming out with a midpoint of 3, since you're looking at the characters one at a time. Instead, treat them as a floating point mantissa: 0.305 and 0.4202. From that, it's easy to come up with a midpoint of .3626 (you can round if you'd like).
Do the same with base 26 (a=0...z=25, ba=26, bb=27, etc.) to do the calculations for letters:
cat becomes 'a.cat' and doggie becomes 'a.doggie', doing the math gives cat a decimal value of 0.078004096, doggie a value of 0.136390697, with an average of 0.107197397 which in base 26 is roughly "cumcqo"
Based on your proposed usage, consistent hashing ( http://en.wikipedia.org/wiki/Consistent_hashing ) seems to make more sense.
Thanks for everyone who answered, but I ended up writing my own solution because the others weren't exactly what I needed. I am trying to average app engine key names, and after studying them a bit more I discovered they actually allow any 7-bit ASCII characters in the names. Additionally I couldn't really rely on the solutions that converted the key names first to floating point, because I suspected floating point accuracy just isn't enough.
To take an average, first you add two numbers together and then divide by two. These are both such simple operations that I decided to just make functions to add and divide base 128 numbers represented as lists. This solution hasn't been used in my system yet so I might still find some bugs in it. Also it could probably be a lot shorter, but this is just something I needed to get done instead of trying to make it perfect.
# Given two lists representing a number with one digit left to decimal point and the
# rest after it, for example 1.555 = [1,5,5,5] and 0.235 = [0,2,3,5], returns a similar
# list representing those two numbers added together.
#
def ladd(a, b, base=128):
i = max(len(a), len(b))
lsum = [0] * i
while i > 1:
i -= 1
av = bv = 0
if i < len(a): av = a[i]
if i < len(b): bv = b[i]
lsum[i] += av + bv
if lsum[i] >= base:
lsum[i] -= base
lsum[i-1] += 1
return lsum
# Given a list of digits after the decimal point, returns a new list of digits
# representing that number divided by two.
#
def ldiv2(vals, base=128):
vs = vals[:]
vs.append(0)
i = len(vs)
while i > 0:
i -= 1
if (vs[i] % 2) == 1:
vs[i] -= 1
vs[i+1] += base / 2
vs[i] = vs[i] / 2
if vs[-1] == 0: vs = vs[0:-1]
return vs
# Given two app engine key names, returns the key name that comes between them.
#
def average(a_kn, b_kn):
m = lambda x:ord(x)
a = [0] + map(m, a_kn)
b = [0] + map(m, b_kn)
avg = ldiv2(ladd(a, b))
return "".join(map(lambda x:chr(x), avg[1:]))
print average('a', 'z') # m#
print average('aa', 'zz') # n-#
print average('aa', 'az') # am#
print average('cat', 'doggie') # d(mstr#
print average('google', 'microsoft') # jlim.,7s:
print average('microsoft', 'google') # jlim.,7s:
import math
def avg(str1,str2):
y = ''
s = 'abcdefghijklmnopqrstuvwxyz'
for i in range(len(str1)):
x = s.index(str2[i])+s.index(str1[i])
x = math.floor(x/2)
y += s[x]
return y
print(avg('z','a')) # m
print(avg('aa','az')) # am
print(avg('cat','dog')) # chm
Still working on strings with different lengths... any ideas?
This version thinks 'abc' is a fraction like 0.abc. In this approach space is zero and a valid input/output.
MAX_ITER = 10
letters = " abcdefghijklmnopqrstuvwxyz"
def to_double(name):
d = 0
for i, ch in enumerate(name):
idx = letters.index(ch)
d += idx * len(letters) ** (-i - 1)
return d
def from_double(d):
name = ""
for i in range(MAX_ITER):
d *= len(letters)
name += letters[int(d)]
d -= int(d)
return name
def avg(w1, w2):
w1 = to_double(w1)
w2 = to_double(w2)
return from_double((w1 + w2) * 0.5)
print avg('a', 'a') # 'a'
print avg('a', 'aa') # 'a mmmmmmmm'
print avg('aa', 'aa') # 'a zzzzzzzz'
print avg('car', 'duck') # 'cxxemmmmmm'
Unfortunately, the naïve algorithm is not able to detect the periodic 'z's, this would be something like 0.99999 in decimal; therefore 'a zzzzzzzz' is actually 'aa' (the space before the 'z' periodicity must be increased by one.
In order to normalise this, you can use the following function
def remove_z_period(name):
if len(name) != MAX_ITER:
return name
if name[-1] != 'z':
return name
n = ""
overflow = True
for ch in reversed(name):
if overflow:
if ch == 'z':
ch = ' '
else:
ch=letters[(letters.index(ch)+1)]
overflow = False
n = ch + n
return n
print remove_z_period('a zzzzzzzz') # 'aa'
I haven't programmed in python in a while and this seemed interesting enough to try.
Bear with my recursive programming. Too many functional languages look like python.
def stravg_half(a, ln):
# If you have a problem it will probably be in here.
# The floor of the character's value is 0, but you may want something different
f = 0
#f = ord('a')
L = ln - 1
if 0 == L:
return ''
A = ord(a[0])
return chr(A/2) + stravg_half( a[1:], L)
def stravg_helper(a, b, ln, x):
L = ln - 1
A = ord(a[0])
B = ord(b[0])
D = (A + B)/2
if 0 == L:
if 0 == x:
return chr(D)
# NOTE: The caller of helper makes sure that len(a)>=len(b)
return chr(D) + stravg_half(a[1:], x)
return chr(D) + stravg_helper(a[1:], b[1:], L, x)
def stravg(a, b):
la = len(a)
lb = len(b)
if 0 == la:
if 0 == lb:
return a # which is empty
return stravg_half(b, lb)
if 0 == lb:
return stravg_half(a, la)
x = la - lb
if x > 0:
return stravg_helper(a, b, lb, x)
return stravg_helper(b, a, la, -x) # Note the order of the args