Convert a long number to corresponding letter combinations - python

Given a number, translate it to all possible combinations of corresponding letters. For example, if given the number 1234, it should spit out abcd, lcd, and awd because the combinations of numbers corresponding to letters could be 1 2 3 4, 12 3 4, or 1 23 4.
I was thinking of ways to do this in Python and I was honestly stumped. Any hints?
I basically only setup a simple system to convert single digit to letters so far.

Make str.
Implement partition as in here.
Filter lists with a number over 26.
Write function that returns letters.
def alphabet(n):
# return " abcde..."[n]
return chr(n + 96)
def partition(lst):
for i in range(1, len(lst)):
for r in partition(lst[i:]):
yield [lst[:i]] + r
yield [lst]
def int2words(x):
for lst in partition(str(x)):
ints = [int(i) for i in lst]
if all(i <= 26 for i in ints):
yield "".join(alphabet(i) for i in ints)
x = 12121
print(list(int2words(x)))
# ['ababa', 'abau', 'abla', 'auba', 'auu', 'laba', 'lau', 'lla']

I'm not gonna give you a complete solution but an idea where to start:
I would transform the number to a string and iterate over the string, as the alphabet has 26 characters you would only have to check one- and two-digit numbers.
As in a comment above a recursive approach will do the trick, e.g.:
Number is 1234
*) Take first character -> number is 1
*) From there combine it with all remaining 1-digit numbers -->
1 2 3 4
*) Then combine it with the next 2 digit number (if <= 26) and the remaining 1 digit numbers -->
1 23 4
*) ...and so on
As i said, it's just an idea where to start, but basically its a recursive approach using combinatorics including checks if two digit numbers aren't greater then 26 and thus beyond the alphabet.

Related

How does this python palindrome function work?

can someone explain this function to me?
#from the geeksforgeeks website
def isPalimdrome(str):
for i in range(0, int(len(str)/2)):
if str[i] != str[len(str)-i-1]:
return False
return True
I dont understand the for loop and the if statement.
A - why is the range from 0 to length of the string divided by 2?
B - what does "str[len(str)-i-1" do?
//sorry, ik I ask stupid questions
To determine if a string is a palindrome, we can split the string in half and compare each letter of each half.
Consider the example
string ABCCBA
the range in the for loop sets this up by only iterating over the first n/2 characters. int(n/2) is used to force an integer (question A)
ex_str = 'ABCCBA'
for s in range(int(len(ex_str)/2)):
print(ex_str[s])
A
B
C
we now have to look at the letters in the other half, CBA, in reverse order
adding an index to our example to visualize this
string ABCCBA
index 012345
to determine if string is a palindrome, we can compare indices 0 to 5, 1 to 4, and 2 to 3
len(str)-i-1 gives us the correct index of the other half for each i (question B)
example:
ex_str = 'ABCCBA'
for s in range(int(len(ex_str)/2)):
print(f'compare index {s} to index {len(ex_str)-s-1}')
print(f"{ex_str[s]} to {ex_str[len(ex_str) - s - 1]}")
compare index 0 to index 5
A to A
compare index 1 to index 4
B to B
compare index 2 to index 3
C to C
for i in range(0, int(len(str)/2)):
Iterate through(go one by one from) 0(because in string first letter's index is 0) to half length of the string.
Why to only half length?
Because in a palindrome you need to compare only half length of string to the other half.
e.g., RADAR. 0=R, 1=A, 2=D, 3=A, 4=R. Number of letters = 5.
int(len(str)/2) will evaluate to 2. So first two letters will be compared with last two letters and middle one is common so will not be compared.
if str[i] != str[len(str)-i-1]:
Now, length of string is 5 but index of letters in string goes from 0 to 4, which is why len(str)-1 (5-1 = 4, i.e., last letter R).
len(str)-1-i Since i is a loop variable, it will be incremented by 1 every time for loop runs. In first run i is 0, in second 1....
The for loop will run two times.
str[i] != str[len(str)-1-i] will be evaluated as-
0 != 4 i.e. R != R FALSE
1 != 3 i.e. A != A FALSE
This code is not very readable and can be simplified as pointed out by others. This also reflects why code readability is important.
1. why is the range from 0 to length of the string divided by 2?
That's because we don't need to iterate all the way through the string but just halfway through it.
2. what does "str[len(str)-i-1]" do?
It returns the ith element from the end ie for a string "noon" when i is 0 it will get str[3] ie n
Easiest way to check palindrome is this
def isPalimdrome(s):
return s == s[::-1]
Reading the string from the beginning is same as reading it reverse.

Generate all the possible combinations of n bits with different prefix of a given word in Python3

I got the following problem. Given a word (a binary one) I want to generate all the combinations with length n, and the given word can not be a prefix of any of the combinations.
For instance, with n = 3 and the word is 00 I would like to generate:
010
011
100
101
110
111
Is there any pythonic way to do this?
Edit: Sorry, I am trying modifications of this standard pseudo-code
combinations:
if depth = 0 return result
for i in start..size
out+=combinations(depth-1, i+1, result)
return out
I can't figure out how to add the restriction of not starting by the given word. By "pythonic" I mean with something like comprehension lists, or a beautiful one-liner :D
You can do all the work in a one-liner, but it takes a bit of setup. This takes advantage of the fact that you basically want all the binary numbers within a range of 0 to 2**n, except if their leftmost bits represent a particular binary number. Note that in general you will be keeping most of the numbers in the range (all but 1/2**len(word)), so it's reasonably efficient just to generate all the numbers and then filter out the ones you don't want.
word = '00'
word_int = int(word, base=2)
m = len(word)
n = 3
results = ['{0:b}'.format(num).zfill(n) for num in range(2**n) if num >> (n-m) != word_int]
print('\n'.join(results))
# 010
# 011
# 100
# 101
# 110
# 111
You can eliminate some of the setup, but the one-liner gets harder to read:
word = '00'
n = 3
[
num
for num in ('{0:b}'.format(p).zfill(n) for p in range(2**n))
if not num.startswith(word)
]
Or you can use itertools.product
word = '00'
n = 3
[
num
for num in (''.join(p) for p in itertools.product('01', repeat=n))
if not num.startswith(word)
]

Short Unique Hexadecimal String in Python

I need to generate a unique hexadecimal string in Python 3 that meets following requirements:
It should contain 6 characters
it should not contain just digits. There must be at least one character.
These generated strings should be random. They should not be in any order.
There should be minimum probability of conflict
I have considered uuid4(). But the problem is that it generates strings with too many characters and any substring of the generated string can contain all digits(i.e. no character) at some point.
Is there any other way to fulfill this conditions? Thanks in advance!
EDIT
Can we use a hash for example SHA-1 to fulfill above requirements?
Here's a simple method that samples evenly from all allowed strings. Sampling uniformly makes conflicts as rare as possible, short of keeping a log of previous keys or using a hash based on a counter (see below).
import random
digits = '0123456789'
letters = 'abcdef'
all_chars = digits + letters
length = 6
while True:
val = ''.join(random.choice(all_chars) for i in range(length))
# The following line might be faster if you only want hex digits.
# It makes a long int with 24 random bits, converts it to hex,
# drops '0x' from the start and 'L' from the end, then pads
# with zeros up to six places if needed
# val = hex(random.getrandbits(4*length))[2:-1].zfill(length)
# test whether it contains at least one letter
if not val.isdigit():
break
# now val is a suitable string
print val
# 5d1d81
Alternatively, here's a somewhat more complex approach that also samples uniformly, but doesn't use any open-ended loops:
import random, bisect
digits = '0123456789'
letters = 'abcdef'
all_chars = digits + letters
length = 6
# find how many valid strings there are with their first letter in position i
pos_weights = [10**i * 6 * 16**(length-1-i) for i in range(length)]
pos_c_weights = [sum(pos_weights[0:i+1]) for i in range(length)]
# choose a random slot among all the allowed strings
r = random.randint(0, pos_c_weights[-1])
# find the position for the first letter in the string
first_letter = bisect.bisect_left(pos_c_weights, r)
# generate a random string matching this pattern
val = ''.join(
[random.choice(digits) for i in range(first_letter)]
+ [random.choice(letters)]
+ [random.choice(all_chars) for i in range(first_letter + 1, length)]
)
# now val is a suitable string
print val
# 4a99f0
And finally, here's an even more complex method that uses the random number r to index directly into the entire range of allowed values, i.e., this converts any number in the range of 0-15,777,216 into a suitable hex string. This could be used to completely avoid conflicts (discussed more below).
import random, bisect
digits = '0123456789'
letters = 'abcdef'
all_chars = digits + letters
length = 6
# find how many valid strings there are with their first letter in position i
pos_weights = [10**i * 6 * 16**(length-1-i) for i in range(length)]
pos_c_weights = [sum(pos_weights[0:i+1]) for i in range(length + 1)]
# choose a random slot among all the allowed strings
r = random.randint(0, pos_c_weights[-1])
# find the position for the first letter in the string
first_letter = bisect.bisect_left(pos_c_weights, r) - 1
# choose the corresponding string from among all that fit this pattern
offset = r - pos_c_weights[first_letter]
val = ''
# convert the offset to a collection of indexes within the allowed strings
# the space of allowed strings has dimensions
# 10 x 10 x ... (for digits) x 6 (for first letter) x 16 x 16 x ... (for later chars)
# so we can index across it by dividing into appropriate-sized slices
for i in range(length):
if i < first_letter:
offset, v = divmod(offset, 10)
val += digits[v]
elif i == first_letter:
offset, v = divmod(offset, 6)
val += letters[v]
else:
offset, v = divmod(offset, 16)
val += all_chars[v]
# now val is a suitable string
print val
# eb3493
Uniform Sampling
I mentioned above that this samples uniformly across all allowed strings. Some other answers here choose 5 characters completely at random and then force a letter into the string at a random position. That approach produces more strings with multiple letters than you would get randomly. e.g., that method always produces a 6-letter string if letters are chosen for the first 5 slots; however, in this case the sixth selection should actually only have a 6/16 chance of being a letter. Those approaches can't be fixed by forcing a letter into the sixth slot only if the first 5 slots are digits. In that case, all 5-digit strings would automatically be converted to 5 digits plus 1 letter, giving too many 5-digit strings. With uniform sampling, there should be a 10/16 chance of completely rejecting the string if the first 5 characters are digits.
Here are some examples that illustrate these sampling issues. Suppose you have a simpler problem: you want a string of two binary digits, with a rule that at least one of them must be a 1. Conflicts will be rarest if you produce 01, 10 or 11 with equal probability. You can do that by choosing random bits for each slot, and then throwing out the 00's (similar to my approach above).
But suppose you instead follow this rule: Make two random binary choices. The first choice will be used as-is in the string. The second choice will determine the location where an additional 1 will be inserted. This is similar to the approach used by the other answers here. Then you will have the following possible outcomes, where the first two columns represent the two binary choices:
0 0 -> 10
0 1 -> 01
1 0 -> 11
1 1 -> 11
This approach has a 0.5 chance of producing 11, or 0.25 for 01 or 10, so it will increase the risk of collisions among 11 results.
You could try to improve this as follows: Make three random binary choices. The first choice will be used as-is in the string. The second choice will be converted to a 1 if the first choice was a 0; otherwise it will be added to the string as-is. The third choice will determine the location where the second choice will be inserted. Then you have the following possible outcomes:
0 0 0 -> 10 (second choice converted to 1)
0 0 1 -> 01 (second choice converted to 1)
0 1 0 -> 10
0 1 1 -> 01
1 0 0 -> 10
1 0 1 -> 01
1 1 0 -> 11
1 1 1 -> 11
This gives 0.375 chance for 01 or 10, and 0.25 chance for 11. So this will slightly increase the risk of conflicts between duplicate 10 or 01 values.
Reducing Conflicts
If you are open to using all letters instead of just 'a' through 'f' (hexadecimal digits), you could alter the definition of letters as noted in the comments. This will give much more diverse strings and much less chance of conflict. If you generated 1,000 strings allowing all upper- and lowercase letters, you'd only have about a 0.0009% chance of generating any duplicates, vs. 3% chance with hex strings only. (This will also virtually eliminate double-passes through the loop.)
If you really want to avoid conflicts between strings, you could store all the values you've generated previously in a set and check against that before breaking from the loop. This would be good if you are going to generate fewer than about 5 million keys. Beyond that, you'd need quite a bit of RAM to hold the old keys, and it might take a few runs through the loop to find an unused key.
If you need to generate more keys than that, you could encrypt a counter, as described at Generating non-repeating random numbers in Python. The counter and its encrypted version would both be ints in the range of 0 to 15,777,216. The counter would just count up from 0, and the encrypted version would look like a random number. Then you would convert the encrypted version to hex using the third code example above. If you do this, you should generate a random encryption key at the start, and change the encryption key each time the counter rolls past your maximum, to avoid producing the same sequence again.
The following approach works as follows, first pick one random letter to ensure rule 2, then select 4 random entries from the list of all available characters. Shuffle the resulting list. Lastly prepend one value taken from the list of all entries except 0 to ensure the string has 6 characters.
import random
all = "0123456789abcdef"
result = [random.choice('abcdef')] + [random.choice(all) for _ in range(4)]
random.shuffle(result)
result.insert(0, random.choice(all[1:]))
print(''.join(result))
Giving you something like:
3b7a4e
This approach avoids having to repeatedly check the result to ensure that it satisfies the rules.
Note: Updated the answer for hexadecimal unique string. Earlier I assumed for alhanumeric string.
You may create your own unique function using uuid and random library
>>> import uuid
>>> import random
# Step 1: Slice uuid with 5 i.e. new_id = str(uuid.uuid4())[:5]
# Step 2: Convert string to list of char i.e. new_id = list(new_id)
>>> uniqueval = list(str(uuid.uuid4())[:5])
# uniqueval = ['f', '4', '4', '4', '5']
# Step 3: Generate random number between 0-4 to insert new char i.e.
# random.randint(0, 4)
# Step 4: Get random char between a-f (for Hexadecimal char) i.e.
# chr(random.randint(ord('a'), ord('f')))
# Step 5: Insert random char to random index
>>> uniqueval.insert(random.randint(0, 4), chr(random.randint(ord('a'), ord('f'))))
# uniqueval = ['f', '4', '4', '4', 'f', '5']
# Step 6: Join the list
>>> uniqueval = ''.join(uniqueval)
# uniqueval = 'f444f5'
This function returns the nth string conforming to your requirements, so you can simply generate unique integers and convert them using this function.
def inttohex(number, digits):
# there must be at least one character:
fullhex = 16**(digits - 1)*6
assert number < fullhex
partialnumber, remainder = divmod(number, digits*6)
charposition, charindex = divmod(remainder, digits)
char = ['a', 'b', 'c', 'd', 'e', 'f'][charposition]
hexconversion = list("{0:0{1}x}".format(partialnumber, digits-1))
hexconversion.insert(charposition, char)
return ''.join(hexconversion)
Now you can get a particular one using for instance
import random
digits = 6
inttohex(random.randint(0, 6*16**(digits-1)), digits)
You can't have maximum randomness along with minimum probability of conflict. I recommend keeping track of which numbers you have handed out or if you are looping through all of them somehow, using a randomly sorted list.

Getting bad max and min values from list

I used split to remove whitespaces and turn string into list and using built in function I tried to find the max and min values of the list but I gave incorrect ans also I want ans in formate " x y "where x and y are max and min respectively.
When I print list it consist of ' ' every elements of list
Thanks in Advance.
My code:
def high_and_low(numbers):
numbers = numbers.split()
numbers = list(numbers)
return max(numbers),min(numbers)
print high_and_low("4 5 29 54 4 0 -214 542 -64 1 -3 6 -6")
split returns strings, and you don't convert the strings to actual numbers. When comparing strings, the meaning of the comparison is different than when comparing numbers:
>>> '2' > '10'
True
So you need to change your function to something like this:
In [1]: def high_and_low(s):
...: numbers = [int(x) for x in s.split()]
...: return max(numbers), min(numbers)
...:
In [2]: high_and_low("4 5 29 54 4 0 -214 542 -64 1 -3 6 -6")
Out[2]: (542, -214)
min and max take a key so if you don't actually want ints returned you can use key=int so you compare as integers and not strings which is what you are currently doing:
def high_and_low(numbers):
numbers = numbers.split()
return max(numbers,key=int),min(numbers,key=int)
Or use map to cast the strings to int after splitting if you want ints:
def high_and_low(numbers):
numbers = map(int,numbers.split())
return max(numbers,key=int),min(numbers,key=int)
numbers is already a list after splitting so using numbers = list(numbers) is redundant.
from the docs:
sequence objects may be compared to other objects with the same sequence type. The comparison uses lexicographical ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted
So for min you only get the correct answer -214 because -2 is < -6 and -3, once you add -1 or anything starting with -1 then you would again see incorrect output.
For the max you get '6' because "6" is greater than the first char of any other substring.

Word ranking partial completion [duplicate]

This question already has answers here:
Finding the ranking of a word (permutations) with duplicate letters
(6 answers)
Closed 8 years ago.
I am not sure how to solve this problem within the constraints.
Shortened problem formulation:
"Word" as any sequence of capital letters A-Z (not limited to just "dictionary words").
Consider list of permutations of all characters in a word, sorted lexicographically
Find a position of original word in such a list
Do not generate all possible permutations of a word, since it won't fit in time-memory constraints.
Constraints: word length <= 25 characters; memory limit 1Gb, any answer should fit in 64-bit integer
Original problem formulation:
Consider a "word" as any sequence of capital letters A-Z (not limited to just "dictionary words"). For any word with at least two different letters, there are other words composed of the same letters but in a different order (for instance, STATIONARILY/ANTIROYALIST, which happen to both be dictionary words; for our purposes "AAIILNORSTTY" is also a "word" composed of the same letters as these two). We can then assign a number to every word, based on where it falls in an alphabetically sorted list of all words made up of the same set of letters. One way to do this would be to generate the entire list of words and find the desired one, but this would be slow if the word is long. Write a program which takes a word as a command line argument and prints to standard output its number. Do not use the method above of generating the entire list. Your program should be able to accept any word 25 letters or less in length (possibly with some letters repeated), and should use no more than 1 GB of memory and take no more than 500 milliseconds to run. Any answer we check will fit in a 64-bit integer.
Sample words, with their rank:
ABAB = 2
AAAB = 1
BAAA = 4
QUESTION = 24572
BOOKKEEPER = 10743
examples:
AAAB - 1
AABA - 2
ABAA - 3
BAAA - 4
AABB - 1
ABAB - 2
ABBA - 3
BAAB - 4
BABA - 5
BBAA - 6
I came up with I think is only a partial solution.
Imagine I have the word JACBZPUC. I sort the word and get ABCCJPUZ This should be rank 1 in the word rank. From ABCCJPUZ to the first alphabetical word right before the word starting with J I want to find the number of permutations between the 2 words.
ex:
for `JACBZPUC`
sorted --> `ABCCJPUZ`
permutations that start with A -> 8!/2!
permutations that start with B -> 8!/2!
permutations that start with C -> 8!/2!
Add the 3 values -> 60480
The other C is disregarded as the permutations would have the same values as the previous C (duplicates)
At this point I have the ranks from ABCCJPUZ to the word right before the word that starts with J
ABCCJPUZ rank 1
...
... 60480 values
...
*HERE*
JABCCJPUZ rank 60481 LOCATION A
...
...
...
JACBZPUC rank ??? LOCATION B
I'm not sure how to get the values between Locations A and B:
Here is my code to find the 60480 values
def perm(word):
return len(set(itertools.permutations(word)))
def swap(word, i, j):
word = list(word)
word[i], word[j] = word[j], word[i]
print word
return ''.join(word)
def compute(word):
if ''.join(sorted(word)) == word:
return 1
total = 0
sortedWord = ''.join(sorted(word))
beforeFirstCharacterSet = set(sortedWord[:sortedWord.index(word[0])])
print beforeFirstCharacterSet
for i in beforeFirstCharacterSet:
total += perm(swap(sortedWord,0,sortedWord.index(i)))
return total
Here is a solution I found online to solve this problem.
Consider the n-letter word { x1, x2, ... , xn }. My solution is based on the idea that the word number will be the sum of two quantities:
The number of combinations starting with letters lower in the alphabet than x1, and
how far we are into the the arrangements that start with x1.
The trick is that the second quantity happens to be the word number of the word { x2, ... , xn }. This suggests a recursive implementation.
Getting the first quantity is a little complicated:
Let uniqLowers = { u1, u2, ... , um } = all the unique letters lower than x1
For each uj, count the number of permutations starting with uj.
Add all those up.
I think I complete step number 1 but not number 2. I am not sure how to complete this part
Here is the Haskell solution...I don't know Haskell =/ and I am trying to write this program in Python
https://github.com/david-crespo/WordNum/blob/master/comb.hs
The idea of finding the number of prmutations of the letters before the actual first letter is good.But your calculation:
for `JACBZPUC`
sorted --> `ABCCJPUZ`
permutations that start with A -> 8!/2!
permutations that start with B -> 8!/2!
permutations that start with C -> 8!/2!
Add the 3 values -> 60480
is wrong. There are only 8!/2! = 20160 permutations of JACBZPUC, so the starting position can't be greater than 60480. In your method, the first letter is fixed, you can only permute the seven following letters. So:
permutations that start with A: 7! / 2! == 2520
permutations that start with B: 7! / 2! == 2520
permutations that start with C: 7! / 1! == 5040
-----
10080
You don't divide by 2! to find the permutations beginning with C, because the seven remaning letters are unique; there's only one C left.
Here's a Python implementation:
def fact(n):
"""factorial of n, n!"""
f = 1
while n > 1:
f *= n
n -= 1
return f
def rrank(s):
"""Back-end to rank for 0-based rank of a list permutation"""
# trivial case
if len(s) < 2: return 0
order = s[:]
order.sort()
denom = 1
# account for multiple occurrences of letters
for i, c in enumerate(order):
n = 1
while i + n < len(order) and order[i + n] == c:
n += 1
denom *= n
# starting letters alphabetically before current letter
pos = order.index(s[0])
#recurse to list without its head
return fact(len(s) - 1) * pos / denom + rrank(s[1:])
def rank(s):
"""Determine 1-based rank of string permutation"""
return rrank(list(s)) + 1
strings = [
"ABC", "CBA",
"ABCD", "BADC", "DCBA", "DCAB", "FRED",
"QUESTION", "BOOKKEEPER", "JACBZPUC",
"AAAB", "AABA", "ABAA", "BAAA"
]
for s in strings:
print s, rank(s)
The second part of the solution you have found is also --I think-- what I was about to suggest:
To go from what you call "Location A" to "Location B", you have to find the position of word ACBZPUC among its possible permutations. Consider that a new question to your algorithm, with a new word that just happens to be one position shorter than the original one.
The words in the alphabetical list between JABCCPUZ, which you know the position of, and JACBZPUC, which you want to find the position of, all start with J. Finding the position of JACBZPUC relative to JABCCPUZ, then, is equivalent to finding the relative positions of those two words with the initial J removed, which is the same as the problem you were trying to solve initially but with a word one character shorter.
Repeat that process enough times and you will be left with a word that contains a single character, C. The position of a word with a single character is known to always be 1, so you can then sum that and all of the previous relative positions for an absolute position.

Categories