I would like to find all subsets of a sorted string, disregarding order and which characters are next to each other. I think the best way for this to be explained is though an example. The results should also be from longest to shortest.
These are the results for bell.
bell
bel
bll
ell
be
bl
el
ll
b
e
l
I have thought of ways to do this, but none for any length of input.
Thank you!
There are generally two ways to approach such things: generate "everything" and weed out duplicates later, or create custom algorithms to avoid generating duplicates to begin with. The former is almost always easier, so that's what I'll show here:
def gensubsets(s):
import itertools
for n in reversed(range(1, len(s)+1)):
seen = set()
for x in itertools.combinations(s, n):
if x not in seen:
seen.add(x)
yield "".join(x)
for x in gensubsets("bell"):
print(x)
That prints precisely what you said you wanted, and how it does so should be more-than-less obvious.
Here is one way using itertools.combinations.
If the order for strings of same length is important, see #TimPeters' answer.
from itertools import combinations
mystr = 'bell'
res = sorted({''.join(sorted(x, key=lambda j: mystr.index(j)))
for i in range(1, len(mystr)+1) for x in combinations(mystr, i)},
key=lambda k: -len(k))
# ['bell', 'ell', 'bel', 'bll', 'be', 'll', 'bl', 'el', 'l', 'e', 'b']
Explanation
Find all combinations of length in range(1, len(mystr)+1).
Sort by original string via key argument of sorted. This step may be omitted if not required.
Use set of ''.join on elements for unique strings.
Outer sorted call to go from largest to smallest.
You can try in one line:
import itertools
data='bell'
print(set(["".join(i) for t in range(len(data)) for i in itertools.combinations(data,r=t) if "".join(i)!='']))
output:
{'bel', 'bll', 'ell', 'el', 'be', 'bl', 'e', 'b', 'l', 'll'}
Related
I would like to generate a list of combinations. I will try to simplify my problem to make it understandable.
We have 3 variables :
x : number of letters
k : number of groups
n : number of letters per group
I would like to generate using python a list of every possible combinations, without any duplicate knowing that : i don't care about the order of the groups and the order of the letters within a group.
As an example, with x = 4, k = 2, n = 2 :
# we start with 4 letters, we want to make 2 groups of 2 letters
letters = ['A','B','C','D']
# here would be a code that generate the list
# Here is the result that is very simple, only 3 combinations exist.
combos = [ ['AB', 'CD'], ['AC', 'BD'], ['AD', 'BC'] ]
Since I don't care about the order of or within the groups, and letters within a group, ['AB', 'CD'] and ['DC', 'BA'] is a duplicate.
This is a simplification of my real problem, which has those values : x = 12, k = 4, n = 3. I tried to use some functions from itertools, but with that many letters my computer freezes because it's too many combinations.
Another way of seeing the problem : you have 12 players, you want to make 4 teams of 3 players. What are all the possibilities ?
Could anyone help me to find an optimized solution to generate this list?
There will certainly be more sophisticated/efficient ways of doing this, but here's an approach that works in a reasonable amount of time for your example and should be easy enough to adapt for other cases.
It generates unique teams and unique combinations thereof, as per your specifications.
from itertools import combinations
# this assumes that team_size * team_num == len(players) is a given
team_size = 3
team_num = 4
players = list('ABCDEFGHIJKL')
unique_teams = [set(c) for c in combinations(players, team_size)]
def duplicate_player(combo):
"""Returns True if a player occurs in more than one team"""
return len(set.union(*combo)) < len(players)
result = (combo for combo in combinations(unique_teams, team_num) if not duplicate_player(combo))
result is a generator that can be iterated or turned into a list with list(result). On kaggle.com, it takes a minute or so to generate the whole list of all possible combinations (a total of 15400, in line with the computations by #beaker and #John Coleman in the comments). The teams are tuples of sets that look like this:
[({'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'I'}, {'J', 'K', 'L'}),
({'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'J'}, {'I', 'K', 'L'}),
({'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'K'}, {'I', 'J', 'L'}),
...
]
If you want, you can cast them into strings by calling ''.join() on each of them.
Another solution (players are numbered 0, 1, ...):
import itertools
def equipartitions(base_count: int, group_size: int):
if base_count % group_size != 0:
raise ValueError("group_count must divide base_count")
return set(_equipartitions(frozenset(range(base_count)), group_size))
def _equipartitions(base_set: frozenset, group_size: int):
if not base_set:
yield frozenset()
for combo in itertools.combinations(base_set, group_size):
for rest in _equipartitions(base_set.difference(frozenset(combo)), group_size):
yield frozenset({frozenset(combo), *rest})
all_combinations = [
[tuple(team) for team in combo]
for combo in equipartitions(12, 3)
]
print(all_combinations)
print(len(all_combinations))
And another:
import itertools
from typing import Iterable
def equipartitions(players: Iterable, team_size: int):
if len(players) % team_size != 0:
raise ValueError("group_count must divide base_count")
return _equipartitions(set(players), team_size)
def _equipartitions(players: set, team_size: int):
if not players:
yield []
return
first_player, *other_players = players
for other_team_members in itertools.combinations(other_players, team_size-1):
first_team = {first_player, *other_team_members}
for other_teams in _equipartitions(set(other_players) - set(first_team), team_size):
yield [first_team, *other_teams]
all_combinations = [
{''.join(sorted(team)) for team in combo} for combo in equipartitions(players='ABCDEFGHIJKL', team_size=3)
]
print(all_combinations)
print(len(all_combinations))
Firstly, you can use a list comprehension to give you all of the possible combinations (regardless of the duplicates):
comb = [(a,b) for a in letters for b in letters if a != b]
And, afterwards, you can use the sorted function to sort the tuples. After that, to remove the duplicates, you can convert all of the items to a set and then back to a list.
var = [tuple(sorted(sub)) for sub in comb]
var = list(set(var))
You could use the list comprehension approach, which has a time complexity of O(n*n-1), or you could use a more verbose way, but with a slightly better time complexity of O(n^2-n)/2:
comb = []
for first_letter_idx, _ in enumerate(letters):
for sec_letter_idx in range(first_letter_idx + 1, len(letters)):
comb.append(letters[first_letter_idx] + letters[sec_letter_idx])
print(comb)
comb2 = []
for first_letter_idx, _ in enumerate(comb):
for sec_letter_idx in range(first_letter_idx + 1, len(comb)):
if (comb[first_letter_idx][0] not in comb[sec_letter_idx]
and comb[first_letter_idx][1] not in comb[sec_letter_idx]):
comb2.append([comb[first_letter_idx], comb[sec_letter_idx]])
print(comb2)
This algorithm needs more work to handle dynamic inputs. Maybe with recursion.
Use combination from itertools
from itertools import combinations
x = list(combinations(['A','B','C','D'],2))
t = []
for i in (x):
t.append(i[0]+i[1]) # concatenating the strings and adding in a list
g = []
for i in range(0,len(t),2):
for j in range(i+1,len(t)):
g.append([t[i],t[j]])
break
print(g)
I need help in some basic python scrips, well I want to order a prayer in words from longer to shorter length and without repeating, until then everything is fine, what happens is that I do not know how to do to order words of the same length alphabetically.
Since you're asking for a case where the 2 iterators will not be of the same order, you'll have to do it differently. You can consult this question Sort by multiple keys using different orderings. But since it doesn't contain what you really wanted, I'll answer it here:
from itertools import groupby
s = ['ddd', 'bb', 'ab', 'aa', 'cc', 'dab']
l = [sorted(list(g)) for b, g in groupby(s, key=lambda x: len(x))]
l = [e for x in l for e in x]
>>> l
['dab', 'ddd', 'aa', 'ab', 'bb', 'cc']
This sort by negative order for length but positive order for words. Explanation: the first list comprehension turns the list of string into a list of list that contain sorted lists (by alphanumeric) of strings of same length. The second list comprehension unwraps the list of list into one list.
I have a simple list where the numbers are strings:
simple_list = ['1','2','3','4','5','K','P']
I would like to sort this first by alpha, then numerically.
currently I'm doing:
# Probably a faster way to handle this
alpha_list = [x for x in simple_list if not x.isnumeric()]
grade_list = [x for x in simple_list if x.isnumeric()]
# Put the alpha grades at the beginning of the grade_list
if alpha_list:
grade_list = sorted(alpha_list) + sorted(grade_list)
I'm sure there is a faster way to handle this - I just can't seem to find it.
The result I currently get is correct ['K','P','1','2','3','4','5']
I just wanted to know if there was a way I could condense all that down that would be more efficient than multiple list comprehensions.
You can sort the list with a key function that returns a tuple of str.isdigit() test and the string, and if the string is found to be digits, convert it to an integer:
sorted(simple_list, key=lambda c: (c.isdigit(), int(c) if c.isdigit() else c))
This returns:
['K', 'P', '1', '2', '3', '4', '5']
I am using the sorted() function to sort the text based on last character
which works perfectly
def sort_by_last_letter(strings):
def last_letter(s):
return s[-1]
return sorted(strings,key=last_letter)
print(sort_by_last_letter(["hello","from","last","letter","a"]))
Output
['a', 'from', 'hello', 'letter', 'last']
My requirement is to sort based on last 3rd character .But problem is few of the words are less than 3 character in that case it should be sorted based on next lower placed character (2 if present else last).Searching to do it in pythonic way
Presently I am getting
IndexError: string index out of range
def sort_by_last_letter(strings):
def last_letter(s):
return s[-3]
return sorted(strings,key=last_letter)
print(sort_by_last_letter(["hello","from","last","letter","a"]))
You can use:
return sorted(strings,key=lambda x: x[max(0,len(x)-3)])
So thus we first calculate the length of the string len(x) and subtract 3 from it. In case the string is not that long, we will thus obtain a negative index, but by using max(0,..) we prevent that and thus take the last but one, or the last character in case these do not exist.
This will work given every string has at least one character. This will produce:
>>> sorted(["hello","from","last","letter","a"],key=lambda x: x[max(0,len(x)-3)])
['last', 'a', 'hello', 'from', 'letter']
In case you do not care about tie-breakers (in other words if 'a' and 'abc' can be reordered), you can use a more elegant approach:
from operator import itemgetter
return sorted(strings,key=itemgetter(slice(-3,None)))
What we here do is generating a slice with the last three characters, and then compare these substrings. This then generates:
>>> sorted(strings,key=itemgetter(slice(-3,None)))
['a', 'last', 'hello', 'from', 'letter']
Since we compare with:
['a', 'last', 'hello', 'from', 'letter']
# ['a', 'ast', 'llo', 'rom', 'ter'] (comparison key)
You can simply use the minimum of the string length and 3:
def sort_by_last_letter(strings):
def last_letter(s):
return s[-min(len(s), 3)]
return sorted(strings,key=last_letter)
print(sort_by_last_letter(["hello","from","last","letter","a"]))
I know how to generate combinations of a set and that's a builtin in Python (what I use), anyway. But how to generate combinations with replacements?
Suppose I have a set with, say, two identical elements - for example, AABCDE.
Combinations of 3 items could be:
"AAB"
"ABC"
"CDE"
However, the program would count ABC twice - once when using the first A, and the second one using the second A.
What is a good way to generate such combinations without duplicates?
Thanks.
convert it to set, that's the easiest way to get rid of duplicates.
>>> import itertools
>>> ["".join(x) for x in (itertools.combinations(set("AABCDE"),3))]
['ACB', 'ACE', 'ACD', 'ABE', 'ABD', 'AED', 'CBE', 'CBD', 'CED', 'BED']
>>>
From your other comments, I think I misunderstood what you are asking.
>>> import itertools
>>> set("".join(x) for x in (itertools.combinations("AABCDE",3)))
set(['AAE', 'AAD', 'ABC', 'ABD', 'ABE', 'AAC', 'AAB', 'BCD', 'BCE', 'ACD', 'CDE', 'ACE', 'ADE', 'BDE'])
def stepper_w_w(l,stop):#stepper_with_while
"""l is a list of any size usually you would input [1,1,1,1...],
stop is the highest number you want to stop at so if you put in stop=5
the sequence would stop at [5,5,5,5...]
This stepper shows the first number that equals the last.
This generates combinations with replacement. """
numb1=1
while numb1<stop:
#print(numb1)
l[0]=numb1
NeL=0
while l[len(l)-1]<=numb1:
if l[NeL]==l[len(l)-1]:
l[NeL]+=1
l[(NeL+1):]=[1]*((len(l))-(NeL+1))
print(l)
"""iter_2s=NeL+1
while iter_2s<=(len(l)-1): #this is different from above
l[iter_2s]=2
iter_2s+=1
print(l)"""
NeL=-1
NeL+=1
numb1+=1