Replacing chars in a string in every way - python

I'm looking for help on a function that takes a string, and replaces every character in that string in every way. I'm not quite sure how to word my question so that it makes sense so I'll show you what it's supposed to do.
stars('1')
returns ['*']
stars('12')
returns ['*1', '1*', '**']
stars('123')
returns ['*23', '1*3', '12*', '**3', '*2*', '**1', '***']
stars('1234')
returns ['*234', '1*34', '12*4', '123*', '**34', '*2*4', '*23*', '1**4', '1*3*',
'12**', '***4', '**3*', '*2**', '1***', '****']
Did that all out by hand, but even if I made a mistake, you should get the idea of what I'm looking for now. The final case (all *'s) isn't required but I put it in there to make sure the problem was understood.
Here is what I've come up with so far but it doesn't quite work.
def stars(n):
lst = []
length = len(n)
for j in xrange(0, length):
p = list(n)
for k in xrange(j, length):
p[k] = '*'
lst += [''.join(p)]
return lst
Output:
'1' returns ['*']
'12' returns ['*2', '**', '1*']
'123' returns ['*23', '**3', '***', '1*3', '1**', '12*']
'1234' returns ['*234', '**34', '***4', '****', '1*34', '1**4', '1***', '12*4', '12**', '123*']
Any help would be greatly appreciated. Would like this answered in Python if possible, but if you don't know Python, then pseudocode or another language would be acceptable. If it's written clearly, I'm sure I could convert it into Python on my own.

I think the canonical approach in Python would be to use the itertools module:
>>> from itertools import product, cycle
>>> s = 'abcde'
>>> [''.join(chars) for chars in product(*zip(s, cycle('*')))]
['abcde', 'abcd*', 'abc*e', 'abc**', 'ab*de', 'ab*d*', 'ab**e', 'ab***',
'a*cde', 'a*cd*', 'a*c*e', 'a*c**', 'a**de', 'a**d*', 'a***e', 'a****',
'*bcde', '*bcd*', '*bc*e', '*bc**', '*b*de', '*b*d*', '*b**e', '*b***',
'**cde', '**cd*', '**c*e', '**c**', '***de', '***d*', '****e', '*****']
and then you could just toss the first one without any stars, but that might seem a little magical.
ISTM you have two other approaches if you don't want to use the built-in Cartesian product function: you can use recursion, or you can take advantage of the fact that you want to turn each star on and off, a binary switch. That means with n letters you'll have 2^n (-1, if you remove the no-star case) possibilities to return, and whether or not to put a star somewhere corresponds to whether or not the corresponding bit in the number is set (e.g. for 'abc' you'd loop from 1 to 7 inclusive, 1 = 001 so you'd put a star in the last place, 7 = 111 so you'd put a star everywhere, etc.)
This last one is pretty simple to implement, so I'll leave that for you. :^)

You can look at this as a problem of finding and iterating over all subsequences of characters in your original string. (For every subsequence, replace the characters in it by '*', and leave the rest alone).
For a given subsequence, each character is either in it or not, so for an N-character string, there are 2^N subsequences. Probably the easiest way to iterate over them is to iterate over the integers from 0 to (2^N)-1, and use their binary representations as the indications of whether the character should be replaced or not
For N=3, it looks like this:
0 000 abc
1 001 ab*
2 010 a*c
3 011 a**
4 100 *bc
5 101 *b*
6 110 **c
7 111 ***
In Python, you could do it like this:
def stars(input):
l = len(input)
for i in xrange(2**l):
yield ''.join([('*' if i&(2**(l-pos-1)) else ch) for pos, ch in enumerate(input)])
Try it out:
>>> print list(stars('abc'))
['abc', 'ab*', 'a*c', 'a**', '*bc', '*b*', '**c', '***']

Here's a way using combinations :
from itertools import combinations
def stars(str):
N,L = len(str), []
for k in range(0,N+1):
for com in combinations(range(N),k):
S = list(str)
for x in com: S[x] = '*'
L.append(''.join(S))
return L
Try it out:
>>> stars('abc')
['abc', '*bc', 'a*c', 'ab*', '**c', '*b*', 'a**', '***']
>>> stars('1234')
['1234', '*234', '1*34', '12*4', '123*', '**34', '*2*4', '*23*', '1**4', '1*3*', '12**', '***4', '**3*', '*2**', '1***', '****']

Or specific to Python see this function: http://docs.python.org/2/library/itertools.html#itertools.combinations

Related

Is there anyway I can check whether part of a string in python is somewhere in another string? (beginner)

Let's say we have a string 'abc' and another string 'bcd'. If I do 'abc' in 'bcd' it will return false. I want to say 'if there is a character of 'abc' in the string 'bcd' than return true. (python)
edit: thank you for the spelling changes. It makes me feel dumb. They were typos though.
I have tried iterating through the string using for loops, but this is clunky and I am assuming it is not good practice. Anyway I couldn't make it flexible enough for my needs.
import random
symb1 = random.choice('abc#') # I am trying to test if it chose AL one
# symbol
symb2 = random.choice('abc!')
mystring = (symb1+symb2) #lets say mystring is 'a!'
if mystring in '#!' # I want to test here somehow if part of mystring is
# in #!
I want it to output true, and the output is false. I understand why, I just need help creating a way to test for the symbol in mystring
Iterate one of the Strings while doing in checks:
any(c in "#!" for c in mystring)
"Is there any c from mystring in '#!'?"
You could use list comprehensions.
Example;
>>> a = 'abc'
>>> b = 'bcd'
>>> [letter for letter in a if letter in b] # list comprehensions
['b', 'c']
>>> any(letter for letter in a if letter in b) # generator expression
True
as mentioned by #asikorski; change to use generator expression so the loop stops on the first match.
Okay the comprehensions made more sense but I decided to do a more expanded for loop. I found a way to make it less terrible. I'm just teaching python to a friend and I want him to be able to read the code more easily. here's the section of code in the program.
def punc():
characters = int(input('How long would you like your password to be? : '))
passlist = []
for x in range(characters):
add = random.choice(string.ascii_letters + '#####$$$$$~~~~!!!!!?????')
passlist.append(add)
l = []
for x in passlist:
if x in ['#','$','~','!','?']:
l.append(0)
if len(l) == 0:
punc()
else:
print(''.join(passlist))
There are few solutions:
the best one (both effective and pythonic):
if any(char in 'bcd' for char in 'abc'):
...
Uses generator + built-in any. It does not have to create a list with list comprehension and doesn't waste time to gather all the letters that are in both of strings, so it's memory effective.
the boring one (~ C style):
def check_letters(word1, word2):
for char in word1:
if char in word2:
return True
reeturn False
if check_letters('abc', 'bcd'):
...
Quite obvious.
the fancy one:
if set(list('abc')) & set(list('bcd')):
...
This one uses some tricks. list converts string to a list of letters, set creates a set of letters from the list. Then intersetion of two sets is created with & operator and if's condition evaluates to True if there's any element in the intersection.
It's not very effective, though; it has to create two lists and two sets.

How to loop to generate string in sequence?

I am trying to create a loop where I can generate string using loop. What I am trying to achieve is that I want to create a small collection of strings starting from 1 character to up to 5 characters.
So, starting from sting 1, I want to go to 55555 but this is number so it seems easy if I just add them, but when it comes to alpha numeric, it gets tricky.
Here is explanation,
I have collection of alpha-numeric chars as string s = "123ABC" and what I want to do is that I want to create all possible 1 character string out of it, so I will have 1,2,3,A,B,C and after that I want to add one more digit in length of string so I can get 11, 12, 13 and so on until I get all possible combination out of it up to CA, CB, CC and I want to get it up to CCCCCC. I am confused in loop because I can get it to generate a temp sting but looping inside to rotate characters is tricky,
this is what I have done so far,
i = 0
strr = "123ABC"
while i < len(strr):
t = strr[0] * (i+1)
for q in range(0, len(t)):
# Here I need help to rotate more
pass
i += 1
Can anyone explain me or point me to resource where I can find solution for it?
You may want to use itertools.permutations function:
import itertools
chars = '123ABC'
for i in xrange(1, len(chars)+1):
print list(itertools.permutations(chars, i))
EDIT:
To get a list of strings, try this:
import itertools
chars = '123ABC'
strings = []
for i in xrange(1, len(chars)+1):
strings.extend(''.join(x) for x in itertools.permutations(chars, i))
This is a nested loop. Different depths of recursion produce all possible combinations.
strr = "123ABC"
def prod(items, level):
if level == 0:
yield []
else:
for first in items:
for rest in prod(items, level-1):
yield [first] + rest
for ln in range(1, len(strr)+1):
print("length:", ln)
for s in prod(strr, ln):
print(''.join(s))
It is also called cartesian product and there is a corresponding function in itertools.

Realizing if there is a pattern in a string (Does not need to start at index 0, could be any length)

Coding a program to detect a n-length pattern in a string, even without knowing where the pattern starts, could be easily done by creating a list of n-length substrings and check if starting at one point there are same items or the rest of the list. Without any piece of information other than the string to check through, is the only way to recognize the pattern is to brute-force through all lengths and check or is there a more efficient algorithm?
(I'm just a beginner in Python, so this may be easy to code... )
Current code that only suits checking for starting at index 0:
def search(s):
match=s[0]+s[1]
while (match != s) and (match[0] != match[-1]):
for matchLen in range(len(match),len(s)-1):
letter = s[matchLen]
if letter == match[-1]:
match += s[len(match)]
break
if match == s:
return None
else:
return match[:-1]
You can use re.findall(r'(.{2,})\1+', string). The parentheses creates a capture group that is later backreferenced by \1. The . matches any character (except for line breaks). The {2,} requires the pattern to be at least two characters long (otherwise strings like ss would be considered a pattern). Finally the + requires that pattern to repeat 1 or more times (in addition to the first time that it occurred inside the capture group). You can see it working in action.
Pattern is a far too vague term, but assuming you mean some string repeating itself, the regexp (?P<pat>.+)(?P=pat) will work.
Given a string what you could do is -
You start with length = 1, and take two pointer variables i and j which you shall use to traverse the string.
Set i = 0 and j = i+length
if str[i]==str[j]:
i++,j++ // till j not equal to length of string
else:
length = length + 1
//increase length by 1 and start the algorithm over from i = 0
Take the example abcdeabcde :
In this we see
Initially i = 0, j = 1 ,
but str[0]!=str[1] i.e. a!=b,
Then we get length = 2 i.e., i = 0,j = 2
but str[0]!=str[2] i.e. a!=c,
Continuing in the same fashion,
We see when length = 5 and i = 0 and j = 5,
str[0]==str[5]
and thus you can see that i and j increment till j is equal to string length.
And you have your answer that is the pattern length. It may not seem obvious but i would suggest you dry-run this algorithm over some of your test cases and let me know the results.
You can use re.findall() to find all matches:
import re
s = "somethingabcdeabcdeabcdeabcdeabcdeelseabcdeabcdeabcde"
li = re.findall(r'abcde',s)
print(li)
Output:
['abcde', 'abcde', 'abcde', 'abcde', 'abcde', 'abcde', 'abcde', 'abcde']

Python - build new string of specific length with n replacements from specific alphabet

I have been working on a fast, efficient way to solve the following problem, but as of yet, I have only been able to solve it using a rather slow, nest-loop solution. Anyways, here is the description:
So I have a string of length L, lets say 'BBBX'. I want to find all possible strings of length L, starting from 'BBBX', that differ at, at most, 2 positions and, at least, 0 positions. On top of that, when building the new strings, new characters must be selected from a specific alphabet.
I guess the size of the alphabet doesn't matter, so lets say in this case the alphabet is ['B', 'G', 'C', 'X'].
So, some sample output would be, 'BGBG', 'BGBC', 'BBGX', etc. For this example with a string of length 4 with up to 2 substitutions, my algorithm finds 67 possible new strings.
I have been trying to use itertools to solve this problem, but I am having a bit of difficulty finding a solution. I try to use itertools.combinations(range(4), 2) to find all the possible positions. I am then thinking of using product() from itertools to build all of the possibilities, but I am not sure if there is a way I could connect it somehow to the indices from the output of combinations().
Here's my solution.
The first for loop tells us how many replacements we will perform. (0, 1 or 2 - we go through each)
The second loop tells us which letters we will change (by their indexes).
The third loop goes through all of the possible letter changes for those indexes. There's some logic to make sure we actually change the letter (changing "C" to "C" doesn't count).
import itertools
def generate_replacements(lo, hi, alphabet, text):
for count in range(lo, hi + 1):
for indexes in itertools.combinations(range(len(text)), count):
for letters in itertools.product(alphabet, repeat=count):
new_text = list(text)
actual_count = 0
for index, letter in zip(indexes, letters):
if new_text[index] == letter:
continue
new_text[index] = letter
actual_count += 1
if actual_count == count:
yield ''.join(new_text)
for text in generate_replacements(0, 2, 'BGCX', 'BBBX'):
print text
Here's its output:
BBBX GBBX CBBX XBBX BGBX BCBX BXBX BBGX BBCX BBXX BBBB BBBG BBBC GGBX
GCBX GXBX CGBX CCBX CXBX XGBX XCBX XXBX GBGX GBCX GBXX CBGX CBCX CBXX
XBGX XBCX XBXX GBBB GBBG GBBC CBBB CBBG CBBC XBBB XBBG XBBC BGGX BGCX
BGXX BCGX BCCX BCXX BXGX BXCX BXXX BGBB BGBG BGBC BCBB BCBG BCBC BXBB
BXBG BXBC BBGB BBGG BBGC BBCB BBCG BBCC BBXB BBXG BBXC
Not tested much, but it does find 67 for the example you gave. The easy way to connect the indices to the products is via zip():
def sub(s, alphabet, minsubs, maxsubs):
from itertools import combinations, product
origs = list(s)
alphabet = set(alphabet)
for nsubs in range(minsubs, maxsubs + 1):
for ix in combinations(range(len(s)), nsubs):
prods = [alphabet - set(origs[i]) for i in ix]
s = origs[:]
for newchars in product(*prods):
for i, char in zip(ix, newchars):
s[i] = char
yield "".join(s)
count = 0
for s in sub('BBBX', 'BGCX', 0, 2):
count += 1
print s
print count
Note: the major difference from FogleBird's is that I posted first - LOL ;-) The algorithms are very similar. Mine constructs the inputs to product() so that no substitution of a letter for itself is ever attempted; FogleBird's allows "identity" substitutions, but counts how many valid substitutions are made and then throws the result away if any identity substitutions occurred. On longer words and a larger number of substitutions, that can be much slower (potentially the difference between len(alphabet)**nsubs and (len(alphabet)-1)**nsubs times around the ... in product(): loop).

How to determine the sum of a group of integers without using recursion

This is my first post on Stack Overflow, and I'm hoping that it'll be a good one.
This is a problem that I thought up myself, and now I'm a bit embarrassed to say, but it's beating the living daylights out of me. Please note that this is not a homework exercise, scout's honor.
Basically, the program takes (as input) a string made up of integers from 0 to 9.
strInput = '2415043'
Then you need to break up that string of numbers into smaller groups of numbers, until eventually, the sum of those groups give you a pre-defined total.
In the case of the above string, the target is 289.
iTarget = 289
For this example, there are two correct answers (but most likely only one will be displayed, since the program stops once the target has been reached):
Answer 1 = 241, 5, 043 (241 + 5 + 043 = 289)
Answer 2 = 241, 5, 0, 43 (241 + 5 + 0 + 43 = 289)
Note that the integers do not change position. They are still in the same order that they were in the original string.
Now, I know how to solve this problem using recursion. But the frustrating part is that I'm NOT ALLOWED to use recursion.
This needs to be solved using only 'while' and 'for' loops. And obviously lists and functions are okay as well.
Below is some of the code that I have so far:
My Code:
#Pre-defined input values, for the sake of simplicity
lstInput = ['2','4','1','5','0','4','3'] #This is the kind of list the user will input
sJoinedList = "".join(lstInput) #sJoinedList = '2415043'
lstWorkingList = [] #All further calculuations are performed on lstWorkingList
lstWorkingList.append(sJoinedList) #lstWorkingList = ['2415043']
iTarget = 289 #Target is pre-defined
-
def SumAll(_lst): #Adds up all the elements in a list
iAnswer = 0 #E.g. lstEg = [2,41,82]
for r in _lst: # SumAll(lstEg) = 125
iAnswer += int(r)
return(iAnswer)
-
def AddComma(_lst):
#Adds 1 more comma to a list and resets all commas to start of list
#E.g. lstEg = [5,1001,300] (Note only 3 groups / 2 commas)
# AddComma(lstEg)
# [5,1,0,001300] (Now 4 groups / 3 commas)
iNoOfCommas = len(_lst) - 1 #Current number of commas in list
sResetString = "".join(_lst) #Make a string with all the elements in the list
lstTemporaryList = []
sTemp = ""
i = 0
while i < iNoOfCommas +1:
sTemp += sResetString[i]+',' #Add a comma after every element
i += 1
sTemp += sResetString[i:]
lstTemporaryList = sTemp.split(',') #Split sTemp into a list, using ',' as a separator
#Returns list in format ['2', '415043'] or ['2', '4', '15043']
return(lstTemporaryList)
return(iAnswer)
So basically, the Pseudo-code will look something like this:
Pseudo-Code:
while SumAll(lstWorkingList) != iTarget: #While Sum != 289
if(len(lstWorkingList[0]) == iMaxLength): #If max possible length of first element is reached
AddComma(lstWorkingList) #then add a new comma / group and
Reset(lstWorkingList) #reset all the commas to the beginning of the list to start again
else:
ShiftGroups() #Keep shifting the comma's until all possible combinations
#for this number of comma's have been tried
#Otherwise, Add another comma and repeat the whole process
Phew! That was quite a mouthfull .
I have worked through the process that the program will follow on a piece of paper, so below is the expected output:
OUTPUT:
[2415043] #Element 0 has reached maximum size, so add another group
#AddComma()
#Reset()
[2, 415043] #ShiftGroups()
[24, 15043] #ShiftGroups()
[241, 5043] #ShiftGroups()
#...etc...etc...
[241504, 3] #Element 0 has reached maximum size, so add another group
#AddComma()
#Reset()
[2, 4, 15043] #ShiftGroups()
[2, 41, 5043] #ShiftGroups()
#etc...etc...
[2, 41504, 3] #Tricky part
Now here is the tricky part.
In the next step, the first element must become 24, and the other two must reset.
#Increase Element 0
#All other elements Reset()
[24, 1, 5043] #ShiftGroups()
[24, 15, 043] #ShiftGroups()
#...etc...etc
[24, 1504, 3]
#Increase Element 0
#All other elements Reset()
[241, 5, 043] #BINGO!!!!
Okay. That is the basic flow of the program logic. Now the only thing I need to figure out, is how to get it to work without recursion.
For those of you that have been reading up to this point, I sincerely thank you and hope that you still have the energy left to help me solve this problem.
If anything is unclear, please ask and I'll clarify (probably in excruciating detail X-D).
Thanks again!
Edit: 1 Sept 2011
Thank you everyone for responding and for your answers. They are all very good, and definitely more elegant than the route I was following.
However, my students have never worked with 'import' or any data-structures more advanced than lists. They do, however, know quite a few list functions.
I should also point out that the students are quite gifted mathematically, many of them have competed and placed in international math olympiads. So this assignment is not beyond the scope of
their intelligence, perhaps only beyond the scope of their python knowledge.
Last night I had a Eureka! moment. I have not implemented it yet, but will do so over the course of the weekend and then post my results here. It may be somewhat crude, but I think it will get the job done.
Sorry it took me this long to respond, my internet cap was reached and I had to wait until the 1st for it to reset. Which reminds me, happy Spring everyone (for those of you in the Southern Hempisphere).
Thanks again for your contributions. I will choose the top answer after the weekend.
Regards!
A program that finds all solutions can be expressed elegantly in functional style.
Partitions
First, write a function that partitions your string in every possible way. (The following implementation is based on http://code.activestate.com/recipes/576795/.) Example:
def partitions(iterable):
'Returns a list of all partitions of the parameter.'
from itertools import chain, combinations
s = iterable if hasattr(iterable, '__getslice__') else tuple(iterable)
n = len(s)
first, middle, last = [0], range(1, n), [n]
return [map(s.__getslice__, chain(first, div), chain(div, last))
for i in range(n) for div in combinations(middle, i)]
Predicate
Now, you'll need to filter the list to find those partitions that add to the desired value. So write a little function to test whether a partition satisfies this criterion:
def pred(target):
'Returns a function that returns True iff the numbers in the partition sum to iTarget.'
return lambda partition: target == sum(map(int, partition))
Main program
Finally, write your main program:
strInput = '2415043'
iTarget = 289
# Run through the list of partitions and find partitions that satisfy pred
print filter(pred(iTarget), partitions(strInput))
Note that the result is calculated in a single line of code.
Result: [['241', '5', '043'], ['241', '5', '0', '43']]
Recursion isn't the best tool for the job anyways. itertools.product is.
Here's how I search it:
Imagine the search space as all the binary strings of length l, where l is the length of your string minus one.
Take one of these binary strings
Write the numbers in the binary string in between the numbers of your search string.
2 4 1 5 0 4 3
1 0 1 0 1 0
Turn the 1's into commas and the 0's into nothing.
2,4 1,5 0,4 3
Add it all up.
2,4 1,5 0,4 3 = 136
Is it 289? Nope. Try again with a different binary string.
2 4 1 5 0 4 3
1 0 1 0 1 1
You get the idea.
Onto the code!
import itertools
strInput = '2415043'
intInput = map(int,strInput)
correctOutput = 289
# Somewhat inelegant, but what the heck
JOIN = 0
COMMA = 1
for combo in itertools.product((JOIN, COMMA), repeat = len(strInput) - 1):
solution = []
# The first element is ALWAYS a new one.
for command, character in zip((COMMA,) + combo, intInput):
if command == JOIN:
# Append the new digit to the end of the most recent entry
newValue = (solution[-1] * 10) + character
solution[-1] = newValue
elif command == COMMA:
# Create a new entry
solution.append(character)
else:
# Should never happen
raise Exception("Invalid command code: " + command)
if sum(solution) == correctOutput:
print solution
EDIT:
agf posted another version of the code. It concatenates the string instead of my somewhat hacky multiply by 10 and add approach. Also, it uses true and false instead of my JOIN and COMMA constants. I'd say the two approaches are equally good, but of course I am biased. :)
import itertools
strInput = '2415043'
correctOutput = 289
for combo in itertools.product((True, False), repeat = len(strInput) - 1):
solution = []
for command, character in zip((False,) + combo, strInput):
if command:
solution[-1] += character
else:
solution.append(character)
solution = [int(x) for x in solution]
if sum(solution) == correctOutput:
print solution
To expand on pst's hint, instead of just using the call stack as recursion does, you can create an explicit stack and use it to implement a recursive algorithm without actually calling anything recursively. The details are left as an exercise for the student ;)

Categories