return string without punctuation

return string without punctuation - python

I'm a beginner who'd like to return strings in pairs of characters. If the input to the function is odd then the last pair it to include an _.
Example: solution("asdfadb") should return ['as', 'df', 'ad', 'b_']
My code however, returns: ['a', 's']['d', 'f']['a', 'd']['b', '_']
I've tried multiple ways and cannot get it to return the correctly formatted result:
def solution(s):
if len(s)%2 != 0:
s = "".join((s, "_"))
s = list(s)
s = [ s[i:i+2] for i in range(0 , len(s) , 2) ]
s = ''.join(str(pair) for pair in s )
print(s)
solution("asdfadb")
['a', 's']['d', 'f']['a', 'd']['b', '_']

You had a small confusion in the last list comprehension, try this (see my comment):
def solution(s):
if len(s)%2 != 0:
s = "".join((s, "_"))
s = list(s)
s = [ s[i:i+2] for i in range(0 , len(s) , 2) ]
s = [''.join(pair) for pair in s] # For each sublist (aka pair) - do join.
print(s)
Output:
['as', 'df', 'ad', 'b_']

Just a bit more compact than #idanz answer, but the principle is the same:
def solution(s: str):
s = s + "_" if len(s) % 2 !=0 else s
pairs = [s[i:i+2] for i in range(0, len(s), 2)]
print(pairs)
solution("asdfadb")
Output:
['as', 'df', 'ad', 'b_']

Here a solution using a list comprehension, string slicing, and zip_longest:
from itertools import zip_longest
def solution(string):
return ["".join(pair) for pair in zip_longest(string[0::2], string[1::2], fillvalue="_")]
print(solution("asdfadb"))
Output:
['as', 'df', 'ad', 'b_']

Related

How to construct a string from letters of each word from list?

I am wondering how to construct a string, which takes 1st letter of each word from list. Then it takes 2nd letter from each word etc.
For example :
Input --> my_list = ['good', 'bad', 'father']
Every word has different length (but the words in the list could have equal length)
The output should be: 'gbfoaaodtdher'.
I tried:
def letters(my_list):
string = ''
for i in range(len(my_list)):
for j in range(len(my_list)):
string += my_list[j][i]
return string
print(letters(['good', 'bad', 'father']))
and I got:
'gbfoaaodt'.

That's a good job for itertools.zip_longest:
from itertools import zip_longest
s = ''.join([c for x in zip_longest(*my_list) for c in x if c])
print(s)
Or more_itertools.interleave_longest:
from more_itertools import interleave_longest
s = ''.join(interleave_longest(*my_list))
print(s)
Output: gbfoaaodtdher
Used input:
my_list = ['good', 'bad', 'father']

The answer by #mozway is the best approach, but if you want to go along with your original method, this is how
def letters(my_list):
string = ''
max_len = max([len(s) for s in my_list])
for i in range(max_len):
for j in range(len(my_list)):
if i < len(my_list[j]):
string += my_list[j][i]
return string
print(letters(['good', 'bad', 'father']))
Output: gbfoaaodtdher

We can do without zip_longest as well:
l = ['good', 'bad', 'father']
longest_string=max(l,key=len)
''.join(''.join([e[i] for e in l if len(e) > i]) for i in range(len(longest_string)))
#'gbfoaaodtdher'

Splitting consecutive similar characters of a specific length in an array of strings

I have an array
["ejjjjmmtthh", "zxxuueeg", "aanlljrrrxx", "dqqqaaabbb", "oocccffuucccjjjkkkjyyyeehh"]
and need to extract consecutive characters in each string element of length k (in this case 3) without using regex or groupby.
This is what I have so far:
s = ["ejjjjmmtthh", "zxxuueeg", "aanlljrrrxx", "dqqqaaabbb", "oocccffuucccjjjkkkjyyyeehhh"]
k = 3
output = []
for i in s:
result = ""
for j in range(1,len(i)-1):
if i[j]==i[j-1] or i[j]==i[j+1]:
result+=i[j]
if i[-1] == result[-1]:
result+=i[-1]
if i[0]==result[0]:
result=i[0]+result
output.append(result)
print(output)
#current output = ['jjjjmmtthh', 'xxuuee', 'aallrrrxx', 'qqqaaabbb', 'oocccffuucccjjjkkkyyyeehhh']
#expected outcome(for k =3) = ['rrr','qqq','aaa','bbb','ccc','ccc','jjj','kkk','yyy','hhh']
My questions:
How can I accommodate the k condition?
Is there a more optimal way to do this?

This solution is more readable and not too long. It works for k > 0.
s = ["ejjjjmmtthh", "zxxuueeg", "aanlljrrrxx", "dqqqaaabbb", "oocccffuucccjjjkkkjyyyeehhh"]
k = 3
output = []
for element in s:
state = "" #State variable (reset on every list item)
for char in element: #For each character
if state != "" and char == state[-1]: # Check if the last character is the same (only if state isn't empty)
state += char #Add it to the state
else:
if len(state) == k: #Otherwise, check if we have k characters
output.append(state) #Append te result if we do
state = char #Reset the state
#If there are no more characters (end of element), check too
if len(state) == k:
output.append(state)
print(output)
Output for k = 3
['rrr', 'qqq', 'aaa', 'bbb', 'ccc', 'ccc', 'jjj', 'kkk', 'yyy', 'hhh']
Output for k = 1
['e', 'z', 'g', 'n', 'j', 'd', 'j']

Here I group the letters manually by consecutive same letters. Then I only count them to the result in case they have the same length as k. This works but I am sure there is a more optimal way:
s = ["ejjjjmmtthh", "zxxuueeg", "aanlljrrrxx", "dqqqaaabbb", "oocccffuucccjjjkkkjyyyeehhh"]
k = 3
def _next_group(st):
if not st:
return None
first = st[0]
res = [first]
for s in st[1:]:
if s == first:
res.append(s)
else:
break
return res
result = []
for st in s:
while True:
group = _next_group(st)
if not group:
break
if len(group) == k:
result.append("".join(group))
if len(group) == len(st):
break
st = st[len(group):]
print(result)
Output:
['rrr', 'qqq', 'aaa', 'bbb', 'ccc', 'ccc', 'jjj', 'kkk', 'yyy', 'hhh']

A for-loop approach.
Remark: I suggest a divide-and-conquer solution: focus on a single string (and not on a list of strings) and make a function that works and then generalize it with loops/comprehension...
def repeated_chars(string, k=3):
out = []
c, tmp = 0, '' # counter, tmp char
for char in s:
if tmp == '':
tmp = char
c += 1
continue
if tmp == char:
c += 1
else:
if c == k:
out.append((tmp, c))
tmp = char
c = 1
# last term
if c == k:
out.append((tmp, c))
return [char * i for char, i in out]
data = ['jjjjmmtthh', 'xxuuee', 'aallrrrxx', 'qqqaaabbb', 'oocccffuucccjjjkkkyyyeehhh']
# apply the function to all strings
out = []
for s in data:
out.extend(repeated_chars(s, k=3))
print(out)
#['rrr', 'qqq', 'aaa', 'bbb', 'ccc', 'ccc', 'jjj', 'kkk', 'yyy', 'hhh']

Edit: Yes, groubpy shouldn't be used as per requirement, but doing the job requires to groupby in some way (see accepted answer for example), so it seems a good idea to split the responsibilities into multiple functions, as is good practice, by reimplementing a groupby.
groupby seems obvious in that case, if you can't use the one from itertools, just write one.
Also the core function should work on a string, not a list of strings -- just for loop in case you have a list of strings.
Once you have your groubpy it is straightforward:
def extract_groups(s: str, k: int):
return [group for group in groupby(s) if len(group) == k]
Let's try it out:
input_strings = [
"ejjjjmmtthh",
"zxxuueeg",
"aanlljrrrxx",
"dqqqaaabbb",
"oocccffuucccjjjkkkjyyyeehhh",
]
expected_outputs = [
[],
[],
["rrr"],
["qqq", "aaa", "bbb"],
["ccc", "ccc", "jjj", "kkk", "yyy", "hhh"],
]
outputs = [extract_groups(s, k=3) for s in input_strings]
print(outputs == expected_outputs) # True
As it is outputs is a list of groups:
In [ ]: outputs
Out[ ]: [[], [], ['rrr'], ['qqq', 'aaa', 'bbb'], ['ccc', 'ccc', 'jjj', 'kkk', 'yyy', 'hhh']]
If you really want it flat, flatten it:
In [ ]: from itertools import chain
... : list(chain.from_iterable(outputs))
Out[ ]: ['rrr', 'qqq', 'aaa', 'bbb', 'ccc', 'ccc', 'jjj', 'kkk', 'yyy', 'hhh']
In [ ]: [group for s in input_strings for group in extract_groups(s, k)]
Out[ ]: ['rrr', 'qqq', 'aaa', 'bbb', 'ccc', 'ccc', 'jjj', 'kkk', 'yyy', 'hhh']
The groupby function for reference:
def groupby(s: str):
if not s:
return []
result = []
tgt = s[0]
counter = 1
for c in s[1:]:
if c == tgt:
counter += 1
else:
result.append(tgt * counter)
tgt = c
counter = 1
result.append(tgt * counter)
return result

How to merge duplicate elements in a list while adding their "coefficient"?

I'm writing a function 'simplify' to simplify polynomials so that simplify("2xy-yx") can return "xy", simplify("-a+5ab+3a-c-2a")can return "-c+5ab" and so on.
I am at the stage where I have broken the polynomials into multiple monomials as elements for a list and have separated the coefficient of the monomials and the letter (variable) parts.
For instance
input = '3xy+y-2x+2xy'
My process gives me:
Var = ['xy', 'y', 'x', 'xy']
Coe = ['+3', '+1', '-2', '+2']
What I want to do is to merge the same monomials and add up their corresponding coefficients in the other list simultaneously.
My code was:
Play1 = Letter[:]
Play2 = Coe[:]
for i in range(len(Play1) - 1):
for j in range(i+1, len(Play1)):
if Play1[i] == Play1[j]:
Letter.pop(j)
Coe[i] = str(int(Play2[i]) + int(Play2[j]))
Coe.pop(j)
But this seems to only work with lists where each duplicate element appears no more than twice. For instance, input of "-a+5ab+3a-c-2a" gives me:
IndexError: pop index out of range
I thought of using set, but that will change the order.
What's the best way to proceed? Thanks.

Combine your lists with zip() for easier processing, and create a new list:
newVar = []
newCoe = []
for va, co in zip(Var, Coe):
# try/except (EAFP) is very Pythonic
try:
# See if this var is seen
ind = newVar.index(va)
# Yeah, seen, let's add the coefficient
newCoe[ind] = str(int(newCoe[ind]) + int(co))
except ValueError:
# No it's not seen, add both to the new lists
newVar.append(va)
newCoe.append(co)
Because all items are processed in their original order, as well as using list appending instead of hash tables (like set and dict), the order is preserved.

This is typically a use-case where dict come in handy :
from collections import defaultdict
Var = ['xy', 'y', 'x', 'xy']
Coe = ['+3', '+1', '-2', '+2']
polynom = defaultdict(int)
for var, coeff in zip(Var, Coe):
polynom[var] += int(coeff)
Var, Coe = list(polynom.keys()), list(polynom.values())

Your input was:
input = '3xy+y-2x+2xy'
You reached till:
Var = ['xy', 'y', 'x', 'xy']
Coe = ['+3', '+1', '-2', '+2']
Use below code to get --> +5xy-y-2x
def varCo(Var, Coe):
aa = {}
for k, i in enumerate(Var):
if i in aa: aa[i] += int(Coe[k])
else : aa[i] = "" if int(Coe[k]) == 1 else "-" if int(Coe[k]) == -1 else int(Coe[k])
aa = "".join([("" if "-" in str(v) else "+") + str(v)+i for i, v in aa.items() if v != 0])
return aa
Var = ['xy', 'y', 'x', 'xy']
Coe = ['+3', '-1', '-2', '+2']
print (varCo(Var, Coe))
#Result --> +5xy-y-2x

TRY THIS:
with using regex
import re
# a = '3xy+y-2x+2xy'
a = "-a+5ab+3a-c-2a"
i = re.findall(r"[\w]+", a)
j = re.findall(r"[\W]+", a)
if len(i)!=len(j):
j.insert(0,'+')
d = []
e = []
for k in i:
match = re.match(r"([0-9]+)([a-z]+)", k, re.I)
if match:
items = match.groups()
d.append(items[0])
e.append(items[1])
else:
d.append('1')
e.append(k)
print(e)
f = []
for ii,jj in zip(j,d):
f.append(ii+jj)
print(f)
Input:
a = "-a+5ab+3a-c-2a"
Output:
['a', 'ab', 'a', 'c', 'a']
['-1', '+5', '+3', '-1', '-2']
Input:
a = '3xy+y-2x+2xy'
Output:
['xy', 'y', 'x', 'xy']
['+3', '+1', '-2', '+2']

Combine elements of a list with all possible separators

I have the following requirement.
I have a list which say has 3 elements [X,Y,2]
What I would like to do is to generate strings with a separator (say "-") between (or not) each element. The order of the elements in the array should be preserved.
So the output would be:
'XY2'
'X-Y-2'
'X-Y2'
'XY-2'
is there an elegant way to this in python?

>>> import itertools
>>> for c in itertools.product(' -', repeat=2): print ('X%sY%s2' % c).replace(' ', '')
XY2
XY-2
X-Y2
X-Y-2
Or, with the elements coming from a python list:
import itertools
a = ['X', 'Y', 2]
for c in itertools.product(' -', repeat=2):
print ('%s%s%s%s%s' % (a[0],c[0],a[1],c[1],a[2])).replace(' ', '')
Or, in a slightly different style:
import itertools
a = ['X', 'Y', '2']
for c in itertools.product(' -', repeat=2):
print ( '%s'.join(a) % c ).replace(' ', '')
To capture the output to a list:
import itertools
a = ['X', 'Y', '2']
output = []
for c in itertools.product(' -', repeat=len(a)-1):
output.append( ('%s'.join(a) % c).replace(' ', '') )
print 'output=', output

A little more generalized but works for any number of separators and hopefully is easy to understand at each step:
import itertools
a = ['X', 'Y', '2']
all_separators = ['', '-', '+']
results = []
# this product puts all separators in all positions for len-1 (spaces between each element)
for this_separators in itertools.product(all_separators, repeat=len(a)-1):
this_result = []
for pair in itertools.izip_longest(a, this_separators, fillvalue=''):
for element in pair:
this_result.append(element)
# if you want it, here it is as a comprehension
# this_result = [element for pair
# in itertools.izip_longest(a, this_separators, fillvalue='')
# for element in pair]
this_result_string = ''.join(this_result) # check out join docs if it's new to you
results.append(this_result_string)
print results
>>> ['XY2', 'XY-2', 'XY+2', 'X-Y2', 'X-Y-2', 'X-Y+2', 'X+Y2', 'X+Y-2', 'X+Y+2']
These are the results for your case with just '' and '-' as separators:
>>> ['XY2', 'XY-2', 'X-Y2', 'X-Y-2']
If you want everything in one comprehension:
results = [''.join(element for pair
in itertools.izip_longest(a, this_separators, fillvalue='')
for element in pair)
for this_separators in itertools.product(all_separators, repeat=len(a)-1)]

I don't know if there is a function in itertool in order to do that. But i always think it's fun and a good exercice to do this kind of things. So there is a solution with recursive generator :
def generate(liste):
if len(liste) == 1:
yield [liste]
else:
for i in generate(liste[1:]):
yield [[liste[0]]]+i
yield [ [liste[0]]+i[0] ] + i[1:]
if __name__ == "__main__":
for i in generate (["X","Y","2"]):
print "test : " + str(i)
if len(i) == 1:
print "".join(i[0])
else:
print reduce(
lambda left, right : left + "".join(right),
i,
"")

Something like this?
from itertools import permutations
i = ["X","Y","2"]
for result in permutations(i, 3):
print "-".join(result)
Result:
X-Y-2
X-2-Y
Y-X-2
Y-2-X
2-X-Y
2-Y-X

improve this very-simple dictionary generator in python

I'm trying to make a simple dict generator. It works but it isn't very functional yet.
I'd like to improve it by being able to change the max size of the output without touching the code.
letr='abcdefghijklmnopqrstuvwxyz'
for i in range(len(letr)):
t=letr[i]
print t
for t2 in letr:
print t+t2
for t3 in letr:
print t+t2+t3
for t4 in letr:
print t+t2+t3+t4
for t5 in letr:
print t+t2+t3+t4+t5

import itertools
def dict_gen(n):
letr = 'abcdefghijklmnopqrstuvwxyz'
return itertools.chain(''.join(j) for i in range(n)
for j in itertools.product(letr, repeat=i+1))
Usage:
for word in dict_gen(n): # replace n with the max word length you want
print word
Unlike some of the other answers this will include duplicates like your example ('aa', 'bb', etc).
dict_gen() will return a generator, but you can always just pass it into list() if you need to access elements by index:
>>> words = list(dict_gen(5))
>>> len(words) == 26 + 26**2 + 26**3 + 26**4 + 26**5 # verify correct length
True
>>> words[20:30] # transition from one letter to two letters
['u', 'v', 'w', 'x', 'y', 'z', 'aa', 'ab', 'ac', 'ad']
>>> words[-10:] # last 10 elements
['zzzzq', 'zzzzr', 'zzzzs', 'zzzzt', 'zzzzu', 'zzzzv', 'zzzzw', 'zzzzx', 'zzzzy', 'zzzzz']

letr = ''.join(chr(o) for o in range(ord('a'), ord('z') + 1))
import itertools
print [''.join(word) for word in itertools.permutations(letr, 5)]

Itertools is your best friend.
>>> import itertools
>>> gen = ("".join(i) for i in itertools.permutations(letr, 5))
>>> list(gen)[-10:]
['zyxwm', 'zyxwn', 'zyxwo', 'zyxwp', 'zyxwq', 'zyxwr', 'zyxws', 'zyxwt', 'zyxwu', 'zyxwv']
If you want to get all the permuations, you could write a generator yourself:
import itertools
def perms(seq):
for n in range(len(seq)+1):
for i in itertools.permutations(seq, n):
yield i
Check the Python documentation for itertools and generators for more info.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

return string without punctuation - python

Just a bit more compact than #idanz answer, but the principle is the same: def solution(s: str): s = s + "_" if len(s) % 2 !=0 else s pairs = [s[i:i+2] for i in range(0, len(s), 2)] print(pairs) solution("asdfadb") Output: ['as', 'df', 'ad', 'b_']

Here a solution using a list comprehension, string slicing, and zip_longest: from itertools import zip_longest def solution(string): return ["".join(pair) for pair in zip_longest(string[0::2], string[1::2], fillvalue="_")] print(solution("asdfadb")) Output: ['as', 'df', 'ad', 'b_']

Related

How to construct a string from letters of each word from list?

Splitting consecutive similar characters of a specific length in an array of strings

How to merge duplicate elements in a list while adding their "coefficient"?

Combine elements of a list with all possible separators

improve this very-simple dictionary generator in python

Categories

Resources