Printing alphabets advanced by n in Python - python

how can i write a python program to intake some alphabets in and print out (alphabets+n) in the output. Example
my_string = 'abc'
expected_output = 'cde' # n=2
One way I've thought is by using str.maketrans, and mapping the original input to (alphabets + n). Is there any other way?
PS: xyz should translate to abc
I've tried to write my own code as well for this, (apart from the infinitely better answers mentioned):
number = 2
prim = """abc! fgdf """
final = prim.lower()
for x in final:
if(x =="y"):
print("a", end="")
elif(x=="z"):
print("b", end="")
else:
conv = ord(x)
x = conv+number
print(chr(x),end="")
Any comments on how to not convert special chars? thanks

If you don't care about wrapping around, you can just do:
def shiftString(string, number):
return "".join(map(lambda x: chr(ord(x)+number),string))
If you do want to wrap around (think Caesar chiffre), you'll need to specify a start and an end of where the alphabet begins and ends:
def shiftString(string, number, start=97, num_of_symbols=26):
return "".join(map(lambda x: chr(((ord(x)+number-start) %
num_of_symbols)+start) if start <= ord(x) <= start+num_of_symbols
else x,string))
That would, e.g., convert abcxyz, when given a shift of 2, into cdezab.
If you actually want to use it for "encryption", make sure to exclude non-alphabetic characters (like spaces etc.) from it.
edit: Shameless plug of my Vignère tool in Python
edit2: Now only converts in its range.

How about something like
>>> my_string = "abc"
>>> n = 2
>>> "".join([ chr(ord(i) + n) for i in my_string])
'cde'
Note As mentioned in comments the question is bit vague about what to do when the edge cases are encoundered like xyz
Edit To take care of edge cases, you can write something like
>>> from string import ascii_lowercase
>>> lower = ascii_lowercase
>>> input = "xyz"
>>> "".join([ lower[(lower.index(i)+2)%26] for i in input ])
'zab'
>>> input = "abc"
>>> "".join([ lower[(lower.index(i)+2)%26] for i in input ])
'cde'

I've made the following change to the code:
number = 2
prim = """Special() ops() chars!!"""
final = prim.lower()
for x in final:
if(x =="y"):
print("a", end="")
elif(x=="z"):
print("b", end="")
elif (ord(x) in range(97, 124)):
conv = ord(x)
x = conv+number
print(chr(x),end="")
else:
print(x, end="")
**Output**: urgekcn() qru() ejctu!!

test_data = (('abz', 2), ('abc', 3), ('aek', 26), ('abcd', 25))
# translate every character
def shiftstr(s, k):
if not (isinstance(s, str) and isinstance(k, int) and k >=0):
return s
a = ord('a')
return ''.join([chr(a+((ord(c)-a+k)%26)) for c in s])
for s, k in test_data:
print(shiftstr(s, k))
print('----')
# translate at most 26 characters, rest look up dictionary at O(1)
def shiftstr(s, k):
if not (isinstance(s, str) and isinstance(k, int) and k >=0):
return s
a = ord('a')
d = {}
l = []
for c in s:
v = d.get(c)
if v is None:
v = chr(a+((ord(c)-a+k)%26))
d[c] = v
l.append(v)
return ''.join(l)
for s, k in test_data:
print(shiftstr(s, k))
Testing shiftstr_test.py (above code):
$ python3 shiftstr_test.py
cdb
def
aek
zabc
----
cdb
def
aek
zabc
It covers wrapping.

Related

Python - removing repeated letters in a string

Say I have a string in alphabetical order, based on the amount of times that a letter repeats.
Example: "BBBAADDC".
There are 3 B's, so they go at the start, 2 A's and 2 D's, so the A's go in front of the D's because they are in alphabetical order, and 1 C. Another example would be CCCCAAABBDDAB.
Note that there can be 4 letters in the middle somewhere (i.e. CCCC), as there could be 2 pairs of 2 letters.
However, let's say I can only have n letters in a row. For example, if n = 3 in the second example, then I would have to omit one "C" from the first substring of 4 C's, because there can only be a maximum of 3 of the same letters in a row.
Another example would be the string "CCCDDDAABC"; if n = 2, I would have to remove one C and one D to get the string CCDDAABC
Example input/output:
n=2: Input: AAABBCCCCDE, Output: AABBCCDE
n=4: Input: EEEEEFFFFGGG, Output: EEEEFFFFGGG
n=1: Input: XXYYZZ, Output: XYZ
How can I do this with Python? Thanks in advance!
This is what I have right now, although I'm not sure if it's on the right track. Here, z is the length of the string.
for k in range(z+1):
if final_string[k] == final_string[k+1] == final_string[k+2] == final_string[k+3]:
final_string = final_string.translate({ord(final_string[k]): None})
return final_string
Ok, based on your comment, you're either pre-sorting the string or it doesn't need to be sorted by the function you're trying to create. You can do this more easily with itertools.groupby():
import itertools
def max_seq(text, n=1):
result = []
for k, g in itertools.groupby(text):
result.extend(list(g)[:n])
return ''.join(result)
max_seq('AAABBCCCCDE', 2)
# 'AABBCCDE'
max_seq('EEEEEFFFFGGG', 4)
# 'EEEEFFFFGGG'
max_seq('XXYYZZ')
# 'XYZ'
max_seq('CCCDDDAABC', 2)
# 'CCDDAABC'
In each group g, it's expanded and then sliced until n elements (the [:n] part) so you get each letter at most n times in a row. If the same letter appears elsewhere, it's treated as an independent sequence when counting n in a row.
Edit: Here's a shorter version, which may also perform better for very long strings. And while we're using itertools, this one additionally utilises itertools.chain.from_iterable() to create the flattened list of letters. And since each of these is a generator, it's only evaluated/expanded at the last line:
import itertools
def max_seq(text, n=1):
sequences = (list(g)[:n] for _, g in itertools.groupby(text))
letters = itertools.chain.from_iterable(sequences)
return ''.join(letters)
hello = "hello frrriend"
def replacing() -> str:
global hello
j = 0
for i in hello:
if j == 0:
pass
else:
if i == prev:
hello = hello.replace(i, "")
prev = i
prev = i
j += 1
return hello
replacing()
looks a bit primal but i think it works, thats what i came up with on the go anyways , hope it helps :D
Here's my solution:
def snip_string(string, n):
list_string = list(string)
list_string.sort()
chars = set(string)
for char in chars:
while list_string.count(char) > n:
list_string.remove(char)
return ''.join(list_string)
Calling the function with various values for n gives the following output:
>>> string = "AAAABBBCCCDDD"
>>> snip_string(string, 1)
'ABCD'
>>> snip_string(string, 2)
'AABBCCDD'
>>> snip_string(string, 3)
'AAABBBCCCDDD'
>>>
Edit
Here is the updated version of my solution, which only removes characters if the group of repeated characters exceeds n.
import itertools
def snip_string(string, n):
groups = [list(g) for k, g in itertools.groupby(string)]
string_list = []
for group in groups:
while len(group) > n:
del group[-1]
string_list.extend(group)
return ''.join(string_list)
Output:
>>> string = "DDDAABBBBCCABCDE"
>>> snip_string(string, 3)
'DDDAABBBCCABCDE'
from itertools import groupby
n = 2
def rem(string):
out = "".join(["".join(list(g)[:n]) for _, g in groupby(string)])
print(out)
So this is the entire code for your question.
s = "AABBCCDDEEE"
s2 = "AAAABBBDDDDDDD"
s3 = "CCCCAAABBDDABBB"
s4 = "AAAAAAAA"
z = "AAABBCCCCDE"
With following test:
AABBCCDDEE
AABBDD
CCAABBDDABB
AA
AABBCCDE

Need help regarding random string generation python

What I want is to generate a string in this specific format: l+l+l+d+d+d+d+l+d+l+l+l+l+d+d+d+d+l+d+l+l+l+l+d+d+d+d+l+d+l+l+l+l+d+d+d+d+l+d+l With each l and d being a different string or number.
The issue is when I try to generate, the whole thing is the same value/string. But I want it different.
Here is an example:
What I am getting:
lll9999l9llll9999l9llll9999l9llll9999l9l
What I need:
bfb7491w3anfr4530x2zzbg9891u2rbep8421m9s
def id_gen():
l = random.choice(string.ascii_lowercase)
d = random.choice(string.digits)
id = l+l+l+d+d+d+d+l+d+l+l+l+l+d+d+d+d+l+d+l+l+l+l+d+d+d+d+l+d+l+l+l+l+d+d+d+d+l+d+l
print(id)
The result:
lll9999l9llll9999l9llll9999l9llll9999l9l
I need this to generate something different :)
This seems to work for me:
def gen_id() :
pattern = 'lllddddldllllddddldllllddddldllllddddldl'
digits = [random.choice(string.digits) for i in range(len(pattern))]
letters = [random.choice(string.ascii_lowercase) for i in range(len(pattern))]
return ''.join( digits[i] if pattern[i] == 'd' else letters[i] for i in range(len(pattern)) )
testing:
>>> gen_id()
'lnx1066k0hnrd5409d1nhgo1254t6rzyw5165f8v'
>>> gen_id()
'sbc7119f4ythd8845i1afay1900f4wjcv0659b4e'
>>> gen_id()
'yan6228r0nebj5097y7jnwh7065s7osra0391j5f'
>>>
seems different enough... please, don't forget to import string, random =)
To not consume the random generator, IMHO this is the best solution:
def gen_id(pattern) :
l = len(pattern)
d = pattern.count('d')
digits = random.choices(string.digits, d)
letters = random.choices(string.ascii_lowercase, l-d)
return ''.join( digits.pop() if pattern[i] == 'd' else letters.pop() for i in range(l) )
You can use this to get a random combination of letters and digits in the desired order:
def letter():
return random.choice(string.ascii_lowercase)
def digit():
return random.choice(string.digits)
def id_gen():
return letter() + digit() + letter() + letter() # ldll

Find longest unique substring in string python

I am trying that age old question (there are multitudes of versions around) of finding the longest substring of a string which doesn't contain repeated characters. I can't work out why my attempt doesn't work properly:
def findLongest(inputStr):
resultSet = []
substr = []
for c in inputStr:
print ("c: ", c)
if substr == []:
substr.append([c])
continue
print(substr)
for str in substr:
print ("c: ",c," - str: ",str,"\n")
if c in str:
resultSet.append(str)
substr.remove(str)
else:
str.append(c)
substr.append([c])
print("Result set:")
print(resultSet)
return max(resultSet, key=len)
print (findLongest("pwwkewambb"))
When my output gets to the second 'w', it doesn't iterate over all the substr elements. I think I've done something silly, but I can't see what it is so some guidance would be appreciated! I feel like I'm going to kick myself at the answer...
The beginning of my output:
c: p
c: w
[['p']]
c: w - str: ['p']
c: w
[['p', 'w'], ['w']]
c: w - str: ['p', 'w'] # I expect the next line to say c: w - str: ['w']
c: k
[['w'], ['w']] # it is like the w was ignored as it is here
c: k - str: ['w']
c: k - str: ['w']
...
EDIT:
I replaced the for loop with
for idx, str in enumerate(substr):
print ("c: ",c," - str: ",str,"\n")
if c in str:
resultSet.append(str)
substr[idx] = []
else:
str.append(c)
and it produces the correct result. The only thing is that the empty element arrays get set with the next character. It seems a bit pointless; there must be a better way.
My expected output is kewamb.
e.g.
c: p
c: w
[['p']]
c: w - str: ['p']
c: w
[['p', 'w'], ['w']]
c: w - str: ['p', 'w']
c: w - str: ['w']
c: k
[[], [], ['w']]
c: k - str: []
c: k - str: []
c: k - str: ['w']
c: e
[['k'], ['k'], ['w', 'k'], ['k']]
c: e - str: ['k']
c: e - str: ['k']
c: e - str: ['w', 'k']
c: e - str: ['k']
...
Edit, per comment by #seymour on incorrect responses:
def find_longest(s):
_longest = set()
def longest(x):
if x in _longest:
_longest.clear()
return False
_longest.add(x)
return True
return ''.join(max((list(g) for _, g in groupby(s, key=longest)), key=len))
And test:
In [101]: assert find_longest('pwwkewambb') == 'kewamb'
In [102]: assert find_longest('abcabcbb') == 'abc'
In [103]: assert find_longest('abczxyabczxya') == 'abczxy'
Old answer:
from itertools import groupby
s = set() ## for mutable access
''.join(max((list(g) for _, g in groupby('pwwkewambb', key=lambda x: not ((s and x == s.pop()) or s.add(x)))), key=len))
'kewamb'
groupby returns an iterator grouped based on the function provided in the key argument, which by default is lambda x: x. Instead of the default we are utilizing some state by using a mutable structure (which could have been done a more intuitive way if using a normal function)
lambda x: not ((s and x == s.pop()) or s.add(x))
What is happening here is since I can't reassign a global assignment in a lambda (again I can do this, using a proper function), I just created a global mutable structure that I can add/remove. The key (no pun) is that I only keep elements that I need by using a short circuit to add/remove items as needed.
max and len are fairly self explanatory, to get the longest list produced by groupby
Another version without the mutable global structure business:
def longest(x):
if hasattr(longest, 'last'):
result = not (longest.last == x)
longest.last = x
return result
longest.last = x
return True
''.join(max((list(g) for _, g in groupby('pwwkewambb', key=longest)), key=len))
'kewamb'
Not sure what is wrong in your attempt, but it's complex and in:
for str in substr:
print ("c: ",c," - str: ",str,"\n")
if c in str:
resultSet.append(str)
substr.remove(str)
you're removing elements from a list while iterating on it: don't do that, it gives unexpected results.
Anyway, my solution, not sure it's intuitive, but it's probably simpler & shorter:
slice the string with an increasing index
for each slice, create a set and store letters until you reach the end of the string or a letter is already in the set. Your index is the max length
compute the max of this length for every iteration & store the corresponding string
Code:
def findLongest(s):
maxlen = 0
longest = ""
for i in range(0,len(s)):
subs = s[i:]
chars = set()
for j,c in enumerate(subs):
if c in chars:
break
else:
chars.add(c)
else:
# add 1 when end of string is reached (no break)
# handles the case where the longest string is at the end
j+=1
if j>maxlen:
maxlen=j
longest=s[i:i+j]
return longest
print(findLongest("pwwkewambb"))
result:
kewamb
Depends on your definition of repeated characters: if you mean consecutive, then the approved solution is slick, but not of characters appearing more than once (e.g.: pwwkewabmb -> 'kewabmb' ).
Here's what I came up with (Python 2):
def longest(word):
begin = 0
end = 0
longest = (0,0)
for i in xrange(len(word)):
try:
j = word.index(word[i],begin,end)
# longest?
if end-begin >= longest[1]-longest[0]:
longest = (begin,end)
begin = j+1
if begin==end:
end += 1
except:
end = i+1
end=i+1
if end-begin >= longest[1]-longest[0]:
longest = (begin,end)
return word[slice(*longest)]
Thus
>>> print longest('pwwkewabmb')
kewabm
>>> print longest('pwwkewambb')
kewamb
>>> print longest('bbbb')
b
My 2-cents:
from collections import Counter
def longest_unique_substr(s: str) -> str:
# get all substr-ings from s, starting with the longest one
for substr_len in range(len(s), 0, -1):
for substr_start_index in range(0, len(s) - substr_len + 1):
substr = s[substr_start_index : substr_start_index + substr_len]
# check if all substr characters are unique
c = Counter(substr)
if all(v == 1 for v in c.values()):
return substr
# ensure empty string input returns ""
return ""
Run:
In : longest_unique_substr('pwwkewambb')
Out: 'kewamb'
s=input()
ma=0
n=len(s)
l=[]
a=[]
d={}
st=0;i=0
while i<n:
if s[i] not in d:
d[s[i]]=i
l.append(s[i])
else:
t=d[s[i]]
d[s[i]]=i
s=s[t+1:]
d={}
n=len(s)
if len(l)>=3:
a.append(l)
ma=max(ma,len(l))
l=[];i=-1
i=i+1
if len(l)!=0 and len(l)>=3:
a.append(l)
ma=max(ma,len(l))
if len(a)==0:
print("-1")
else:
for i in a:
if len(i)==ma:
for j in i:
print(j,end="")
break

How to compress by removing duplicates in python?

I have strings with blocks of the same character in, eg '1254,,,,,,,,,,,,,,,,982'. What I'm aiming to do is replace that with something along the lines of '1254(,16)982' so that the original string can be reconstructed. If anyone could point me in the right direction that would be greatly appreciated
You're looking for run-length encoding: here is a Python implementation based loosely on this one.
import itertools
def runlength_enc(s):
'''Return a run-length encoded version of the string'''
enc = ((x, sum(1 for _ in gp)) for x, gp in itertools.groupby(s))
removed_1s = [((c, n) if n > 1 else c) for c, n in enc]
joined = [["".join(g)] if n == 1 else list(g)
for n, g in itertools.groupby(removed_1s, key=len)]
return list(itertools.chain(*joined))
def runlength_decode(enc):
return "".join((c[0] * c[1] if len(c) == 2 else c) for c in enc)
For your example:
print runlength_enc("1254,,,,,,,,,,,,,,,,982")
# ['1254', (',', 16), '982']
print runlength_decode(runlength_enc("1254,,,,,,,,,,,,,,,,982"))
# 1254,,,,,,,,,,,,,,,,982
(Note that this will be efficient only if there are very long runs in your string).
If you don't care about the exact compressed form you may want to look at zlib.compress and zlib.decompress. zlibis a standard Python library that can compress a single string and will probably get better compression than a self implemented compression algorithm.
using regular expressions:
s = '1254,,,,,,,,,,,,,,,,982'
import re
c = re.sub(r'(.)\1+', lambda m: '(%s%d)' % (m.group(1), len(m.group(0))), s)
print c # 1254(,16)982
using itertools
import itertools
c = ''
for chr, g in itertools.groupby(s):
k = len(list(g))
c += chr if k == 1 else '(%s%d)' % (chr, k)
print c # 1254(,16)982

removing non-numeric characters from a string

strings = ["1 asdf 2", "25etrth", "2234342 awefiasd"] #and so on
Which is the easiest way to get [1, 25, 2234342]?
How can this be done without a regex module or expression like (^[0-9]+)?
One could write a helper function to extract the prefix:
def numeric_prefix(s):
n = 0
for c in s:
if not c.isdigit():
return n
else:
n = n * 10 + int(c)
return n
Example usage:
>>> strings = ["1asdf", "25etrth", "2234342 awefiasd"]
>>> [numeric_prefix(s) for s in strings]
[1, 25, 2234342]
Note that this will produce correct output (zero) when the input string does not have a numeric prefix (as in the case of empty string).
Working from Mikel's solution, one could write a more concise definition of numeric_prefix:
import itertools
def numeric_prefix(s):
n = ''.join(itertools.takewhile(lambda c: c.isdigit(), s))
return int(n) if n else 0
new = []
for item in strings:
new.append(int(''.join(i for i in item if i.isdigit())))
print new
[1, 25, 2234342]
Basic usage of regular expressions:
import re
strings = ["1asdf", "25etrth", "2234342 awefiasd"]
regex = re.compile('^(\d*)')
for s in strings:
mo = regex.match(s)
print s, '->', mo.group(0)
1asdf -> 1
25etrth -> 25
2234342 awefiasd -> 2234342
Building on sahhhm's answer, you can fix the "1 asdf 1" problem by using takewhile.
from itertools import takewhile
def isdigit(char):
return char.isdigit()
numbers = []
for string in strings:
result = takewhile(isdigit, string)
resultstr = ''.join(result)
if resultstr:
number = int(resultstr)
if number:
numbers.append(number)
So you only want the leading digits? And you want to avoid regexes? Probably there's something shorter but this is the obvious solution.
nlist = []
for s in strings:
if not s or s[0].isalpha(): continue
for i, c in enumerate(s):
if not c.isdigit():
nlist.append(int(s[:i]))
break
else:
nlist.append(int(s))

Categories