how to use python to find the first not repeating character? - python

I am solving a problem: Given a string s consisting of small English letters, find and return the first instance of a non-repeating character in it. If there is no such character, return '_'.
For example: s = "abacabad", the output should be firstNotRepeatingCharacter(s) = 'c'.
I wrote a simple code, it got through all the test, but when I submit it, it reports error, anyone know what's wrong with my code? Thank you!
def firstNotRepeatingCharacter(s):
for i in list(s):
if list(s).count(i) == 1:
return i
return '_'

Could be a performance issue as your repeated count (and unnecessary list conversions) calls make this approach quadratic. You should aim for a linear solution:
from collections import Counter
def firstNotRepeatingCharacter(s):
c = Counter(s)
for i in s:
if c[i] == 1:
return i
return '_'
You can also use next with a generator and a default value:
def firstNotRepeatingCharacter(s):
c = Counter(s)
return next((i for i in s if c[i] == 1), '_')
If you can only use built-ins, just make your own counter (or any other data structure that allows you to identify duplicates)
def firstNotRepeatingCharacter(s):
c = {}
for i in s:
c[i] = c.get(i, 0) + 1
return next((i for i in s if c[i] == 1), '_')

The task at hand is to find first non repeating character from a given string e.g. s = 'aabsacbhsba'
def solution(s):
# This will take each character from the given string s one at a time
for i in s:
if s.index(i) == s.rindex(i): # rindex() returns last index of i in s
return i
return '_'
Here s.rindex(i) method finds the last occurrence of the specified value [value at i in s in our case] we are comparing it with current index s.index(i), if they return the same value we found the first occurance of specified value which is not repeated
You can find definition and usage of rindex() at : W3School rindex()

Related

how to recursively create nested list from string input

So, I would like to convert my string input
'f(g,h(a,b),a,b(g,h))'
into the following list
['f',['g','h',['a','b'],'a','b',['g','h']]]
Essentially, I would like to replace all '(' into [ and all ')' into ].
I have unsuccessfully tried to do this recursively. I thought I would iterate through all the variables through my word and then when I hit a '(' I would create a new list and start extending the values into that newest list. If I hit a ')', I would stop extending the values into the newest list and append the newest list to the closest outer list. But I am very new to recursion, so I am struggling to think of how to do it
word='f(a,f(a))'
empty=[]
def newlist(word):
listy=[]
for i, letter in enumerate(word):
if letter=='(':
return newlist([word[i+1:]])
if letter==')':
listy.append(newlist)
else:
listy.extend(letter)
return empty.append(listy)
Assuming your input is something like this:
a = 'f,(g,h,(a,b),a,b,(g,h))'
We start by splitting it into primitive parts ("tokens"). Since your tokens are always a single symbol, this is rather easy:
tokens = list(a)
Now we need two functions to work with the list of tokens: next_token tells us which token we're about to process and pop_token marks a token as processed and removes it from the list:
def next_token():
return tokens[0] if tokens else None
def pop_token():
tokens.pop(0)
Your input consist of "items", separated by a comma. Schematically, it can be expressed as
items = item ( ',' item )*
In the python code, we first read one item and then keep reading further items while the next token is a comma:
def items():
result = [item()]
while next_token() == ',':
pop_token()
result.append(item())
return result
An "item" is either a sublist in parentheses or a letter:
def item():
return sublist() or letter()
To read a sublist, we check if the token is a '(', the use items above the read the content and finally check for the ')' and panic if it is not there:
def sublist():
if next_token() == '(':
pop_token()
result = items()
if next_token() == ')':
pop_token()
return result
raise SyntaxError()
letter simply returns the next token. You might want to add some checks here to make sure it's indeed a letter:
def letter():
result = next_token()
pop_token()
return result
You can organize the above code like this: have one function parse that accepts a string and returns a list and put all functions above inside this function:
def parse(input_string):
def items():
...
def sublist():
...
...etc
tokens = list(input_string)
return items()
Quite an interesting question, and one I originally misinterpreted. But now this solution works accordingly. Note that I have used list concatenation + operator for this solution (which you usually want to avoid) so feel free to improve upon it however you see fit.
Good luck, and I hope this helps!
# set some global values, I prefer to keep it
# as a set incase you need to add functionality
# eg if you also want {{a},b} or [ab<c>ed] to work
OPEN_PARENTHESIS = set(["("])
CLOSE_PARENTHESIS = set([")"])
SPACER = set([","])
def recursive_solution(input_str, index):
# base case A: when index exceeds or equals len(input_str)
if index >= len(input_str):
return [], index
char = input_str[index]
# base case B: when we reach a closed parenthesis stop this level of recursive depth
if char in CLOSE_PARENTHESIS:
return [], index
# do the next recursion, return it's value and the index it stops at
recur_val, recur_stop_i = recursive_solution(input_str, index + 1)
# with an open parenthesis, we want to continue the recursion after it's associated
# closed parenthesis. and also the recur_val should be within a new dimension of the list
if char in OPEN_PARENTHESIS:
continued_recur_val, continued_recur_stop_i = recursive_solution(input_str, recur_stop_i + 1)
return [recur_val] + continued_recur_val, continued_recur_stop_i
# for spacers eg "," we just ignore it
if char in SPACER:
return recur_val, recur_stop_i
# and finally with normal characters, we just extent it
return [char] + recur_val, recur_stop_i
You can get the expected answer using the following code but it's still in string format and not a list.
import re
a='(f(g,h(a,b),a,b(g,h))'
ans=[]
sub=''
def rec(i,sub):
if i>=len(a):
return sub
if a[i]=='(':
if i==0:
sub=rec(i+1,sub+'[')
else:
sub=rec(i+1,sub+',[')
elif a[i]==')':
sub=rec(i+1,sub+']')
else:
sub=rec(i+1,sub+a[i])
return sub
b=rec(0,'')
print(b)
b=re.sub(r"([a-z]+)", r"'\1'", b)
print(b,type(b))
Output
[f,[g,h,[a,b],a,b,[g,h]]
['f',['g','h',['a','b'],'a','b',['g','h']] <class 'str'>

Recursive function to obtain non-repeated characters from string

I have this exercise:
Write a recursive function that takes a string and returns all the characters that are not repeated in said string.
The characters in the output don't need to have the same order as in the input string.
First I tried this, but given the condition for the function to stop, it never evaluates the last character:
i=0
lst = []
def list_of_letters_rec(str=""):
if str[i] not in lst and i < len(str) - 1:
lst.append(str[i])
list_of_letters_rec(str[i+1:])
elif str[i] in lst and i < len(str) - 1:
list_of_letters_rec(str[i+1:])
elif i > len(str) - 1:
return lst
return lst
word = input(str("Word?"))
print(list_of_letters_rec(word))
The main issue with this function is that it never evaluates the last character.
An example of an output:
['a', 'r', 'd', 'v'] for input 'aardvark'.
Since the characters don't need to be ordered, I suppose a better approach would be to do the recursion backwards, and I also tried another approach (below), but no luck:
lst = []
def list_of_letters_rec(str=""):
n = len(str) - 1
if str[n] not in lst and n >= 0:
lst.append(str[n])
list_of_letters_rec(str[:n-1])
elif str[n] in lst and n >= 0:
list_of_letters_rec(str[:n-1])
return lst
word = input(str("Word?"))
print(list_of_letters_rec(word))
Apparently, the stop conditions are not well defined, especially in the last one, as the output I get is
IndexError: string index out of range
Could you give me any hints to help me correct the stop condition, either in the 1st or 2nd try?
You can try:
word = input("> ")
result = [l for l in word if word.count(l) < 2]
> aabc
['b', 'c']
Demo
One improvement I would offer on #trincot's answer is the use of a set, which has better look-up time, O(1), compared to lists, O(n).
if the input string, s, is empty, return the empty result
(inductive) s has at least one character. if the first character, s[0] is in the memo, mem, the character has already been seen. Return the result of the sub-problem, s[1:]
(inductive) The first character is not in the memo. Add the first character to the memo and prepend the first character to the result of the sub-problem, s[1:]
def list_of_letters(s, mem = set()):
if not s:
return "" #1
elif s[0] in mem:
return list_of_letters(s[1:], mem) #2
else:
return s[0] + list_of_letters(s[1:], {*mem, s[0]}) #3
print(list_of_letters("aardvark"))
ardvk
Per your comment, the exercise asks only for a string as input. We can easily modify our program to privatize mem -
def list_of_letters(s): # public api
def loop(s, mem): # private api
if not s:
return ""
elif s[0] in mem:
return loop(s[1:], mem)
else:
return s[0] + loop(s[1:], {*mem, s[0]})
return loop(s, set()) # run private func
print(list_of_letters("aardvark")) # mem is invisible to caller
ardvk
Python's native set data type accepts an iterable which solves this problem instantly. However this doesn't teach you anything about recursion :D
print("".join(set("aardvark")))
akdrv
Some issues:
You miss the last character because of i < len(str) - 1 in the conditionals. That should be i < len(str) (but read the next points, as this still needs change)
The test for if i > len(str) - 1 should come first, before doing anything else, otherwise you'll get an invalid index reference. This also makes the other conditions on the length unnecessary.
Don't name your variable str, as that is already a used name for the string type.
Don't populate a list that is global. By doing this, you can only call the function once reliably. Any next time the list will still have the result of the previous call, and you'll be adding to that. Instead use the list that you get from the recursive call. In the base case, return an empty list.
The global i has no use, since you never change its value; it is always 0. So you should just reference index [0] and check that the string is not empty.
Here is your code with those corrections:
def list_of_letters_rec(s=""):
if not s:
return []
result = list_of_letters_rec(s[1:])
if s[0] not in result:
result.append(s[0])
return result
print(list_of_letters_rec("aardvark"))
NB: This is not the most optimal way to do it. But I guess this is what you are asked to do.
A possible solution would be to just use an index instead of splicing the string:
def list_of_letters_rec(string="", index = 0, lst = []):
if(len(string) == index):
return lst
char = string[index]
if string.count(char) == 1:
lst.append(char)
return list_of_letters_rec(string, index+1, lst)
word = input(str("Word?"))
print(list_of_letters_rec(word))

How do I find the predominant letters in a list of strings

I want to check for each position in the string what is the character that appears most often on that position. If there are more of the same frequency, keep the first one. All strings in the list are guaranteed to be of identical length!!!
I tried the following way:
print(max(((letter, strings.count(letter)) for letter in strings), key=lambda x:[1])[0])
But I get: mistul or qagic
And I can not figure out what's wrong with my code.
My list of strings looks like this:
Input: strings = ['mistul', 'aidteh', 'mhfjtr', 'zxcjer']
Output: mister
Explanation: On the first position, m appears twice. Second, i appears twice twice. Third, there is no predominant character, so we chose the first, that is, s. On the fourth position, we have t twice and j twice, but you see first t, so we stay with him, on the fifth position we have e twice and the last r twice.
Another examples:
Input: ['qagic', 'cafbk', 'twggl', 'kaqtc', 'iisih', 'mbpzu', 'pbghn', 'mzsev', 'saqbl', 'myead']
Output: magic
Input: ['sacbkt', 'tnqaex', 'vhcrhl', 'obotnq', 'vevleg', 'rljnlv', 'jdcjrk', 'zuwtee', 'xycbvm', 'szgczt', 'imhepi', 'febybq', 'pqkdfg', 'swwlds', 'ecmrut', 'buwruy', 'icjwet', 'gebgbq', 'djtfzr', 'uenleo']
Expected Output: secret
Some help?
Finally a use case for zip() :-)
If you like cryptic code, it could even be done in one statement:
def solve(strings):
return ''.join([max([(letter, letters.count(letter)) for letter in letters], key=lambda x: x[1])[0] for letters in zip(*strings)])
But I prefer a more readable version:
def solve(strings):
result = ''
# "zip" the strings, so in the first iteration `letters` would be a list
# containing the first letter of each word, the second iteration it would
# be a list of all second letters of each word, and so on...
for letters in zip(*strings):
# Create a list of (letter, count) pairs:
letter_counts = [(letter, letters.count(letter)) for letter in letters]
# Get the first letter with the highest count, and append it to result:
result += max(letter_counts, key=lambda x: x[1])[0]
return result
# Test function with input data from question:
assert solve(['mistul', 'aidteh', 'mhfjtr', 'zxcjer']) == 'mister'
assert solve(['qagic', 'cafbk', 'twggl', 'kaqtc', 'iisih', 'mbpzu', 'pbghn',
'mzsev', 'saqbl', 'myead']) == 'magic'
assert solve(['sacbkt', 'tnqaex', 'vhcrhl', 'obotnq', 'vevleg', 'rljnlv',
'jdcjrk', 'zuwtee', 'xycbvm', 'szgczt', 'imhepi', 'febybq',
'pqkdfg', 'swwlds', 'ecmrut', 'buwruy', 'icjwet', 'gebgbq',
'djtfzr', 'uenleo']) == 'secret'
UPDATE
#dun suggested a smarter way of using the max() function, which makes the one-liner actually quite readable :-)
def solve(strings):
return ''.join([max(letters, key=letters.count) for letters in zip(*strings)])
Using collections.Counter() is a nice strategy here. Here's one way to do it:
from collections import Counter
def most_freq_at_index(strings, idx):
chars = [s[idx] for s in strings]
char_counts = Counter(chars)
return char_counts.most_common(n=1)[0][0]
strings = ['qagic', 'cafbk', 'twggl', 'kaqtc', 'iisih',
'mbpzu', 'pbghn', 'mzsev', 'saqbl', 'myead']
result = ''.join(most_freq_at_index(strings, idx) for idx in range(5))
print(result)
## 'magic'
If you want something more manual without the magic of Python libraries you can do something like this:
def f(strings):
dic = {}
for string in strings:
for i in range(len(string)):
word_dic = dic.get(i, { string[i]: 0 })
word_dic[string[i]] = word_dic.get(string[i], 0) + 1
dic[i] = word_dic
largest_string = max(strings, key = len)
result = ""
for i in range(len(largest_string)):
result += max(dic[i], key = lambda x : dic[i][x])
return result
strings = ['qagic', 'cafbk', 'twggl', 'kaqtc', 'iisih', 'mbpzu', 'pbghn', 'mzsev', 'saqbl', 'myead']
f(strings)
'magic'

How to delete repeating letters in a string?

I am trying to write a function which will return me the string of unique characters present in the passed string. Here's my code:
def repeating_letters(given_string):
counts = {}
for char in given_string:
if char in counts:
return char
else:
counts[char] = 1
if counts[char] > 1:
del(char)
else:
return char
I am not getting expected results with it. How can I get the desired result.
Here when I am passing this string as input:
sample_input = "abcadb"
I am expecting the result to be:
"abcd"
However my code is returning me just:
nothing
def repeating_letters(given_string):
seen = set()
ret = []
for c in given_string:
if c not in seen:
ret.append(c)
seen.add(c)
return ''.join(ret)
Here we add each letter to the set seen the first time we see it, at the same time adding it to a list ret. Then we return the joined list.
Here's the one-liner to achieve this if the order in the resultant string matters via using set with sorted as:
>>> my_str = 'abcadbgeg'
>>> ''.join(sorted(set(my_str),key=my_str.index))
'abcdge'
Here sorted will sort the characters in the set based on the first index of each in the original string, resulting in ordered list of characters.
However if the order in the resultant string doesn't matter, then you may simply do:
>>> ''.join(set(my_str))
'acbedg'

Why won't my Python function be recursive/call itself?

I am making a String Rewriting Function that takes a string and rewrites it according to the rules in a dictionary. The code works perfectly fine once but I need it to call itself n times and print the 'nth' rewritten string. The code that works that I need to be recursive is:
S = "AB"
def srs_print(S, n, rules):
'''
A function that takes a dictionary as SRS rules and prints
the output
'''
axiom = list(S)
key = []
value = []
output = ''
for k in rules:
#Inputs the keys of the rules dictionary into a new list
key.append(k)
#Inputs the value of the rules dictionary into a new list
value.append(rules[k])
for x in axiom:
if x in key:
axiomindex = key.index(x)
output += value[axiomindex]
else:
output += x
S = output
return S
#j = srs_print(S, 5, {'A':'AB', 'B': 'A'})
#print(j)
#while len(range(n)) > 0:
# S = srs_print(S, n, rules)
# n = n-1
#print("The", n, "th rewrite is " )
#j = srs_print(S, 5, {'A':'AB', 'B': 'A'})
print(srs_print("A", 5, {'A':'AB', 'B': 'A'}))
This turns "A" into "AB" but I need it to put 'S' back into the function and run again 'n'times. As you can see, some of the commented code are lines I have tried to use but have failed.
I hope I've explained myself well enough.
If I understand correctly, you pass "n" to the function as a way to count how many times you need to call it.
You could probably get away with enclosing the whole body of the function in a for or a whileloop.
If you want it to be recursive anyway, here's how it could work:
You need to have "two" return statements for your function. One that returns the result (in your case, "S"), and another one, that returns srs_print(S, n-1, rules)
Ex:
if n > 0:
return srs_print(S, n-1, rules)
else:
return S
I suggest you take some time and read this, to get a better understanding of what you want to do and whether or not it is what you should do.
You definitly don't need recursion here. First let's rewrite your function in a more pythonic way:
def srs_transform(string, rules):
'''
A function that takes a dictionary as SRS rules and prints
the output
'''
# the short way:
# return "".join(rules.get(letter, letter) for letter in string)
# the long way
output = []
for letter in string:
transformed = rules.get(letter, letter)
output.append(transformed)
return "".join(output)
Now we add a wrapper function that will take care of applying srs_transform a given number of times to it's arguments:
def srs_multi_transform(string, rules, times):
output = string
for x in range(times):
output = srs_transform(output, rules)
return output
And now we just have to call it:
print(srs_transform("A", {'A':'AB', 'B': 'A'}, 5))
>> ABAABABAABAAB
Why not simply use a for-loop?
def srs_print(S, n, rules):
for _ in range(n):
S = ''.join(rules.get(x, x) for x in S)
return S

Categories