how to recursively create nested list from string input - python

So, I would like to convert my string input
'f(g,h(a,b),a,b(g,h))'
into the following list
['f',['g','h',['a','b'],'a','b',['g','h']]]
Essentially, I would like to replace all '(' into [ and all ')' into ].
I have unsuccessfully tried to do this recursively. I thought I would iterate through all the variables through my word and then when I hit a '(' I would create a new list and start extending the values into that newest list. If I hit a ')', I would stop extending the values into the newest list and append the newest list to the closest outer list. But I am very new to recursion, so I am struggling to think of how to do it
word='f(a,f(a))'
empty=[]
def newlist(word):
listy=[]
for i, letter in enumerate(word):
if letter=='(':
return newlist([word[i+1:]])
if letter==')':
listy.append(newlist)
else:
listy.extend(letter)
return empty.append(listy)

Assuming your input is something like this:
a = 'f,(g,h,(a,b),a,b,(g,h))'
We start by splitting it into primitive parts ("tokens"). Since your tokens are always a single symbol, this is rather easy:
tokens = list(a)
Now we need two functions to work with the list of tokens: next_token tells us which token we're about to process and pop_token marks a token as processed and removes it from the list:
def next_token():
return tokens[0] if tokens else None
def pop_token():
tokens.pop(0)
Your input consist of "items", separated by a comma. Schematically, it can be expressed as
items = item ( ',' item )*
In the python code, we first read one item and then keep reading further items while the next token is a comma:
def items():
result = [item()]
while next_token() == ',':
pop_token()
result.append(item())
return result
An "item" is either a sublist in parentheses or a letter:
def item():
return sublist() or letter()
To read a sublist, we check if the token is a '(', the use items above the read the content and finally check for the ')' and panic if it is not there:
def sublist():
if next_token() == '(':
pop_token()
result = items()
if next_token() == ')':
pop_token()
return result
raise SyntaxError()
letter simply returns the next token. You might want to add some checks here to make sure it's indeed a letter:
def letter():
result = next_token()
pop_token()
return result
You can organize the above code like this: have one function parse that accepts a string and returns a list and put all functions above inside this function:
def parse(input_string):
def items():
...
def sublist():
...
...etc
tokens = list(input_string)
return items()

Quite an interesting question, and one I originally misinterpreted. But now this solution works accordingly. Note that I have used list concatenation + operator for this solution (which you usually want to avoid) so feel free to improve upon it however you see fit.
Good luck, and I hope this helps!
# set some global values, I prefer to keep it
# as a set incase you need to add functionality
# eg if you also want {{a},b} or [ab<c>ed] to work
OPEN_PARENTHESIS = set(["("])
CLOSE_PARENTHESIS = set([")"])
SPACER = set([","])
def recursive_solution(input_str, index):
# base case A: when index exceeds or equals len(input_str)
if index >= len(input_str):
return [], index
char = input_str[index]
# base case B: when we reach a closed parenthesis stop this level of recursive depth
if char in CLOSE_PARENTHESIS:
return [], index
# do the next recursion, return it's value and the index it stops at
recur_val, recur_stop_i = recursive_solution(input_str, index + 1)
# with an open parenthesis, we want to continue the recursion after it's associated
# closed parenthesis. and also the recur_val should be within a new dimension of the list
if char in OPEN_PARENTHESIS:
continued_recur_val, continued_recur_stop_i = recursive_solution(input_str, recur_stop_i + 1)
return [recur_val] + continued_recur_val, continued_recur_stop_i
# for spacers eg "," we just ignore it
if char in SPACER:
return recur_val, recur_stop_i
# and finally with normal characters, we just extent it
return [char] + recur_val, recur_stop_i

You can get the expected answer using the following code but it's still in string format and not a list.
import re
a='(f(g,h(a,b),a,b(g,h))'
ans=[]
sub=''
def rec(i,sub):
if i>=len(a):
return sub
if a[i]=='(':
if i==0:
sub=rec(i+1,sub+'[')
else:
sub=rec(i+1,sub+',[')
elif a[i]==')':
sub=rec(i+1,sub+']')
else:
sub=rec(i+1,sub+a[i])
return sub
b=rec(0,'')
print(b)
b=re.sub(r"([a-z]+)", r"'\1'", b)
print(b,type(b))
Output
[f,[g,h,[a,b],a,b,[g,h]]
['f',['g','h',['a','b'],'a','b',['g','h']] <class 'str'>

Related

First Unique Character

Given a string, find the first non-repeating character in it and return its index. If it doesn't exist, return -1. Input string already all lowercase.
Why does my code not work?
str1 = input("give me a string: ")
def unique(x):
stack = []
if x is None:
return (-1)
i = 0
while i < len(x):
stack = stack.append(x[i])
if x[i] in stack:
return(i)
else:
i += 1
unique(str1)
str1 = input("give me a string: ")
def unique(x):
for i in x:
if x.count(i) == 1:
return x.index(i)
else:
return -1
print(unique(str1))
This will work
Explanation
Instead of using the list stack use the count() function of the string. The function unique(x) will return the index of that first element whose count is 1 in the str1 string.
You need to know what your code is doing to figure out why it doesn't work, let's breakthrough it step by step.
you create a empty list stack for later use, that's fine.
if x is None is a strange way to check if a string is given, and it doesn't work because even a empty string "" is not equal to None. is is used to check if both sides are the same object, and == is a better operator to check if values of both sides are the same. Therefore, if x == "" is better, but if not x is even better to check if something is empty.
using variable i and while loop to iterate the string is fine.
append() change the list in-place and return None, so stack = stack.append(x[i]) is assigning None to stack.
in stack is going to raise TypeError as NoneType is not iterable. If we change the last line to stack.append(x[i]), it now works, as x[0] is already appended to stack, if x[0] in stack must be True and return 0 for your result.
That's what your code is doing, you just append the first character and return the first index. You need to go through the whole string to know if a character is unique.
Although Rishabh's answer is cleaner, I provide a way to doing it using lists to save seen and repeated characters, then read the string again to find the index of unique character.
x = input("give me a string: ")
def unique(x):
seen = []
repeated = []
for char in x:
if char in seen:
repeated.append(char)
else:
seen.append(char)
for idx, char in enumerate(x):
if char not in repeated:
return idx
return -1
print(unique(x))

Why isn't my return command being obeyed?

I'm trying to write a function to return the longest common prefix from a series of strings. Using a debugger, saw that my function reaches the longest common prefix correctly, but then when it reaches the statement to return, it begins reverting to earlier stages of the algorithm.
For test case strs = ["flower","flow","flight"]
The output variable holds the following values:-
f > fl > f
instead of returning fl.
Any help would be appreciated, because I don't really know how to Google for this one. Thank you.
class Solution(object):
def longestCommonPrefix(self, strs, output = ''):
#return true if all chars in string are the same
def same(s):
return s == len(s) * s[0]
#return new list of strings with first char removed from each string
def slicer(list_, list_2 = []):
for string in list_:
string1 = string[1:]
list_2.append(string1)
return list_2
#return string containing first char from each string
def puller(list_):
s = ''
for string in list_:
s += string[0]
return s
#pull first character from each string
s = puller(strs)
#if they are the same
#add one char to output
#run again on sliced list
if same(s):
output += s[0]
self.longestCommonPrefix(slicer(strs), output)
return output
This can be handled with os.path.commonprefix.
>>> import os
>>> strs = ["flower","flow","flight"]
>>> os.path.commonprefix(strs)
'fl'
It doesn't "revert". longestCommonPrefix potentially calls itself - what you're seeing is simply the call-stack unwinding, and flow of execution is returning to the calling code (the line that invoked the call to longestCommonPrefix from which you are returning).
That being said, there's really no need to implement a recursive solution in the first place. I would suggest something like:
def get_common_prefix(strings):
def get_next_prefix_char():
for chars in zip(*strings):
if len(set(chars)) != 1:
break
yield chars[0]
return "".join(get_next_prefix_char())
print(get_common_prefix(["hello", "hey"]))
You are looking at the behavior...the final result...of recursive calls to your method. However, the recursive calls don't do anything to affect the result of the initial execution of the method. If we look at the few lines that matter at the end of your method:
if same(s):
output += s[0]
self.longestCommonPrefix(slicer(strs), output)
return output
The problem here is that since output is immutable, its value won't be changed by calling longestCommonPrefix recursively. So from the standpoint of the outermost call to longestCommonPrefix, the result it will return is determined only by if same(s) is true or false. If it is true it will return s[0], otherwise it will return ''.
The easiest way to fix this behavior and have your recursive call affect the result of the prior call to the method would be to have its return value become the value of output, like this:
if same(s):
output += s[0]
output = self.longestCommonPrefix(slicer(strs), output)
return output
This is a common code pattern when using recursion. Just this change does seem to give you the result you expect! I haven't analyzed your whole algorithm, so I don't know if it becomes "correct" with just this change.
Can you try this? I
class Solution(object):
def longestCommonPrefix(self, strs, output = ''):
#return true if all chars in string are the same
def same(s):
return s == len(s) * s[0]
#return new list of strings with first char removed from each string
def slicer(list_, list_2 = []):
for string in list_:
string1 = string[1:]
list_2.append(string1)
return list_2
#return string containing first char from each string
def puller(list_):
s = ''
for string in list_:
s += string[0]
return s
#pull first character from each string
s = puller(strs)
# Can you Try this revision?
# I think the problem is that your new version of output is being lost when the fourth called function returns to the third and the third returns to the second, etc...
# You need to calculate a new output value before you call recursively, that is true, but you also need a way to 'store' that output when that recursively called function 'returns'. Right now it disappears, I believe.
if same(s):
output += s[0]
output = self.longestCommonPrefix(slicer(strs), output)
return output

how to use python to find the first not repeating character?

I am solving a problem: Given a string s consisting of small English letters, find and return the first instance of a non-repeating character in it. If there is no such character, return '_'.
For example: s = "abacabad", the output should be firstNotRepeatingCharacter(s) = 'c'.
I wrote a simple code, it got through all the test, but when I submit it, it reports error, anyone know what's wrong with my code? Thank you!
def firstNotRepeatingCharacter(s):
for i in list(s):
if list(s).count(i) == 1:
return i
return '_'
Could be a performance issue as your repeated count (and unnecessary list conversions) calls make this approach quadratic. You should aim for a linear solution:
from collections import Counter
def firstNotRepeatingCharacter(s):
c = Counter(s)
for i in s:
if c[i] == 1:
return i
return '_'
You can also use next with a generator and a default value:
def firstNotRepeatingCharacter(s):
c = Counter(s)
return next((i for i in s if c[i] == 1), '_')
If you can only use built-ins, just make your own counter (or any other data structure that allows you to identify duplicates)
def firstNotRepeatingCharacter(s):
c = {}
for i in s:
c[i] = c.get(i, 0) + 1
return next((i for i in s if c[i] == 1), '_')
The task at hand is to find first non repeating character from a given string e.g. s = 'aabsacbhsba'
def solution(s):
# This will take each character from the given string s one at a time
for i in s:
if s.index(i) == s.rindex(i): # rindex() returns last index of i in s
return i
return '_'
Here s.rindex(i) method finds the last occurrence of the specified value [value at i in s in our case] we are comparing it with current index s.index(i), if they return the same value we found the first occurance of specified value which is not repeated
You can find definition and usage of rindex() at : W3School rindex()

How to delete repeating letters in a string?

I am trying to write a function which will return me the string of unique characters present in the passed string. Here's my code:
def repeating_letters(given_string):
counts = {}
for char in given_string:
if char in counts:
return char
else:
counts[char] = 1
if counts[char] > 1:
del(char)
else:
return char
I am not getting expected results with it. How can I get the desired result.
Here when I am passing this string as input:
sample_input = "abcadb"
I am expecting the result to be:
"abcd"
However my code is returning me just:
nothing
def repeating_letters(given_string):
seen = set()
ret = []
for c in given_string:
if c not in seen:
ret.append(c)
seen.add(c)
return ''.join(ret)
Here we add each letter to the set seen the first time we see it, at the same time adding it to a list ret. Then we return the joined list.
Here's the one-liner to achieve this if the order in the resultant string matters via using set with sorted as:
>>> my_str = 'abcadbgeg'
>>> ''.join(sorted(set(my_str),key=my_str.index))
'abcdge'
Here sorted will sort the characters in the set based on the first index of each in the original string, resulting in ordered list of characters.
However if the order in the resultant string doesn't matter, then you may simply do:
>>> ''.join(set(my_str))
'acbedg'

'NoneType' object is not iterable - looping w/returned value

The purpose of this code is to find the longest string in alphabetical order that occurs first and return that subset.
I can execute the code once, but when I try to loop it I get 'NoneType' object is not iterable (points to last line). I have made sure that what I return and input are all not of NoneType, so I feel like I'm missing a fundamental.
This is my first project in the class, so the code doesn't need to be the "best" or most efficient way - it's just about learning the basics at this point.
s = 'efghiabcdefg'
best = ''
comp = ''
temp = ''
def prog(comp, temp, best, s):
for char in s:
if comp <= char: #Begins COMParison of first CHARacter to <null>
comp = char #If the following character is larger (alphabetical), stores that as the next value to compare to.
temp = temp + comp #Creates a TEMPorary string of characters in alpha order.
if len(temp) > len(best): #Accepts first string as longest string, then compares subsequent strings to the "best" length string, replacing if longer.
best = temp
if len(best) == len(s): #This is the code that was added...
return(s, best) #...to fix the problem.
else:
s = s.lstrip(temp) #Removes those characters considered in this pass
return (str(s), str(best)) #Provides new input for subsequent passes
while len(s) != 0:
(s, best) = prog(comp, temp, best, s)
prog is returning None. The error you get is when you try to unpack the result into the tuple (s, best)
You need to fix your logic so that prog is guaranteed to not return None. It will return None if your code never executes the else clause in the loop.
You don't return in all cases. In Python, if a function ends without an explicit return statement, it will return None.
Consider returning something if, for example, the input string is empty.

Categories