Creating a list given an equation with no spaces - python

I want to create a list given a string such as 'b123+xyz=1+z1$' so that the list equals ['b123', '+', 'xyz', '=', '1', '+', 'z1', '$']
Without spaces or a single repeating pattern, I do not know how to split the string into a list.
I tried creating if statements in a for loop to append the string when it reaches a character that is not a digit or letter through isdigit and isalpha but could not differentiate between variables and digits.

You can use a regular expression to split your string. This works by using positive lookaheads and look behinds for none word chars.
import re
sample = "b123+xyz=1+z1$"
split_sample = re.split("(?=\W)|(?:(?<=\W)(?!$))", sample)
print(split_sample)
OUTPUT
['b123', '+', 'xyz', '=', '1', '+', 'z1', '$']
REGEX EXPLAIN

Another regex approach giving the same result is:
split_sample = re.split(r"(\+|=|\$)", sample)[:-1]
The [:-1] is to remove the final empty string.

"""
Given the equation b123+xyz=1+z1$, break it down
into a list of variables and operators
"""
operators = ['+', '-', '/', '*', '=']
equation = 'b123+xyz=1+z1$'
equation_by_variable_and_operator = []
text = ''
for character in equation:
if character not in operators:
text = text + character
elif character in operators and len(text):
equation_by_variable_and_operator.append(text)
equation_by_variable_and_operator.append(character)
text = ''
# For the final variable
equation_by_variable_and_operator.append(text)
print(equation_by_variable_and_operator)
Output
['b123', '+', 'xyz', '=', '1', '+', 'z1$']

A straight-forward regex solution is;
equation = "b123+xyz=1+z1$"
equation_list = re.findall(r'\W+|\w+', equation)
print(equation_list)
This would also work with strings such as -b**10.
Using re.split() returns empty strings at the start and end of the string from the delimiters at the start and end of the string (see this question). To remove them, they can be filtered out, or otherwise look-behind or look-ahead conditions can be used which add to the pattern's complexity, as earlier answers to this question demonstrate.

Well my answer seems to not be the easiest among them all but i hope it helps you.
data: str = "b123+xyz=1+z1$"
symbols: str = "+=$"
merge_text: str = ""
for char in data:
if char not in symbols:
merge_text += char
else:
# insert a unique character for splitting
merge_text += ","
merge_text += char
merge_text += ","
final_result: list = merge_text.split(",")

Related

How to replace given index in String with Dictionary value in python?

The instructions are to replace certain characters within a string to the corresponding value in the dictionary.
Here is my code:
word = input()
password = ''
wordDict = {
'i': '!',
'a': '#',
'm': 'M',
'B': '8',
'o': '.',
}
for i in range(len(word)):
if word[i] in wordDict.keys():
word.replace(word[i], wordDict.get(word[i]))
i += 1
else:
i += 1
print(word)
The problem with my code is that nothing about the given password is changing nor does it seem to be iterating through the for loop.
Your problem is with this line:
word.replace(word[i], wordDict.get(word[i]))
Strings in Python, as well as many other languages, are immutable, meaning you can't edit the string.
The function you're calling (str.replace) doesn't replace the character in the string, it returns a new str with the character replaced.
The easiest, though naive if you want this to work efficiently, solution is to replace it with this line:
word = word.replace(word[i], wordDict.get(word[i]))

How to replace list of chars from the string [duplicate]

This question already has answers here:
Removing a list of characters in string
(20 answers)
Closed 2 years ago.
I have a search string If the character inside the search matches then replace with None
sear = '!%'
special_characters = ['!', '"', '#', '$', '%','(',')']
for remove_char in special_characters:
search_value = re.sub(remove_char, '', sear)
My out got error
Expected out is None
sear = 'ABC!%DEF'
Expected is 'ABCDEF'
sear = 'ABC,DEF'
Expected is 'ABC,DEF'
Just do a list comprehension and ''.join:
sear = '!%'
special_characters = ['!', '"', '#', '$', '%']
sear = ''.join([i for i in sear if i not in special_characters])
print(sear)
This code iterates the string by characters, and see if the character is not in the special_characters list, if it's not, it keeps it, if it is, it removes, but that only gives us a list of strings, so we need ''.join to change it into a string.
You can make a regex character class out of your special characters and then use a regex substitution to do the replacements. For longer strings or larger lists of special characters, you should find this runs 2-3x faster than the list comprehension solution.
import re
special_characters = ['!', '"', '#', '$', '%','(',')']
regex = re.compile('[' + ''.join(f'\{c}' for c in special_characters) + ']')
sear = '!%'
search_value = regex.sub('', sear)
print(search_value)
sear = 'ABC!%DEF'
search_value = regex.sub('', sear)
print(search_value)
sear = 'ABC,DEF'
search_value = regex.sub('', sear)
print(search_value)
Output:
<blank line>
ABCDEF
ABC,DEF
Note I've prefixed all characters in the character class with \ so that you don't have to worry about using characters such as - and ] which have special meaning within a character class.

How to use split strings with closed brackets as the separator

If I have a messy string like '[Carrots] [Broccoli] (cucumber)-(tomato) irrelevant [spinach]' and I want to split it into a list so that each part within any bracket is an item like ['Carrots', 'Broccoli', 'cucumber', 'tomato', 'spinach'] How would I do this? I can't figure out a way to make the .split() method work.
You can use regex
import re
s = '[Carrots] [Broccoli] (cucumber)-(tomato) irrelevant [spinach]'
lst = [x[0] or x[1] for x in re.findall(r'\[(.*?)\]|\((.*?)\)', s)]
print(lst)
Output
['Carrots', 'Broccoli', 'cucumber', 'tomato', 'spinach']
Explanation
Regex pattern to match
r'\[(.*?)\]|\((.*?)\)'
Subpattern 1: To match items in square brackets i.e. [...]
\[(.*?)\] # Use \[ and \] since [, ] are special characters
# we have to escape so they will be literal
(.*?) # Is a Lazy match of all characters
Subpattern 2: To match in parentheses i.e. (..)
\((.*?)\) # Use \( and \) since (, ) are special characters
# we have to escape so they will be literal
Since we are looking for either of the two patterns we use:
'|' # which is or between the two subpatterns
# to match Subpattern1 or Subpattern
The expression
re.findall(r'\[(.*?)\]|\((.*?)\)', s)
[('Carrots', ''), ('Broccoli', ''), ('', 'cucumber'), ('', 'tomato'), ('spinach', '')]
The result is in the first or second tuple. So we use:
[x[0] or x[1] for x in re.findall(r'\[(.*?)\]|\((.*?)\)', s)]
To extract the data from the first or second tuple and place it into a list.
Without any error handling whatsoever (like checking for nested or unbalanced brackets):
def parse(expr):
opening = "(["
closing = ")]"
result = []
current_item = ""
for char in expr:
if char in opening:
current_item = ""
continue
if char in closing:
result.append(current_item)
continue
current_item += char
return result
print(parse("(a)(b) stuff (c) [d] more stuff - (xxx)."))
>>> ['a', 'b', 'c', 'd', 'xxx']
Depending on your needs, this might already be good enough...
Assuming no other brackets or operators (e.g. '-') than the ones present in your example string are used, try
s = '[Carrots] [Broccoli] (cucumber)-(tomato) irrelevant [spinach]'
words = []
for elem in s.replace('-', ' ').split():
if '[' in elem or '(' in elem:
words.append(elem.strip('[]()'))
Or with list comprehension
words = [elem.strip('[]()') for elem in s.replace('-', ' ').split() if '[' in elem or '(' in elem]

How to modify existing Regex expression to ignore words in brackets

I have the following code
listnew= ['E-Textbooks','Dynamic', 'Case', 'Management', '(', 'DCM', ')'].
nounbreak = list(itertools.chain(*[re.findall(r"\b\w+\b(?![\(\w+\)])", i) for i in listnew]))
While the above code successfully removes '-' and even '/'. It somehow is not able to ignore the words in the brackets
The ideal output required is
['E', 'Textbooks','Dynamic', 'Case', 'Management']
How do I tweak the above regex expression itself to render the above desired output?
Your problem is that your regex looks at each list element seperately - it can not "see" that there are "(" and ")" elements before/after the current element it looks at.
I propose cleaning your list beforehand:
import re
from itertools import chain
listnew = ['E-Textbooks','Dynamic', 'Case', 'Management', '(', 'DCM', ')']
# collect indexes of elements that are ( or ) or things between them
# does not work for ((())) - you might need to do something more elaborate
# if that can happen
remove = []
for i,k in enumerate(listnew):
if k == "(":
remove.append(i)
elif k != ")" and remove and i == remove[-1]+1 and remove[-1] != ")":
remove.append(i)
elif k == ")":
remove.append(i)
data = [k for i,k in enumerate(listnew) if i not in frozenset(remove)]
# did not touch your regex per se - you might want to simplify it using regex101.com
nounbreak = list(chain(*[re.findall(r"\b\w+\b(?![\(\w+\)])", i) for i in data]))
print(nounbreak)
Output:
['E', 'Textbooks', 'Dynamic', 'Case', 'Management']
If you only have short lists - you could also ' '.join(..) them and clean the string from things inside parenthesis - see f.e. Regular expression to return text between parenthesis on how to accomplish this and remove it from the string.
This is a sparse solution just demonstrating the regex.
Basically joins the array on a non-word, comma in this case, then
runs a regex on it using findall.
The parenthesis elements will be empty strings that can be filtered
via list compression.
The regex :
\( .*? \)
| \b
( \w+ ) # (1)
\b
Python code :
>>> import re
>>> list_orig = ['E-Textbooks','Dynamic', 'Case', 'Management', '(', 'DCM', ')']
>>> str = ','.join( list_orig )
>>> list_new = re.findall( r"\(.*?\)|\b(\w+)\b", str )
>>> list_new = [i for i in list_new if i]
>>> print( list_new )
['E', 'Textbooks', 'Dynamic', 'Case', 'Management']

Loop Through Pattern Of Characters and Print Out Corresponding Text

Hi working on a sort of morse-code problem where a user would input a string in a format like this
pattern = ['.', '.', '_', '.', '_', '.']
and the code would print out the resulting word for each like so:
"dot-dot-dash-dot-dash-dot"
I've tried this:
def ss(pattern):
dotdash = ""
for s in pattern:
if s == ".":
dotdash+=("dot")
elif s == "_":
dotdash+=("dash")
x = "-".join(dotdash)
print(x)
ss(['.', '.', '_', '.', '_', '.'])
but that's just giving me output like this:
d-o-t-d-o-t-d-a-s-h-d-o-t-d-a-s-h-d-o-t
looking for a solution to separate those dots and dashes with a hyphen--just a bit stumped. thinking maybe to split--based on the words--but just unsure how to accomplish that. any help is hugely appreciated.
You could add the hyphens first with join, which gives you a string, and then apply two replacements to get the final string:
def ss(pattern):
return '-'.join(pattern).replace('.', 'dot').replace('_', 'dash')
print(ss(['.', '.', '_', '.', '_', '.'])) // dot-dot-dash-dot-dash-dot
What you are passing to the join() function is a string, so it will iterate through every letter in the string, that's why you are getting this output. What you really want to do is use a list, so that join() iterates through every word in the list:
...
dotdash = []
for s in pattern:
if s == ".":
dotdash.append("dot")
elif s == "_":
dotdash.append("dash")
x = "-".join(dotdash)
...
well, i'll just go ahead and respond to my own answer--woops. Easy solution here--i just changed dotdash to a list, and that took care of all my problems.
Change ("dot") to ("dot",) with a trailing comma, and do the same for ("dash").
The trailing comma forces it to be a list type.
Edit: Also, change dotdash to be initialised like dotdash = []

Categories