Let's say I have a string a = 31 4 5 + + and a string b = 31 4 5 ++. I need to check that all numbers and operators in the string are delimited by at least one white space. Therefore, string a is correct, string b incorrect. c = 31 4 5+ + is also incorrect. Is there a way how to check for this? I could not come up with anything reasonable.
You can check it through following steps -
Break string into list using .split()
Check whether items in list whose length is more than 1 is numeric or not.
Code snippet:
def is_valid_string(st, delimiter = ' '):
lst = st.split(delimiter) # create list of chars separated by space
for item in lst:
if len(item) > 1 and not item.isdigit():
return False
return True
In case if you are considering float numbers you can use item.replace('.','',1).isdigit()
First thing to do would be splitting the strings by the whitespaces into "words", so something like words = a.split() (split's delimitor is a whitespace by default so no need for arguments)
I'm guessing you're only gonna use integers or floats and a set of operators like adding, substraction, multiplication and division, so one thing that you could do is check if you can cast the words into numbers with int or float and if you can't, check if the word is in your operators set, so something like:
a = "31 4 5 + +"
operators = ["+", "-", "*", "/"]
# Every string is valid by default
valid = True
words = a.split() # ["31", "4", "5", "+", "+"]
for word in words:
# try to cast word into a number
try:
float(word)
except:
# if you can't, check if it's an operator
if word not in operators:
valid = False #if it's not, the string isn't valid
if valid:
print("String is valid")
else:
print("String is not valid")
More complex stuff like equations and variables is obviously more difficult to code.
EDIT: python's isdigit() checks if a string is a number and it is more simple than a try block for casting the string, but it doesn't check for floats, which won't be valid. (you could still replace decimal points by numbers)
Try use regex ^((\d+|[+*/-])(\s+|$))+$. It matches more or more items, each of which is either a number (\d+) or an operator ([+*/-]), followed by either one or more spaces (\s+) or the end of string ($). The ^ at the beginning and ($) at the end force the regex match the whole string. Example:
>>> import re
>>> a = '31 4 5 + +'
>>> b = '31 4 5 ++'
>>> c = '31 4 5+ +'
>>> print(re.match(r'^((\d+|[+*/-])(\s+|$))+$', a))
<re.Match object; span=(0, 10), match='31 4 5 + +'>
>>> print(re.match(r'^((\d+|[+*/-])(\s+|$))+$', b))
None
>>> print(re.match(r'^((\d+|[+*/-])(\s+|$))+$', c))
None
Related
As part of a bigger Python homework problem, I'm trying to check if a string input contains a positive/negative integer or float, and returns True if it does, False if it does not.
The homework problem will assume that the user enters in a very simple math expression, separated by whitespace (e.g. 8 / ( 2 + 2 )), after which, the expression will be broken down into separate characters in a list (e.g. ["8", "/", "(", "2", "+", "2", ")"]). Then, I will run this list through a 'for' loop that increases the number of a counter variable, based on the current token's type (e.g. if currentToken is "2", then operandCount += 1).
So far, I'm having difficulty trying to accomplish this with a currentToken that contains a positive/negative float or int-type number (e.g. "8", "-9", "4.6"). The main challenge here is that my professor does not allow us to use try-except anywhere in our code, so that rules out a lot of the solutions I've found online so far as many of them involve catching an error message.
The closest I've come to solving this was using the 'isnumeric()' and 'isdigit()' methods.
# Assume that the user will enter in a simple math expression, separated by whitespace.
expression = input("Enter an expression: ")
expr = expression.split()
operandCount = 0
for currentToken in expr:
if currentToken.isnumeric() == True:
operandCount += 1
# operandCount should increase when the 'for' loop encounters
# a positive/negative integer/float.
However, it only returns True for strings that contain purely positive integers. I need it to be able to return True for integers and floats, both positive and negative. Thanks in advance for any advice!
Apply a regular expression to search for a numeric string which optionally can be
a negative number: -1
have comma separation: 120,000
have an optional decimal point: 0.34, .1415926
import re
for currentToken in expr:
m = re.match("^[-\.\d]+[,]?[\d]*[\.]?[\d]?$", str(currentToken))
if (m):
print "number:", m.group[0]
As pointed out skymon using regex is a good idea, here is an example:
import re
def isAdigit(text, pat=r'^-*\d+\.*\d*[ \t]*$'):
if re.search(pat, text) != None:
return True
return False
expression = input("Enter an expression: ")
expr = expression.split()
operandCount = 0
for currentToken in expr:
if isAdigit(currentToken):
operandCount += 1
print (operandCount)
Example usage:
input: 1 + 2 / ( 69 * -420 ) - -3.14 => output: 5
I also recommend you learn more about regex.
How to match exactly two same characters in a string like '4003', '1030'.
import re
s='1030'
if re.search('0{2}',s):
print(True)
But the above code matches only '1002' butnot '1030'
Assume you don't have to use regex:
Note that a string with 4 characters have exactly a pair of duplicating character if and only if it has 3 unique characters. So:
Make a set of its characters
Check if there are 3 distinct elements in the set.
Do you HAVE to use regex? Just use .count()
>>> '1002'.count('0')
2
>>> '1030'.count('0')
2
>>> '2002200220'.count('20')
3
This code sniped just checks if f.e. index 3 from the string number1 is equal to the index 3 from the string number2.
number1 = '1002'
number2 = '1030'
counter = 0
for i in number1:
if number1[counter] is number2[counter]:
print("It's a match")
counter = counter + 1
My strings are something like that:
str1 = "3,5 of 5 stars"
str2 = "4 of 5 stars"
I want to extract the first number of each string.
Something like that:
str1 = 3,5
str2 = 4
The Problem is that the numbers are in two formats (int and float)
I hope you guys can help me
Thanks for your help
If there is a space before the "of", you can use (avoids regex):
>>> print [item.split()[0] for item in [str1, str2]]
['3,5', '4']
string = "3 o 4 k 5"
for char in string:
entry = ""
try:
entry = int(char)
except:
continue
if entry != "":
print entry
break
Here's the explanation. The string holds the string. As the for loop begins, char is set to the first character in the string. The for loop attempts to convert the character into an integer. If it is successful, it means that it is the character is a number. In that case, it is the first number to be found, so it is outputted and the loop stops.
If the conversion fails, it will output an error (thus the except part) but since we are using try/except, the loop will skip to its next character immediately. The for loop will continue until a number has been found or there are no numbers in the string.
If the pattern of strings is always "X of Y stars" you can do the following:
str1 = "3,5 of 5 stars"
str2 = "4 of 5 stars"
lst = [str1, str2, ...]
nums = [float(x.split(' of ')[0].replace(',','.')) for x in lst]
print(nums) # prints [3.5, 4.0]
To match numbers and floats (using the , delimiter) in a string you could use the re module:
>>> re.findall(r"[-+]?\d*\,\d+|\d+", "5,5 of 5 stars")
['5,5', '5']
>>> re.findall(r"[-+]?\d*\,\d+|\d+", "5,5 of 5 stars")[0]
'5,5'
>>> re.findall(r"[-+]?\d*\,\d+|\d+", "4 of 5 stars")[0]
'4'
I've used the regex from this StackOverflow answer (from #miku) but modified it to use , as delimiter instead of ..
I guess your string format is - X of Y stars
You can extract X in this way.
>>> my_str = "3,5 of 5 stars"
>>> my_str.strip().split(' ')[0]
'3,5'
Let's say you want to convert 3,5 to float to do some math on it then you first replace , with . and then wrap around float(...).
>>> float(my_str.strip().split(' ')[0].replace(',','.'))
3.5
My code intends to identify the first non-repeating string characters, empty strings, repeating strings (i.e. abba or aa), but it's also meant to treat lower and upper case input as the same character while returning the accurate non-repeating character in it's orignial case input.
def first_non_repeat(string):
order = []
counts = {}
for x in string:
if x in counts and x.islower() == True:
counts[x] += 1
else:
counts[x] = 1
order.append(x)
for x in order:
if counts[x] == 1:
return x
return ''
My logic on line 5 was that if I make all letter inputs lowercase, then it would iterate through the string input and not distinguish by case. But as of now, take the input 'sTreSS'and output is 's' when really I need 'T'. If the last two S's were lowercase, then it would be 'T' but I need code flexible enough to handle any case input.
When comparing two letters, use lower() to compare the characters in a string. An example would be:
string ="aabcC"
count = 0
while count < len(string) - 1:
if string[count].lower() == string[count + 1].lower():
print "Characters " + string[count] + " and " + string[count + 1] + " are repeating."
count += 1
Here's little change u can make to your code to make it work.
def first_non_repeat(string):
order = []
counts = {}
for x in string:
char_to_look = x.lower() #### convert to lowercase for all operations
if char_to_look in counts :
counts[char_to_look] += 1
else:
counts[char_to_look] = 1
order.append(char_to_look)
for x in string: ### search in the string instead or order, character and order will remain the same, except the case. So again do x.lower() to search in count
if counts[x.lower()] == 1:
return x
return ''1
The point is that x in counts is searched for in a case-insensitive way. You have to implement your own case insensitive Dictionary, or use regular expressions to detect repeating letters:
import re
def first_non_repeat(string):
r = re.compile(r'([a-z])(?=.*\1)', re.I|re.S)
m = r.search(string)
while m:
string = re.sub(m.group(1), '', string, re.I)
m = r.search(string)
return string[0]
print(first_non_repeat('sTreSS'))
See the Python demo
The ([a-z])(?=.*\1) regex finds any ASCII letter that also appears somewhere ahead (note that ([a-z]) captures the char into Group 1 and the (?=.*\1) is a lookahead where \1 matches the same char captured into Group 1 after any 0+ characters matched with .* pattern, and re.S flag helps support strings with linebreaks).
The re.sub will remove all the found letters in a case insensitive way, so we will only get unique characters in the string after the while block.
I wrote the function that converts the string in argument to number. If the string does not contain number the cycle breaks and the new variable with numbers is printed.
If the argument is "123" the function returns 6. I don't want to return the sum, just placing every number in a row. How do I accomplish the result 123? I don!t know what to use instead of string2 += float(c).
def try_parse(string):
string2=0
for c in string:
if c.isdigit() == True:
string2 += float(c)
else:
break
return string2
I modified your code:
def try_parse(string):
string2 = ""
for c in string:
if not c.isdigit() and c != '.':
break
string2 += c
return string2
You can see that now I use string2 as a string and not an int (When the + sign is used on an int you sum, and with a string + is used for concatenation).
Also, I used a more readable if condition.
Update:
Now the condition is ignoring the '.'.
Tests:
>>> try_parse('123')
'123'
>>> try_parse('12n3')
'12'
>>> try_parse('')
''
>>> try_parse('4.13n3')
'4.13'
Note
The return type is string you can use the float() function wherever you like :)
You need to use a string for string2, and str instead of float.
You want string2 = "", and string2 += c. (You don't need to call str on c because it is already a string.)
You could leave the conversion to a number to Python (using int(), rather than float(); you only filter on digits), and only worry about filtering:
def try_parse(string):
digits = []
for c in string:
if c.isdigit():
digits.append(c)
return int(''.join(digits))
but if you really want to build a number yourself, you need to take into account that digits are not just their face value. 1 in 123 does not have the value of one. It has a value of 100.
The easiest way then to build your number would be to multiply the number you have so far by 10 before adding the next digit. That way 1 stays 1, and 12 starts as 1 then becomes 10 as you add the 2, etc:
def try_parse(string):
result = 0
for c in string:
if c.isdigit():
result = result * 10 + int(c)
return result