How to check for certain things in my string?

How to check for certain things in my string? - python

My code is supposed to return None if the string:
Contains non-supported operators or non-numbers. Supported operators are: **,* , ^ , - , + , / , ( , )
Examples
"4.0 + 2" is valid
"3.88327 - $3.4" is invalid (since "$")
"a + 24" is invalid (since "a")
"2+6" is valid
"4+/3" is invalid (since "+/" is two operators next to each other)
"4**-3" is valid
How would I do this?
Here's my code:
def checkvalid(string1):
temp = string1.split()
for i in len(temp):
if i in "*^-+/()":
return None
if not i.isnumeric():
return None
return string1
But this doesn't always work. It only works for regular integer numbers like "22 66" -> this works, it returns the string, but nothing else seems to work, it always returns None.

Updated Answer
Since my original answer, you've added seven new requirements to this question. I'm disengaging as I think you need to better understand the scope of the problem you're facing before asking for more help.
However, I will throw one more snippet up that might set you on the right path, as it appears that you're trying to find valid mathematical expressions. The following code will do that:
def check_valid(data):
errors = (SyntaxError, NameError)
try:
eval(data)
except errors:
for i in data.split():
try:
eval(i)
except errors:
return None
return data
test = ["4++2", "4+-2", "4.0 + 2", "3.88327 - $3.4", "a + 24", "2+6", "4+/3"]
for t in test:
try:
assert check_valid(t)
print(f"{t} valid")
except AssertionError:
print(f"{t} not valid")
Output
4++2 valid
4+-2 valid
4.0 + 2 valid
3.88327 - $3.4 not valid
a + 24 not valid
2+6 valid
4+/3 not valid
In Python, + can repeat any number of times and still be a valid math expression, as it's just changing the sign of the integer repeatedly.
Original Answer
There are a number of ways to approach this. Given your example, there are a few flaws in your logic:
"4.0" is not numeric. Numeric is in 0-9 or unicode numerics. Docs here
You're checking a string against another string with the in keyword. With your first example string, the sequence "4.0" is clearly not in the sequence "*^-+/()". Example of how this works:
>>> "4.0" in "asdf4.012345"
True
>>> "4.0" in "0.4"
False
A quick fix using similar logic would be to check character-by-character rather than word-by-word, and combine the two conditionals with and. Try the following snippet:
def check_valid(data):
for word in data.split():
for character in word:
if character not in "*^-+/()." and not character.isnumeric():
return None
return data
test = ["4.0 + 2", "3.88327 - $3.4", "a + 24", "22 66", "2+6"]
for t in test:
print(f"Test: {check_valid(t)}")
Output
Test: 4.0 + 2
Test: None
Test: None
Test: 22 66
Test: 2+6
Note: I changed some names to more closely follow python code style best practices.

Adding a few checks to your eval can make it a bit more secure although not ideal.
import re
def checkvalid(string1):
string1 = string1.replace(" ", "")
checkStatements = ["++", "-+", "---"]
checkOut = [x for x in checkStatements if x not in string1]
# If you require anything further characters to be avoided place them in the regex
if re.match('[a-zA-Z]', string1) or len(checkOut) != len(checkStatements):
return False
else:
try:
output = eval(string1)
if isinstance(output, float) or isinstance(output, int):
return True
else:
return False
except:
return False

An alternative might be to use regex to check if the expression contains invalid characters; or to use a string parser.
Because your expressions are simple, let's make python do our job
def check_valid(expression: str):
try:
eval(expression) # Execute the expression; If the expression is valid, return the result of the evaluation; if is invalid; raise exception
return True
except Exception as _:
return False
if __name__ == '__main__':
print("{expression}: {validity}".format(expression="4.0 + 2", validity=check_valid("4.0 + 2")))
print("{expression}: {validity}".format(expression="3.88327 - $3.4", validity=check_valid("3.88327 - $3.4")))
print("{expression}: {validity}".format(expression="a + 24", validity=check_valid("a + 24")))
print("{expression}: {validity}".format(expression="2+6", validity=check_valid("2+6")))

Related

Why doesn't my specific implementation work for returning middle letter from a function?

The aim is to return the middle letter of a string, or " " if the string's even. I'm failing to see why my code doesn't work. Any sort of clarification would be great, thanks!
def mid(ml):
x=len(ml)
y=x+1
z=y/2
if isinstance(z,int)==True:
return (ml[z-1])
else:
return (" ")
print (mid("abc"))

/ doesn't return an int; even if the number it returns can be represented as one:
>>> 4 / 2
2.0 # A float
It would be better to explicitly check if the number is even or not:
# Read as: "If z is odd", or, "If division of z by 2 has a remainder of 1"
if z % 2 == 1:
return ml[z-1]
else:
return " "

This behavior occurs, because the / operator returns a float. Although the smarter way to solve this would be the use of the modulo operator, if you want to stick to your code, could use the is_integer() method of float like this:
def mid(ml):
x=len(ml)
y=x+1
z=y/2
if z.is_integer():
return (ml[int(z)-1])
else:
return (" ")
print (mid("abc"))
Better way to do it:
def mid(ml):
return ml[len(ml)//2] if len(ml) % 2 else ""

Following the answer from #Carcigenicate above you can try following code:
def mid(ml):
x=len(ml)
y=x+1
z=y/2
return (ml[int(z-1)])
print (mid("abc"))

How to check if the order of operators in an equation is correct?

I am trying to make a function that takes an equation as input and evaluate it based on the operations, the rule is that I should have the operators(*,+,-,%,^) between correct mathematical expressions, examples:
Input: 6**8
Result: Not correct
Reason: * has another * next to it instead of a digit or a mathematical expression
Input: -6+2
Result: Not correct
Reason: "-" was in the beginning and it didn't fall between two numbers.
Input: 6*(2+3)
Result: Correct
Reason: "*" was next to a mathematically correct expression "(2+3)

1. Option: eval
eval the expression with try-except:
try:
result = eval(expression)
correct_sign = True
except SyntaxError:
correct_sign = False
Advantages:
Very easy and fast
Disadvantages:
Python accepts expressions, that you probably don't want (e.g. ** is valid in python)
eval is not secure
2. Option: Algorithm
In compilers algorithms are used, to make a math expression readable for the pc. These algorithms can also be used to evaluate if the expression is valid.
I don't aim to explain these algorithms. There are enough resources outside.
This is a very brief structure of what you can do:
Parsing an infix expression
Converting infix expression to a postfix expression
Evaluating the postfix expression
You need to understand what postfix and infix expressions mean.
Resources:
Shunting yard algorithm: https://en.wikipedia.org/wiki/Shunting-yard_algorithm
Reverse polish notation/ post fix notation: https://en.wikipedia.org/wiki/Reverse_Polish_notation
Python builtin tokenizer: https://docs.python.org/3.7/library/tokenize.html
Advantages:
Reliable
Works for complicated expressions
You don't have to reinvent the wheel
Disadvantages
complicate to understand
complicate to implement

As mentioned in comments, this is called parsing and requires a grammar.
See an example with parsimonious, a PEG parser:
from parsimonious.grammar import Grammar
from parsimonious.nodes import NodeVisitor
from parsimonious.exceptions import ParseError
grammar = Grammar(
r"""
expr = (term operator term)+
term = (lpar factor rpar) / number
factor = (number operator number)
operator = ws? (mod / mult / sub / add) ws?
add = "+"
sub = "-"
mult = "*"
mod = "/"
number = ~"\d+(?:\.\d+)?"
lpar = ws? "(" ws?
rpar = ws? ")" ws?
ws = ~"\s+"
"""
)
class SimpleCalculator(NodeVisitor):
def generic_visit(self, node, children):
return children or node
def visit_expr(self, node, children):
return self.calc(children[0])
def visit_operator(self, node, children):
_, operator, *_ = node
return operator
def visit_term(self, node, children):
child = children[0]
if isinstance(child, list):
_, factor, *_ = child
return factor
else:
return child
def visit_factor(self, node, children):
return self.calc(children)
def calc(self, params):
""" Calculates the actual equation. """
x, op, y = params
op = op.text
if not isinstance(x, float):
x = float(x.text)
if not isinstance(y, float):
y = float(y.text)
if op == "+":
return x+y
elif op == "-":
return x-y
elif op == "/":
return x/y
elif op == "*":
return x*y
equations = ["6 *(2+3)", "2+2", "4*8", "123-23", "-1+1", "100/10", "6**6"]
c = SimpleCalculator()
for equation in equations:
try:
tree = grammar.parse(equation)
result = c.visit(tree)
print("{} = {}".format(equation, result))
except ParseError:
print("The equation {} could not be parsed.".format(equation))
This yields
6 *(2+3) = 30.0
2+2 = 4.0
4*8 = 32.0
123-23 = 100.0
The equation -1+1 could not be parsed.
100/10 = 10.0
The equation 6**6 could not be parsed.

You need to use correct data structures and algorithms to achieve your goal to parse a mathematical equation and evaluate it.
also you have to be familiar with two concepts: stacks and trees for creating a parser.
think the best algorithm you can use is RPN (Reverse Polish Notation).

For issue #1, you could always strip out the parentheses before evaluating.
input_string = "6*(2+3)"
it = filter(lambda x: x != '(' and x != ')', input_string)
after = ' '.join(list(it))
print(after)
# prints "6 * 2 + 3"

It looks like you might just be starting to use python. There are always many ways to solve a problem. One interesting one to sort of get you jump started would be to consider splitting the equation based on the operators.
For example the following uses what's called a regular expression to split the formula:
import re
>>> formula2 = '6+3+5--5'
>>> re.split(r'\*|\/|\%|\^|\+|\-',formula2)
['6', '3', '5', '', '5']
>>> formula3 = '-2+5'
>>> re.split(r'\*|\/|\%|\^|\+|\-',formula3)
['', '2', '5']
It may look complex, but in the r'\*|\/|\%|\^|\+|\-' piece the \ means to take the next character literally and the | means 'or' so it evaluates to split on any one of those operators.
In this case you'd notice that any time there are two operators together, or when a formula starts with an operator you will have a blank value in your list - one for the second - in the first formula and one for the leading - in the second formula.
Based on that you could say something like:
if '' in re.split(r'\*|\/|\%|\^|\+|\-',formula):
correctsign = False
Maybe this can serve as a good starting point to get the brain thinking about interesting ways to solve the problem.

Important to first mention that ** stands for exponentiation, i.e 6**8: 6 to the power of 8.
The logic behind your algorithm is wrong because in your code the response depends only on whether the last digit/sign satisfies your conditions. This is because once the loop is complete, your boolean correctsigns defaults to True or False based on the last digit/sign.
You can also use elif instead of nested else statements for cleaner code.
Without changing your core algorithm, your code would like something like this:
def checksigns(equation):
signs = ["*","/","%","^","+","-"]
for i in signs:
if i in equation:
index = equation.index((i))
if (equation[index] == equation[0]):
return "Not correct"
elif (equation[index] == equation[len(equation) - 1]):
return "Not correct"
elif (equation[index + 1].isdigit() and equation[index - 1].isdigit()):
return "Correct"
else:
return "Not correct"

You can use Python's ast module for parsing the expression:
import ast
import itertools as it
def check(expr):
allowed = (ast.Add, ast.Sub, ast.Mult, ast.Mod)
try:
for node in it.islice(ast.walk(ast.parse(expr)), 2, None):
if isinstance(node, (ast.BinOp, ast.Num)):
continue
if not isinstance(node, allowed):
return False
except SyntaxError:
return False
return True
print(check('6**8')) # False
print(check('-6+2')) # False
print(check('6*(2+3)')) # True
The first case 6**8 evaluates as False because it is represented by ast.Pow node and the second one because -6 corresponds to ast.UnaryOp.

True and False Statements, finding variable types

Im trying to decide whether a string variable is a valid integer or float, ive tried try statements but using ASCII seems to be better. I can't get this to work, probably something with the returns and boolean.
def validation(valid):
var=input("no...")
state=True
y=0
for x in range(len(var)):
if state==False or y==2:
valid=False
return valid
if var[x] == chr(46) or chr(48) <= var[x] <= chr(57):
if var[x] == chr(46):
y+=1
state=True
else:
state=False
valid=True
valid = validation(valid)
print(valid)
The ASCII characters 46 and 48-57 are a decimal point and numbers 0-9. If there more than one decimal point (representing by y) than it also returns a false statement.

Simple and powerful approach:
def isNum(txt):
try:
float(txt)
return True
except ValueError:
return False

There is a much simpler way to do this:
try:
float(x)
except ValueError as e:
print e
print "Not a float"
else:
try:
int(x)
except ValueError as e:
print e
print "not an integer"
You can combine these into a method pretty easily.

Your code isn't very efficient, and it doesn't handle negative numbers, or floats written in scientific notation. Using try, as in fiacre's answer is the standard way to do this in Python.
But to answer your question, you don't have a return statement after the for loop, so if all your tests succeed your function returns the default value of None. So you just need to put a return True at the end of your function.
You can make your code a little more efficient by iterating over the string values and testing the characters directly rather than indexing into the string and using chr(). Eg,
for c in var:
if c == '.' or '0' <= c <= '9':
# etc
Alternatively,
valid_chars = set('-.0123456789')
for c in var:
if c in valid_chars:
# etc
FWIW, in your existing code Python has to re-calculate chr(46) etc on every loop. Sure, that's a pretty fast operation, but it's still wasteful.
Also, there's no need to have both state and valid in the function.

make a program return True if there is more than one dot in the string?

so I'm new to programming (and python) and I have to make this program that returns True if the string has zero or one dot characters ("." characters) and return False if the string contains two or more dots
here is what I currently have, I cannot get it to work for me, please correct me if I am wrong, thanks!
def check_dots(text):
text = []
for char in text:
if '.' < 2 in text:
return True
else:
return False

Use the builtin Python function list.count()
if text.count('.') < 2:
return True
It can be even shorter if instead of an if-else statement, you do
return text.count('.') < 2
Also, there are some errors in your function. All you need to do is
def check_dots(text):
return text.count('.') < 2

A correct and shorter version would be:
return text.count('.') <= 1

Python has a function called count()
You can do the following.
if text.count('.') < 2: #it checks for the number of '.' occuring in your string
return True
else:
return False
A shortcut would be:
return text.count('.')<2
Let's analyze the above statement.
in this part, text.count('.')<2: It basically says "I will check for periods that occur less than twice in the string and return True or False depending on the number of occurences." So if text.count('.') was 3, then that would be 3<2 which would become False.
another example. Say you want it to return False if a string is longer than 7 characters.
x = input("Enter a string.")
return len(x)>7
The code snippet len(x)>7 means that the program checks for the length of x. Let's pretend the string length is 9. In this case, len(x) would evaluate to 9, then it would evaluate to 9>7, which is True.

I shall now analyze your code.
def check_dots(text):
text = [] ################ Don't do this. This makes it a list,
# and the thing that you are trying to
# do involves strings, not lists. Remove it.
for char in text: #not needed, delete
if '.' < 2 in text: #I see your thinking, but you can use the count()
#to do this. so -> if text.count('.')<2: <- That
# will do the same thing as you attempted.
return True
else:
return False

Pass Variables within Python Class

I have the following class. But when trying to pass the variable x to the re.match it doesnt appear to correctly work as whatever input I put in it returns "invalid"
class validate:
def __init__(self, input_value):
self.input_value = input_value
def macaddress(self, oui):
self.oui = oui
#oui = 0, input deemed valid if it matches {OUI:DEVICE ID}.
#oui = 1, input deemed valid if it matches {OUI}.
if self.oui == 0:
x = 5
elif self.oui == 1:
x = 2
if re.match("[0-9a-fA-F]{2}([.-: ][0-9a-fA-F]{2}){x}$", self.input_value):
return "valid"
else:
return "invalid"
Should I be passing var x in some other manner ?
Thanks,

Insert x into the string this way (using string formatting):
Python <2.7:
if re.match("[0-9a-fA-F]{2}([.-: ][0-9a-fA-F]{2}){%d}$" % x, self.input_value):
However if you use the python 3 way of formatting, your regex interferes.
It can be cleaner (but slower) to use concatenation.
Without concatenation:
if re.match("[0-9a-fA-F]\{2\}([.-: ][0-9a-fA-F]\{2\}){0}$".format(x), self.input_value):
With concatenation:
if re.match("[0-9a-fA-F]{2}([.-: ][0-9a-fA-F]{2})" + x + "$", self.input_value):
Note: this fails if implicit type conversion is not possible.
If you just put {x} in the middle of your string, Python doesn't actually do anything with it unless string formatting is applied.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to check for certain things in my string? - python

Related

Why doesn't my specific implementation work for returning middle letter from a function?

How to check if the order of operators in an equation is correct?

True and False Statements, finding variable types

make a program return True if there is more than one dot in the string?

Pass Variables within Python Class

Categories

Resources