I'm trying to create a calculator program in which the user can type an equation and get an answer. I don't want the full code for this, I just need help with a specific part.
The approach I am trying to take is to have the user input the equation as a string (raw_input) and then I am trying to convert the numbers from their input to integers. After that I need to know how I can get the operands to do what I want them to do depending on which operand the user uses and where it is in the equation.
What are some methods I might use to accomplish this task?
Here is basically what I have right now:
equation_number = raw_input("\nEnter your equation now: ")
[int(d) for d in equation_number if d.isdigit()]
Those lines are just for collecting input and attempting to convert the numbers into integers. Unfortunately, it does not seem to be working very well and .isdigit will only work for positive numbers anyway.
Edit- aong152 mentioned recursive parsing, which I looked into, and it appears to have desirable results:
http://blog.erezsh.com/how-to-write-a-calculator-in-70-python-lines-by-writing-a-recursive-descent-parser/
However, I do not understand the code that the author of this post is using, could anyone familiarize me with the basics of recursive parsing?
The type of program you are trying to make is probably more complicated than you think
The first step would be separating the string into each argument.
Let's say that the user inputs:
1+2.0+3+4
Before you can even convert to ints, you are going to need to split the string up into its components:
1
+
2.0
+
3
+
4
This will require a recursive parser, which (seeing as you are new to python) maybe be a bit of a hurdle.
Assuming that you now have each part seperately as strings,
float("2.0") = 2.0
int(2.0) = 2
Here is a helper function
def num (s):
try:
return int(s)
except exceptions.ValueError:
return int(float(s))
instead of raw_input just use input because raw_input returns a string and input returns ints
This is a very simple calculator:
def calculate():
x = input("Equation: ")
print x
while True:
calculate()
the function takes the input and prints it then the while loop executes it again
im not sure if this is what you want but here you go and also you should make a way to end the loop
After using raw_input() you can use eval() on the result to compute the value of this string. eval() evaluates any valid Python expression and returns the outcome.
But I think this is not to your liking. You probably want to do more by yourself.
So I think you should have a look at the re module to split the input using regular expressions into tokens (sth like numbers and operators). After this you should write a parser which gets the token stream as input. You should decide whether this parser shall just return the computed value (e. g. a number) or maybe an abstract syntax tree, i. e. a data structure which represents the expression in an object-oriented (instead of character-oriented) way. Such an Absy could then be evaluated to get the final result.
Are you familiar with regular expressions? If not, it's probably a good idea to first learn about them. They are the weak, non-recursive cousin of parsing. Don't go deep, just understand the building blocks — A then B, A many times, A or B.
The blog post you found is hard because it implements the parsing by hand. It's using recursive descent, which is the only way to write a parser by hand and keep your sanity, but it's still tricky.
What people do most of the time is only write a high level grammar and use a library (or code generator) to do the hard work of parsing.
Indeed he had an earlier post where he uses a library:
http://blog.erezsh.com/how-to-write-a-calculator-in-50-python-lines-without-eval/
At least the beginning should be very easy. Things to pay attention to:
How precedence arises from the structure of the grammar — add consists of muls, not vice versa.
The moment he adds a rule for parentheses:
atom: neg | number | '(' add ')';
This is where it really becomes recursive!
6-2-1 should parse as (6-2)-1, not 6-(2-1). He doesn't discuss it, but if you look
carefully, it also arises from the structure of the grammar. Don't waste tome on this; just know for future reference that this is called associativity.
The result of parsing is a tree. You can then compute its value in a bottom-up manner.
In the "Calculating!" chapter he does that, but the in a sort of magic way.
Don't worry about that.
To build a calculator yourself, I suggest you strip the problem as much as possible.
Recognizing where numbers end etc. is a bit messy. It could be part of the grammar, or done by a separate pass called lexer or tokenizer.
I suggest you skip it — require the user to type spaces around all operators and parens. Or just assume you're already given a list of the form [2.0, "*", "(", 3.0, "+", -1.0, ")"].
Start with a trivial parser(tokens) function that only handles 3-element expressions — [number, op, number].
Return a single number, the result of the computation. (I previously said parsers output a tree which is processed later. Don't worry about that, returning a number is simpler.)
Write a function that expects either a number or parentheses — in the later case it calls parser().
>>> number_or_expr([1.0, "rest..."])
(1.0, ["rest..."])
>>> number_or_expr(["(", 2.0, "+", 2.0, ")", "rest..."])
(4.0, ["rest..."])
Note that I'm now returning a second value - the remaining part of the input. Change parser() to also use this convention.
Now Rewrite parser() to call number_or_expr() instead of directly assuming tokens[0] and tokens[2] are numbers.
Viola! You now have a (mutually) recursive calculator that can compute anything — it just has to be written in verbose style with parens around everything.
Now stop and admire your code, for at least a day :-) It's still simple but has the essential recursive nature of parsing. And the code structure reflects the grammar 1:1 (which is the nice property of recursive descent. You don't want to know how the other algorithms look).
From here there many improvements possible — support 2+2+2, allow (1), precedence... — but there are 2 ways to go about it:
Improve your code step by step. You'll have to refactor a lot.
Stop working hard and use a parsing library, e.g. pyparsing.
This will allow you to experiment with grammar changes faster.
Related
Is there a way to check satisfiability of a python string like 'p or p -> p' in Z3 if you do not know the variable names before hand?
For example I have seen this:
p = Bool('p')
solve(Implies(Or(p, p), p))
However I cannot define the variables in Z3 in advance because the proposition is given to me as a string. How can I do this with z3?
I have also seen python's eval function but it seems I need to have the variable names defined in z3 of that prior too
Some questions to ponder: What would be the meaning of that string? What if it has syntax-errors in it? How do you discern what are the valid operators/variables? Do you allow just booleans, or other sorts as well? What about grouping, precedence, and associativity of operators?
Bottom line, if you want to go directly from a string, you really have no choice but to agree on a syntax and a semantics of what those strings mean. And the only way to do that is to write a parser for those strings, and "interpret" that result in the z3 context.
One choice is to "stick" to SMTLib, i.e., ask your input to be well-formatted SMTLib scripts. If you go with this choice, then z3 already have a built-in parser for them that you can readily use. See here: https://z3prover.github.io/api/html/namespacez3py.html#a09fe122cbfbc6d3fa30a79850b2a2414 But I'm pretty sure you'll find this rather ugly and not quite what you wanted. But this is the only "out-of-the-box" solution.
The proper way to handle this issue is to write a basic parser over boolean-expressions, whose syntax (and to some extent semantics) you'll have freedom to define however way you want. Also, this isn't a particularly difficult thing to do. If you're doing this in Python, you can use ply (https://www.dabeaz.com/ply/), or go with a hand-written recursive-descent parser (https://www.booleanworld.com/building-recursive-descent-parsers-definitive-guide/).
Feel free to explore and ask further questions; though make sure to tag them appropriately if it's about parsing strings in Python; which really have nothing to do with z3/z3py.
I'm trying to write an iterative LL(k) parser, and I've gotten strings down pretty well, because they have a start and end token, and so you can just "".join(tokenlist[string_start:string_end]).
Numbers, however, do not, and only consist of .0123456789. They can occur at any given point in a program, have any arbitrary length and are delimited purely by non-numerals.
Some examples, because that definition is pretty vague:
56 123.45/! is 56 and 123.45 followed by two other tokens
565.5345.345 % is 565.5345, 0.345 and two other tokens (incl. whitespace)
The problem I'm trying to solve is how the parser should figure out where a numeric literal ends. (Note that this is a context-free, self-modifying interpretive grammar thus there is no separate lexical analysis to be done.)
I could and have solved this with iteration:
def _next_notinst(self, atindex, subs = DIGITS):
"""return the next index of a char not in subs"""
for i, e in enumerate(self.toklist[atindex:]):
if e not in subs:
return i - len(self.toklist)
else:
break
return self.idx.v
(I don't think I need to clarify the variables, since it's an example and extremely straightforward.)
Great! That works, but there are at least two issues:
It's O(n) for a number with digit-length n. Not ideal.*
The parser class of which this method is a member is already using a while True: to cycle over arbitrary parts of the string, and I would prefer not having remotely nested loops when I don't need to.
From the previous bullet: since the parser uses arbitrary k lookahead and skipahead, parsing each individual token is absolutely not what I want.
I don't want to use RegEx mostly because I don't know it, and using it for this right now would make my code uncomprehendable to me, its creator.
There must be a simple, < O(n) solution to this, that simply collects the contiguous numerals in a string given a starting point, up until a non-numeral.
*Yes, I'm fully aware the parser itself is O(n), but we don't also need the number catenator to be > O(n). If you don't believe me, the string catenator is O(1) because it simply looks for the next unescaped " in the program and then joins all the chars up to that. Can't I do the same thing for numbers?
My other answer was actually erroneous due to lack of testing.
I decided to suck it up and learn a little bit of RegEx just because it's the only other way to solve this.
^([.\d]+[.\d]+|[.\d]) works for what I want, and matches these:
123.43.453""
.234234!/%
but not, for example:
"1233
I am new to Python and am trying to write a calculator program. I have been trying to do the following but with no success, so please point me in the right direction:
I would like to input an equation as a user, for example:
f(t) = 2x^5 + 8
the program should recognize the different parts of a string and in this case make a variable f(t) and assign 2x^5 + 8 to it.
Though, if I input an equation followed by an equals sign, for example
2x^5 + 8 =
the program will instead just output the answer.
I am not asking how to code for the math-logic of solving the equation, just how to get the program to recognize the different parts of a string and make decisions accordingly.
I am sorry I don't have any code to show as an attempt as I'm not sure how to go about this and am looking for a bit of help to get started.
Thank you.
For a little bit of context: The problem you're describing is more generally known as parsing, and it can get rather complicated, depending on the grammar. The grammar is the description of the language; the language, in your case, is the set of all valid formulas for your calculator.
The first recommended step, even before you start coding, is to formalize your grammar. This is mainly for your own benefit, as it will make the programming easier. A well established way to do this is to describe the grammar using EBNF, and there exist tools like PLY for Python that you can use to generate parsers for such languages.
Let's try a simplified version of your calculator grammar:
digit := "0" | "1" # our numbers are in binary
number := digit | number digit # these numbers are all nonnegative
variable := "x" | "y" # we recognize two variable names
operator := "+" | "-" # we could have more operators
expression := number | variable | "(" expression operator expression ")"
definition := variable "=" expression
evaluation := expression "="
Note that there are multiple problems with this grammar. For example:
What about whitespace?
What about negative numbers?
What do you do about inputs like x = x (this is a valid definition)?
The first two are probably problems with the grammar itself, while the last one might need to be handled at a later stage (is the language perhaps context sensitive?).
But anyway, given such a grammar a tool like PLY can generate a parser for you, but leaving it up to you to handle any additional logic (like x = x). First, however, I'd suggest you try to implement it on your own. One idea is to write a so called Top Down Parser using recursion.
I am trying to match a string with a regular expression but it is not working.
What I am trying to do is simple, it is the typical situation when an user intruduces a range of pages, or single pages. I am reading the string and checking if it is correct or not.
Expressions I am expecting, for a range of pages are like: 1-3, 5-6, 12-67
Expressions I am expecting, for single pages are like: 1,5,6,9,10,12
This is what I have done so far:
pagesOption1 = re.compile(r'\b\d\-\d{1,10}\b')
pagesOption2 = re.compile(r'\b\d\,{1,10}\b')
Seems like the first expression works, but not the second.
And, would it be possible to merge both of them in one single regular expression?, In a way that, if the user introduces either something like 1-2, 7-10 or something like 3,5,6,7 the expression will be recogniced as good.
Simpler is better
Matching the entire input isn't simple, as the proposed solutions show, at least it is not as simple as it could/should be. Will become read only very quickly and probably be scrapped by anyone that isn't regex savvy when they need to modify it with a simpler more explicit solution.
Simplest
First parse the entire string and .split(","); into individual data entries, you will need these anyway to process. You have to do this anyway to parse out the useable numbers.
Then the test becomes a very simple, test.
^(\d+)(?:-\(d+))?$
It says, that there the string must start with one or more digits and be followed by optionally a single - and one or more digits and then the string must end.
This makes your logic as simple and maintainable as possible. You also get the benefit of knowing exactly what part of the input is wrong and why so you can report it back to the user.
The capturing groups are there because you are going to need the input parsed out to actually use it anyway, this way you get the numbers if they match without having to add more code to parse them again anyway.
This regex should work -
^(?:(\d+\-\d+)|(\d+))(?:\,[ ]*(?:(\d+\-\d+)|(\d+)))*$
Demo here
Testing this -
>>> test_vals = [
'1-3, 5-6, 12-67',
'1,5,6,9,10,12',
'1-3,1,2,4',
'abcd',
]
>>> regex = re.compile(r'^(?:(\d+\-\d+)|(\d+))(?:\,[ ]*(?:(\d+\-\d+)|(\d+)))*$')
>>> for val in test_vals:
print val
if regex.match(val) == None:
print "Fail"
else:
print "Pass"
1-3, 5-6, 12-67
Pass
1,5,6,9,10,12
Pass
1-3,1,2,4.5
Fail
abcd
Fail
I am converting some matlab code to C, currently I have some lines that have powers using the ^, which is rather easy to do with something along the lines \(?(\w*)\)?\^\(?(\w*)\)?
works fine for converting (glambda)^(galpha),using the sub routine in python pattern.sub(pow(\g<1>,\g<2>),'(glambda)^(galpha)')
My problem comes with nested parenthesis
So I have a string like:
glambdastar^(1-(1-gphi)*galpha)*(glambdaq)^(-(1-gphi)*galpha);
And I can not figure out how to convert that line to:
pow(glambdastar,(1-(1-gphi)*galpha))*pow(glambdaq,-(1-gphi)*galpha));
Unfortunately, regular expressions aren't the right tool for handling nested structures. There are some regular expressions engines (such as .NET) which have some support for recursion, but most — including the Python engine — do not, and can only handle as many levels of nesting as you build into the expression (which gets ugly fast).
What you really need for this is a simple parser. For example, iterate over the string counting parentheses and storing their locations in a list. When you find a ^ character, put the most recently closed parenthesis group into a "left" variable, then watch the group formed by the next opening parenthesis. When it closes, use it as the "right" value and print the pow(left, right) expression.
I think you can use recursion here.
Once you figure out the Left and Right parts, pass each of those to your function again.
The base case would be that no ^ operator is found, so you will not need to add the pow() function to your result string.
The function will return a string with all the correct pow()'s in place.
I'll come up with an example of this if you want.
Nested parenthesis cannot be described by a regexp and require a full parser (able to understand a grammar, which is something more powerful than a regexp). I do not think there is a solution.
See recent discussion function-parser-with-regex-in-python (one of many similar discussions). Then follow the suggestion to pyparsing.
An alternative would be to iterate until all ^ have been exhausted. no?.
Ruby code:
# assuming str contains the string of data with the expressions you wish to convert
while str.include?('^')
str!.gsub!(/(\w+)\^(\w+)/, 'pow(\1,\2)')
end