Could anyone help me with this python singpath program? - python

First of all, this is not for any class.
I have been working on these 2 programs for a long time and cannot make heads or tails of it. I really want to get past these problems so I can move onto other lessons.
"Create a function that transforms the prefix notation into postfix notation, and postfix notation into prefix notation. The function takes two arguments. The first is a string of an expression without spaces or syntax errors, and the second is another string contains all the operators. Characters not in the second string are regarded as operands. The lengths of all the operators and operands are 1, and all the operators are binary operators."
ex:
>>> fix_trans('ab33c2c11','abc')
'33b211cca'
and Convert to (reverse) Polish notation:
>>> toPolish('(3+5)*(7-2)',D,0)
'*+35-72'

Can you provide any examples of how far you've gotten with this, or what methods haven't worked for you? Also, are you familiar with the shunting-yard algorithm?

Related

Indexing the wrong character for an expression

My program seems to be indexing the wrong character or not at all.
I wrote a basic calculator that allows expressions to be used. It works by having the user enter the expression, then turning it into a list, and indexing the first number at position 0 and then using try/except statements to index number2 and the operator. All this is in a while loop that is finished when the user enters done at the prompt.
The program seems to work fine if I type the expression like this "1+1" but if I add spaces "1 + 1" it cannot index it or it ends up indexing the operator if I do "1+1" followed by "1 + 1".
I have asked in a group chat before and someone told me to use tokenization instead of my method, but I want to understand why my program is not running properly before moving on to something else.
Here is my code:
https://hastebin.com/umabukotab.py
Thank you!
Strings are basically lists of characters. 1+1 contains three characters, whereas 1 + 1 contains five, because of the two added spaces. Thus, when you access the third character in this longer string, you're actually accessing the middle element.
Parsing input is often not easy, and certainly parsing arithmetic expressions can get tricky quite quickly. Removing spaces from the input, as suggested by #Sethroph is a viable solution, but will only go that far. If you all of a sudden need to support stuff like 1+2+3, it will still break.
Another solution would be to split your input on the operator. For example:
input = '1 + 2'
terms = input.split('+') # ['1 ', ' 2'] note the spaces
terms = map(int, terms) # [1, 2] since int() can handle leading/trailing whitespace
output = terms[0] + terms[1]
Still, although this can handle situations like 1 + 2 + 3, it will still break when there's multiple different operators involved, or there are parentheses (but that might be something you need not worry about, depending on how complex you want your calculator to be).
IMO, a better approach would indeed be to use tokenization. Personally, I'd use parser combinators, but that may be a bit overkill. For reference, here's an example calculator whose input is parsed using parsy, a parser combinator library for Python.
You could remove the spaces before processing the string by using replace().
Try adding in:
clean_input = hold_input.replace(" ", "")
just after you create hold_input.

I know of f-strings, but what are r-strings? Are there others?

I started learning python for the first time in an accelerated course on data science a few weeks ago and we were introduced early on to f-strings.
The simple code:
name = 'Tim'
print(f'There are some who call me {name}...')
outputs the string "There are some who call me Tim..."
Through my browsing of various packages out of curiosity, I came upon pages like this one detailing a function you can call in matplotlib to render $\LaTeX$-like expressions within the generated images. In the example code they use something similar to f-strings but with an r instead of an f.
import matplotlib.pyplot as plt
plt.title(r'$\alpha > \beta$')
plt.show()
The resulting (otherwise empty) graph has a title using text which has been formatted similarly to how one would expect using MathJax or $\LaTeX$ with a greek character alpha and a greek character beta.
My questions are the following:
What precisely is an r-string and how does it compare to an f-string? Are r-strings specifically used for matplotlib's mathtext and usetex?
Apart from f-strings and r-strings, are there any other notable similar string variants or alternates that I should familiarize myself with or be made aware of?
An r-string is a raw string.
It ignores escape characters. For example, "\n" is a string containing a newline character, and r"\n" is a string containing a backslash and the letter n.
If you wanted to compare it to an f-string, you could think of f-strings as being "batteries-included." They have tons of flexibility in the ability to escape characters and execute nearly arbitrary expressions. The r-string on the other hand is stripped down and minimalist, containing precisely the characters between its quotation marks.
As far as actually using the things, typically you would use an r-string if you're passing the string into something else that uses a bunch of weird characters or does its own escaping so that you don't have to think too hard about how many backslashes you really need to get everything to work correctly. In your example, they at least needed r-strings to get the \a bit working correctly without double escapes. Note that '$\\alpha > \\beta$' is identical to r'$\alpha > \beta$'.
Since you're using f-strings, I'll assume you have at least Python 3.6. Not all of these options are supported for older versions but any of the following prefixes are valid in Python 3.6+ in any combination of caps and lowers: r, u, f, rf, fr, b, rb, br
The b-strings are binary literals. In Python 2 they do nothing and only exist so that the source code is compatible with Python 3. In Python 3, they allow you to create a bytes object. Strings can be thought of as a view of the underlying bytes, often restricted as to which combinations are allowed. The distinction in types helps to prevent errors from blindly applying text techniques to raw data. In Python 3, note that 'A'==b'A' is False. These are not the same thing.
The u-strings are unicode literals. Strings are unicode by default in Python 3, but the u prefix is allowed for backward compatibility with Python 2. In Python 2, strings are ASCII by default, and the u prefix allows you to include non-ASCII characters in your strings. For example, note the accented character in the French phrase u"Fichier non trouvé".
In the kind of code I write, I rarely need anything beyond r, u, f, and b. Even b is a bit out there. Other people deal with those prefixes every day (presumably). They aren't necessarily anything you need to familiarize yourself with, but knowing they exist and being able to find their documentation is probably a good skill to have.
Just so that it's in an answer instead of buried in a comment, Peter Gibson linked the language specification, and that's the same place I pulled the prefix list from. With your math background, a formal language specification might be especially interesting — depending a little on how much you like algebra and mathematical logic.
Even if it's just for a semantically trivial language like Forth, I think many programmers would enjoy writing a short interpreter and gain valuable insight into how their language of choice works.

Finding the end of a contiguous substring of a string without iteration or RegEx

I'm trying to write an iterative LL(k) parser, and I've gotten strings down pretty well, because they have a start and end token, and so you can just "".join(tokenlist[string_start:string_end]).
Numbers, however, do not, and only consist of .0123456789. They can occur at any given point in a program, have any arbitrary length and are delimited purely by non-numerals.
Some examples, because that definition is pretty vague:
56 123.45/! is 56 and 123.45 followed by two other tokens
565.5345.345 % is 565.5345, 0.345 and two other tokens (incl. whitespace)
The problem I'm trying to solve is how the parser should figure out where a numeric literal ends. (Note that this is a context-free, self-modifying interpretive grammar thus there is no separate lexical analysis to be done.)
I could and have solved this with iteration:
def _next_notinst(self, atindex, subs = DIGITS):
"""return the next index of a char not in subs"""
for i, e in enumerate(self.toklist[atindex:]):
if e not in subs:
return i - len(self.toklist)
else:
break
return self.idx.v
(I don't think I need to clarify the variables, since it's an example and extremely straightforward.)
Great! That works, but there are at least two issues:
It's O(n) for a number with digit-length n. Not ideal.*
The parser class of which this method is a member is already using a while True: to cycle over arbitrary parts of the string, and I would prefer not having remotely nested loops when I don't need to.
From the previous bullet: since the parser uses arbitrary k lookahead and skipahead, parsing each individual token is absolutely not what I want.
I don't want to use RegEx mostly because I don't know it, and using it for this right now would make my code uncomprehendable to me, its creator.
There must be a simple, < O(n) solution to this, that simply collects the contiguous numerals in a string given a starting point, up until a non-numeral.
*Yes, I'm fully aware the parser itself is O(n), but we don't also need the number catenator to be > O(n). If you don't believe me, the string catenator is O(1) because it simply looks for the next unescaped " in the program and then joins all the chars up to that. Can't I do the same thing for numbers?
My other answer was actually erroneous due to lack of testing.
I decided to suck it up and learn a little bit of RegEx just because it's the only other way to solve this.
^([.\d]+[.\d]+|[.\d]) works for what I want, and matches these:
123.43.453""
.234234!/%
but not, for example:
"1233

Python- stuck trying to create a "free hand" calculator

I'm trying to create a calculator program in which the user can type an equation and get an answer. I don't want the full code for this, I just need help with a specific part.
The approach I am trying to take is to have the user input the equation as a string (raw_input) and then I am trying to convert the numbers from their input to integers. After that I need to know how I can get the operands to do what I want them to do depending on which operand the user uses and where it is in the equation.
What are some methods I might use to accomplish this task?
Here is basically what I have right now:
equation_number = raw_input("\nEnter your equation now: ")
[int(d) for d in equation_number if d.isdigit()]
Those lines are just for collecting input and attempting to convert the numbers into integers. Unfortunately, it does not seem to be working very well and .isdigit will only work for positive numbers anyway.
Edit- aong152 mentioned recursive parsing, which I looked into, and it appears to have desirable results:
http://blog.erezsh.com/how-to-write-a-calculator-in-70-python-lines-by-writing-a-recursive-descent-parser/
However, I do not understand the code that the author of this post is using, could anyone familiarize me with the basics of recursive parsing?
The type of program you are trying to make is probably more complicated than you think
The first step would be separating the string into each argument.
Let's say that the user inputs:
1+2.0+3+4
Before you can even convert to ints, you are going to need to split the string up into its components:
1
+
2.0
+
3
+
4
This will require a recursive parser, which (seeing as you are new to python) maybe be a bit of a hurdle.
Assuming that you now have each part seperately as strings,
float("2.0") = 2.0
int(2.0) = 2
Here is a helper function
def num (s):
try:
return int(s)
except exceptions.ValueError:
return int(float(s))
instead of raw_input just use input because raw_input returns a string and input returns ints
This is a very simple calculator:
def calculate():
x = input("Equation: ")
print x
while True:
calculate()
the function takes the input and prints it then the while loop executes it again
im not sure if this is what you want but here you go and also you should make a way to end the loop
After using raw_input() you can use eval() on the result to compute the value of this string. eval() evaluates any valid Python expression and returns the outcome.
But I think this is not to your liking. You probably want to do more by yourself.
So I think you should have a look at the re module to split the input using regular expressions into tokens (sth like numbers and operators). After this you should write a parser which gets the token stream as input. You should decide whether this parser shall just return the computed value (e. g. a number) or maybe an abstract syntax tree, i. e. a data structure which represents the expression in an object-oriented (instead of character-oriented) way. Such an Absy could then be evaluated to get the final result.
Are you familiar with regular expressions? If not, it's probably a good idea to first learn about them. They are the weak, non-recursive cousin of parsing. Don't go deep, just understand the building blocks — A then B, A many times, A or B.
The blog post you found is hard because it implements the parsing by hand. It's using recursive descent, which is the only way to write a parser by hand and keep your sanity, but it's still tricky.
What people do most of the time is only write a high level grammar and use a library (or code generator) to do the hard work of parsing.
Indeed he had an earlier post where he uses a library:
http://blog.erezsh.com/how-to-write-a-calculator-in-50-python-lines-without-eval/
At least the beginning should be very easy. Things to pay attention to:
How precedence arises from the structure of the grammar — add consists of muls, not vice versa.
The moment he adds a rule for parentheses:
atom: neg | number | '(' add ')';
This is where it really becomes recursive!
6-2-1 should parse as (6-2)-1, not 6-(2-1). He doesn't discuss it, but if you look
carefully, it also arises from the structure of the grammar. Don't waste tome on this; just know for future reference that this is called associativity.
The result of parsing is a tree. You can then compute its value in a bottom-up manner.
In the "Calculating!" chapter he does that, but the in a sort of magic way.
Don't worry about that.
To build a calculator yourself, I suggest you strip the problem as much as possible.
Recognizing where numbers end etc. is a bit messy. It could be part of the grammar, or done by a separate pass called lexer or tokenizer.
I suggest you skip it — require the user to type spaces around all operators and parens. Or just assume you're already given a list of the form [2.0, "*", "(", 3.0, "+", -1.0, ")"].
Start with a trivial parser(tokens) function that only handles 3-element expressions — [number, op, number].
Return a single number, the result of the computation. (I previously said parsers output a tree which is processed later. Don't worry about that, returning a number is simpler.)
Write a function that expects either a number or parentheses — in the later case it calls parser().
>>> number_or_expr([1.0, "rest..."])
(1.0, ["rest..."])
>>> number_or_expr(["(", 2.0, "+", 2.0, ")", "rest..."])
(4.0, ["rest..."])
Note that I'm now returning a second value - the remaining part of the input. Change parser() to also use this convention.
Now Rewrite parser() to call number_or_expr() instead of directly assuming tokens[0] and tokens[2] are numbers.
Viola! You now have a (mutually) recursive calculator that can compute anything — it just has to be written in verbose style with parens around everything.
Now stop and admire your code, for at least a day :-) It's still simple but has the essential recursive nature of parsing. And the code structure reflects the grammar 1:1 (which is the nice property of recursive descent. You don't want to know how the other algorithms look).
From here there many improvements possible — support 2+2+2, allow (1), precedence... — but there are 2 ways to go about it:
Improve your code step by step. You'll have to refactor a lot.
Stop working hard and use a parsing library, e.g. pyparsing.
This will allow you to experiment with grammar changes faster.

regular expression help with converting exp1^exp2 to pow(exp1, exp2)

I am converting some matlab code to C, currently I have some lines that have powers using the ^, which is rather easy to do with something along the lines \(?(\w*)\)?\^\(?(\w*)\)?
works fine for converting (glambda)^(galpha),using the sub routine in python pattern.sub(pow(\g<1>,\g<2>),'(glambda)^(galpha)')
My problem comes with nested parenthesis
So I have a string like:
glambdastar^(1-(1-gphi)*galpha)*(glambdaq)^(-(1-gphi)*galpha);
And I can not figure out how to convert that line to:
pow(glambdastar,(1-(1-gphi)*galpha))*pow(glambdaq,-(1-gphi)*galpha));
Unfortunately, regular expressions aren't the right tool for handling nested structures. There are some regular expressions engines (such as .NET) which have some support for recursion, but most — including the Python engine — do not, and can only handle as many levels of nesting as you build into the expression (which gets ugly fast).
What you really need for this is a simple parser. For example, iterate over the string counting parentheses and storing their locations in a list. When you find a ^ character, put the most recently closed parenthesis group into a "left" variable, then watch the group formed by the next opening parenthesis. When it closes, use it as the "right" value and print the pow(left, right) expression.
I think you can use recursion here.
Once you figure out the Left and Right parts, pass each of those to your function again.
The base case would be that no ^ operator is found, so you will not need to add the pow() function to your result string.
The function will return a string with all the correct pow()'s in place.
I'll come up with an example of this if you want.
Nested parenthesis cannot be described by a regexp and require a full parser (able to understand a grammar, which is something more powerful than a regexp). I do not think there is a solution.
See recent discussion function-parser-with-regex-in-python (one of many similar discussions). Then follow the suggestion to pyparsing.
An alternative would be to iterate until all ^ have been exhausted. no?.
Ruby code:
# assuming str contains the string of data with the expressions you wish to convert
while str.include?('^')
str!.gsub!(/(\w+)\^(\w+)/, 'pow(\1,\2)')
end

Categories