I'm studying operators in Python and came across multiple concepts that decide order of evaluation when there are multiple operators in an expression.
I understand the concept of Operator precedence, and came across the operator precedence table in Python docs. There were a few things that confused me there,
Why are assignment and augmented-assignment operators not included in the list?
What really counts as an operator in Python? (And is there any difference between operator and a keyword).
The later question stems from what the categorization of operator that I've read at various places on the internet, they categorize operators in the following categories
Arithmetic Operators
Comparison Operators
Assignment Operators
Logical Operators
Bitwise Operators
Membership Operators
Identity Operators
But when I saw keywords like lambda, if-else in the operator precedence table in the Python documentation, it confused me. Moreover the operator mapping table in documentation for operator module includes keywords like del which are neither part of usual categorization on the internet and the precedence table in Python docs.
My final question is "is there any grouping that can be done about the
categories of operators and their behavior (precedence, chaining, associativity, etc) in Python? Or should I be studying every operator and its behavior independently?"
Why are assignment and augmented-assignment operators not included in the list?
Because they're not true operators. We sometimes call them operators for convenience, but they cannot form expressions and therefore do not have precedence relative to real operators.
What really counts as an operator in Python? (And is there any difference between operator and a keyword).
According to the doc on operators, it seems to be any punctuation that can form an expression. For simplicity, I prefer to define an operator as any punctuation or keyword that can form an expression.
But when I saw keywords like lambda, if-else in the operator precedence table in the Python documentation, it confused me.
Those are keywords that can form expressions, therefore they must have operator precedence. Note that if-else can be an expression or a block statement, depending on the syntax:
# Expression
a if condition else b
# Statement
if condition:
pass
else:
pass
Moreover the operator mapping table in documentation for operator module includes keywords like del which are neither part of usual categorization on the internet and the precedence table in Python docs.
del is not an operator, because it's used only to get a side-effect. However, it can potentially modify an object in place, so it makes sense to include a function in the operator library that does the same thing. The other use for del is to remove a variable, which is something a function can't do.
My final question is "is there any grouping that can be done about the
categories of operators and their behavior (precedence, chaining, associativity, etc) in Python? Or should I be studying every operator and its behavior independently?"
Operators can always be combined to form larger expressions, so they must have precedence and associativity to define the meaning of a non-trivial expression. Non-operator syntax usually forms either a statement or a group of statements.
Related
I'm learning how parsers work by creating a simple recursive descent parser. However I'm having a problem defining my grammar to be LL(1). I want to be able to parse the following two statements:
a = 1
a + 1
To do this I've created the following grammar rules:
statement: assignent | expression
assignment: NAME EQUALS expression
expression: term [(PLUS|MINUS) term]
term: NAME | NUMBER
However, this leads to ambiguity when using a LL(1) parser as when a NAME token is encountered in the statement rule, it doesn't know whether it is an assignment or an expression without a look-ahead.
Python's grammar is LL(1) so I know this is possible to do but I can't figure out how to do it. I've looked at Python's grammar rules found here (https://docs.python.org/3/reference/grammar.html) but I'm still not sure how they implement this.
Any help would be greatly appreciated :)
Just treat = as an operator with very low precedence. However (unless you want a language like C where = really is an operator with very low precedence), you need to exclude it from internal (e.g. parenthetic) expressions.
If you had only multiplication and addition, you could use:
expression: factor ['+' factor]
factor: term ['*' term]
term: ID | NUMBER | '(' expression ')'
That is a guide for operator precedence: has higher precedence because the arguments to + can include s but not vice versa. So we could just add assignment:
statement: expression ['=' expression]
Unfortunately, that would allow, for example:
(a + 1) = b
which is undesirable. So it needs to be eliminated, but it is possible to eliminate it when the production is accepted (by a check of the form of the first expression), rather than in the grammar itself. As I understand it, that's what the Python parser does; see the long comment about test and keywords.
If you used an LR(1) parser instead, you wouldn't have this problem.
inputing an logical expression as string and evaluating, i'm getting proper output
str1 = "(1|0)&(1|1&(0|1))"
print eval(str1)
o/p: 1
But the same way if i'm including not operator as ~, the output goes wrong.
str1 = "(~0|~1)&(~1|0)"
print eval(str1)
o/p: -2
Is there any other way of representing not operator here to get proper answer.
These are not logical expressions but bitwise expressions. That is the reason why ~0 == -1. Instead you can look for a parser that parses these expressions the way you want. A quick google search showed up this stackoverflow question.
Sympy seems to implement a similar thing: sympy logic
The logic module for SymPy allows to form and manipulate logic expressions using symbolic and boolean values
&, | and ~ are bitwise operators.
For logic operators use and, or and not.
If your intention is to do logical operations, prefer to use the appropriate boolean values:
True / False
str1 = "(not 0|not 1) and (not 1|0)"
print eval(str1)
In python NOT is not
Ref : https://docs.python.org/2/library/stdtypes.html
I can't find in PEPs information about style of bitwise operators (|, &), in this code in particular:
class MySplashScreen(wx.SplashScreen):
def __init__(self, parent=None):
wx.SplashScreen.__init__(self, logo.getBitmap(), wx.SPLASH_CENTRE_ON_SCREEN | wx.SPLASH_TIMEOUT, 2000, parent)
Should I use spaces in this case (wx.SPLASH_CENTRE_ON_SCREEN | wx.SPLASH_TIMEOUT)?
I would definitely use spaces on either side. Otherwise, it'd hard to spot the | in between the variable/constant names.
The place to find this, if it existed, would be Whitespace in Expressions in PEP 8. However, these operators are not mentioned:
Always surround these binary operators with a single space on either side: assignment (=), augmented assignment (+=, -= etc.), comparisons (==, <, >, !=, <>, <=, >=, in, not in, is, is not), Booleans (and, or, not).
I think there's a good reason for this. While you almost certainly want the space in wx.SPLASH_CENTRE_ON_SCREEN | wx.SPLASH_TIMEOUT, I'm not sure you want it in a|b. In fact, an expression like a&b | c&d seems a pretty good parallel to the recommended x*x + y*y.
The reason you want it here has nothing to do with the operator being |, but with the value being wx.SPLASH_CENTRE_ON_SCREEN. In fact, I think you'd make the same decision with BIG_LONG_CONSTANT_1 + BIG_LONG_CONSTANT_2. So, maybe there should be an additional rule in the style guide about whitespace around operators when the operands are ugly capitalized constants.
But meanwhile, I don't think there is, or needs to be, a specific rule about the bitwise operators. Treat them the same way you do the arithmetic operators. (And note that there is no specific rule for whether or not to put spaces around, e.g., +, except in the case where operators with different priorities are used in the same expression. In fact, you see it both ways within PEP8 itself. That implies that it's acceptable either way in general, and you have to use your own judgment in specific cases.)
All that said, the style checker pep8 flags both bitwise and arithmetic operators without whitespace with an E225. And it even flags the "different priorities" examples like x = x/2 - 1 (which PEP 8 lists as "good") with the optional E226 warning. See missing_whitespace_around_operator for details. I don't think this counts as any kind of official endorsement—but I think "I put the spaces here so the code would pass the style checker we've chosen to use for this project" is a pretty valid reason. (But you might want to check alternatives like pep8ify, and see if pylint, pyflakes, etc. have anything to say on the topic.)
Does anyone have some good resources on learning more advanced regular expressions
I keep having problems where I want to make sure something is not enclosed in quotation marks
i.e. I am trying to make an expression that will match lines in a python file containing an equality, i.e.
a = 4
which is easy enough, but I am having trouble devising an expression that would be able to separate out multiple terms or ones wrapped in quotes like these:
a, b = b, a
a,b = "You say yes, ", "i say no"
Parsing code with regular expressions is generally not a good idea, as the grammar of a programming language is not a regular language. I'm not much of a python programmer, but I think you would be a lot better off parsing python code with python modules such as this one or this one
A think that you have to tokenize the expression for correct evaluation but you can detect the pattern using the following regex
r'\s+(\w+)(\s*,\s*\w+)*\s*=\s*(.*?)(\s*,\s*.*?)*'
If group(2) and group(4) are not empty you have to tokenize the expression
Note that if you have
a,b = f(b,a), g(a,b)
It is hard to analyze
Python has an excellent Language Reference that also includes descriptions of the lexical analysis and syntax.
In your case both statements are assignments with a list of targets on the left hand side and and a list of expressions on the right hand side.
But since parts of that grammar part are context-free and not regular, you can’t use regular expressions (unless they support some kind of recursive patterns). So better use a proper parser as Jonas H suggested.
I am converting some matlab code to C, currently I have some lines that have powers using the ^, which is rather easy to do with something along the lines \(?(\w*)\)?\^\(?(\w*)\)?
works fine for converting (glambda)^(galpha),using the sub routine in python pattern.sub(pow(\g<1>,\g<2>),'(glambda)^(galpha)')
My problem comes with nested parenthesis
So I have a string like:
glambdastar^(1-(1-gphi)*galpha)*(glambdaq)^(-(1-gphi)*galpha);
And I can not figure out how to convert that line to:
pow(glambdastar,(1-(1-gphi)*galpha))*pow(glambdaq,-(1-gphi)*galpha));
Unfortunately, regular expressions aren't the right tool for handling nested structures. There are some regular expressions engines (such as .NET) which have some support for recursion, but most — including the Python engine — do not, and can only handle as many levels of nesting as you build into the expression (which gets ugly fast).
What you really need for this is a simple parser. For example, iterate over the string counting parentheses and storing their locations in a list. When you find a ^ character, put the most recently closed parenthesis group into a "left" variable, then watch the group formed by the next opening parenthesis. When it closes, use it as the "right" value and print the pow(left, right) expression.
I think you can use recursion here.
Once you figure out the Left and Right parts, pass each of those to your function again.
The base case would be that no ^ operator is found, so you will not need to add the pow() function to your result string.
The function will return a string with all the correct pow()'s in place.
I'll come up with an example of this if you want.
Nested parenthesis cannot be described by a regexp and require a full parser (able to understand a grammar, which is something more powerful than a regexp). I do not think there is a solution.
See recent discussion function-parser-with-regex-in-python (one of many similar discussions). Then follow the suggestion to pyparsing.
An alternative would be to iterate until all ^ have been exhausted. no?.
Ruby code:
# assuming str contains the string of data with the expressions you wish to convert
while str.include?('^')
str!.gsub!(/(\w+)\^(\w+)/, 'pow(\1,\2)')
end