This question already has answers here:
When are parentheses required around a tuple?
(3 answers)
Closed 8 years ago.
So I stumbled into a particular behaviour of tuples in python that I was wondering if there is a particular reason for it happening.
While we are perfectly capable of assigning a tuple to a variable without
explicitely enclosing it in parentheses:
>>> foo_bar_tuple = "foo","bar"
>>>
we are not able to print or check in a conditional if statement the variable containing
the tuple in the previous fashion (without explicitely typing the parentheses):
>>> print foo_bar_tuple == "foo","bar"
False bar
>>> if foo_bar_tuple == "foo","bar": pass
SyntaxError: invalid syntax
>>>
>>> print foo_bar_tuple == ("foo","bar")
True
>>>
>>> if foo_bar_tuple == ("foo","bar"): pass
>>>
Does anyone why?
Thanks in advance and although I didn't find any similar topic please inform me if you think it is a possible dublicate.
Cheers,
Alex
It's because the expressions separated by commas are evaluated before the whole comma-separated tuple (which is an "expression list" in the terminology of the Python grammar). So when you do foo_bar_tuple=="foo", "bar", that is interpreted as (foo_bar_tuple=="foo"), "bar". This behavior is described in the documentation.
You can see this if you just write such an expression by itself:
>>> 1, 2 == 1, 2 # interpreted as "1, (2==1), 2"
(1, False, 2)
The SyntaxError for the unparenthesized tuple is because an unparenthesized tuple is not an "atom" in the Python grammar, which means it's not valid as the sole content of an if condition. (You can verify this for yourself by tracing around the grammar.)
Considering an example of if 1 == 1,2: which should cause SyntaxError, following the full grammar:
if 1 == 1,2:
Using the if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite], we get to shift the if keyword and start parsing 1 == 1,2:
For the test rule, only first production matches:
test: or_test ['if' or_test 'else' test] | lambdef
Then we get:
or_test: and_test ('or' and_test)*
And step down into and_test:
and_test: not_test ('and' not_test)*
Here we just step into not_test at the moment:
not_test: 'not' not_test | comparison
Notice, our input is 1 == 1,2:, thus the first production doesn't match and we check the other one: (1)
comparison: expr (comp_op expr)*
Continuing on stepping down (we take the only the first non-terminal as the zero-or-more star requires a terminal we don't have at all in our input):
expr: xor_expr ('|' xor_expr)*
xor_expr: and_expr ('^' and_expr)*
and_expr: shift_expr ('&' shift_expr)*
shift_expr: arith_expr (('<<'|'>>') arith_expr)*
arith_expr: term (('+'|'-') term)*
term: factor (('*'|'/'|'%'|'//') factor)*
factor: ('+'|'-'|'~') factor | power
Now we use the power production:
power: atom trailer* ['**' factor]
atom: ('(' [yield_expr|testlist_comp] ')' |
'[' [testlist_comp] ']' |
'{' [dictorsetmaker] '}' |
NAME | NUMBER | STRING+ | '...' | 'None' | 'True' | 'False')
And shift NUMBER (1 in our input) and reduce. Now we are back at (1) with input ==1,2: to parse. == matches comp_op:
comp_op: '<'|'>'|'=='|'>='|'<='|'<>'|'!='|'in'|'not' 'in'|'is'|'is' 'not'
So we shift it and reduce, leaving us with input 1,2: (current parsing output is NUMBER comp_op, we need to match expr now). We repeat the process for the left-hand side, going straight to the atom nonterminal and selecting the NUMBER production. Shift and reduce.
Since , does not match any comp_op we reduce the test non-terminal and receive 'if' NUMBER comp_op NUMBER. We need to match else, elif or : now, but we have , so we fail with SyntaxError.
I think the operator precedence table summarizes this nicely:
You'll see that comparisons come before expressions, which are actually dead last.
in, not in, is, is not, Comparisons, including membership tests
<, <=, >, >=, <>, !=, == and identity tests
...
(expressions...), [expressions...], Binding or tuple display, list display,
{key: value...}, `expressions...` dictionary display, string conversion
Related
Given:
e = 'a' + 'b' + 'c' + 'd'
How do I write the above in two lines?
e = 'a' + 'b' +
'c' + 'd'
What is the line? You can just have arguments on the next line without any problems:
a = dostuff(blahblah1, blahblah2, blahblah3, blahblah4, blahblah5,
blahblah6, blahblah7)
Otherwise you can do something like this:
if (a == True and
b == False):
or with explicit line break:
if a == True and \
b == False:
Check the style guide for more information.
Using parentheses, your example can be written over multiple lines:
a = ('1' + '2' + '3' +
'4' + '5')
The same effect can be obtained using explicit line break:
a = '1' + '2' + '3' + \
'4' + '5'
Note that the style guide says that using the implicit continuation with parentheses is preferred, but in this particular case just adding parentheses around your expression is probably the wrong way to go.
From PEP 8 -- Style Guide for Python Code:
The preferred way of wrapping long lines is by using Python's implied line continuation inside parentheses, brackets and braces. Long lines can be broken over multiple lines by wrapping expressions in parentheses. These should be used in preference to using a backslash for line continuation.
Backslashes may still be appropriate at times. For example, long, multiple with-statements cannot use implicit continuation, so backslashes are acceptable:
with open('/path/to/some/file/you/want/to/read') as file_1, \
open('/path/to/some/file/being/written', 'w') as file_2:
file_2.write(file_1.read())
Another such case is with assert statements.
Make sure to indent the continued line appropriately. The preferred place to break around a binary operator is after the operator, not before it. Some examples:
class Rectangle(Blob):
def __init__(self, width, height,
color='black', emphasis=None, highlight=0):
if (width == 0 and height == 0 and
color == 'red' and emphasis == 'strong' or
highlight > 100):
raise ValueError("sorry, you lose")
if width == 0 and height == 0 and (color == 'red' or
emphasis is None):
raise ValueError("I don't think so -- values are %s, %s" %
(width, height))
Blob.__init__(self, width, height,
color, emphasis, highlight)file_2.write(file_1.read())
PEP8 now recommends the opposite convention (for breaking at binary operations) used by mathematicians and their publishers to improve readability.
Donald Knuth's style of breaking before a binary operator aligns operators vertically, thus reducing the eye's workload when determining which items are added and subtracted.
From PEP8: Should a line break before or after a binary operator?:
Donald Knuth explains the traditional rule in his Computers and Typesetting series: "Although formulas within a paragraph always break after binary operations and relations, displayed formulas always break before binary operations"[3].
Following the tradition from mathematics usually results in more readable code:
# Yes: easy to match operators with operands
income = (gross_wages
+ taxable_interest
+ (dividends - qualified_dividends)
- ira_deduction
- student_loan_interest)
In Python code, it is permissible to break before or after a binary operator, as long as the convention is consistent locally. For new code Knuth's style is suggested.
[3]: Donald Knuth's The TeXBook, pages 195 and 196
The danger in using a backslash to end a line is that if whitespace is added after the backslash (which, of course, is very hard to see), the backslash is no longer doing what you thought it was.
See Python Idioms and Anti-Idioms (for Python 2 or Python 3) for more.
Put a \ at the end of your line or enclose the statement in parens ( .. ). From IBM:
b = ((i1 < 20) and
(i2 < 30) and
(i3 < 40))
or
b = (i1 < 20) and \
(i2 < 30) and \
(i3 < 40)
You can break lines in between parenthesises and braces. Additionally, you can append the backslash character \ to a line to explicitly break it:
x = (tuples_first_value,
second_value)
y = 1 + \
2
From the horse's mouth: Explicit line
joining
Two or more physical lines may be
joined into logical lines using
backslash characters (\), as follows:
when a physical line ends in a
backslash that is not part of a string
literal or comment, it is joined with
the following forming a single logical
line, deleting the backslash and the
following end-of-line character. For
example:
if 1900 < year < 2100 and 1 <= month <= 12 \
and 1 <= day <= 31 and 0 <= hour < 24 \
and 0 <= minute < 60 and 0 <= second < 60: # Looks like a valid date
return 1
A line ending in a backslash cannot
carry a comment. A backslash does not
continue a comment. A backslash does
not continue a token except for string
literals (i.e., tokens other than
string literals cannot be split across
physical lines using a backslash). A
backslash is illegal elsewhere on a
line outside a string literal.
If you want to break your line because of a long literal string, you can break that string into pieces:
long_string = "a very long string"
print("a very long string")
will be replaced by
long_string = (
"a "
"very "
"long "
"string"
)
print(
"a "
"very "
"long "
"string"
)
Output for both print statements:
a very long string
Notice the parenthesis in the affectation.
Notice also that breaking literal strings into pieces allows to use the literal prefix only on parts of the string and mix the delimiters:
s = (
'''2+2='''
f"{2+2}"
)
One can also break the call of methods (obj.method()) in multiple lines.
Enclose the command in parenthesis "()" and span multiple lines:
> res = (some_object
.apply(args)
.filter()
.values)
For instance, I find it useful on chain calling Pandas/Holoviews objects methods.
It may not be the Pythonic way, but I generally use a list with the join function for writing a long string, like SQL queries:
query = " ".join([
'SELECT * FROM "TableName"',
'WHERE "SomeColumn1"=VALUE',
'ORDER BY "SomeColumn2"',
'LIMIT 5;'
])
Taken from The Hitchhiker's Guide to Python (Line Continuation):
When a logical line of code is longer than the accepted limit, you need to split it over multiple physical lines. The Python interpreter will join consecutive lines if the last character of the line is a backslash. This is helpful in some cases, but should usually be avoided because of its fragility: a white space added to the end of the line, after the backslash, will break the code and may have unexpected results.
A better solution is to use parentheses around your elements. Left with an unclosed parenthesis on an end-of-line the Python interpreter will join the next line until the parentheses are closed. The same behaviour holds for curly and square braces.
However, more often than not, having to split a long logical line is a sign that you are trying to do too many things at the same time, which may hinder readability.
Having that said, here's an example considering multiple imports (when exceeding line limits, defined on PEP-8), also applied to strings in general:
from app import (
app, abort, make_response, redirect, render_template, request, session
)
I'm writing a profile manager for Stellaris game and I've hit a wall with their format in which they keep the info about mods and settings.
Mod file:
name="! (Ship Designer UI Fix) !"
path="mod/ship_designer_ui_fix"
tags={
"Fixes"
}
remote_file_id="879973318"
supported_version="1.6"
Settings:
language="l_english"
graphics={
size={
x=1920
y=1200
}
min_gui={
x=1920
y=1200
}
gui_scale=1.000000
gui_safe_ratio=1.000000
refreshRate=59
fullScreen=no
borderless=no
display_index=0
shadowSize=2048
multi_sampling=8
maxanisotropy=16
gamma=50.000000
vsync=yes
}
last_mods={
"mod/ship_designer_ui_fix.mod"
"mod/ugc_720237457.mod"
"mod/ugc_775944333.mod"
}
I've thought pyparsing will be of help there (and it probably will be) but it has been a long time since I've actually did something like this and this I'm clueless atm.
I've got to extract the simple key=value but I'm struggling to actually move from there to be able to extract the arrays, not to mention the multilevel arrays.
lbrack = Literal("{").suppress()
rbrack = Literal("}").suppress()
equals = Literal("=").suppress()
nonequals = "".join([c for c in printables if c != "="]) + " \t"
keydef = ~lbrack + Word(nonequals) + equals + restOfLine
conf = Dict( ZeroOrMore( Group(keydef) ) )
tokens = conf.parseString(data)
I haven't got very far as you can see. Can anyone point me towards next step? I'm not asking a finished and working solution for the whole thing - it would move me forward a lot but where's the fun in that :)
Well, it is awfully tempting to just dive in and write this parser, but you want some of that fun for yourself, that's great.
Before writing any code, write a BNF. That way you'll write a decent and robust parser, instead of just "everything that's not an equals sign must be an identifier".
There are a lot of "something = something" bits here, look at the kinds of things on the right- and left-hand sides of the '='. The left-hand sides all look like pretty well-mannered identifiers: alphas, underscores. I could envision numeric digits too, as long as they aren't the leading character. So let's say the left-hand sides will be identifiers:
identifier_leading = 'A'..'Z' 'a'..'z' '_'
identifier_body = identifier_leading '0'..'9'
identifier ::= identifier_leading + identifier_body*
The right-hand sides are a mix of things:
integers
floats
'yes' or 'no' booleans
quoted strings
something in braces
The "something in braces" are either a list of quoted strings, or a list of 'identifer = value' pairs. I'll skip the awful details of defining floats and integers and quoted strings, let's just assume we have those defined:
boolean_value ::= 'yes' | 'no'
value ::= float | integer | boolean_value | quoted_string | string_list_in_braces | key_value_list_in_braces
string_list_in_braces ::= '{' quoted_string * '}'
key_value ::= identifier '=' value
key_value_list_in_braces ::= '{' key_value* '}'
You will have to use a pyparsing Forward to declare value before it is fully defined, since it is used in key_value, but key_value is used in key_value_list_in_braces, which is used to define value - a recursive grammar. You are already familiar with the Dict(OneOrMore(Group(named_item))) pattern, and this should be good to give you a structure of fields that are accessible by name. For identifier, a Word would work, or you could just use the pre-defined pyparsing_common.identifier which was introduced as part of the pyparsing_common namespace class last year.
The translation from BNF to pyparsing should be pretty much 1-to-1 from here. For that matter, from the BNF, you could use PLY, ANTLR, or another parsing lib too. The BNF is really worth taking the 1/2 hour or 1/2 day to get sorted out.
The following line of code outputs SyntaxError: invalid syntax
for (i in range(-WIDTH,WIDTH)):
The next one works without errors. I have no idea what the syntax error is supposed to be here. So I am just asking out of curiosity. My guess is that the brackets prevent the expression from being evaluated.
for i in range(-WIDTH,WIDTH):
Your parentheses are essentially just confusing the parser.
There are a couple of reasons you could have an open paren after a for, most notably using tuple unpacking:
>>> for (x, y) in zip(range(5), range(6, 11)):
... print(x, '->', y)
...
0 -> 6
1 -> 7
2 -> 8
3 -> 9
4 -> 10
Additionally, parens can be used in loads of places in Python for simple grouping, such as when breaking up long lines:
>>> s = ("This is "
... "a really awkward way "
... "to write a "
... "long string "
... "over several lines")
>>>
>>> s
'This is a really awkward way to write a long string over several lines'
So the parser won't really complain about it.
However, as you know, for is supposed to read like this:
for_stmt ::= "for" target_list "in" expression_list ":" suite
["else" ":" suite]
Which means that by grouping this way, you're constructing an invalid loop. Essentially, yours reads that there is no in because it's grouped into the target_list by your parentheses. Hope this makes sense.
A way to see more clearly what's happening: write the rest of your for loop (in expression_list) after your close paren. Then you will get a clearer error about how it is interpreting this statement.
>>> for (i in range(-WIDTH, WIDTH)) in range(-WIDTH, WIDTH):
... print(i)
...
File "<stdin>", line 1
SyntaxError: can't assign to comparison
So it will let you do it, but the result of x in y will be a boolean, which cannot be the target of an assignment. The original error you got is because it got to your : before it found your in, which is plain old invalid syntax, as if you just wrote for x:.
I have a boolean expression string, that I would like to take apart:
condition = "a and (b or (c and d))"
Or let's say:
I want to be able to access the string contents between two parenthesis.
I want following outcome:
"(b or (c and d))"
"(c and d)"
I've tried the following with regular expressions (not really working)
x = re.match(".*(\(.*\))", condition)
print x.group(1)
Question:
What is the nicest way to take a boolean expression string apart?
This is the sort of thing you can't do with a simple regex. You need to actually parse the text. pyparsing is apparently excellent for doing that.
Like everyone said, you need a parser.
If you don't want to install one, you can start from this simple top-down parser (take the last code sample here)
Remove everything not related to your need (+, -, *, /, is, lambda, if, else, ...). Just keep parenthesis, and, or.
You will get a binary tree structure generated from your expression.
The tokenizer use the build-in tokenize (import tokenize), which is a lexical scanner for Python source code but works just fine for simple cases like yours.
If your requirements are fairly simple, you don't really need a parser.
Matching parentheses can easily be achieved using a stack.
You could do something like the following:
condition = "a and (b or (c and d))"
stack = []
for c in condition:
if c != ')':
stack.append(c)
else:
d = c
contents = []
while d != '(':
contents.insert(0, d)
d = stack.pop()
contents.insert(0, d)
s = ''.join(contents)
print(s)
stack.append(s)
produces:
(c and d)
(b or (c and d))
Build a parser:
Condition ::= Term Condition'
Condition' ::= epsilon | OR Term Condition'
Term ::= Factor Term'
Term' ::= epsilon | AND Factor Term'
Factor ::= [ NOT ] Primary
Primary ::= Literal | '(' Condition ')'
Literal ::= Id
I have been trying to convert a simple execution such as:
for x in xrange(10):
if x % 2 == 0:
print x, 'is even'
to a one liner version:
for x in xrange(10): if x % 2 == 0: print x, 'is even'
which gives me:
File "foo.py", line 1
for x in xrange(10): if x % 2 == 0: print x, 'is even'
^
SyntaxError: invalid syntax
I don't see any ambiguity in here. Is there a particular reason why this fails?
It's simply not allowed by the grammar. The relevant productions are:
for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite]
suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT
As you can see, after a for you can either put a simple statement or a "suite", i.e. an indented block. An if statement is a compound statement, not a simple one.
Two lines are the minimum to express this program:
for x in xrange(10):
if x % 2 == 0: print x, 'is even'
(Of course, you can write equivalent programs that take only one line, such as
for x in xrange(0, 10, 2): print x, "is even"
or any of the other one-liners posted in response to this question.)
From the formal grammar for 2.7:
compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated
if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]
for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite]
suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT
simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
small_stmt: (expr_stmt | print_stmt | del_stmt | pass_stmt | flow_stmt |
import_stmt | global_stmt | exec_stmt | assert_stmt)
If the suite had allowed a compound_stmt then what you suggest would be accepted. But that would also allow something like this:
if True: try:
# do something
except:
# handle
foo()
Is that except outside the enclosing if? Is the call to foo outside the enclosing if? I think this shows that we really don't want in-lining compound statements to be allowed by the formal grammar. Simply adding suite: compound_stmt makes the grammar ambiguous as I read it, where the same code can be interpreted with two or more different meanings, neither disprovable.
Basically, it's by design that what you ask is a parse error. Reworking the formal grammar could allow the code in your example to work without other funny stuff, but it requires careful attention to ambiguity and other problems.
See also Dangling Else, a grammar problem that afflicted the standard Algol-60 language. It's not always easy to find these kinds of problems, so a healthy fear of changing a working grammar is a good thing.
try something like :
In [14]: from __future__ import print_function
In [17]: for x in xrange(10): print (x,'is even') if x%2==0 else None
....:
0 is even
2 is even
4 is even
6 is even
8 is even
If you want something similar, use a list comprehension:
print '\n'.join('{0} is even'.format(x) for x in xrange(10) if x % 2 == 0)
Prints:
0 is even
2 is even
4 is even
6 is even
8 is even
You can try this:
for y in (x for x in xrange(10) if x % 2 == 0): print y
The Python docs about compound statements states a reason for the decision on why does the grammar disallow this:
A suite can be one or more semicolon-separated simple statements on
the same line as the header, following the header’s colon, or it can
be one or more indented statements on subsequent lines. Only the
latter form of suite can contain nested compound statements; the
following is illegal, mostly because it wouldn’t be clear to which if
clause a following else clause would belong:
if test1: if test2: print x
So in wberry's answer it was right about the Dangling Else.