I was experimenting with something on the Python console and I noticed that it doesn't matter how many spaces you have between the function name and the (), Python still manages to call the method?
>>> def foo():
... print 'hello'
...
>>> foo ()
hello
>>> foo ()
hello
How is that possible? Shouldn't that raise some sort of exception?
From the Lexical Analysis documentation on whitespace between tokens:
Except at the beginning of a logical line or in string literals, the whitespace characters space, tab and formfeed can be used interchangeably to separate tokens. Whitespace is needed between two tokens only if their concatenation could otherwise be interpreted as a different token (e.g., ab is one token, but a b is two tokens).
Inverting the last sentence, whitespace is allowed between any two tokens as long as they should not instead be interpreted as one token without the whitespace. There is no limit on how much whitespace is used.
Earlier sections define what comprises a logical line, the above only applies to within a logical line. The following is legal too:
result = (foo
())
because the logical line is extended across newlines by parenthesis.
The call expression is a separate series of tokens from what precedes; foo is just a name to look up in the global namespace, you could have looked up the object from a dictionary, it could have been returned from another call, etc. As such, the () part is two separate tokens and any amount of whitespace in and around these is allowed.
You should understand that
foo()
in Python is composed of two parts: foo and ().
The first one is a name that in your case is found in the globals() dictionary and the value associated to it is a function object.
An open parenthesis following an expression means that a call operation should be made. Consider for example:
def foo():
print "Hello"
def bar():
return foo
bar()() # Will print "Hello"
So they key point to understand is that () can be applied to whatever expression precedes it... for example mylist[i]() will get the i-th element of mylist and call it passing no arguments.
The syntax also allows optional spaces between an expression and the ( character and there's nothing strange about it. Note that also you can for example write p . x to mean p.x.
Python takes many cues from the C programming language. This is certainly one of them. Consider the fact that the following compiles:
int add(int a, int b)
{
return a + b;
}
int main(int argc, char **argv)
{
return add (10, 11);
}
There can be an arbitrary number of spaces between the function name and the arguments to it.
This also holds true in other languages. Consider Ruby, for example,
def add(a, b)
a + b
end
add (10, 11)
Sure you get a warning, but it still works and although I don't know how, I'm sure that warning could easily be suppressed.
Perfectly fine. Leading spaces or not, anything between a name of a symbol of sorts and rounded braces would probably change the name of the symbol (in this case a function definition) in question.
This is also common in various C-based languages, where padding spaces between the end of a function name and the parameter list can be done without issue.
Related
I am trying to figure out if the different quote types make a difference functionally. I have seen people say its preference for "" or '' but what about """ """? I tested it in a simple code to see if it would work and it does. I was wondering if """ triple quotes """ have a functional purpose for defined function arguments or is it just another quote option that can be used interchangeably like "" and ''?
As I have seen many people post about "" and '' I have not seen a post about """ """ or ''' ''' being used in functions.
My question is: Does the triple quote have a unique use as an argument or is it simply interchangeable with "" and ''? The reason I think it might have a unique function is because it is a multi line quote and I was wondering if it would allow a multi line argument to be submitted. I am not sure if something like that would even be useful but it could be.
Here is an example that prints out what you would expect using all the quote types I know of.
def myFun(var1="""different""",var2="quote",var3='types'):
return var1, var2, var3
print (myFun('All','''for''','one!'))
Result:
('All', 'for', 'one!')
EDIT:
After some more testing of the triple quote I did find some variation in how it works using return vs printing in the function.
def myFun(var1="""different""",var2="""quote""",var3='types'):
return (var1, var2, var3)
print(myFun('This',
'''Can
Be
Multi''',
'line!'))
Result:
('This', 'Can\nBe\nMulti', 'line!')
Or:
def myFun(var1="""different""",var2="""quote""",var3='types'):
print (var1, var2, var3)
myFun('This',
'''Can
Be
Multi''',
'line!')
Result:
This Can
Be
Multi line!
From the docs:
String literals can be enclosed in matching single quotes (') or double quotes ("). They can also be enclosed in matching groups of three single or double quotes (these are generally referred to as triple-quoted strings). [...other rules applying identically to all string literal types omitted...]
In triple-quoted strings, unescaped newlines and quotes are allowed (and are retained), except that three unescaped quotes in a row terminate the string. (A “quote” is the character used to open the string, i.e. either ' or ".)
Thus, triple-quoted string literals can span multiple lines, and can contain literal quotes without use of escape sequences, but are otherwise exactly identical to string literals expressed with other quoting types (including those using escape sequences such as \n or \' to express the same content).
Also see the Python 3 documentation: Bytes and String Literals -- which expresses an effectively identical set of rules with slightly different verbiage.
A more gentle introduction is also available in the language tutorial, which explicitly introduces triple-quotes as a way to permit strings to span multiple lines:
String literals can span multiple lines. One way is using triple-quotes: """...""" or '''...'''. End of lines are automatically included in the string, but it’s possible to prevent this by adding a \ at the end of the line. The following example:
print("""\
Usage: thingy [OPTIONS]
-h Display this usage message
-H hostname Hostname to connect to
""")
produces the following output (note that the initial newline is not included):
Usage: thingy [OPTIONS]
-h Display this usage message
-H hostname Hostname to connect to
To be clear, though: These are different syntax, but the string literals they create are indistinguishable from each other. That is to say, given the following code:
s1 = '''foo
'bar'
baz
'''
s2 = 'foo\n\'bar\'\nbaz\n'
there's no possible way to tell s1 and s2 apart from each other by looking at their values: s1 == s2 is true, and so is repr(s1) == repr(s2). The Python interpreter is even allowed to intern them to the same value, so it may (or may not) make id(s1) == id(s2) true depending on details (such as whether the code was run at the REPL or imported as a module).
FWIW, my understanding is that there's a convention whereby """ """, ''' ''' are used for docstring, which is kinda like a #comment, but is a recallable attribute that can be referenced later. https://www.python.org/dev/peps/pep-0257/
I'm a beginner too, but my understanding is that using triple quotes for strings is not the best idea, even if there's little functional difference with what you're doing currently (I don't know if there might be later). Sticking with conventions is helpful to others reading and using your code, and it seems to be a good rule of thumb that some conventions will bite you if you don't follow them, as in this case where a mal-formatted string with triple quotes will be interpreted as a docstring, and maybe not throw an error, and you'll need to search through a bunch of code to find the issue.
This question already has answers here:
Python comments: # vs. strings
(3 answers)
Closed 6 years ago.
I am still exploring python. Today I came across multi-line strings. If I do:
a = '''
some-text
'''
The content of variable a is '\nsome-text\n'. But this leaves me confused. I always thought that if you enclose something within three single quotes ('''), you are commenting it out. So the above statement would be equivalent to something like this in C++:
a = /*
some-text
*/
What am I missing?
Technically such multi-line-comments enclosed in triple quotes are not really comments but string literals.
The reason why you can still use them to comment stuff out is that a string literal itself does not represent any kind of operation. It gets parsed, but nothing is done with it and it does not get assigned to a variable name, so it gets ignored.
You could also place any other literal into your code. As long as it is not involved in any kind of operation or assignment, it gets basically ignored like a comment. It is not a comment though, just useless code if you want to name it that way.
Here's an example of code that does... well, nothing:
# This is a real comment.
"useless normal string"
"""useless triple-quoted
multi-line
string"""
[1, "two"] # <-- useless list
42 # <-- useless number
I always thought that if you enclose something within three single quotes (''')
This is not the case, actually. Enclosing something in triple quotes '''string''', it creates a string expression which yields a string containing the characters within the quotes. The difference between this and a single quoted string 'string' is that the former can be on multiple lines. People often use this to comment out multiple lines.
However, if you don't assign the string expression to a variable, then you'll get something a lot like a comment.
'''this is
a useless piece of python
text that does nothing for
your program'''
In python, wrapping your code with ''' will encode it as a string, effectively commenting it out unless that code already contains a multi-line string literal ''' anywhere. Then, the string will be terminated.
print('''hello!
How are you?''')
# this will not have the intended comment effect
'''
print('''hello!
How are you?''')
'''
Let's say I have this string :
<div>Object</div><img src=#/><p> In order to be successful...</p>
I want to substitute every letter between < and > with a #.
So, after some operation, I want my string to look like:
<###>Object<####><##########><#> In order to be successful...<##>
Notice that every character between the two symbols were replaced with # ( including whitespace).
This is the closest I could get:
r = re.sub('<.*?>', '<#>', string)
The problem with my code is that all characters between < and > are replaced by a single #, whereas I would like every individual character to be replaced by a #.
I tried a mixture of various back references, but to no avail. Could someone point me in the right direction?
What about...:
def hashes(mo):
replacing = mo.group(1)
return '<{}>'.format('#' * len(replacing))
and then
r = re.sub(r'<(.*?)>', hashes, string)
The ability to use a function as the second argument to re.sub gives you huge flexibility in building up your substitutions (and, as usual, a named def results in much more readable code than any cramped lambda -- you can use meaningful names, normal layouts, etc, etc).
The re.sub function can be called with a function as the replacement, rather than a new string. Each time the pattern is matched, the function will be called with a match object, just like you'd get using re.search or re.finditer.
So try this:
re.sub(r'<(.*?)>', lambda m: "<{}>".format("#" * len(m.group(1))), string)
I have code like this:
def escape_query(query):
special_chars = ['\\','+','-','&&','||','!','(',')','{','}','[',']',
'^','"','~','*','?',':']
for character in special_chars:
query = query.replace(character, '\\%s' % character)
return query
This function should escape all occurrences of every substring (Notice && and ||) in special_characters with backslash.
I think, that my approach is pretty ugly and I couldn't stop wondering if there aren't any better ways to do this. Answers should be limited to standart library.
Using reduce:
def escape_query(query):
special_chars = ['\\','+','-','&&','||','!','(',')','{','}','[',']',
'^','"','~','*','?',':']
return reduce(lambda q, c: q.replace(c, '\\%s' % c), special_chars, query)
The following code has exactly the same principle than the steveha's one.
But I think it fulfills your requirement of clarity and maintainability since the special chars are still listed in the same list as yours.
special_chars = ['\\','+','-','&&','||','!','(',')','{','}','[',']',
'^','"','~','*','?',':']
escaped_special_chars = map(re.escape, special_chars)
special_chars_pattern = '|'.join(escaped_special_chars).join('()')
def escape_query(query, reg = re.compile(special_chars_pattern) ):
return reg.sub(r'\\\1',query)
With this code:
when the function definition is executed, an object is created with a value (the regex re.compile(special_chars_pattern) ) received as default argument, and the name reg is assigned to this object and defined as a parameter for the function.
This happens only one time, at the moment when the function definition is executed, which is performed only one time at compilation time.
That means that during the execution of the compiled code that takes place after the compilation, each time a call to the function will be done, this creation and assignement won't be done again: the regex object already exists and is permanantly registered and avalaible in the tuple func_defaults that is definitive attribute of the function.
That's interesting if several calls to the function are done during execution, because Python has not to search for the regex outside if it was defined outside or to reassign it to parameter reg if it was passed as simple argument.
If I understand your requirements correctly, some of the special "chars" are two-character strings (specifically: "&&" and "||"). The best way to do such an odd collection is with a regular expression. You can use a character class to match anything that is one character long, then use vertical bars to separate some alternative patterns, and these can be multi-character. The trickiest part is the backslash-escaping of chars; for example, to match "||" you need to put r'\|\|' because the vertical bar is special in a regular expression. In a character class, backslash is special and so are '-' and ']'. The code:
import re
_s_pat = r'([\\+\-!(){}[\]^"~*?:]|&&|\|\|)'
_pat = re.compile(_s_pat)
def escape_query(query):
return re.sub(_pat, r'\\\1', query)
I suspect the above is the fastest solution to your problem possible in Python, because it pushes the work down to the regular expression machinery, which is written in C.
If you don't like the regular expression, you can make it easier to look at by using the verbose format, and compile using the re.VERBOSE flag. Then you can sprawl the regular expression across multiple lines, and put comments after any parts you find confusing.
Or, you can build your list of special characters, just like you already did, and run it through this function which will automatically compile a regular expression pattern that matches any alternative in the list. I made sure it will match nothing if the list is empty.
import re
def make_pattern(lst_alternatives):
if lst_alternatives:
temp = '|'.join(re.escape(s) for s in lst_alternatives)
s_pat = '(' + temp + ')'
else:
s_pat = '$^' # a pattern that will never match anything
return re.compile(s_pat)
By the way, I recommend you put the string and the pre-compiled pattern outside the function, as I showed above. In your code, Python will run code on each function invocation to build the list and bind it to the name special_chars.
If you want to not put anything but the function into the namespace, here's a way to do it without any run-time overhead:
import re
def escape_query(query):
return re.sub(escape_query.pat, r'\\\1', query)
escape_query.pat = re.compile(r'([\\+\-!(){}[\]^"~*?:]|&&|\|\|)')
The above uses the function's name to look up the attribute, which won't work if you rebind the function's name later. There is a discussion of this and a good solution here: how can python function access its own attributes?
(Note: The above paragraph replaces some stuff including a question that was discussed in the discussion comments below.)
Actually, upon further thought, I think this is cleaner and more Pythonic:
import re
_pat = re.compile(r'([\\+\-!(){}[\]^"~*?:]|&&|\|\|)')
def escape_query(query, pat=_pat):
return re.sub(pat, r'\\\1', query)
del(_pat) # not required but you can do it
At the time escape_query() is compiled, the object bound to the name _pat will be bound to a name inside the function's name space (that name is pat). Then you can call del() to unbind the name _pat if you like. This nicely encapsulates the pattern inside the function, does not depend at all on the function's name, and allows you to pass in an alternate pattern if you wish.
P.S. If your special characters were always a single character long, I would use the code below:
_special = set(['[', ']', '\\', '+']) # add other characters as desired, but only single chars
def escape_query(query):
return ''.join('\\' + ch if (ch in _special) else ch for ch in query)
Not sure if this is any better but it works and probably faster.
def escape_query(query):
special_chars = ['\\','+','-','&&','||','!','(',')','{','}','[',']', '^','"','~','*','?',':']
query = "".join(map(lambda x: "\\%s" % x if x in special_chars else x, query))
for sc in filter(lambda x: len(x) > 1, special_chars):
query = query.replace(sc, "\%s" % sc)
return query
In python 2.6, why is the following line valid?
my_line = 'foo' 'bar'
and if that is valid, why isn't the following:
my_list = 1 2
The first example is string concatenation, however, the following isn't valid either (thanks god):
foo = 'foo'
bar = 'bar'
foo_bar = foo bar
This is doing string literal concatenation. As noted in the documentation, advantages include the following:
This feature can be used to reduce the
number of backslashes needed, to split
long strings conveniently across long
lines, or even to add comments to
parts of strings...
It goes on to note that this concatenation is done at compilation time rather than run time.
The history and rationale behind this, and a rejected suggestion to remove the feature, is described in PEP 3126.
my_line = 'foo' 'bar' is string concatenation.
Perhaps this is of C's ancestry. In C, the following is perfectly valid:
char* ptr = "hello " "world";
It is implemented by the C pre-processor (cpp), and the rationale given in that link is:
this allows long strings to be split
over multiple lines, and also allows
string literals resulting from C
preprocessor defines and macros to be
appended to strings at compile time
It isn't inconsistent. Strings and integers have different methods.
Integer concatenation is meaningless.
String concatenation is a meaningful default behavior.