Python unit-test assertions are being joined by newlines - python

While using unittest.TestCase.run(test_class(test)), the correct errors are being reported but they are being joined by \n.
AssertionError: False is not true : Failures: [], Errors: [(<module1 testMethod=method1>, 'Traceback (most recent call last):\n File "<file_name>", line 20, in method1\n \'resource_partitions\')\n File "error_source_file_path", line 23, in error_function\n error_line_statement\nKeyError: \'gen_data\'\n')]
How can these be removed and replaced with actual newlines instead?
Has it got something to do with line-endings on my machine (currently set to \n)

That is an intended behaviour.
The string is being displayed as a part an object. Such displays always print escape sequences rather to convert them to their specific char.
Look in this short example with string in interpreter:
>>> "spam\neggs"
'spam\neggs'
>>> print("spam\neggs")
spam
eggs
The first one gets displayed because it's an interactive console, in normal code it would never happen. But that's also how string in other object behave.
Printing list containing a string vs printing each element separately:
>>> print(["spam\neggs"])
['spam\neggs']
>>> for element in ["spam\neggs"]: print(element)
...
spam
eggs

Related

How to re-match a group that did not capture anything?

I'm trying to parse a string in which a certain section can either be enclosed between " or ' or not be enclosed at all. However, I'm struggling finding a syntax that works when no quotation marks are there at all.
See the following (simplified) example:
>>> print re.match(r'\w(?P<quote>(\'|"))?\w', 'f"oo').group('quote')
"
>>> print re.match(r'\w(?P<quote>(\'|"))?\w', 'foo').group('quote')
None
>>> print re.match(r'\w(?P<quote>(\'|"))?\w(?P=quote)', 'f"o"o').group('quote')
"
>>> print re.match(r'\w(?P<quote>(\'|"))?\w(?P=quote)', 'foo').group('quote')
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "<string>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'
'NoneType' object has no attribute 'group'
The desired result for the last attempt should be None as the second command in the example.
Based on the suggestions I got to another question, I was able to produce a slightly different regex that provides the correct answers:
>>> re.match(r'\w(?P<quote>[\'"]?)\w(?P=quote)\w', 'foo').group('quote')
u''
>>> re.match(r'\w(?P<quote>[\'"]?)\w(?P=quote)\w', 'f"o"o').group('quote')
u'"'
>>> re.match(r'\w(?P<quote>[\'"]?)\w(?P=quote)\w', 'f\'o\'o').group('quote')
u"'"
The trick was really to use a quantifier on the character matched rather than on the entire group.
[ The leading and trailing \w in this example are just for preventing the regex to match the full string (as an unquoted string). In the real case scenario this was not needed as this match is part of a larger regex with previous and later groups matched ].

Why does the python interpreter prompt with ... for a comment "#"?

In the Python interpreter, if I put in a # for a comment, why does it prompt with ...? I expected a >>> prompt.
e.g.,
>>> # my comment
... x = 4
>>> x
4
>>> # my comment
... foo
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
NameError: name 'foo' is not defined
Here's my educated guess about what is going on; I haven't actually looked at Python's REPL code. As you know, the Python interactive shell uses the ... prompt when it is expecting further input based on having parsed the contents of the preceding line(s).
For example:
>>> if True:
...
... because if ... :<newline> must be followed by an indented block according to the lexical structure of Python.
Notice that you can trigger the same slightly odd behavior with a line that is empty except for whitespace, e.g.:
>>> <space><enter>
...
According to the lexical rules of Python, in most contexts a line that contains only whitespace should not be treated as a pass statement, or an empty block, but it should be handled as if it didn't appear at all. Take this example (with | to emphasize the lack of whitespace at the end of each line):
if False:|
|
print "Foo"|
# comment|
print "Bar"|
|
print "Baz"|
If you run this code, it will print only Baz. The first two print statements are treated as part of the same block despite the fact that there are non-indented empty or comment-only lines before, after, and in the middle of them.
Basically, when the Python interpreter reads a line that is blank or contains only a comment, it pretends it didn't read any line at all. The interactive interpreter basically follows this behavior: it's waiting for input, and if it gets no input, it asks for more input. Hence the ... continued input prompt.
It appears that the case of a totally blank line (line=='' after chopping off the EOL character(s)) is special-cased into the interactive interpreter, but that this special-casing is not extended to lines which contain only comments and/or whitespace.
It is the prompt, just like when you type in def foo () : and press enter, it puts an ellipsis (...) and with some prompts automatically tabs for you. It's just it's way of saying that it is accepting that you are doing something over multiple lines.

Syntax error with exec call in Python

Quick python question about the exec command. I'm have Python 2.7.6 and am trying to make use of the exec to run some code stored in a .txt file. I've run into a syntax error and am not entirely sure what is causing it.
Traceback (most recent call last):
File "/Users/XYZ/Desktop/parser.py", line 46, in <module>
try_code(block)
File "<string>", line 1
x = 'Hello World!'
^
SyntaxError: invalid syntax
I initially thought it was complaining about carriage returns, but when I tried to edit them .replace them with ' ' I still received this error message. I've tried variations to see what appears to be the issue and it always declares the error as the first ' or " the program encounters when it runs exec.
Here is the try_code(block) method
def try_code(block):
exec block
And the main body of the program
inputFile = open('/Users/XYZ/Desktop/test.txt', 'r+')
starter = False
finished = False
check = 1
block = ""
for val in inputFile:
starter = lookForStart(val)
finished = lookForEnd(val)
if lookForStart:
check = 1
elif finished:
try_code(block)
if check == 1:
check = 0
elif finished == False:
block = block + val
Basically I'm trying to import a file (test.txt) and then look for some embedded code in it. To make it easier I surrounded it with indicators, thus starter and finished. Then I concatenate all the hidden code into one string and call try_code on it. Then try_code attempts to execute it (it does make it there, check with print statements) and fails with the Syntax error.
As a note it works fine if I have hidden something like...
x = 5
print x
so whatever the problem is appears to be dealing with strings for some reason.
EDIT
It would appear that textedit includes some extra characters that aren't displayed normally. I rewrote the test file in a different text editor (text wrangler) and it would seem that the characters have disappeared. Thank you all very much for helping me solve my problem, I appreciate it.
It is a character encoding issue. Unless you've explicitly declared character encoding of your Python source code at the top of the file; it is 'ascii' by default on Python 2.
The code that you're trying to execute contains non-ascii quotes:
>>> print b'\xe2\x80\x98Hello World\xe2\x80\x99'.decode('utf-8')
‘Hello World’
To fix it; use ordinary single quotes instead: 'Hello World'.
You can check that block doesn't contain non-ascii characters using decode method: block.decode('ascii') it raises an exception if there are.
A gentle introduction into the topic: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
by Joel Spolsky.

Doctest Involving Escape Characters

Have a function fix(), as a helper function to an output function which writes strings to a text file.
def fix(line):
"""
returns the corrected line, with all apostrophes prefixed by an escape character
>>> fix('DOUG\'S')
'DOUG\\\'S'
"""
if '\'' in line:
return line.replace('\'', '\\\'')
return line
Turning on doctests, I get the following error:
Failed example:
fix('DOUG'S')
Exception raised:
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/doctest.py", line 1254, in __run
compileflags, 1) in test.globs
File "<doctest convert.fix[0]>", line 1
fix('DOUG'S')
^
No matter what combination of \ and 's I use, the doctest doesn't seem to to want to work, even though the function itself works perfectly. Have a suspicion that it is a result of the doctest being in a block comment, but any tips to resolve this.
Is this what you want?:
def fix(line):
r"""
returns the corrected line, with all apostrophes prefixed by an escape character
>>> fix("DOUG\'S")
"DOUG\\'S"
>>> fix("DOUG'S") == r"DOUG\'S"
True
>>> fix("DOUG'S")
"DOUG\\'S"
"""
return line.replace("'", r"\'")
import doctest
doctest.testmod()
raw strings are your friend...
First, this is what happens if you actually call your function in the interactive interpreter:
>>> fix("Doug's")
"Doug\\'s"
Note that you don't need to escape single quotes in double-quoted strings, and that Python does not do this in the representation of the resulting string – only the back slash gets escaped.
This means the correct docstring should be (untested!)
"""
returns the corrected line, with all apostrophes prefixed by an escape character
>>> fix("DOUG'S")
"DOUG\\\\'S"
"""
I'd use a raw string literal for this docstring to make this more readable:
r"""
returns the corrected line, with all apostrophes prefixed by an escape character
>>> fix("DOUG'S")
"DOUG\\'S"
"""

Passing string with (accidental) escape character loses character even though it's a raw string

I have a function with a python doctest that fails because one of the test input strings has a backslash that's treated like an escape character even though I've encoded the string as a raw string.
My doctest looks like this:
>>> infile = [ "Todo: fix me", "/** todo: fix", "* me", "*/", r"""//\todo stuff to fix""", "TODO fix me too", "toDo bug 4663" ]
>>> find_todos( infile )
['fix me', 'fix', 'stuff to fix', 'fix me too', 'bug 4663']
And the function, which is intended to extract the todo texts from a single line following some variation over a todo specification, looks like this:
todos = list()
for line in infile:
print line
if todo_match_obj.search( line ):
todos.append( todo_match_obj.search( line ).group( 'todo' ) )
And the regular expression called todo_match_obj is:
r"""(?:/{0,2}\**\s?todo):?\s*(?P<todo>.+)"""
A quick conversation with my ipython shell gives me:
In [35]: print "//\todo"
// odo
In [36]: print r"""//\todo"""
//\todo
And, just in case the doctest implementation uses stdout (I haven't checked, sorry):
In [37]: sys.stdout.write( r"""//\todo""" )
//\todo
My regex-foo is not high by any standards, and I realize that I could be missing something here.
EDIT: Following Alex Martellis answer, I would like suggestions on what regular expression would actually match the blasted r"""//\todo fix me""". I know that I did not originally ask for someone to do my homework, and I will accept Alex's answer as it really did answer my question (or confirm my fears). But I promise to upvote any good solutions to my problem here :)
EDITEDIT: for reference, a bug has been filed with the kodos project: bug #437633
I'm using Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15)
Thank you for reading this far (If you skipped directly down here, I understand)
Read your original regex carefully:
r"""(?:/{0,2}\**\s?todo):?\s*(?P<todo>.+)"""
It matches: zero to two slashes, then 0+ stars, then 0 or 1 "whitespace characters" (blanks, tabs etc), then the literal characters 'todo' (and so on).
Your rawstring is:
r"""//\todo stuff to fix"""
so there's a literal backslash between the slashes and the 'todo', therefore of course the regex doesn't match it. It can't -- nowhere in that regex are you expressing any desire to optionally match a literal backslash.
Edit:
A RE pattern, very close to yours, that would accept and ignore an optional backslash right before the 't' would be:
r"""(?:/{0,2}\**\s?\\?todo):?\s*(?P<todo>.+)"""
note that the backslash does have to be repeated, to "escape itself", in this case.
This gets even more strange as I venture down the road of doctests.
Consider this python script.
If you uncomment the lines 22 and 23, the script passes just fine, as the method returns True, which is both asserted and explicitly compared.
But if you run the file as it stands in the link, the doctest will fail with the message:
% python doctest_test.py
**********************************************************************
File "doctest_test.py", line 3, in __main__.doctest_test
Failed example:
doctest_test( r"""// odo""" )
Exception raised:
Traceback (most recent call last):
File "/usr/lib/python2.6/doctest.py", line 1241, in __run
compileflags, 1) in test.globs
File "<doctest __main__.doctest_test[0]>", line 1, in <module>
doctest_test( r"""// odo""" )
File "doctest_test.py", line 14, in doctest_test
assert input_string == compare_string
AssertionError
**********************************************************************
1 items had failures:
1 of 1 in __main__.doctest_test
***Test Failed*** 1 failures.
Can someone enlighten me here?
I'm still using python 2.6.4 for this.
I'm placing this answer under 'community wiki', as it really does not reputation-wise relate to the question.

Categories