Overriding the newline generation behaviour of Python's print statement - python

I have a bunch of legacy code for encoding raw emails that contains a lot of print statements such as
print >>f, "Content-Type: text/plain"
This is all well and good for emails, but we're now leveraging the same code for outputting HTTP request. The problem is that the Python print statement outputs '\n' whilst HTTP requires '\r\n'.
It looks like Python (2.6.4 at least) generates a trailing PRINT_NEWLINE byte code for a print statement which is implemented as
ceval.c:1582: err = PyFile_WriteString("\n", w);
Thus it appears there's no easy way to override the default newline behaviour of print. I have considered the following solutions
After writing the output simply do a .replace('\n', '\r\n'). This will interfere with HTTP messages that use multipart encoding.
Create a wrapper around the destination file object and proxy the .write method
def write(self, data):
if data == '\n':
data = '\r\n'
return self._file.write(data)
Write a regular expression that translates print >>f, text to f.write(text + line_end) where line_end can be '\n' or '\r\n'.
I believe the third option would be the most appropriate. It would be interesting to hear what your Pythonic approach to the problem would be.

You should solve your problem now and for forever by defining a new output function. Were print a function, this would have been much easier.
I suggest writing a new output function, mimicing as much of the modern print function signature as possible (because reusing a good interface is good), for example:
def output(*items, end="\n", file=sys.stdout):
pass
Once you have replaced all prints in question, you no longer have this problem -- you can always change the behavior of your function instead! This is a big reason why print was made a function in Python 3 -- because in Python 2.x, "all" projects invariably go through the stage where all the print statements are no longer flexible, and there is no easy way out.

(Not sure how/if this fits with the wrapper you intend to use, but in case...)
In Python 2.6 (and many preceding versions), you can suppress the newline by adding a comma at the end of the print statement, as in:
data = 'some msg\r\n'
print data, # note the comma
The downside of using this approach however is that the print syntax and behavior is changed in Python3.

In python2.x, I think you can do:
print >>f "some msg\r\n",
to supress the trailing new line.
In python3.x, it's a lot simpler:
print("some msg", end = "\r\n", file = f)

I think I would define a new function writeline in an inherited file/stream class and update the code to use writeline instead of print. The file object itself can hold the line ending style as a member. That should give you some flexibility in behavior and also make the code a little clearer i.e. f.writeline(text) as opposed to f.write(text+line_end).

I also prefer your third solution, but no need to use f.write, any user written function/callable would do. Thus the next changes would become easy. If you use an object you may even hide target file inside it thus removing some syntaxic noise like file or kind of newline.
Too bad print is a statement in python 2.x, with python 3.x print could simply be overloaded by something user defined.

Python has modules both to handle email and http headers in an easy compliant way. I suggest you use them instead of solving already solved problems again.

Related

Python f-string: replacing newline/linebreak [duplicate]

This question already has answers here:
How can I use newline '\n' in an f-string to format output?
(7 answers)
Closed last month.
First off all, sorry: I'm quite certain this might be a "duplicate" but I didn't succeed finding the right solution.
I simply want to replace all linebreaks within my sql-code for logging it to one line, but Python's f-string doesn't support backslashes, so:
# Works fine (but is useless ;))
self.logger.debug(f"Executing: {sql.replace( 'C','XXX')}")
# Results in SyntaxError:
# f-string expression part cannot include a backslash
self.logger.debug(f"Executing: {sql.replace( '\n',' ')}")
Of course there are several ways to accomplish that before the f-string, but I'd really like to keep my "log the line"-code in one line and without additional helper variables.
(Besides I think it's a quite stupid behavior: Either you can execute code within the curly brackets or you cant't...not "you can, but only without backslashes"...)
This one isn't a desired solution because of additional variables:
How to use newline '\n' in f-string to format output in Python 3.6?
General Update
The suggestion in mkrieger1s comment:
self.logger.debug("Executing %s", sql.replace('\n',' '))
Works fine for me, but as it doesn't use f-strings at all (beeing that itself good or bad ;)), I think I can leave this question open.
I found possible solutions
from os import linesep
print(f'{string_with_multiple_lines.replace(linesep, " ")}')
Best,
You can do this
newline = '\n'
self.logger.debug(f"Executing: {sql.replace( newline,' ')}")
don't use f-strings, especially for logging
assign the newline to a constant and use that, which you apparently don't want to
use an other version of expressing a newline, chr(10) for instance
(Besides I think it's a quite stupid behavior: Either you can execute code within the curly brackets or you cant't...not "you can, but only without backslashes"...)
Feel free to take a shot at fixing it, I'm pretty sure this restriction was not added because the PEP authors and feature developers wanted it to be a pain in the ass.

Is there something similar to __END__ of perl in python? [duplicate]

Am I correct in thinking that that Python doesn't have a direct equivalent for Perl's __END__?
print "Perl...\n";
__END__
End of code. I can put anything I want here.
One thought that occurred to me was to use a triple-quoted string. Is there a better way to achieve this in Python?
print "Python..."
"""
End of code. I can put anything I want here.
"""
The __END__ block in perl dates from a time when programmers had to work with data from the outside world and liked to keep examples of it in the program itself.
Hard to imagine I know.
It was useful for example if you had a moving target like a hardware log file with mutating messages due to firmware updates where you wanted to compare old and new versions of the line or keep notes not strictly related to the programs operations ("Code seems slow on day x of month every month") or as mentioned above a reference set of data to run the program against. Telcos are an example of an industry where this was a frequent requirement.
Lastly Python's cult like restrictiveness seems to have a real and tiresome effect on the mindset of its advocates, if your only response to a question is "Why would you want to that when you could do X?" when X is not as useful please keep quiet++.
The triple-quote form you suggested will still create a python string, whereas Perl's parser simply ignores anything after __END__. You can't write:
"""
I can put anything in here...
Anything!
"""
import os
os.system("rm -rf /")
Comments are more suitable in my opinion.
#__END__
#Whatever I write here will be ignored
#Woohoo !
What you're asking for does not exist.
Proof: http://www.mail-archive.com/python-list#python.org/msg156396.html
A simple solution is to escape any " as \" and do a normal multi line string -- see official docs: http://docs.python.org/tutorial/introduction.html#strings
( Also, atexit doesn't work: http://www.mail-archive.com/python-list#python.org/msg156364.html )
Hm, what about sys.exit(0) ? (assuming you do import sys above it, of course)
As to why it would useful, sometimes I sit down to do a substantial rewrite of something and want to mark my "good up to this point" place.
By using sys.exit(0) in a temporary manner, I know nothing below that point will get executed, therefore if there's a problem (e.g., server error) I know it had to be above that point.
I like it slightly better than commenting out the rest of the file, just because there are more chances to make a mistake and uncomment something (stray key press at beginning of line), and also because it seems better to insert 1 line (which will later be removed), than to modify X-many lines which will then have to be un-modified later.
But yeah, this is splitting hairs; commenting works great too... assuming your editor supports easily commenting out a region, of course; if not, sys.exit(0) all the way!
I use __END__ all the time for multiples of the reasons given. I've been doing it for so long now that I put it (usually preceded by an exit('0');), along with BEGIN {} / END{} routines, in by force-of-habit. It is a shame that Python doesn't have an equivalent, but I just comment-out the lines at the bottom: extraneous, but that's about what you get with one way to rule them all languages.
Python does not have a direct equivalent to this.
Why do you want it? It doesn't sound like a really great thing to have when there are more consistent ways like putting the text at the end as comments (that's how we include arbitrary text in Python source files. Triple quoted strings are for making multi-line strings, not for non-code-related text.)
Your editor should be able to make using many lines of comments easy for you.

Adding a new string method in Python

I am parsing a bunch of HTML and am encountering a lot of "\n" and "\t" inside the code. So I am using
"something\t\n here".replace("\t","").replace("\n","")
This works, but I'm using it often. Is there a way to define a string function, along the lines of replace itself (or find, index, format, etc.) that will pretty my code a little, something like
"something\t\n here".noTabsOrNewlines()
I tried
class str:
def noTabNewline(self):
self.replace("\t","").replace("\n","")
but that was no good. Thanks for any help.
While you could do something along these lines (https://stackoverflow.com/a/4698550/1867876), the more Pythonic thing to do would be:
myString = "something\t\n here"
' '.join(myString.split())
You can see this thread for more information:
Strip spaces/tabs/newlines - python
you can try encoding='utf-8'. otherwise in my opinion there is no other way otherthan replacing it . python also replaces it spaces with '/xa0' so in anyway you have to replace it. our you can read it line by line via (readline()) instead of just read() it .

Python style for `chained` function calls

More and more we use chained function calls:
value = get_row_data(original_parameters).refine_data(leval=3).transfer_to_style_c()
It can be long. To save long line in code, which is prefered?
value = get_row_data(
original_parameters).refine_data(
leval=3).transfer_to_style_c()
or:
value = get_row_data(original_parameters)\
.refine_data(leval=3)\
.transfer_to_style_c()
I feel it good to use backslash \, and put .function to new line. This makes each function call has it own line, it's easy to read. But this sounds not preferred by many. And when code makes subtle errors, when it's hard to debug, I always start to worry it might be a space or something after the backslash (\).
To quote from the Python style guide:
Long lines can be broken over multiple lines by wrapping expressions
in parentheses. These should be used in preference to using a
backslash for line continuation. Make sure to indent the continued
line appropriately. The preferred place to break around a binary
operator is after the operator, not before it.
I tend to prefer the following, which eschews the non-recommended \ at the end of a line, thanks to an opening parenthesis:
value = (get_row_data(original_parameters)
.refine_data(level=3)
.transfer_to_style_c())
One advantage of this syntax is that each method call is on its own line.
A similar kind of \-less structure is also often useful with string literals, so that they don't go beyond the recommended 79 character per line limit:
message = ("This is a very long"
" one-line message put on many"
" source lines.")
This is a single string literal, which is created efficiently by the Python interpreter (this is much better than summing strings, which creates multiple strings in memory and copies them multiple times until the final string is obtained).
Python's code formatting is nice.
What about this option:
value = get_row_data(original_parameters,
).refine_data(leval=3,
).transfer_to_style_c()
Note that commas are redundant if there are no other parameters but I keep them to maintain consistency.
The not quoting my own preference (although see comments on your question:)) or alternatives answer to this is:
Stick to the style guidelines on any project you have already - if not stated, then keep as consistent as you can with the rest of the code base in style.
Otherwise, pick a style you like and stick with that - and let others know somehow that's how you'd appreciate chained function calls to be written if not reasonably readable on one-line (or however you wish to describe it).

Why not python implicit line continuation on period?

Is there any reason Python does not allow implicit line continuations after (or before) periods? That is
data.where(lambda d: e.name == 'Obama').
count()
data.where(lambda d: e.name == 'Obama')
.count()
Does this conflict with some feature of Python? With the rise of method chaining APIs this seems like a nice feature.
Both of those situations can lead to valid, complete constructs, so continuing on them would complicate the parser.
print 3.
1415926
print 'Hello, world'
.lower()
Python allow line continuations within parentheticals (), so you might try:
(data.where(lambda d: e.name == 'Obama').
count())
I know that's not answering your question ("why?"), but maybe it's helpful.
Use a '\' at the end. (looks ugly though)
data.where(lambda d: e.name == 'Obama').\
count()
Not sure about after periods, but in your example the newline before a period leads to the first line being a valid statement on its own. Then Python would have to look ahead to the second line to know whether the first line was a statement or not.
One of the goals when defining the language syntax was to be able to parse it without having ambiguities that require looking ahead like that.
It'd get annoying in the interactive interpreter if you had to press enter twice after every single line just so Python knew you'd finished your statement and weren't going to put a .foo() after it.
In the cases where a period could be leading in to a method call, it will always(?) be a syntax error for it to just occur at the end of a line by itself. So it would be unambiguous to read it as starting a continuation.
However, Python generally speaking doesn't continue a line just because there's an incomplete binary operator there. For instance, the following is not valid:
2 +
4
In the second example, the first line is valid by itself and it would be really inconsistent for Python to look for a following line "just in case" there is one.
I would just break after the opening paren of the method call.
{Because python uses line breaks to end statements, not depending on braces or semi-colins;}

Categories