Clearing the terminal window in Python, memory/performance - python

I'm using a script that runs for many hours, which prints statements to verify whether or not issues might have arisen (data is downloaded from the web, which sometimes gets distorted).
I've noticed a significant drop in performance after a while. I suspect that the many thousands of lines of print statements might be the reason.
It is commonly known that the terminal can be cleared of these print statements by the following line of code:
import os
os.system('cls') # for windows
Still, I suspect that this doesn't actually improve the performance speed and that it's merely a perceived improvement due to the fact that the screen is cleared. Is that true or not?
I've also considered suppressing certain print statements with the following code:
import sys
class NullWriter(object):
def write(self, arg):
pass
nullwrite = NullWriter()
oldstdout = sys.stdout
sys.stdout = oldstdout # enable output
print("text that I want to see")
sys.stdout = nullwrite # disable output
print("text I don't want to see")
My question: How can I improve the performance (speed) of my script, given that I still want to see the most recent print statements?

If you like you can just do a line feed without a carriage return and override the last line:
sys.stdout.write("\rDoing things")
sys.stdout.flush()
Printing over time shouldn't use any extra memory within python, but you might have your terminal's buffer set to high which can use a lot of memory. Or it's just taking time to flush the buffer because you're writing so fast to stdout.
You can also use the print function
Python 2.6+
From Python 2.6 you can import the print function from Python 3:
from __future__ import print_function
This allows you to use the Python 3 solution below.
Python 3
In Python 3, the print statement has been changed into a function. In Python 3, you can instead do:
print('.', end='')

Related

`time.sleep()` causing previous `print()` with `end=''` to delay [duplicate]

I have a python script that performs a simulation. It takes a fairly long, varying time to run through each iteration, so I print a . after each loop as a way to monitor how fast it runs and how far it went through the for statement as the script runs. So the code has this general structure:
for step in steps:
run_simulation(step)
# Python 3.x version:
print('.', end='')
# for Python 2.x:
# print '.',
However, when I run the code, the dots do not appear one by one. Instead, all the dots are printed at once when the loop finishes, which makes the whole effort pointless. How can I print the dots inline as the code runs?
This problem can also occur when iterating over data fed from another process and trying to print results, for example to echo input from an Electron app. See Python not printing output.
The issue
By default, output from a Python program is buffered to improve performance. The terminal is a separate program from your code, and it is more efficient to store up text and communicate it all at once, rather than separately asking the terminal program to display each symbol.
Since terminal programs are usually meant to be used interactively, with input and output progressing a line at a time (for example, the user is expected to hit Enter to indicate the end of a single input item), the default is to buffer the output a line at a time.
So, if no newline is printed, the print function (in 3.x; print statement in 2.x) will simply add text to the buffer, and nothing is displayed.
Outputting in other ways
Every now and then, someone will try to output from a Python program by using the standard output stream directly:
import sys
sys.stdout.write('test')
This will have the same problem: if the output does not end with a newline, it will sit in the buffer until it is flushed.
Fixing the issue
For a single print
We can explicitly flush the output after printing.
In 3.x, the print function has a flush keyword argument, which allows for solving the problem directly:
for _ in range(10):
print('.', end=' ', flush=True)
time.sleep(.2) # or other time-consuming work
In 2.x, the print statement does not offer this functionality. Instead, flush the stream explicitly, using its .flush method. The standard output stream (where text goes when printed, by default) is made available by the sys standard library module, and is named stdout. Thus, the code will look like:
for _ in range(10):
print '.',
sys.stdout.flush()
time.sleep(.2) # or other time-consuming work
For multiple prints
Rather than flushing after every print (or deciding which ones need flushing afterwards), it is possible to disable the output line buffering completely. There are many ways to do this, so please refer to the linked question.

Why does this for loop wait until the end of the iteration to print everything? [duplicate]

I have a python script that performs a simulation. It takes a fairly long, varying time to run through each iteration, so I print a . after each loop as a way to monitor how fast it runs and how far it went through the for statement as the script runs. So the code has this general structure:
for step in steps:
run_simulation(step)
# Python 3.x version:
print('.', end='')
# for Python 2.x:
# print '.',
However, when I run the code, the dots do not appear one by one. Instead, all the dots are printed at once when the loop finishes, which makes the whole effort pointless. How can I print the dots inline as the code runs?
This problem can also occur when iterating over data fed from another process and trying to print results, for example to echo input from an Electron app. See Python not printing output.
The issue
By default, output from a Python program is buffered to improve performance. The terminal is a separate program from your code, and it is more efficient to store up text and communicate it all at once, rather than separately asking the terminal program to display each symbol.
Since terminal programs are usually meant to be used interactively, with input and output progressing a line at a time (for example, the user is expected to hit Enter to indicate the end of a single input item), the default is to buffer the output a line at a time.
So, if no newline is printed, the print function (in 3.x; print statement in 2.x) will simply add text to the buffer, and nothing is displayed.
Outputting in other ways
Every now and then, someone will try to output from a Python program by using the standard output stream directly:
import sys
sys.stdout.write('test')
This will have the same problem: if the output does not end with a newline, it will sit in the buffer until it is flushed.
Fixing the issue
For a single print
We can explicitly flush the output after printing.
In 3.x, the print function has a flush keyword argument, which allows for solving the problem directly:
for _ in range(10):
print('.', end=' ', flush=True)
time.sleep(.2) # or other time-consuming work
In 2.x, the print statement does not offer this functionality. Instead, flush the stream explicitly, using its .flush method. The standard output stream (where text goes when printed, by default) is made available by the sys standard library module, and is named stdout. Thus, the code will look like:
for _ in range(10):
print '.',
sys.stdout.flush()
time.sleep(.2) # or other time-consuming work
For multiple prints
Rather than flushing after every print (or deciding which ones need flushing afterwards), it is possible to disable the output line buffering completely. There are many ways to do this, so please refer to the linked question.

Threading with Python Curses giving me weird characters?

Hey there Stack Overflow. I'm trying to build a testing script that should mix outputting changing characters (using curses) on multiple lines (creating them over time), creating new lines based on the thread number.
I have the below code:
# -*- coding: utf-8 -*-
import curses, time, threading
def threadedFunction(linePos):
stdscr = curses.initscr()
curses.noecho()
curses.cbreak()
try:
stdscr.clear()
for i in range(50):
stdscr.addstr(linePos, 0, "testing %s..." % i)
stdscr.refresh()
time.sleep(.1)
finally:
curses.echo()
curses.nocbreak()
curses.endwin()
pass
pass
if __name__ == "__main__":
for x in xrange(0, 4): # should produce 5 lines maximum
exec("process" + str(x) + " = threading.Thread(target = threadedFunction, args = (" + str(x) + ",))")
exec("process" + str(x) + ".start()")
I tried using the multithreading library before, but I had no hope with it. The threading library at least will display the numbers I want on a few lines before it goes crazy. Here's an example of what it does when I run it:
All I want is for the program to just simply start a new thread, and display a line that counts to 50 while adding new lines doing the same thing. How would I go about doing this?? Thanks in advance :)
Printing to the terminal from multiple threads will give you intermingled output like that. It is a very simple example of race condition. Use some kind of locking mechanism to coordinate writes to the terminal, or make sure to only write from one thread (for example, using a FIFO to pass message to the writing thread, which will write them to the terminal).
The weird numbers you see are part of the ANSI escape sequences that are used by programs to use special features of the terminal: writing \x1B[nF to the output will make your terminal move the cursor one line up, for example. Curses is outputting such codes for you, and because the terminal interprets them according to the ANSI meaning, you don't usually see them. But because of the multithreading issue, those become mingled and invalid, and part of them get printed to the screen.
Even if you only use curses in one thread, other processing-heavy threads can disrupt the escape sequences in the curses thread. The environment variable $ESCDELAY indicates how long (in ms) to wait after an escape code (0x1B) is sent; and if more than that time elapsed, a ^[ keystroke (ESC) is returned by get_wch().
Use stdscr.noutrefresh() instead of stdscr.refresh(), then call curses.doupdate() in a designated thread that handles the update. The idea is to do curses.doupdate() only in one thread.

Curious python print behavior

I'm using a print statement in a python 2.7 script in which I'm creating instances of data modeling classes. They're fairly large classes which do a good number of calculations in property setters during the init, so it's not the fastest executing script. I use print statements to have some idea of progress, but what's interesting is how they're executing. The code looks something like this:
from __future__ import absolute_import, division, print_function, unicode_literals
print('Loading data...', end='\t')
data = LoadData(data_path)
first_model = FirstModel(parameters).fit(data)
print('Done.\nFitting second model...', end='\t')
# prints 'Done.' and then there's a very long pause...
# suddenly 'Fitting second model...' prints and the next model initializes almost immediately
second_model = SecondModel(parameters).fit(data)
results = second_model.forecast(future_dates)
Why would the statement print('Done.\nFitting second model...', end=\t') first print 'Done.' and then pause for a long period of time? There was one instance when I was running this code, and after the 'Done.' printed I got an error before the rest of the statement printed. The error returned was an error in SecondModel where I tried too access a method as an attribute. What's going on here? How or why is python executing this print statement in such a counterintuitive way? It's as if the interpreter views the new line character as an indication that it should start looking at later parts of the code.
By default, print calls are buffered. The buffer is flushed whenever a newline character is encountered (therefore, you see Done\n appear). However, the subsequent text is kept in the buffer until the next event that flushes it (in the absence of some subsequent newline character to print, that'll probably be Python either returning to the command prompt or exiting completely to the shell, depending on how you're running this script). Therefore, your time-consuming call to SecondModel().fit() is occurring between the display of the two lines.
To avoid this, you can flush the buffer manually by calling sys.stdout.flush() immediately after the print. Or, if you were ever to move to Python 3.3 or higher, you would be able to shortcut this by passing the additional argument flush=True into print().
Error messages can interrupt printed output, and vice versa, because by default they are handled by two separate streams: sys.stderr and sys.stdout, respectively. The two streams have separate buffers.

Problems refreshing stdout line using print with python

I've been trying to print out the progress of a for loop in python2.7 using the following code:
for i in range(100):
if float(i) % 10.0 == 0:
print i, "\r",
The behaviour I'm after is the refreshing of the same line on std out rather than writing to a new line every time.
EDIT 1:
Testing in my console (Xfce Terminal 0.4.8), I actually don't get any output regardless of whether I include the if statement or not.
Why is there no output?
I originally said the behaviour of the stdout changed depending on the if statement being there or not because I simplified the code that produced the problem to its most simple form (only to produce the above mentioned effect). My apologies.
EDIT 2:
Thanks to senderle, this is solved. If you miss out the sleep() command, the prints and carriage return happen so quickly you can't see them.
EDIT 3:
One last thing. If you don't catch for the final number in range(100), i.e. 99, the number is cleared off the screen.
EDIT 4:
Note the comma after print i in senderle's answer.
I have found that using sys.stdout is a more system-independent way of doing this, for varions reasons having to do with the way print works. But you have to flush the buffer explicitly, so I put it in a function.
def carriage_return():
sys.stdout.write('\r')
sys.stdout.flush()
This is kind of a WAG. Let me know if it helps.
I tried this and it works for me. The time.sleep is just for dramatization.
import sys, time
def carriage_return():
sys.stdout.write('\r')
sys.stdout.flush()
for i in range(100):
if i % 10 == 0:
print i,
carriage_return()
time.sleep(1)
Finally, I have seen people do this as well. Using terminal control codes like this seems right in some ways, but it also seems more brittle to me. This works for me with the above code as well (on OS X).
def carriage_return():
if sys.platform.lower().startswith('win'):
print '\r'
else:
print chr(27) + '[A'
Testing your code as is, and just including a :colon: at the end of the first line, works just fine with Py2.7 32bit, Windows7 64-bit.
Do you have any out writes to stdout in your if or for block that could be causing the new-lines to be written out ?

Categories