I'm trying to decide which one to use when I need to acquire lines of input from STDIN, so I wonder how I need to choose them in different situations.
I found a previous post (https://codereview.stackexchange.com/questions/23981/how-to-optimize-this-simple-python-program) saying that:
How can I optimize this code in terms of time and memory used? Note that I'm using different function to read the input, as sys.stdin.readline() is the fastest one when reading strings and input() when reading integers.
Is that statement true ?
The builtin input and sys.stdin.readline functions don't do exactly the same thing, and which one is faster may depend on the details of exactly what you're doing. As aruisdante commented, the difference is less in Python 3 than it was in Python 2, when the quote you provide was from, but there are still some differences.
The first difference is that input has an optional prompt parameter that will be displayed if the interpreter is running interactively. This leads to some overhead, even if the prompt is empty (the default). On the other hand, it may be faster than doing a print before each readline call, if you do want a prompt.
The next difference is that input strips off any newline from the end of the input. If you're going to strip that anyway, it may be faster to let input do it for you, rather than doing sys.stdin.readline().strip().
A final difference is how the end of the input is indicated. input will raise an EOFError when you call it if there is no more input (stdin has been closed on the other end). sys.stdin.readline on the other hand will return an empty string at EOF, which you need to know to check for.
There's also a third option, using the file iteration protocol on sys.stdin. This is likely to be much like calling readline, but perhaps nicer logic to it.
I suspect that while differences in performance between your various options may exist, they're liky to be smaller than the time cost of simply reading the file from the disk (if it is large) and doing whatever you are doing with it. I suggest that you avoid the trap of premature optimization and just do what is most natural for your problem, and if the program is too slow (where "too slow" is very subjective), you do some profiling to see what is taking the most time. Don't put a whole lot of effort into deciding between the different ways of taking input unless it actually matters.
As Linn1024 says, for reading large amounts of data input() is much slower.
A simple example is this:
import sys
for i in range(int(sys.argv[1])):
sys.stdin.readline()
This takes about 0.25μs per iteration:
$ time yes | py readline.py 1000000
yes 0.05s user 0.00s system 22% cpu 0.252 total
Changing that to sys.stdin.readline().strip() takes that to about 0.31μs.
Changing readline() to input() is about 10 times slower:
$ time yes | py input.py 1000000
yes 0.05s user 0.00s system 1% cpu 2.855 total
Notice that it's still pretty fast though, so you only really need to worry when you are reading thousands of entries like above.
It checks if it is TTY every time as input() runs by syscall and it works much more slow than sys.stdin.readline()
https://github.com/python/cpython/blob/af2f5b1723b95e45e1f15b5bd52102b7de560f7c/Python/bltinmodule.c#L1981
import sys
def solve(N, A):
for in range (N):
A.append(int(sys.stdin.readline().strip()))
return A
def main():
N = int(sys.stdin.readline().strip())
A = []
result = solve(N, A):
print(result)
main()
Related
We've got a script that uses itertools.combinations() and it seems to hang with a large input size.
I'm a relatively inexperienced Python programmer so I'm not sure how to fix this problem. Is there a more suitable library? Or is there a way to enable verbose logging to that I can debug why the method call is hanging?
Any help is much appreciated.
[Edit]
def findsubsets(S,m):
return set( itertools.combinations(S, m) )
for s in AllSearchTerms:
S.append(itemsize)
itemsize = itemsize + 1
for i in range (1,6):
Subset = findsubsets(S,i)
for sub in Subset:
for s in sub:
sublist.append(AllSearchTerms[s])
PComb.append(sublist)
sublist = []
You have two things in your code that will hang for large input sizes.
First, your function findsubsets calls itertools.combinations then converts the result to a set. The result of itertools.combinations is a generator, yielding each combination one at a time without storing them or all calculating them all at once. When you convert that to a set, you force Python to calculate and store them all at once. Therefore the line return set( itertools.combinations(S, m) ) is almost certainly where your program hangs. You can check that by placing print statements (or some other kind of logging statements) immediately before and after that line, and if you see the preceding print and the program hangs before you see the succeeding one, you have found the problem. The solution is not to convert the combinations to a set. Leave it as a generator, and your program can grab one combination at a time, as needed.
Second, even if you do what I just suggested, your loop for sub in Subset: is a fairly tight loop that uses every combination. If the input size is large, that loop will take a very long time and implementing my previous paragraph will not help. You probably should reorganize your program to avoid the large input sizes, or at least show some kind of progress during that loop. The combinations function has a predictable output size so you can even show the percent done in a progress bar.
There is no logging inside itertools.combinations since it is not needed when used properly, and there is no logging in the conversion of the generator to a set. You can implement logging in your own tight loop, if needed.
I'm trying to decide which one to use when I need to acquire lines of input from STDIN, so I wonder how I need to choose them in different situations.
I found a previous post (https://codereview.stackexchange.com/questions/23981/how-to-optimize-this-simple-python-program) saying that:
How can I optimize this code in terms of time and memory used? Note that I'm using different function to read the input, as sys.stdin.readline() is the fastest one when reading strings and input() when reading integers.
Is that statement true ?
The builtin input and sys.stdin.readline functions don't do exactly the same thing, and which one is faster may depend on the details of exactly what you're doing. As aruisdante commented, the difference is less in Python 3 than it was in Python 2, when the quote you provide was from, but there are still some differences.
The first difference is that input has an optional prompt parameter that will be displayed if the interpreter is running interactively. This leads to some overhead, even if the prompt is empty (the default). On the other hand, it may be faster than doing a print before each readline call, if you do want a prompt.
The next difference is that input strips off any newline from the end of the input. If you're going to strip that anyway, it may be faster to let input do it for you, rather than doing sys.stdin.readline().strip().
A final difference is how the end of the input is indicated. input will raise an EOFError when you call it if there is no more input (stdin has been closed on the other end). sys.stdin.readline on the other hand will return an empty string at EOF, which you need to know to check for.
There's also a third option, using the file iteration protocol on sys.stdin. This is likely to be much like calling readline, but perhaps nicer logic to it.
I suspect that while differences in performance between your various options may exist, they're liky to be smaller than the time cost of simply reading the file from the disk (if it is large) and doing whatever you are doing with it. I suggest that you avoid the trap of premature optimization and just do what is most natural for your problem, and if the program is too slow (where "too slow" is very subjective), you do some profiling to see what is taking the most time. Don't put a whole lot of effort into deciding between the different ways of taking input unless it actually matters.
As Linn1024 says, for reading large amounts of data input() is much slower.
A simple example is this:
import sys
for i in range(int(sys.argv[1])):
sys.stdin.readline()
This takes about 0.25μs per iteration:
$ time yes | py readline.py 1000000
yes 0.05s user 0.00s system 22% cpu 0.252 total
Changing that to sys.stdin.readline().strip() takes that to about 0.31μs.
Changing readline() to input() is about 10 times slower:
$ time yes | py input.py 1000000
yes 0.05s user 0.00s system 1% cpu 2.855 total
Notice that it's still pretty fast though, so you only really need to worry when you are reading thousands of entries like above.
It checks if it is TTY every time as input() runs by syscall and it works much more slow than sys.stdin.readline()
https://github.com/python/cpython/blob/af2f5b1723b95e45e1f15b5bd52102b7de560f7c/Python/bltinmodule.c#L1981
import sys
def solve(N, A):
for in range (N):
A.append(int(sys.stdin.readline().strip()))
return A
def main():
N = int(sys.stdin.readline().strip())
A = []
result = solve(N, A):
print(result)
main()
I inadvertently ran across a phenomenon that has me a bit perplexed. I was using IDLE for some quick testing, and I had some very simple code like this (which I have simplified for the purpose of illustration):
from time import clock # I am presently using windows
def test_speedup():
c = clock()
for i in range(1000):
print i,
print '=>', clock() - c
Now I ran this code like so (several times, with the same basic results):
# without pressing enter
>>> test_speedup()
0 1 2 3 4 . . . 997 998 999 => 12.8300956124 # the time to run code in seconds
# pressing enter ONLY 3 TIMES while the code ran
>>> test_speedup()
0 1 2 3 4 . . . 997 998 999 => 4.8656890089
# Pressing enter several times while the code ran
>>> test_speedup()
0 1 2 3 4 . . . 997 998 999 => 1.91522580283
My first hunch was that perhaps the code ran faster because the output system perhaps did not have to collate as many strings when I pressed enter (beginning anew each time I pressed enter). In fact, the output system always seemed to receive a boost in speed immediately after I pressed enter.
I have also reviewed the documentation here, but I am still a little puzzled why three newline characters would speed things up so substantially.
This question is admittedly somewhat trivial, unless one would like to know how to speed up the output system in IDLE without aborting the script. Still, I would like to understand what is happening with the output system here.
(I am using Python 2.7.x.)
Under the covers, IDLE is simulating a terminal on top of a Tk widget, which I'm pretty sure is ultimately derived from Text.
The presence of long lines slows down that widget a little bit. And appending to long lines takes longer than appending to short ones. If you really want to understand why this happens, you need to look at the Tcl code underlying the Tk Text widget, which Tkinter.Text is just a thin wrapper around.
Meanwhile, the Tkinter loop that IDLE runs does some funky things to allow it to accept input without blocking the loop. When it thinks there's nothing else going on, it may sometimes block for a short time until it sees input or a Tk event, and all of those short blocks may add up; pressing Enter may just cancel a few of them.
I'm not actually sure which of these two is more relevant here. You'd have to test it with a program that just spams long lines vs. one that inserts newlines every, say, 10 numbers, and see how much of the performance improvement you get that way.
From a quick test on my Mac, the original program visibly slows down gradually, and also has two quantum jumps in sluggishness around 500 and 920. So, it makes sense that hitting enter every 333 or so would speed things up a lot—you likely avoid both of those quantum slowdowns. If I change it to just remove the comma, the problem goes away.
Printing a newline for every number could of course cause a different slowdown, because that makes the terminal long enough to need to scroll, increases the scrollback buffer, etc. I didn't see that cost in IDLE, but run the same thing on the Windows command line, and you'll see problems with too many newlines just as bad as the problems with too few in IDLE. The best tradeoff should probably come from "square"-ish data, or data that's as close to 80 columns as possible without going over.
I believe this has something to do with the IDE that you are using to execute your code.
I have run your code (with syntax alterations for 3.3) and got the same time to execute both times:
999
=> 8.09542283367021
edit: This was performed in stock IDLE with Python 3.3. I spammed the enter key with multiple experiments and witnessed no time difference between string output on the control v. the experiment.
On a hunch, I decided to remove a print function and condense it into one line.:
print(i, "=>", clock()-c)
this produced 999 => 7.141783124325343
On this basis, I believe that the time differential you are seeing is due to the IDE sucking up more threads to process your code which results in faster times. Clearly, the time at the end is the total amount of time to compute and print your for loop.
To confirm my suspicion, I decided to put everything into a list of tuples and then print that list.
def new_speed_test():
speed_list = []
c = clock()
for i in range(1000):
speed_list.append((i, clock()-c))
print(speed_list[-1])
which output: (999, 0.0005668934240929957)
To conclude: My experiment confirms it is how your IDE handles output and CPU consumption.
I am trying to make a program which has a raw_input in a loop, if anyone presses a key while the long loop is running the next raw_input takes that as input, how do I avoid that?
I don't know what else to add to this simple question. Do let me know if more is required.
EDIT
Some code
for i in range(1000):
var = raw_input("Enter the number")
#.... do some long magic and stuff here which takes afew seconds
print 'Output is'+str(output)
So if someone presses something inside the magic phase, that is take as the input for the next loop. That is where the problem begins. (And yes the loop has to run for 1000 times).
This works for me with Windows 7 64bit, python 2.7.
import msvcrt
def flush_input():
while msvcrt.kbhit():
msvcrt.getch()
I put the OS in the title, window 7 64 bit to be specific. I saw the
answers there. They do apply but by god they are so big. Aren't there
other n00b friendly and safer ways to take inputs?
Let me try to explain why you need to do such an elaborate process. When you press a key it is stored in a section of computer memory called keyboard buffer (not to be confused with stdin buffer). This buffer stores the key's pressed until it is processed by your program. Python doesn't provide any platform independent wrapper to do this task. You have to rely on OS specific system calls to access this buffer, and flush it, read it or query it. msvcrt is a MS VC++ Runtime Library and python msvcrt provides a wrapper over it. Unless you wan't a platform independent solution, it is quite straight forward.
Use msvcrt getch to read a character from console. msvcrt.kbhit() to test if a key press is present in the keyboard buffer and so on. So as MattH has shown, it just a couple of lines code. And if you think you are a noob take this opportunity to learn something new.
Just collect your input outside of the loop (before you enter the loop). Do you really want the user to enter 1000 numbers? well maybe you do. but just include a loop at the top and collect the 1000 numbers at the start, and store them in an array.
then on the bottom half change your loop so it just does all the work. then if someone enters something no the keyboard, it doesn't really matter anymore.
something like this:
def getvars(top=1000):
vars = []
for i in range(0,top):
anum = int(raw_input('%d) Please enter another number: ' % i))
vars.append(anum)
return vars
def doMagic(numbers):
top = len(numbers)
for number in numbers:
# do magic number stuff
print 'this was my raw number %s' % number
if __name__ == "__main__":
numbers = getvars(top=10)
doMagic(numbers)
presented in a different sort of way and less os dependent
There is another way to do it that should work. I don't have a windows box handy to test it out on but its a trick i used to use and its rather undocumented. Perhaps I'm giving away secrets... but its basically like this: trick the os into thinking your app is a screensaver by calling the api that turns on the screensaver function at the start of your magic calculations. at the end of your magic calculations or when you are ready to accept input again, call the api again and turn off the screensaver functionality.
That would work.
There is another way to do it as well. Since you are in windows this will work too. but its a fair amount of work but not really too much. In windows, the window that is foreground (at the top of the Z order) that window gets the 'raw input thread'. The raw input thread receives the mouse and keyboard input. So to capture all input all you need to do is create a function that stands up a transparent or (non transparent) window that sits at the top of the Z order setWindowPos would do the trick , have it cover the entire screen and perhaps display a message such as Even Geduld or Please wait
when you are ready to prompt the user for more input, you use showwindow() to hide the window, show the previous results, get the input and then reshow the window and capture the keys/mouse all over again.
Of course all these solutions tie you to a particular OS unless you implement some sort of try/except handling and/or wrapping of the low level windows SDK calls.
I'm working on a SPOJ problem, INTEST. The goal is to specify the number of test cases (n) and a divisor (k), then feed your program n numbers. The program will accept each number on a newline of stdin and after receiving the nth number, will tell you how many were divisible by k.
The only challenge in this problem is getting your code to be FAST because k can be anything up to 10^7 and n can be as high as 10^9.
I'm trying to write it in Python and have trouble speeding it up. Any ideas?
Edit 2: I finally got it to pass at 10.54 seconds. I used nearly all of your answers to get there, and thus it was hard to choose one as 'correct', but I believe the one I chose sums it up the best. Thanks to you all. Final passing code is below.
Edit: I included some of the suggested updates in the included code.
Extensions and third-party modules are not allowed. The code is also run by the SPOJ judge machine, so I do not have the option of changing interpreters.
import sys
import psyco
psyco.full()
def main():
from sys import stdin, stdout
first_in = stdin.readline()
thing = first_in.split()
n = int(thing[0])
k = int(thing[1])
total = 0
list = stdin.readlines()
for item in list:
if int(item) % k == 0:
total += 1
stdout.write(str(total) + "\n")
if __name__ == "__main__":
main()
[Edited to reflect new findings and passing code on spoj]
Generally, when using Python for spoj:
Don't use "raw_input", use sys.stdin.readlines(). That can make a difference for large input. Also, if possible (and it is, for this problem), read everything at once (sys.stdin. readlines()), instead of reading line by line ("for line in sys.stdin...").
Similarly, don't use "print", use sys.stdout.write() - and don't forget "\n". Of course, this is only relevant when printing multiple times.
As S.Mark suggested, use psyco. It's available for both python2.5 and python2.6, at spoj (test it, it's there, and easy to spot: solutions using psyco usually have a ~35Mb memory usage offset). It's really simple: just add, after "import sys": import psyco; psyco.full()
As Justin suggested, put your code (except psyco incantation) inside a function, and simply call it at the end of your code
Sometimes creating a list and checking its length can be faster than creating a list and adding its components.
Favour list comprehensions (and generator expressions, when possible) over "for" and "while" as well. For some constructs, map/reduce/filter may also speed up your code.
Using (some of) these guidelines, I've managed to pass INTEST. Still testing alternatives, though.
Hey, I got it to be within the time limit. I used the following:
Psyco with Python 2.5.
a simple loop with a variable to keep count in
my code was all in a main() function (except the psyco import) which I called.
The last one is what made the difference. I believe that it has to do with variable visibility, but I'm not completely sure. My time was 10.81 seconds. You might get it to be faster with a list comprehension.
Edit:
Using a list comprehension brought my time down to 8.23 seconds. Bringing the line from sys import stdin, stdout inside of the function shaved off a little too to bring my time down to 8.12 seconds.
Use psyco, it will JIT your code, very effective when there is big loop and calculations.
Edit: Looks like third party modules are not allowed,
So, you may try converting your loop to list comprehensions, it supposed to be run at C level, so it should be faster a little bit.
sum(1 if int(line) % k == 0 else 0 for line in sys.stdin)
Just recently Alex Martinelli said that invoking code inside a function, outperforms code run in the module ( I can't find the post though )
So, why don't you try:
import sys
import psyco
psyco.full1()
def main():
first_in = raw_input()
thing = first_in.split()
n = int(thing[0])
k = int(thing[1])
total = 0
i = 0
total = sum(1 if int(line) % k == 0 else 0 for line in sys.stdin)
print total
if __name__ == "__main__":
main()
IIRC the reason was code inside a function can be optimized.
Using list comprehensions with psyco is counter productive.
This code:
count = 0
for l in sys.stdin:
count += not int(l)%k
runs twice as fast as
count = sum(not int(l)%k for l in sys.stdin)
when using psyco.
For other readers, here is the INTEST problem statement. It's intended to be an I/O throughput test.
On my system, I was able to shave 15% off the execution time by replacing the loop with the following:
print sum(1 for line in sys.stdin if int(line) % k == 0)