Python thread fails to time out when terminal command requires user input - python

The Task
I'm building a Python script, the purpose of which is to audit a number of .tex files, and one step in this auditing process is to test whether each file will compile, each file being compiled using the terminal command xelatex filename.tex.
These are the methods with which I'm testing whether a given file compiles:
def run_xelatex(self):
""" Ronseal. """
self.latex_process = Popen(["xelatex", "current.tex"], stdout=PIPE)
lines = self.latex_process.stdout.readlines()
for line in self.latex_process.stdout:
self.screentext = self.screentext+line.decode("utf-8")+"\n"
def attempt_to_compile(self):
""" Attempt to compile an article, and kill the process if
necessary. """
thread = Thread(target=self.run_xelatex())
thread.start()
thread.join(3)
if thread.is_alive():
self.latex_process.kill()
thread.join()
return False
return True
In English: I create a thread, which in turn creates a process, which in turn tries to compile a given file. If the thread times out, then that file is marked as being uncompilable.
The Problem
The problem is that, if xelatex finds some bad syntax, it asks the user for manual input in order to resolve the issue. But then, for some reason, the thread does not time out when the process is waiting for user input. This means that, when I try to run the script, it stops in mid-flow at several points, until I mash the return key to get things going again. This is not ideal.
What I Want
An explanation of why a thread fails to time out when a process within it asks for user input.
A solution to the problem, either by forcing the thread to time out in the above circumstances, or by preventing xelatex from asking for user input.
Alternatively, an explanation for why what I'm trying to achieve is totally insane, and a suggestion for a better line of attack.

Related

Python3 prompt user input for a limited time, then enter a default if user doesn't respond

Python3: I'm trying to allow a user a given amount of time to enter a response into an input, but after a given amount of time if they haven't entered anything, I want to abort the input and assign a default value to the variable that is storing the input, or otherwise feed default values into the input statement after the given time period.
I've tried this:
from threading import Timer
timeout = 2
t = Timer(timeout, print, ["\nSorry, time is up"])
t.start()
answer = input("You have 2 seconds to answer:")
t.cancel()
print(answer)
from a different stack overflow post, but the problem is that the interpreter still prompts the user for input even after the final line is executed and answer is printed, and this won't work for what I'm trying to do (essentially, a command line game that needs to keep going when the player isn't giving it input but update when it does receive input).
What is the best way to do this? I know python doesn't really have a timeout function or something like that, but is there any way to achieve this via system commands or a module?
There are several plausible approaches (some of which are probably Unix-specific):
Read the input in a subprocess that can be killed after the timeout. (Many functions, including subprocess.run, accept a timeout parameter to automate this.)
Use alarm or similar to send your own process a signal (and install a signal handler that throws an exception).
Have another thread close the descriptor being read after the timeout; this sounds drastic but is generally said to work, so long as you don’t accidentally open a file on the closed descriptor before restoring it with dup2.
Read the input with lower-level facilities like non-blocking read and select—which will unfortunately disable nice things like readline.
In any case, you have to decide what to do with incomplete but non-empty input entered before the timeout. The terminal driver will likely transfer it invisibly to the next input prompt by default.
Using a select call may be a easier way.
import sys, select
print "You have ten seconds to answer!"
i, o, e = select.select( [sys.stdin], [], [], 10 )
if (i):
print "You said", sys.stdin.readline().strip()
else:
print "You said nothing!"
Refer to Keyboard input with timeout?

pexpect did not block my script with pexpect.expect

I am making a task scheduler with pexpect of Python.
This was implemented with a simple idea:
term = spawnu('tcsh') # I need a tcsh instead of its default bash
term.sendline('FIRST_TASK')
term.expect('MY_SHELL_PROMPT') # When parent receive prompt means end of previous task.
term.sendline('SECOND_TASK')
...(and so on)
But I found pexpect.expect did not block this line:
term.expect('MY_SHELL_PROMPT') # Go through this line before finish of previous task.
Since it works with matching pattern set to the last output of previous task. I suspect the pexpect.expect matched MY_SHELL_PROMPT before the child starts its job. I have add some delay before matching. However, this happens even if I add delay before pexect.expect.
time.sleep(2) # delay for 2 second
term.expect('MY_SHELL_PROMPT')
Does anyone know how to debug this? Any help would be appreciate.
I think I found the answer myself.
pexpect does not distinguish echoed command and output from child.
So it is difficult to accomplish this with my previous attempt.
I workarounded this with saving my check code in a txt file.
Such file could be feedback by child with calling 'cat' in child.
For Example:
#check_code.txt
----YOUR JOB IS DONE----
#In testPexpect.py
term.sendline('cat check_code.txt') # this prevents matching its echoed command
term.expect('----YOUR JOB IS DONE----') # blocks and matches successfully

Performing an action upon unexpected exit python

I was wandering if there was a way to perform an action before the program closes. I am running a program over a long time and I do want to be able to close it and have the data be saved in a text file or something but there is no way of me interfering with the while True loop I have running, and simply saving the data each loop would be highly ineffective.
So is there a way that I can save data, say a list, when I hit the x or destroy the program? I have been looking at the atexit module but have had no luck, except when I set the program to finish at a certain point.
def saveFile(list):
print "Saving List"
with open("file.txt", "a") as test_file:
test_file.write(str(list[-1]))
atexit.register(saveFile(list))
That is my whole atexit part of the code and like I said, it runs fine when I set it to close through the while loop.
Is this possible, to save something when the application is terminated?
Your atexit usage is wrong. It expects a function and its arguments, but you're just calling your function right away and passing the result to atexit.register(). Try:
atexit.register(saveFile, list)
Be aware that this uses the list reference as it exists at the time you call atexit.register(), so if you assign to list afterwards, those changes will not be picked up. Modifying the list itself without reassigning should be fine, though.
You could use the handle_exit context manager from this ActiveState recipe:
http://code.activestate.com/recipes/577997-handle-exit-context-manager/
It handles SystemExit, KeyboardInterrupt, SIGINT, and SIGTERM, with a simple interface:
def cleanup():
print 'do some cleanup here'
def main():
print 'do something'
if __name__ == '__main__':
with handle_exit(cleanup):
main()
There's nothing you can in reaction to a SIGKILL. It kills your process immediately, without any allowed cleanup.
Catch the SystemExit exception at the top of your application, then rethrow it.
There are a a couple of approaches to this. As some have commented you could used signal handling ... your [Ctrl]+[C] from the terminal where this is running in the foreground is dispatching a SIGHUP signal to your process (from the terminal's drivers).
Another approach would be to use a non-blocking os.read() on sys.stdin.fileno such that you're polling your keyboard one during every loop to see if an "exit" keystroke or sequence has been entered.
A similarly non-blocking polling approach can be implemented using the select module's functionality. I've see that used with the termios and tty modules. (Seems inelegant that it needs all those to save, set changes to, and restore the terminal settings, and I've also seen some examples using os and fcntl; and I'm not sure when or why one would prefer one over the other if os.isatty(sys.stdin.fileno())).
Yet another approach would be to use the curses module with window.nodelay() or window.timeout() to set your desired input behavior and then either window.getch() or window.getkey() to poll for any input.

Is multiprocessing or threading appropriate in this case in Python/Django?

I have a function like this in Django:
def uploaded_files(request):
global source
global password
global destination
username = request.user.username
log_id = request.user.id
b = File.objects.filter(users_id=log_id, flag='F') # Get the user id from session .delete() to use delete
source = 'sachet.adhikari#69.43.202.97:/home/sachet/my_files'
password = 'password'
destination = '/home/zurelsoft/my_files/'
a = Host.objects.all() #Lists hosts
command = subprocess.Popen(['sshpass', '-p', password, 'rsync', '--recursive', source],
stdout=subprocess.PIPE)
command = command.communicate()[0]
lines = (x.strip() for x in command.split('\n'))
remote = [x.split(None, 4)[-1] for x in lines if x]
base_name = [os.path.basename(ok) for ok in remote]
files_in_server = base_name[1:]
total_files = len(files_in_server)
info = subprocess.Popen(['sshpass', '-p', password, 'rsync', source, '--dry-run'],
stdout=subprocess.PIPE)
information = info.communicate()[0]
command = information.split()
filesize = command[1]
#st = int(os.path.getsize(filesize))
#filesize = size(filesize, system=alternative)
date = command[2]
users_b = User.objects.all()
return render_to_response('uploaded_files.html', {'files': b, 'username':username, 'host':a, 'files_server':files_in_server, 'file_size':filesize, 'date':date, 'total_files':total_files, 'list_users':users_b}, context_instance=RequestContext(request))
The main usage of the function is to transfer the file from the server to local machine and writes the data into the database. What I want it: There are single file which is of 10GB which will take a long time to copy. Since the copying happens using rsync in command line, I want to let user play with other menus while the file is being transferred. How can I achieve that? For example if the user presses OK, the file will be transferring in command line, so I want to show user "The file is being transferred" message and stop rolling the cursor or something like that? Is multiprocessing or threading appropriate in this case? Thanks
Assuming that function works inside of a view, your browser will timeout before the 10GB file has finished transferring over. Maybe you should re-think your architecture for this?
There are probably several ways to do this, but here are some that come to my mind right now:
One solution is to have an intermediary storing the status of the file transfer. Before you begin the process that transfers the file, set a flag somewhere like a database saying the process has begun. Then if you make your subprocess call blocking, wait for it to complete, check the output of the command if possible and update the flag you set earlier.
Then have whatever front end you have poll the status of the file transfer.
Another solution, if you make the subprocess call non-blocking as in your example, in that case you should use a thread which sits there reading the stdout and updating an intermediary store which your front end can query to get a more 'real time' update of the transfer process.
What you need is Celery.
It let's you spawn job as a parallel task and return http response.
RaviU solutions would certainly work.
Another option is to call a blocking subprocess in its own Thread. This thread could be responsible for setting a flag or information (in memcache, db, or just a file on the harddrive) as well as clearing it when it's complete. Personally, there is no love lost between reading rsyncs stdout and I so I usually just ask the OS for the filesize.
Also, if you don't need the file absolutely ASAP, adding "-c" to do a checksum can be good for those giant files. source: personal experience trying to transfer giant video files over spotty campus network.
I will say the one problem with all of the solutions so far is that it doesn't work for "N" files. Eventually, even if you make sure each file only can be transfered once at a time, if you have a lot of different files then eventually it'll bog down the system. You might be better off just using some sort of task queue unless you know it will only ever be the one file at a time. I haven't used one recently, but a quick google search yielded Celery which doesn't look to bad.
Every web server has a facility of uploading files. And what it does for large files is that it divides the file in chunks and does a merge after every chunk is received. What you can do here is that you can have a hidden tag in your html page which has a value attribute and whenever your upload webservice returns you an ok message at that point of time you can change the hidden html value to something relevant and also write a function that keeps on reading the value of that hidden html element and check whether your file uploading has been finished or not.

Overriding basic signals (SIGINT, SIGQUIT, SIGKILL??) in Python

I'm writing a program that adds normal UNIX accounts (i.e. modifying /etc/passwd, /etc/group, and /etc/shadow) according to our corp's policy. It also does some slightly fancy stuff like sending an email to the user.
I've got all the code working, but there are three pieces of code that are very critical, which update the three files above. The code is already fairly robust because it locks those files (ex. /etc/passwd.lock), writes to to a temporary files (ex. /etc/passwd.tmp), and then, overwrites the original file with the temporary. I'm fairly pleased that it won't interefere with other running versions of my program or the system useradd, usermod, passwd, etc. programs.
The thing that I'm most worried about is a stray ctrl+c, ctrl+d, or kill command in the middle of these sections. This has led me to the signal module, which seems to do precisely what I want: ignore certain signals during the "critical" region.
I'm using an older version of Python, which doesn't have signal.SIG_IGN, so I have an awesome "pass" function:
def passer(*a):
pass
The problem that I'm seeing is that signal handlers don't work the way that I expect.
Given the following test code:
def passer(a=None, b=None):
pass
def signalhander(enable):
signallist = (signal.SIGINT, signal.SIGQUIT, signal.SIGABRT, signal.SIGPIPE, signal.SIGALRM, signal.SIGTERM, signal.SIGKILL)
if enable:
for i in signallist:
signal.signal(i, passer)
else:
for i in signallist:
signal.signal(i, abort)
return
def abort(a=None, b=None):
sys.exit('\nAccount was not created.\n')
return
signalhander(True)
print('Enabled')
time.sleep(10) # ^C during this sleep
The problem with this code is that a ^C (SIGINT) during the time.sleep(10) call causes that function to stop, and then, my signal handler takes over as desired. However, that doesn't solve my "critical" region problem above because I can't tolerate whatever statement encounters the signal to fail.
I need some sort of signal handler that will just completely ignore SIGINT and SIGQUIT.
The Fedora/RH command "yum" is written is Python and does basically exactly what I want. If you do a ^C while it's installing anything, it will print a message like "Press ^C within two seconds to force kill." Otherwise, the ^C is ignored. I don't really care about the two second warning since my program completes in a fraction of a second.
Could someone help me implement a signal handler for CPython 2.3 that doesn't cause the current statement/function to cancel before the signal is ignored?
As always, thanks in advance.
Edit: After S.Lott's answer, I've decided to abandon the signal module.
I'm just going to go back to try: except: blocks. Looking at my code there are two things that happen for each critical region that cannot be aborted: overwriting file with file.tmp and removing the lock once finished (or other tools will be unable to modify the file, until it is manually removed). I've put each of those in their own function inside a try: block, and the except: simply calls the function again. That way the function will just re-call itself in the event of KeyBoardInterrupt or EOFError, until the critical code is completed.
I don't think that I can get into too much trouble since I'm only catching user provided exit commands, and even then, only for two to three lines of code. Theoretically, if those exceptions could be raised fast enough, I suppose I could get the "maximum reccurrsion depth exceded" error, but that would seem far out.
Any other concerns?
Pesudo-code:
def criticalRemoveLock(file):
try:
if os.path.isFile(file):
os.remove(file)
else:
return True
except (KeyboardInterrupt, EOFError):
return criticalRemoveLock(file)
def criticalOverwrite(tmp, file):
try:
if os.path.isFile(tmp):
shutil.copy2(tmp, file)
os.remove(tmp)
else:
return True
except (KeyboardInterrupt, EOFError):
return criticalOverwrite(tmp, file)
There is no real way to make your script really save. Of course you can ignore signals and catch a keyboard interrupt using try: except: but it is up to your application to be idempotent against such interrupts and it must be able to resume operations after dealing with an interrupt at some kind of savepoint.
The only thing that you can really to is to work on temporary files (and not original files) and move them after doing the work into the final destination. I think such file operations are supposed to be "atomic" from the filesystem prospective. Otherwise in case of an interrupt: restart your processing from start with clean data.

Categories