How to get open files of a subprocess?
i opened a subprocess which generate files, i want get file descritor of these files to do fsync on them
so if i have code like this:
p = subprocess.Popen([
'some_program'
])
the process p generate some files
i can get the process id of the subprocess using:
p.pid
but how can i get fd of these files to call flush and fsync() on them?
actually i find a utility called "lsof" (list open files)
but it is not installed or supported on my system, so i did not do further investigations on it, as i really need a standard way
thanks
Each process has its own table of file descriptors. If you know that a child process has a certain file open with FD 8 (which is easy enough, just take a listing of /proc/<pid>/fd), when you do fsync(8) you are sync'ing a file of your process, not the child's.
The same applies to all functions that use file descriptors: fread, fwrite, dup, close...
To get the effect of fsync, you might call sync instead.
What you could do instead is implement some kind of an RPC mechanism. For example you could add a signal handler that makes the child run fsync on all open FDs when it receives SIGUSR1.
If you want to use a packed solution, instead of going to /proc/pid/fd, an option is to use lsof of psutils
You can't fsync on behalf of another process. Also, you probably want flushing, not fsync. You can't flush on behalf of another process either. Rethink your requirements.
Related
I've read that there is a way to open a file in python with..
os.startfile('file.exe')
is there a way to close the same file when open?
Thank you in advance!
From the os.startfile() doc:
startfile() returns as soon as the associated application is launched. There is no option to wait for the application to close, and no way to retrieve the application’s exit status.
So, basically, no, there isn't a way to close a file opened with startfile.
It isn't clear from the question is whether you want to launch a file, or to open it (for reading/writing).
If you want to launch a process, subprocess is a better candidate for running other processes and controlling them through a subshell (including killing them.)
If you want to open a file for read/write, then open() would be a good choice to start with.
As the Python Wiki says this function does the following:
Start a file with its associated application.
So the best idea would be to use os.kill to kill the application in which it is opened. The problem lies in identyfying what application is associated with the file of following extension and finding pid of exact instance which opened that file.
You have used .exe file in example which is executable file extension, so you probably misunderstood what this function does. What are you trying to accomplish? Are you sure it is the correct way of doing it?
If you really want to launch executable file, you should probably use os.system(). If you want to create a new file, write something in it and close it, look for python file operations, here are good examples: http://www.tutorialspoint.com/python/python_files_io.htm
I have a web server in Python (2.7) that uses Popen to delegate some work to a child process:
url_arg = "http://localhost/index.html?someparam=somevalue"
call = ('phantomjs', 'some/phantom/script.js', url_arg)
imageB64data = tempfile.TemporaryFile()
errordata = tempfile.TemporaryFile()
p = Popen(call, stdout=imageB64data, stderr=errordata, stdin=PIPE)
p.communicate(input="")
I am seeing intermittent issues where after some number of these Popens have occurred (roughly 64), the process runs out of file descriptors and is unable to function -- it becomes completely unresponsive and all threads seem to block forever if they attempt to open any files or sockets.
(Possibly relevant: the phantomjs child process loads a URL calls back into the server that spawned it.)
Based on this Python bug report, I believe I need to set close_fds=True on all Popen calls from inside my server process in order to mitigate the leaking of file descriptors. However, I am unfamiliar with the machinery around exec-ing subprocesses and inheritance of file descriptors so much of the Popen documentation and the notes in the aforementioned bug report are unclear to me.
It sounds like it would actually close all open file descriptors (which includes active request sockets, log file handles, etc.) in my process before executing the subprocess. This sounds like it would be strictly better than leaking the sockets, but would still result in errors.
However, in practice, when I use close_fds=True during a web request, it seems to work fine and thus far I have been unable to construct a scenario where it actually closes any other request sockets, database requests, etc.
The docs state:
If close_fds is true, all file descriptors except 0, 1 and 2 will be closed before the child process is executed.
So my question is: is it "safe" and "correct" to pass close_fds=True to Popen in a multithreaded Python web server? Or should I expect this to have side effects if other requests are doing file/socket IO at the same time?
I tried the following test with the subprocess32 backport of Python 3.2/3.3's subprocess:
import tempfile
import subprocess32 as subprocess
fp = open('test.txt', 'w')
fp.write("some stuff")
echoed = tempfile.TemporaryFile()
p = subprocess.Popen(("echo", "this", "stuff"), stdout=echoed, close_fds=True)
p.wait()
echoed.seek(0)
fp.write("whatevs")
fp.write(echoed.read())
fp.close()
and I got the expected result of some stuffwhatevsecho this stuff in test.txt.
So it appears that the meaning of close in close_fds does not mean that open files (sockets, etc.) in the parent process will be unusable after executing a child process.
Also worth noting: subprocess32 defaults close_fds=True on POSIX systems, AFAICT. This implies to me that it is not as dangerous as it sounds.
I suspect that close_fds solves the problem of file descriptors leaking to subprocesses. Imagine opening a file, and then running some task using subprocess. Without close_fds, the file descriptor is copied to the subprocess, so even if the parent process closes the file, the file remains open due to the subprocess. Now, let's say we want to delete the directory with the file in another thread using shutil.rmtree. On a regular filesystem, this should not be an issue. The directory is just removed as expected. However, when the file resides on NFS, the following happens: First, Python will try to delete the file. Since the file is still in use, it gets renamed to .nfsXXX instead, where XXX is a long hexadecimal number. Next, Python will try to delete the directory, but that has become impossible because the .nfsXXX file still resides in it.
I've read that there is a way to open a file in python with..
os.startfile('file.exe')
is there a way to close the same file when open?
Thank you in advance!
From the os.startfile() doc:
startfile() returns as soon as the associated application is launched. There is no option to wait for the application to close, and no way to retrieve the application’s exit status.
So, basically, no, there isn't a way to close a file opened with startfile.
It isn't clear from the question is whether you want to launch a file, or to open it (for reading/writing).
If you want to launch a process, subprocess is a better candidate for running other processes and controlling them through a subshell (including killing them.)
If you want to open a file for read/write, then open() would be a good choice to start with.
As the Python Wiki says this function does the following:
Start a file with its associated application.
So the best idea would be to use os.kill to kill the application in which it is opened. The problem lies in identyfying what application is associated with the file of following extension and finding pid of exact instance which opened that file.
You have used .exe file in example which is executable file extension, so you probably misunderstood what this function does. What are you trying to accomplish? Are you sure it is the correct way of doing it?
If you really want to launch executable file, you should probably use os.system(). If you want to create a new file, write something in it and close it, look for python file operations, here are good examples: http://www.tutorialspoint.com/python/python_files_io.htm
I'm trying to build an application that displays in a GUI the contents of a log file, written by a separate program that I call through subprocess. The application runs in Windows, and is a binary that I have no control over. Also, this application (Actel Designer if anyone cares) will write its output to a log file regardless of how I redirect the output of subprocess, so using a pipe for the output doesn't seem to be an option. The bottom line is that I seem to be forced into reading from a log file at the same time another thread may be writing to it. My question is if there is a way that I can keep the GUI's display of the log file's contents up to date in a robust way?
I've tried the following:
Naively opening the file for reading periodically while the child
process is running causes Python to crash (I'm guessing because the
child thread is writing to the file while I'm attempting to read its
contents)
Next I tried to open a file handle to the log filename before invoking the child process with GENERIC_READ, and SHARED_READ | SHARED_WRITE | SHARED_DELETE and reading back from that file. With this approach, the file appears empty
Thanks for any help you can provide - I'm not a professional programmer and I've been pulling my hair out over this for a week.
You should register for notifications on file change, the way tail -f does (you can find out what system calls it uses by executing strace tail -f logfile).
pyinotify provides a Python interface for these file change notifications.
We have several cron jobs that ftp proxy logs to a centralized server. These files can be rather large and take some time to transfer. Part of the requirement of this project is to provide a logging mechanism in which we log the success or failure of these transfers. This is simple enough.
My question is, is there a way to check if a file is currently being written to? My first solution was to just check the file size twice within a given timeframe and check the file size. But a co-worker said that there may be able to hook into the EXT3 file system via python and check the attributes to see if the file is currently being appended to. My Google-Fu came up empty.
Is there a module for EXT3 or something else that would allow me to check the state of a file? The server is running Fedora Core 9 with EXT3 file system.
no need for ext3-specific hooks; just check lsof, or more exactly, /proc/<pid>/fd/* and /proc/<pid>/fdinfo/* (that's where lsof gets it's info, AFAICT). There you can check if the file is open, if it's writeable, and the 'cursor' position.
That's not the whole picture; but any more is done in processpace by stdlib on the writing process, as most writes are buffered and the kernel only sees bigger chunks of data, so any 'ext3-aware' monitor wouldn't get that either.
There's no ext3 hooks to check what you'd want directly.
I suppose you could dig through the source code of Fuser linux command, replicate the part that finds which process owns a file, and watch that resource. When noone longer has the file opened, it's done transferring.
Another approach:
Your cron jobs should tell that they're finished.
We have our cron jobs that transport files just write an empty filename.finished after it's transferred the filename. Another approach is to transfer them to a temporary filename, e.g. filename.part and then rename it to filename Renaming is atomic. In both cases you check repeatedly until the presence of filename or filename.finished