closing files of a killed process

closing files of a killed process - python

python: 3.4
OS: win7 / win10
I want to kill a running process with python and close all the files it opened:
for proc in psutil.process_iter():
if proc.name() == 'myprocess.exe':
opened = proc.open_files()
proc.kill()
for i in opened:
print(i.path)
io.FileIO(i.path).close()
print(io.FileIO(i.path).closed)
Somehow io.IOBase(i.path).close() does not work.
Explanation:
It's like I would like to kill Microsoft Word with python, but it leaves some files open. And I would like to close those files as well.
Microsoft Word is just an example. It is a self-written python programm. The opened files are:
fonts (.ttf)
clr.pyd
and .dll-s
How should I close these files?

You don't need to close any files that were opened by the process. That is done automatically:
Terminating a process has the following results:
Any remaining threads in the process are marked for termination.
Any resources allocated by the process are freed.
All kernel objects are closed.
The process code is removed from memory.
The process exit code is set.
The process object is signaled.
The important bit is "All kernel objects are closed." For every open file handle, there is an associated kernel object--that's actually what a handle is, a mapping from a number to a kernel object. When the process exits, the kernel will walk behind and close all associated file handles, sockets, etc.
Additionally, you're original approach has a few problems. First, the list of open files is only a snapshot of which ones were open at that time. In between asking for the list of open files and killing the process, the process could have opened many more, or closed and removed many as well. Second, the Python 3 docs say that the constructor for IOBase isn't public, so using it in this way is wrong:
class io.IOBase
The abstract base class for all I/O classes, acting on streams of bytes. There is no public constructor.
Generally, you'd use something like io.open() which takes the path. This leads to the third issue. All you have to work with is the path. In order to close a file, you really need the handle. Those handles are process-specific. This means in one process, 0x5555AAAA may correspond to "file1.txt", but in another process, it might correspond to "file2.txt" or maybe not even a file at all (it could be a socket or something else). So even if you have the kernel handle, we don't really have a way of saying "close this handle in the context of this other process." That violates some security goals of processes. Also, it means that what you're actually doing here is creating your own handle to only turn around and close it (or in this case, it possibly does nothing at all since the object wasn't created correctly).
So, if you're having a problem with files still being held, perhaps the problem is that the process didn't actually die yet before trying whatever work you needed to get done. You may need to wait for the process to exit before attempting to move on if there are files the process was using that you want to use again. It looks like you can use psutils.wait_procs() to do that.
Also, on Windows I find that anti-virus tools often get in the way. They hold open files accessed by applications making it look like a process is still holding onto them when it's actually the virus scanner doing its thing. I remember one instance of having to deal with this in Subversion. The code still exists today. So you might need to simply wait a bit and try again.
Update
Microsoft Word is just an example. It is a self-written python programm. The opened files are:
fonts (.ttf)
clr.pyd
and .dll-s
How should I close these files?
The answer is that you shouldn't need to. Just make sure the process has actually exited. It's not an instantaneous operation, so there's some time between killing it and it actually exiting that it still retains the file handles.
Given that you've actually written the process being killed, I think a far better approach would be to introduce a way to launch that process, have it do its work, then exit gracefully. Then use subprocess.run() to run the script and wait for it to exit.

It's like I would like to kill Microsoft Word with python, but it leaves some files open. And I would like to close those files as well.
There is some misunderstanding here. When you terminate Word with kill, all files are closed from a system point of view, but they will be dirty closed. When Word terminates normally, it flushes its internal buffers, removes any temporary files and mark the files as clean. When it crashes or is abruptely terminated, all that cleaning does not occur. Some modifications may not be written to disk, and temp files are still there, so on next execution, Word will warn you that the files have not been orderly closed and have to be repaired.
So you do not want to kill Microsoft Word, but to close it, meaning posting a WM_QUIT message to its main window. Unfortunately, there is no clean and neat support in Python for that. There is an example of closing Excel by the win32com module here. The convertion for Word should be (beware untested):
wd = win32com.client.Dispatch("Word.Application")
wd.Quit() #quit word, as if user hit the close button/clicked file->exit.

Take a look at the with statement syntax. There's a brief overview here

Related

Python Save Sets To File On Windows Shutdown?

I do not want to lose my sets if windows is about to shutdown/restart/log off/sleep, Is it possible to save it before shutdown? Or is there an alternative to save information without worring it will get lost on windows shutdown? JSON, CSV, DB? Anything?
s = {1,2,3,4}
with open("s.pick","wb") as f: # pickle it to file when PC about to shutdown to save information
pickle.dump(s,f)

I do not want to lose my sets if windows is about to shutdown/restart/log off/sleep, Is it possible to save it before shutdown?
Yes, if you've built an app with a message loop, you can receive the WM_QUERYENDSESSION message. If you want to have a GUI, most GUI libraries will probably wrap this up in their own way. If you don't need a GUI, your simplest solution is probably to use PyWin32. Somewhere in the docs there's a tutorial on creating a hidden window and writing a simple message loop. Just do that on the main thread, and do your real work on a background thread, and signal your background thread when a WM_QUERYENDSESSION message comes in.
Or, much more simply, as Evgeny Prokurat suggests, just use SetConsoleCtrlHandler (again through PyWin32). This can also catch ^C, ^BREAK, and the user closing your console, as well as the logoff and shutdown messages that WM_QUERYENDSESSION catches. More importantly, it doesn't require a message loop, so if you don't have any other need for one, it's a lot simpler.
Or is there an alternative to save information without worring it will get lost on windows shutdown? JSON, CSV, DB? Anything?
The file format isn't going to magically solve anything. However, a database could have two advantages.
First, you can reduce the problem by writing as often as possible. But with most file formats, that means rewriting the whole file as often as possible, which will be very slow. The solution is to streaming to a simpler "journal" file, packing that into the real file less often, and looking for a leftover journal at every launch. You can do that manually, but a database will usually do that for you automatically.
Second, if you get killed in the middle of a write, you end up with half a file. You can solve that by the atomic writing trick—write a temporary file, then replace the old file with the temporary—but this is hard to get right on Windows (especially with Python 2.x) (see Getting atomic writes right), and again, a database will usually do it for you.
The "right" way to do this is to create a new window class with a msgproc that dispatches to your handler on WM_QUERYENDSESSION. Just as MFC makes this easier than raw Win32 API code, win32ui (which wraps MFC) makes this easier than win32api/win32gui (which wraps raw Win32 API). And you can find lots of samples for that (e.g., a quick search for "pywin32 msgproc example" turned up examples like this, and searches for "python win32ui" and similar terms worked just as well).
However, in this case, you don't have a window that you want to act like a normal window, so it may be easier to go right to the low level and write a quick&dirty message loop. Unfortunately, that's a lot harder to find sample code for—you basically have to search the native APIs for C sample code (like Creating a Message Loop at MSDN), then figure out how to translate that to Python with the pywin32 documentation. Less than ideal, especially if you don't know C, but not that hard. Here's an example to get you started:
def msgloop():
while True:
msg = win32gui.GetMessage(None, 0, 0)
if msg and msg.message == win32con.WM_QUERYENDSESSION:
handle_shutdown()
win32api.TranslateMessage(msg)
win32api.DispatchMessage(msg)
if msg and msg.message == win32con.WM_QUIT:
return msg.wparam
worker = threading.Thread(real_program)
worker.start()
exitcode = msgloop()
worker.join()
sys.exit(exitcode)
I haven't shown the "how to create a minimal hidden window" part, or how to signal the worker to stop with, e.g., a threading.Condition, because there are a lot more (and easier-to-find) good samples for those parts; this is the tricky part to find.

you can detect windows shutdown/log off with win32api.setConsoleCtrlHandler
there is a good example How To Catch “Kill” Events with Python

What happens if I don't close a txt file

I'm about to write a program for a racecar, that creates a txt and continuously adds new lines to it. Unfortunately I can't close the file, because when the car shuts off the raspberry (which the program is running on) gets also shut down. So I have no chance of closing the txt.
Is this a problem?

Yes and no. Data is buffered at different places in the process of writing: the file object of python, the underlying C-functions, the operating system, the disk controller. Even closing the file, does not guarantee, that all these buffers are written physically. Only the first two levels are forced to write their buffers to the next level. The same can be done by flushing the filehandle without closing it.
As long as the power-off can occur anytime, you have to deal with the fact, that some data is lost or partially written.
Closing a file is important to give free limited resources of the operating system, but this is no concern in your setup.

Using the Output of Sysinternals Process Monitor in another programm/script in real time

I'm working on a script that should check on certain system events (like opening of a file, or changing of a registry key) and start further actions depending on that. But I haven't found a clean way to get the information into my script.
I'm looking for a way to get the output of Sysinternals Process Monitor into another program. This should happen without user interaction in close to real time; so saving into a CSV/XML and than using this doesn't work.
I've checked on using the backing file, but this is in the Process Monitor PML format, which i haven't found to be documented anywhere.
Does anybody know a way how I can get the output of Process Monitor into my script?
Or an other (not too messy) way to get a real time list of opened files, registry keys etc into a python program?
Thanks!

If you want to parse stdout or a file, and your ok with a 32 bit only solution, try Dr Strace or ntstrace.
YOu could also look into ospy or another ProcMon alternative. ospy is open source, so at the very least you could look at the source code for capturing events.
Here is a list of alternates to ProcMon.

COM: excelApplication.Application.Quit() preserves the process

I'm using COM integration to drive MS Excel from Python 2.7. I noticed a peculiar thing: when I run the following bit of code:
import win32com.client
excelApp = win32com.client.dynamic.Dispatch('Excel.Application')
an EXCEL.EXE process appears on the processes list (which view using the Windows Task Manager or subprocess.Popen('tasklist')) as expected. I then do all the stuff I need to do no problem. However, when I close Excel:
excelApp.Application.Quit()
The process persists, even if I close the Python interpreter which started it (this kind of makes sense as Excel runs in a different process but just to be sure). The only way I've found to terminate this process is either by hand, using the Task Manager, or by calling:
subprocess.Popen("taskkill /F /im EXCEL.EXE",shell=True)
the forceful /F flag is necessary, otherwise the process doesn't terminate.
This isn't really a problem (I hope) but I wanted to ask whether this could cause issues when I first edit the documents "normally", then when calling Excel from python and then "normally" again? Potentially many (couple dozens) times in a row? What I'm worried about is creating conflicting versions of documents etc. Or should I just terminate the EXCEL.EXE process each time just to be safe?
Also I noticed that subprocess.Popen("taskkill") doesn't return any exceptions that I can catch and anylse (or am I doing something worng here?). I'm particularly interested in distinguishing between the "non-existent process" kill attempt and a failed attempt to terminate the process.

try closing any open books, telling the app to quit and delete any references to the app. I usually wrap my com objects in a class. This is what my quit method looks like.
def quit(self):
self.xlBook.Close(SaveChanges=0)
self.xlApp.Quit()
del self.xlApp
Are you calling Dispatch from the main thread? If not be sure to call
pythoncom.CoInitialize()
before Dispatch, and
pythoncom.CoUninitialize()
after Quit

File copy completion?

In Linux, how can we know if a file has completed copying before reading it? In Windows, an OSError is raised.

You can use the inotify mechanisms (via pyinotify) to catch events like CREATE, WRITE, CLOSE and based on them you can assume wether the copy has finished or not.
However, since you provided no details on what are you trying to do, I can't tell if inotify would be suitable for you (btw, inotify is Linux specific so you can't use it on Windows or other platforms)

In Linux, you can open a file while another process is writing to it without Python throwing an OSError, so in general, you cannot know for sure whether the other side has finished writing into that file. You can try some hacks, though:
You can check the file size regularly to see whether it increased since the last check. If it hasn't increased in, say, five seconds, you might be safe to assume that the copy has finished. I'm saying might since this is not true in all circumstances. If the other process that is writing the file is blocked for whatever reason, it might temporarily stop writing to the file and resume it later. So this is not 100% fool-proof, but might work for local file copies if the system is never under a heavy load that would stall the writing process.
You can check the output of fuser (this is a shell command), which will list the process IDs for all the files that are holding a file handle to a given file name. If this list includes any process other than yours, you can assume that the copying process hasn't finished yet. However, you will have to make sure that fuser is installed on the target system in order to make it work.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.