Is it safe to call os.unlink(__file__) in Python? - python

I'm using Python 2.6 on linux.
I have a run.py script which starts up multiple services in the background and generates kill.py to kill those processes.
Inside kill.py, is it safe to unlink itself when it's done its job?
import os
# kill services
os.unlink(__file__)
# is it safe to do something here?
I'm new to Python. My concern was that since Python is a scripting language, the whole script might not be in memory. After it's unlinked, there will be no further code to interpret.
I tried this small test.
import os
import time
time.sleep(10) # sleep 1
os.unlink(__file__)
time.sleep(10) # sleep 2
I ran stat kill.py when this file was being run and the number of links was always 1, so I guess the Python interpreter doesn't hold a link to the file.
As a higher level question, what's the usual way of creating a list of processes to be killed later easily?

Don't have your scripts write new scripts if you can avoid it – just write out a list of the PIDs, and then through them.
It's not very clear what you're trying to do, but creating and deleting scripts sounds like too much fragile magic.
To answer the question:
Python compiles all of the source and closes the file before executing it, so this is safe.
In general, unlinking an opened file is safe on Linux. (But not everywhere: on Windows you can't delete a file that is in use.)
Note that when you import a module, Python 2 compiles it into a .pyc bytecode file and interprets that. If you remove the .py file, Python will still use the .pyc, and vice versa.

Just don't call reload!
There's no need for Python to hold locks on the files since they are compiled and loaded at import time. Indeed, the ability to swap files out while a program is running is often very useful.

IIRC(!): When on *nix an unlink only removes the name in the filesystem, the inode is removed when the last file handle is closed. Therefore this should not induce any problems, except python tries to reopen the file.

As a higher level question, what's the usual way of creating a list of processes to be killed later easily?
I would put the PIDs in a list and iterate over that with os.kill. I don't see why you're creating and executing a new script for this.

Python reads in a whole source file and compiles it before executing it, so you don't have to worry about deleting or changing your running script file.

Related

how can I run python file from another file, then have the new file restart the first file?

So far I don't think this is actually possible, but basically what I am trying to do is have one python program call another and run it, like how you would use import.
But then I need to be able to go from the second file back to the beginning of the first.
Doing this with import doesn't work because the first program never closed and will be still running, so running it again will only return to where it left off when it ran the second file.
Without understanding a bit more about what you want to do, I would suggest looking into the threading or multiprocessing libraries. These should allow you to create multiple instances of a program or function.
This is vague and I'm not quite sure what you're trying to do, but you can also explore the Subprocess module for Python. It will allow you to spawn new processes similarly to if you were starting them from the command-line, and your processes will also be able to talk to the child processes via stdin and stdout.
If you don't want to import any modules:
exec("file.py")
Otherwise:
import os
os.system('file.py')
Or:
import subprocess
subprocess.call('file.py')

Simultaneous Python and C++ run with read and write files

So this one is a doozie, and a little too specific to find an answer online.
I am writing to a file in C++ and reading that file in Python at the same time to move a robot. Or trying to.
When I try running both programs at the same time, the C++ one runs first and then the Python one runs.
Here's the command I use:
./ColorFollow & python fileToHex.py
This happens even if I switch the order of commands.
Even if I run them in different terminals (which is the same thing, just covering all bases).
Both the Python and C++ code read / write in 'infinite' loops, so these two should run until I say stop.
The code works fine; when the Python script finally runs the robot moves as intended. It's just that the code doesn't run at the same time.
Is there a way to make this happen, or is this impossible?
If you need more information, lemme know, but the code is pretty much what you'd expect it to be.
If you are using Linux, & will release bash session and in this case, CollorFlow and fileToXex.py will run in different bash sessions.
At the same time, composition ./ColorFollow | python fileToHex.py looks interesting, cause you redirect stdout of ColorFollow to fileToHex.py stdin - it can syncronize scripts by printing some code string upon exit, then reading it by fileToHex.py and exit as well.
I would create some empty file like /var/run/ColorFollow.flag and write there 1 when one of processes exit. Not a pipe - cause we do not care which process will start first. So, if next loop step of ColorFollow sees 1 in the file, it deletes it and exits (means that fileToHex already exited). The same - for fileToHex - check flag file each loop step and exit if it exists, after deleting flag file.

When does Python write a file to disk?

I have a library that interacts with a configuration file. When the library is imported, the initialization code reads the configuration file, possibly updates it, and then writes the updated contents back to the file (even if nothing was changed).
Very occasionally, I encounter a problem where the contents of the configuration file simply disappear. Specifically, this happens when I run many invocations of a short script (using the library), back-to-back, thousands of times. It never occurs during the same directories, which leads me to believe it's a somewhat random problem--specifically a race condition with IO.
This is a pain to debug, since I can never reliably reproduce the problem and it only happens on some systems. I have a suspicion about what might happen, but I wanted to see if my picture of file I/O in Python is correct.
So the question is, when does a Python program actually write file contents to a disk? I thought that the contents would make it to disk by the time that the file closed, but then I can't explain this error. When python closes a file, does it flush the contents to the disk itself, or simply queue it up to the filesystem? Is it possible that file contents can be written to disk after Python terminates? And can I avoid this issue by using fp.flush(); os.fsync(fp.fileno()) (where fp is the file handle)?
If it matters, I'm programming on a Unix system (Mac OS X, specifically). Edit: Also, keep in mind that the processes are not running concurrently.
Appendix: Here is the specific race condition that I suspect:
Process #1 is invoked.
Process #1 opens the configuration file in read mode and closes it when finished.
Process #1 opens the configuration file in write mode, erasing all of its contents. The erasing of the contents is synced to the disk.
Process #1 writes the new contents to the file handle and closes it.
Process #1: Upon closing the file, Python tells the OS to queue writing these contents to disk.
Process #1 closes and exits
Process #2 is invoked
Process #2 opens the configuration file in read mode, but new contents aren't synced yet. Process #2 sees an empty file.
The OS finally finishes writing the contents to disk, after process 2 reads the file
Process #2, thinking the file is empty, sets defaults for the configuration file.
Process #2 writes its version of the configuration file to disk, overwriting the last version.
It is almost certainly not python's fault. If python closes the file, OR exits cleanly (rather than killed by a signal), then the OS will have the new contents for the file. Any subsequent open should return the new contents. There must be something more complicated going on. Here are some thoughts.
What you describe sounds more likely to be a filesystem bug than a Python bug, and a filesystem bug is pretty unlikely.
Filesystem bugs are far more likely if your files actually reside in a remote filesystem. Do they?
Do all the processes use the same file? Do "ls -li" on the file to see its inode number, and see if it ever changes. In your scenario, it should not. Is it possible that something is moving files, or moving directories, or deleting directories and recreating them? Are there symlinks involved?
Are you sure that there is no overlap in the running of your programs? Are any of them run from a shell with "&" at the end (i.e. in the background)? That could easily mean that a second one is started before the first one is finished.
Are there any other programs writing to the same file?
This isn't your question, but if you need atomic changes (so that any program running in parallel only sees either the old version or the new one, never the empty file), the way to achieve it is to write the new content to another file (e.g. "foo.tmp"), then do os.rename("foo.tmp", "foo"). Rename is atomic.

How can I delay execution until after os.system finishes?

I am using os.system to copy a file from a system to another. The logic of a very simple program is to execute another set of commands after this file gets copied.
The problem is that os.system does not actually wait for the file to be copied, and gets to executing the next line. This causes issues to the system. I could actually give some wait functions, through time.sleep(), but we have to copy files with sizes ranging from 500 MB to sometimes 20 GB, and the times taken are very different.
What's the solution? I need to somehow tell my program that the files are copied, and then to execute the next line.
The first thing I'd try is to use shutil.copyfile() instead of an external program to copy the file. If you have to use an external program, you should call it via subprocess.Popen(), not via os.system(). You can use the Popen.wait() to wait for the subprocess to finish.
I think you should rather use shutil.copyfile than os.system to copy a file.
(Edit: woops, copy, not move)
use the shutil module for copying files.
The shutil module offers a number of
high-level operations on files and
collections of files. In particular,
functions are provided which support
file copying and removal.
also, use the subprocess module instead of os.system()
The subprocess module allows you to
spawn new processes, connect to their
input/output/error pipes, and obtain
their return codes. This module
intends to replace several other,
older modules and functions, such as:
os.system
for a better answer, you need to provide more detail about what exactly you are trying to do and how (programmatically) you are stuck.

How do I watch a folder for changes and when changes are done using Python?

i need to watch a folder for incoming files. i did that with the following help:
How do I watch a file for changes?
the problem is that the files that are being moved are pretty big (10gb)
and i want to be notified when all files are done moving.
i tried comparing the size of the folder every 20 seconds but the file shows its correct size even tough windows shows that it is still moving.
i am using windows with python
i found a solution using open and waiting for an io exception.
if the file is still being moved i get errno 13.
You should take a look at this link:
http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html
There you can see the comparison of the method you are speaking about (simple polling) with two other windows-specific techniques which, in my opinion, offers a really better solution to your problem!
Otherwise, if you are using linux, there's iNotify and the relative Python wrapper:
Pyinotify is a pure Python module used
for monitoring filesystems events on
Linux platforms through inotify
Here: http://trac.dbzteam.org/pyinotify
If you have control over the process of importing the files, I would put a lock file when starting to copy files in, and remove it when you are done. by lock file I mean a tmp empty file, which is just there to indicate that you are coping a file. then your py script can check for the existence of the lock files.
You may be able to use os.stat() to monitor the mtime of the file. However be aware that under various network conditions, the copy may stall momentarily and so the mtime is not updated for a few seconds, so you need to make allowance for this.
Another option is to try opening the file with exclusive read/write which should fail under windows if the file is still opened by the other process
The most reliable method would be to write your own program to move the files.
try checking for the last-modified time change instead of the filesize during your poll.

Categories