Linux IPC: Locking, but not file based locking

Linux IPC: Locking, but not file based locking - python

I need a way to ensure only one python process is processing a directory.
The lock/semaphore should be local to the machine (linux operating system).
Networking or NFS is not involved.
I would like to avoid file based locks, since I don't know where I should put the lock file.
There are libraries which provide posix IPC at pypi.
Is there no way to use linux semaphores with python without a third party library?
The lock provided by multiprocessing.Lock does not help, since both python interpreter don't share one the same parent.
Threading is not involved. All processes have only one thread.
I am using Python 2.7 on linux.
How to to synchronize two python scripts on linux (without file based locking)?
Required feature: If one process dies, then the lock/semaphore should get released by the operating system.

flock the directory itself — then you never need worry about where to put the lock file:
import errno
import fcntl
import os
import sys
# This will work on Linux
dirfd = os.open(THE_DIRECTORY, os.O_RDONLY) # FIXME: FD_CLOEXEC
try:
fcntl.flock(dirfd, fcntl.LOCK_EX|fcntl.LOCK_NB)
except IOError as ex:
if ex.errno != errno.EAGAIN:
raise
print "Somebody else is working here; quitting." # FIXME: logging
sys.exit(1)
do_the_work()
os.close(dirfd)

I would like to avoid file based locks, since I don't know where I should put the lock file.
You can lock the existing file or directory (the one being processed).
Required feature: If one process dies, then the lock/semaphore should get released by the operating system.
That is exactly how file locks work.

Related

How to lock file from write operations of other applications in operating system?

I have a file. My application must open it and protect it from writing done by other applications in my operating system.
Is it possible to do such thing? I tried fcntl.flock() but it seems that this lock is working only for my application. Other programs like text editors can write to file without any problems.
Test example
#!/usr/bin/env python3
import time
import fcntl
with open('./somefile.txt', 'r') as f:
# accquire lock
fcntl.flock(f, fcntl.LOCK_EX | fcntl.LOCK_NB)
# some long running operation
# file should be protected from writes
# of other programs on this computer
time.sleep(20)
# release lock
fcntl.flock(f, fcntl.LOCK_UN)
How can I lock a file for every application in system except mine?

Flock is advisory locking and it's ignored unless specifically checked for.
There are other alternatives that does mandatory locking like this one.
Found in this thread and there are some more alternatives in there as well.

Python Flush Input before raw_input()

As a php programmer (of sorts) very new to os and command line processes, I'm surprised that within python, everything a user inputs during the course of interacting with a program seems to be buffered, waiting to pour out at the first use of raw_input (for example).
Found some code to call prior to raw_input which seems to "solve" the problem on osX, although supposedly it is providing access to windows capabilities:
class FlushInput(object):
def flush_input(self):
try:
import msvcrt
while msvcrt.kbhit():
msvcrt.getch()
except ImportError:
import sys, termios
termios.tcflush(sys.stdin, termios.TCIOFLUSH)
Am I understanding correctly that stdin and stdout, stderr methods will vary between OSs?
I imagine that maybe a framework like Django has built-in methods that simplify the interactivity, but does it basically take a few lines of code just to tell python "don't accept any input until it's invited?"

Ahhh. If I'm understanding this correctly (and I'm sure the understanding needs much refining), the answer is that yes, stdin, stdout, stderr, "Standard" input, output and error streams and their handling may vary from (operating) system to system, because they are products of the OS and NOT any particular programming language.
The expectation that "telling python to ignore stdin until input is requested" would be automatic stems from an thinking of the "terminal" as if it were a typewriter. Where the goal of a typewriter is to record strings of information in a human-readable format, the goal of a terminal is to transmit information which will ultimately be converted to a machine-readable format, and to return human-readable responses.
What most of us coming to computing currently think of as a "terminal" is actually a virtual recreation of a physical machine known as a terminal which used to be the method by which data would be input to and read from a computer processor, right? And a text-editor an application that creates a virtual type-writer out of the keyboard, monitor and processing capabilities of the operating system and included libraries of programs.
An application like the mac OS terminal or even the tty we use to engage with another server via and ssh connection is actually creating a virtual terminal through which we can engage with the processor, but sending information to stdin and receiving from stdout and strerr. When the letters we type appear in the terminal screen, it is because it is being "echoed" back into the terminal window.
So there's no reason to expect that the relationship between python or any other language and a terminal, would by default block the input stream coming from a terminal.
The above code uses pythons exception handling to provide two alternative ways of flushing the input stream prior to some activity on behalf of the program. On OSX platform the code:
import sys, termios
termios.tcflush(sys.stdin, termios.TCIOFLUSH)
imports the system so we can have access to stdin, and termios, which is python's module for managing the POSIX (LINUX, UNIX) application which actually manages TWO virtual terminals - one between itself and the user, and another between itself and the operating system. tcflush appears to be a function that accepts at least two parameters - the first being WHICH stream to flush - the file descriptor (fd), and the second being the queue to flush. I'm not sure what the difference is between the file descriptor and queue is in this case, except that maybe the fd contains data that hasn't yet been added to the queue and the queue contains data that is no longer contained in the fd.
msvcrt is the python module for interacting with (managing) whatever Windows version of a terminal is, and I guess msvcrt.kbhit() and msvcrt.getch() are functions for flushing it's input queue.
The UNIX and Windows calls of the function could be swapped so that rather than saying, try: doing it the windows way and if an ImportError is raised to it the UNIX was, we try: the UNIX way first:
class FlushInput(object):
def flush_input(self):
try:
import sys, termios
termios.tcflush(sys.stdin, termios.TCIOFLUSH)
except ImportError:
import msvcrt
while msvcrt.kbhit():
msvcrt.getch()
Here's a termios introduction that helped clarify the process.

python ensure file consistency

I have a Python 2.7.x process running in an infinite loop that monitors a folder in Ubuntu server.
Whenever it finds a file, it checks the file against a set of known files that have been processed already, and acts accordingly. In pseudocode:
found = set()
while True:
for file in all_files("<DIR>"):
if file not in found:
process_file(file, found)
How can I make sure that the file hasn't just begun being copied there? I wouldn't want to say, take MD5 sum of file or open it with another process until I'm sure it's all there and ready.

The safest solution is to use the Linux kernel's inotify API via the pyinotify library. Experiment with the IN_CREATE and IN_MOVED_TO events depending on your needs. Also note this blog post warning of some implementation problems with the pyinotify library.

Due to locks and other system-level operations, you will not be able to do anything to the file until it has completed copying.
A file cannot be in two operations at once.

How to start daemon process from python on windows?

Can my python script spawn a process that will run indefinitely?
I'm not too familiar with python, nor with spawning deamons, so I cam up with this:
si = subprocess.STARTUPINFO()
si.dwFlags = subprocess.CREATE_NEW_PROCESS_GROUP | subprocess.CREATE_NEW_CONSOLE
subprocess.Popen(executable, close_fds = True, startupinfo = si)
The process continues to run past python.exe, but is closed as soon as I close the cmd window.

Using the answer Janne Karila pointed out this is how you can run a process that doen't die when its parent dies, no need to use the win32process module.
DETACHED_PROCESS = 8
subprocess.Popen(executable, creationflags=DETACHED_PROCESS, close_fds=True)
DETACHED_PROCESS is a Process Creation Flag that is passed to the underlying CreateProcess function.

This question was asked 3 years ago, and though the fundamental details of the answer haven't changed, given its prevalence in "Windows Python daemon" searches, I thought it might be helpful to add some discussion for the benefit of future Google arrivees.
There are really two parts to the question:
Can a Python script spawn an independent process that will run indefinitely?
Can a Python script act like a Unix daemon on a Windows system?
The answer to the first is an unambiguous yes; as already pointed out; using subprocess.Popen with the creationflags=subprocess.CREATE_NEW_PROCESS_GROUP keyword will suffice:
import subprocess
independent_process = subprocess.Popen(
'python /path/to/file.py',
creationflags=subprocess.CREATE_NEW_PROCESS_GROUP
)
Note that, at least in my experience, CREATE_NEW_CONSOLE is not necessary here.
That being said, the behavior of this strategy isn't quite the same as what you'd expect from a Unix daemon. What constitutes a well-behaved Unix daemon is better explained elsewhere, but to summarize:
Close open file descriptors (typically all of them, but some applications may need to protect some descriptors from closure)
Change the working directory for the process to a suitable location to prevent "Directory Busy" errors
Change the file access creation mask (os.umask in the Python world)
Move the application into the background and make it dissociate itself from the initiating process
Completely divorce from the terminal, including redirecting STDIN, STDOUT, and STDERR to different streams (often DEVNULL), and prevent reacquisition of a controlling terminal
Handle signals, in particular, SIGTERM.
The reality of the situation is that Windows, as an operating system, really doesn't support the notion of a daemon: applications that start from a terminal (or in any other interactive context, including launching from Explorer, etc) will continue to run with a visible window, unless the controlling application (in this example, Python) has included a windowless GUI. Furthermore, Windows signal handling is woefully inadequate, and attempts to send signals to an independent Python process (as opposed to a subprocess, which would not survive terminal closure) will almost always result in the immediate exit of that Python process without any cleanup (no finally:, no atexit, no __del__, etc).
Rolling your application into a Windows service, though a viable alternative in many cases, also doesn't quite fit. The same is true of using pythonw.exe (a windowless version of Python that ships with all recent Windows Python binaries). In particular, they fail to improve the situation for signal handling, and they cannot easily launch an application from a terminal and interact with it during startup (for example, to deliver dynamic startup arguments to your script, say, perhaps, a password, file path, etc), before "daemonizing". Additionally, Windows services require installation, which -- though perfectly possible to do quickly at runtime when you first call up your "daemon" -- modifies the user's system (registry, etc), which would be highly unexpected if you're coming from a Unix world.
In light of that, I would argue that launching a pythonw.exe subprocess using subprocess.CREATE_NEW_PROCESS_GROUP is probably the closest Windows equivalent for a Python process to emulate a traditional Unix daemon. However, that still leaves you with the added challenge of signal handling and startup communications (not to mention making your code platform-dependent, which is always frustrating).
That all being said, for anyone encountering this problem in the future, I've rolled a library called daemoniker that wraps both proper Unix daemonization and the above strategy. It also implements signal handling (for both Unix and Windows systems), and allows you to pass objects to the "daemon" process using pickle. Best of all, it has a cross-platform API:
from daemoniker import Daemonizer
with Daemonizer() as (is_setup, daemonizer):
if is_setup:
# This code is run before daemonization.
do_things_here()
# We need to explicitly pass resources to the daemon; other variables
# may not be correct
is_parent, my_arg1, my_arg2 = daemonizer(
path_to_pid_file,
my_arg1,
my_arg2
)
if is_parent:
# Run code in the parent after daemonization
parent_only_code()
# We are now daemonized, and the parent just exited.
code_continues_here()

For that purpose you could daemonize your python process or as you are using windows environment you would like to run this as a windows service.
You know i like to hate posting only web-links:
But for more information according to your requirement:
A simple way to implement Windows Service. read all comments it will resolve any doubt
If you really want to learn more
First read this
what is daemon process or creating-a-daemon-the-python-way
update:
Subprocess is not the right way to achieve this kind of thing

python: functions from math and os modules are interrupted by EINTR

I have linux board on samsung SoC s3c6410 (ARM11).
I build rootfs with buildroot:
Python 2.7.1, uClibc-0.9.31.
Linux kernel:
Linux buildroot 2.6.28.6 #177 Mon Oct 3 12:50:57 EEST 2011 armv6l GNU/Linux
My app, written on python, in some mysterios conditons raise this exceptions:
1)
exception:
File "./dfbUtils.py", line 3209, in setItemData
ValueError: (4, 'Interrupted system call')
code:
currentPage=int(math.floor(float(rowId)/self.pageSize))==self.selectedPage
2)
exception:
File "./terminalGlobals.py", line 943, in getFirmawareName
OSError: [Errno 4] Interrupted system call: 'firmware'
code:
for fileName in os.listdir('firmware'):
Some info about app: it have 3-7 threads, listen serial ports via 'serial' module, use gui implemented via c extension that wrap directfb, i can't reproduce this exceptions, they are not predictable.
I googled for EINTR exceptions in python, but only found that EINTR can occur only on slow system calls and python's modules socket, subprocess and another one is already process EINTR. So what happens in my app? Why simple call of math function can interrupt program at any time, it's not reliable at all. I have only suggestions: ulibc bug, kernel/hw handling bug. But this suggestions don't show me solution.
Now i created wrap functions (that restart opertion in case of EINTR) around some functions from os module, but wrapping math module will increase execution time in 2 times. There another question: if math can be interrutped than other module also can and how to get reliability?
P.S. I realize that library call (to libm for example) is not system call, so why i have "Interrupted system call"?

There was an old bug with threads and EINTR in uClibc (#4994) that they fixed in 0.9.30. The fix was tested against pthreads, so I would second the suggestion by tMC to check how you configured threads when building uClibc.
Also you could try compiling with the malloc-simple option? It is slow, but if your issue disappears it may suggest threading issues as well:
malloc-simple is trivially simple and slow as molasses. It
was written from scratch for uClibc, and is the simplest possible
(and therefore smallest) malloc implementation.
This uses only the mmap() system call to allocate and free memory,
and does not use the brk() system call at all, making it a fine
choice for MMU-less systems with very limited memory. It's 100%
standards compliant, thread safe, very small, and releases freed
memory back to the OS immediately rather than keeping it in the
process's heap for reallocation. It is also VERY SLOW.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.