python: monitor updates in /proc/mydev file - python

I wrote a kernel module that writes in /proc/mydev to notify the python program in userspace. I want to trigger a function in the python program whenever there is an update of data in /proc/mydev from the kernel module. What is the best way to listen for an update here? I am thinking about using "watchdog" (https://pythonhosted.org/watchdog/). Is there a better way for this?

This is an easy and efficient way:
import os
from time import sleep
from datetime import datetime
def myfuction(_time):
print("file modified, time: "+datetime.fromtimestamp(_time).strftime("%H:%M:%S"))
if __name__ == "__main__":
_time = 0
while True:
last_modified_time = os.stat("/proc/mydev").st_mtime
if last_modified_time > _time:
myfuction(last_modified_time)
_time = last_modified_time
sleep(1) # prevent high cpu usage
result:
file modified, time: 11:44:09
file modified, time: 11:46:15
file modified, time: 11:46:24
The while loop guarantees that the program keeps listening to changes forever.
You can set the interval by changing the sleep time. Low sleep time causes high CPU usage.

import time
import os
# get the file descriptor for the proc file
fd = os.open("/proc/mydev", os.O_RDONLY)
# create a polling object to monitor the file for updates
poller = select.poll()
poller.register(fd, select.POLLIN)
# create a loop to monitor the file for updates
while True:
events = poller.poll(10000)
if len(events) > 0:
# read the contents of the file if updated
print(os.read(fd, 1024))

sudo pip install inotify
Example
Code for monitoring a simple, flat path (see “Recursive Watching” for watching a hierarchical structure):
import inotify.adapters
def _main():
i = inotify.adapters.Inotify()
i.add_watch('/tmp')
with open('/tmp/test_file', 'w'):
pass
for event in i.event_gen(yield_nones=False):
(_, type_names, path, filename) = event
print("PATH=[{}] FILENAME=[{}] EVENT_TYPES={}".format(
path, filename, type_names))
if __name__ == '__main__':
_main()
Expected output:
PATH=[/tmp] FILENAME=[test_file] EVENT_TYPES=['IN_MODIFY']
PATH=[/tmp] FILENAME=[test_file] EVENT_TYPES=['IN_OPEN']
PATH=[/tmp] FILENAME=[test_file] EVENT_TYPES=['IN_CLOSE_WRITE']

I'm not sure if this would work for your situation, since it seems that you're wanting to watch a folder, but this program watches a file at a time until the main() loop repeats:
import os
import time
def main():
contents = os.listdir("/proc/mydev")
for file in contents:
f = open("/proc/mydev/" + file, "r")
init = f.read()
f.close()
while different = false:
f = open("/proc/mydev/" + file, "r")
check = f.read()
f.close()
if init !== check:
different = true
else:
different = false
time.sleep(1)
main()
# Write what you would want to happen if a change occured here...
main()
main()
main()
You could then write what you would want to happen right before the last usage of main(), as it would then repeat.
Also, this may contain errors, since I rushed this.
Hope this at least helps!

You can't do this efficiently without modifying your kernel driver.
Instead of using procfs, have it register a new character device under /dev, and write that driver to make new content available to read from that device only when new content has in fact come in from the underlying hardware, such that the application layer can issue a blocking read and have it return only when new content exists.
A good example to work from (which also has plenty of native Python clients) is the evdev devices in the input core.

Related

Sensor data with pythin does not get written to File

I'm currently working on a script for my sensor on my Raspberry Pi. The code underneath should get the values of my sensor and write it into a the data.json file. My problem is, if I run the scipt with my the Thonny editor everything works but if I add the script to my crontab menu the data does not get written to the data.json file.
The Code:
import time
import board
import adafruit_dht
import psutil
import io
import json
import os
from gpiozero import LED
from datetime import date
from datetime import datetime
# We first check if a libgpiod process is running. If yes, we kill it!
for proc in psutil.process_iter():
if proc.name() == "libgpiod_pulsein" or proc.name() == "libgpiod_pulsei":
proc.kill()
sensor = adafruit_dht.DHT11(board.D23)
# init
temp_values = [10]
hum_values = [10]
counter = 0
dataLED = LED(13)
dataList = []
def errSignal():
for i in range(0,3):
dataLED.on()
time.sleep(0.1)
dataLED.off()
time.sleep(0.1)
#on startup
def runSignal():
for i in range(0,5):
dataLED.on()
time.sleep(0.2)
dataLED.off()
time.sleep(0.2)
def getExistingData():
with open('data.json') as fp:
dataList = json.load(fp)
print(dataList)
def startupCheck():
if os.path.isfile("data.json") and os.access("data.json", os.R_OK):
# checks if file exists
print("File exists and is readable.")
# get json data an push into arr on startup
getExistingData()
else:
print("Either file is missing or is not readable, creating file...")
# create json file
with open("data.json", "w") as f:
print("The json file is created.")#
def calc_avgValue(values):
sum = 0
for iterator in values:
sum += iterator
return sum / len(values)
def onOFF():
dataLED.on()
time.sleep(0.7)
dataLED.off()
# data led blinking on startup
runSignal()
# checks if file exists
startupCheck()
while True:
try:
temp_values.insert(counter, sensor.temperature)
hum_values.insert(counter, sensor.humidity)
counter += 1
time.sleep(6)
if counter >= 10:
print(
"Temperature: {}*C Humidity: {}% ".format(
round(calc_avgValue(temp_values), 2),
round(calc_avgValue(hum_values), 2)
)
)
# get time
today = date.today()
now = datetime.now()
# create json obj
data = {
"temperature": round(calc_avgValue(temp_values), 2),
"humidity": round(calc_avgValue(hum_values), 2),
"fullDate": str(today),
"fullDate2": str(today.strftime("%d/%m/%Y")),
"fullDate3": str(today.strftime("%B %d, %Y")),
"fullDate4": str(today.strftime("%b-%d-%Y")),
"date_time": str(now.strftime("%d/%m/%Y %H:%M:%S"))
}
# push data into list
dataList.append(data)
# writing to data.json
with open("data.json", "w") as f:
json.dump(dataList, f, indent=4, separators=(',',': '))
# if data is written signal appears
onOFF()
print("Data has been written to data.json...")
counter = 0
except RuntimeError as error:
continue
except Exception as error:
sensor.exit()
while True:
errSignal()
raise error
time.sleep(0.2)
Crontab Menu:
The line in the center is the script.
Investigation areas:
Do not put & in crontab, it serves no purpose.
You should capture the output of your scripts to see what is going on. You do this by adding >/tmp/stats.out 2>/tmp/stats.err (and similar for the other 2 lines). You will see what output and errors your scripts encounter.
cron does not run your scripts in the same environment, and from the same directory you are running them. Load what you require in the script.
cron might not have permissions to write into data.yml in the directory it is running from. Specify a full path, and ensure cron can write in that directory.
Look at https://unix.stackexchange.com/questions/109804/crontabs-reboot-only-works-for-root for usage of #reboot. Things that should occur at startup should be configured through systemd or init.d (I do not know what Rasperry Pie uses vs distro). Cron is to schedule jobs, not run things at startup.
It could be as simple as not having python3 in the PATH configured in cron.

Python shutil module move exception handling

Destination path /tmp/abc in both Process 1 & Process 2
Say there are N number of process running
we need to retain the file generated by the latest one
Process1
import shutil
shutil.move(src_path, destination_path)
Process 2
import os
os.remove(destination_path)
Solution
1. Handle the process saying if copy fails with [ErrNo2]No Such File or Directory
Is this the correct solution? Is there a better way to handle this
Useful Link A safe, atomic file-copy operation
You can use FileNotFoundError error
try :
shutil.move(src_path, destination_path)
except FileNotFoundError:
print ('File Not Found')
# Add whatever logic you want to execute
except :
print ('Some Other error')
The primary solutions that come to mind are either
Keep partial information in per-process staging file
Rely on the os for atomic moves
or
Keep partial information in per-process memory
Rely on interprocess communication / locks for atomic writes
For the first, e.g.:
import tempfile
import os
FINAL = '/tmp/something'
def do_stuff():
fd, name = tempfile.mkstemp(suffix="-%s" % os.getpid())
while keep_doing_stuff():
os.write(fd, get_output())
os.close(fd)
os.rename(name, FINAL)
if __name__ == '__main__':
do_stuff()
You can choose to invoke individually from a shell (as shown above) or with some process wrappers (subprocess or multiprocessing would be fine), and either way will work.
For interprocess you would probably want to spawn everything from a parent process
from multiprocessing import Process, Lock
from cStringIO import StringIO
def do_stuff(lock):
output = StringIO()
while keep_doing_stuff():
output.write(get_output())
with lock:
with open(FINAL, 'w') as f:
f.write(output.getvalue())
output.close()
if __name__ == '__main__':
lock = Lock()
for num in range(2):
Process(target=do_stuff, args=(lock,)).start()

multiprocessing to a python function

How I can implement the multiprocessing to my function.I tried like this but did not work.
def steric_clashes_parallel(system):
rna_st = system[MolWithResID("G")].molecule()
for i in system.molNums():
peg_st = system[i].molecule()
if rna_st != peg_st:
print(peg_st)
for i in rna_st.atoms(AtomIdx()):
for j in peg_st.atoms(AtomIdx()):
# print(Vector.distance(i.evaluate().center(), j.evaluate().center()))
dist = Vector.distance(i.evaluate().center(), j.evaluate().center())
if dist<2:
return print("there is a steric clash")
return print("there is no steric clashes")
mix = PDB().read("clash_1.pdb")
system = System()
system.add(mix)
from multiprocessing import Pool
p = Pool(4)
p.map(steric_clashes_parallel,system)
I've thousand of pdb or system files to test through this function. It took 2 h for one file on a single core without multiprocessing module. Any suggestion would be great help.
My traceback looks something like this:
self.run()
File "/home/sajid/sire.app/bundled/lib/python3.3/threading.py", line 858,
in run self._target(*self._args, **self._kwargs)
File "/home/sajid/sire.app/bundled/lib/python3.3/multiprocessing/pool.py", line 351,
in _handle_tasks put(task)
File "/home/sajid/sire.app/bundled/lib/python3.3/multiprocessing/connection.py", line 206,
in send ForkingPickler(buf, pickle.HIGHEST_PROTOCOL).dump(obj)
RuntimeError: Pickling of "Sire.System._System.System" instances is not enabled
(boost.org/libs/python/doc/v2/pickle.html)
The problem is that Sire.System._System.System can't be serialized so it can't be sent to the child process. Multiprocessing uses the pickle module for serialization and you can frequently do a sanity check in the main program with pickle.dumps(my_mp_object) to verify.
You have another problem, though (or I think you do, based on variable names). the map method takes an iterable and fans its iterated objects out to pool members, but it appears that you want to process system itself, not something at it iterates.
One trick to multiprocessing is to keep the payload that you send from the parent to the child simple and let the child do the heavy lifting of creating its objects. Here, you might be better off just sending down filenames and letting the children do most of the work.
def steric_clashes_from_file(filename):
mix = PDB().read(filename)
system = System()
system.add(mix)
steric_clashes_parallel(system)
def steric_clashes_parallel(system):
rna_st = system[MolWithResID("G")].molecule()
for i in system.molNums():
peg_st = system[i].molecule()
if rna_st != peg_st:
print(peg_st)
for i in rna_st.atoms(AtomIdx()):
for j in peg_st.atoms(AtomIdx()):
# print(Vector.distance(i.evaluate().center(), j.evaluate().center()))
dist = Vector.distance(i.evaluate().center(), j.evaluate().center())
if dist<2:
return print("there is a steric clash")
return print("there is no steric clashes")
filenames = ["clash_1.pdb",]
from multiprocessing import Pool
p = Pool(4, chunksize=1)
p.map(steric_clashes_from_file,filenames)
# martineau:
I tested pickle command and it gave me;
----> 1 pickle.dumps(clash_1.pdb)
RuntimeError: Pickling of "Sire.Mol._Mol.MoleculeGroup" instances is not enabled (http://www.boost.org/libs/python/doc/v2/pickle.html)
----> 1 pickle.dumps(system)
RuntimeError: Pickling of "Sire.System._System.System" instances is not enabled (http://www.boost.org/libs/python/doc/v2/pickle.html)
With your script it took the same time and using a single core only. dist line is iterable though. Can i run this single line over multicores ? I modify the line as;
for i in rna_st.atoms(AtomIdx()):
icent = i.evaluate().center()
for j in peg_st.atoms(AtomIdx()):
dist = Vector.distance(icent, j.evaluate().center())
There is one trick you can do to get a faster computation for each file -- processing each file sequentially, but processing the contents of the file in parallel. This relies on a number of caveats:
Your are running on a system that can fork processes (such as Linux).
The computations you are doing do not have side effects that effect the result of future computations.
It seems like this is the case in your situation, but I can't be 100% sure.
When a process is forked, all the memory in the child process is duplicated from the parent process (what's more it is duplicated in an efficient manner -- bits of memory that are only read from aren't duplicated). This makes it easy to share big, complex initial states between processes. However, once the child processes have started they will not see any changes to objects made in the parent process though (and vice versa).
Sample code:
import multiprocessing
system = None
rna_st = None
class StericClash(Exception):
"""Exception used to halt processing of a file. Could be modified to
include information about what caused the clash if this is useful."""
pass
def steric_clashes_parallel(system_index):
peg_st = system[system_index].molecule()
if rna_st != peg_st:
for i in rna_st.atoms(AtomIdx()):
for j in peg_st.atoms(AtomIdx()):
dist = Vector.distance(i.evaluate().center(),
j.evaluate().center())
if dist < 2:
raise StericClash()
def process_file(filename):
global system, rna_st
# initialise global values before creating pool
mix = PDB().read(filename)
system = System()
system.add(mix)
rna_st = system[MolWithResID("G")].molecule()
with multiprocessing.Pool() as pool:
# contents of file processed in parallel
try:
pool.map(steric_clashes_parallel, range(system.molNums()))
except StericClash:
# terminate called to halt current jobs and further processing
# of file
pool.terminate()
# wait for pool processes to terminate before returning
pool.join()
return False
else:
pool.close()
pool.join()
return True
finally:
# reset globals
system = rna_st = None
if __name__ == "__main__":
for filename in get_files_to_be_processed():
# files are being processed in serial
result = process_file(filename)
save_result_to_disk(filename, result)

What File Descriptor object does Python AsyncIO's loop.add_reader() expect?

I'm trying to understand how to use the new AsyncIO functionality in Python 3.4 and I'm struggling with how to use the event_loop.add_reader(). From the limited discussions that I've found it looks like its for reading the standard out of a separate process as opposed to the contents of an open file. Is that true? If so it appears that there's no AsyncIO specific way to integrate standard file IO, is this also true?
I've been playing with the following code. The output of the following gives the exception PermissionError: [Errno 1] Operation not permitted from line 399 of /python3.4/selectors.py self._epoll.register(key.fd, epoll_events) that is triggered by the add_reader() line below
import asyncio
import urllib.parse
import sys
import pdb
import os
def fileCallback(*args):
pdb.set_trace()
path = sys.argv[1]
loop = asyncio.get_event_loop()
#fd = os.open(path, os.O_RDONLY)
fd = open(path, 'r')
#data = fd.read()
#print(data)
#fd.close()
pdb.set_trace()
task = loop.add_reader(fd, fileCallback, fd)
loop.run_until_complete(task)
loop.close()
EDIT
For those looking for an example of how to use AsyncIO to read more than one file at a time like I was curious about, here's an example of how it can be accomplished. The secret is in the line yield from asyncio.sleep(0). This essentially pauses the current function, putting it back in the event loop queue, to be called after all other ready functions are executed. Functions are determined to be ready based on how they were scheduled.
import asyncio
#asyncio.coroutine
def read_section(file, length):
yield from asyncio.sleep(0)
return file.read(length)
#asyncio.coroutine
def read_file(path):
fd = open(path, 'r')
retVal = []
cnt = 0
while True:
cnt = cnt + 1
data = yield from read_section(fd, 102400)
print(path + ': ' + str(cnt) + ' - ' + str(len(data)))
if len(data) == 0:
break;
fd.close()
paths = ["loadme.txt", "loadme also.txt"]
loop = asyncio.get_event_loop()
tasks = []
for path in paths:
tasks.append(asyncio.async(read_file(path)))
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
These functions expect a file descriptor, that is, the underlying integers the operating system uses, not Python's file objects. File objects that are based on file descriptors return that descriptor on the fileno() method, so for example:
>>> sys.stderr.fileno()
2
In Unix, file descriptors can be attached to files or a lot of other things, including other processes.
Edit for the OP's edit:
As Max in the comments says, you can not use epoll on local files (and asyncio uses epoll). Yes, that's kind of weird. You can use it on pipes, though, for example:
import asyncio
import urllib.parse
import sys
import pdb
import os
def fileCallback(*args):
print("Received: " + sys.stdin.readline())
loop = asyncio.get_event_loop()
task = loop.add_reader(sys.stdin.fileno(), fileCallback)
loop.run_forever()
This will echo stuff you write on stdin.
you cannot use add_reader on local files, because:
It cannot be done using select/poll/epoll
It depends on the operating system
It cannot be fully asynchronous because of os limitations (linux does not support async fs metadata read/write)
But, technically, yes you should be able to do async filesystem read/write, (almost) all systems have DMA mechanism for doing i/o "in the background". And no, local i/o is not really fast such that no one would want it, the CPU are in the order of millions times faster that disk i/o.
Look for aiofile or aiofiles if you want to try async i/o

pyinotify bug with reading file on creation?

I want to parse a file everytime a new file is created in a certain directory. For this, I'm trying to use pyinotify to setup a directory to watch for IN_CREATE kernel events, and fire the parse() method.
Here is the module:
from pyinotify import WatchManager,
ThreadedNotifier, ProcessEvent, IN_CREATE
class Watcher(ProcessEvent):
watchdir = '/tmp/watch'
def __init__(self):
ProcessEvent.__init__(self)
wm = WatchManager()
self.notifier = ThreadedNotifier(wm, self)
wdd = wm.add_watch(self.watchdir, IN_CREATE)
self.notifier.start()
def process_IN_CREATE(self, event):
pfile = self._parse(event.pathname)
print(pfile)
def _parse(self, filename):
f = open(filename)
file = [line.strip() for line in f.readlines()]
f.close()
return file
if __name__ == '__main__':
Watcher()
The problem is that the list returned by _parse is empty when triggered by a new file creation event, like so (the file is created in another window while watcher.py is running):
$ python watcher.py
[]
...but strangely enough, it works from an interpreter session when called directly.
>>> import watcher
>>> w = watcher.Watcher()
>>> w._parse('/tmp/watch/sample')
['This is a sample file', 'Another line', 'And another...']
Why is this happening? The farthest I've come debugging this thing is to know that something is making pyinotify not read the file correctly. But... why?
may be you want to wait till file is closed?
Here's some code that works for me, with a 2.6.18 kernel, Python 2.4.3, and pyinotify 0.7.1 -- you may be using different versions of some of these, but it's important to make sure we're talking about the same versions, I think...:
#!/usr/bin/python2.4
import os.path
from pyinotify import pyinotify
class Watcher(pyinotify.ProcessEvent):
watchdir = '/tmp/watch'
def __init__(self):
pyinotify.ProcessEvent.__init__(self)
wm = pyinotify.WatchManager()
self.notifier = pyinotify.ThreadedNotifier(wm, self)
wdd = wm.add_watch(self.watchdir, pyinotify.EventsCodes.IN_CREATE)
print "Watching", self.watchdir
self.notifier.start()
def process_IN_CREATE(self, event):
print "Seen:", event
pathname = os.path.join(event.path, event.name)
pfile = self._parse(pathname)
print(pfile)
def _parse(self, filename):
f = open(filename)
file = [line.strip() for line in f]
f.close()
return file
if __name__ == '__main__':
Watcher()
when this is running in a terminal window, and in another terminal window I do
echo "ciao" >/tmp/watch/c3
this program's output is:
Watching /tmp/watch
Seen: event_name: IN_CREATE is_dir: False mask: 256 name: c3 path: /tmp/watch wd: 1
['ciao']
as expected. So can you please try this script (fixing the Python version in the hashbang if needed, of course) and tell us the exact releases of Linux kernel, pyinotify, and Python that you are using, and what do you observe in these exact circunstances? Quite possibly with more detailed info we may identify which bug or anomaly is giving you problems, exactly. Thanks!
As #SilentGhost mentioned, you may be reading the file before any content has been added to file (i.e. you are getting notified of the file creation not file writes).
Update: The loop.py example with pynotify tarball will dump the sequence of inotify events to the screen. To determine which event you need to trigger on, launch loop.py to monitor /tmp and then perform the file manipulation you want to track.
I think I solved the problem by using the IN_CLOSE_WRITE event instead. I'm not sure what was happening before that made it not work.
#Alex: Thanks, I tried your script, but I'm using newer versions: Python 2.6.1, pyinotify 0.8.6 and Linux 2.6.28, so it didn't work for me.
It was definitely a matter of trying to parse the file before it was written, so kudos to SilentGhost and DanM for figuring it out.

Categories