Process, memory and network resource tracer

Process, memory and network resource tracer - python

I would like to try to make a process, memory and network resource tracer similar to the one that comes by default in ubuntu for any operating system. But being new in python I don't know how to get these values to be displayed (in principle by console, then I'll do them as graphics). Which library would be easier to do it with?

On linux, you could leverage the /proc filesystem to read the information you need for such a task.
the /proc filesystem is a window into the kernel with a lot of data on each process running. It is displayed as a virtual filesystem, meaning you can access all that info simply by reading and parsing files.
For instance,
from pathlib import Path
proc = Path('/proc')
for proc in proc.iterdir():
if not proc.name.isnumeric():
continue # ignore directories that aren't processes
pid = proc.name
cmdline = (proc / 'cmdline').read_text()
print(f'PROCESS : {pid} : {cmdline}')
This will list all processes running, along with their commandline.
You have a lot of information you can gather in there.
more info on /proc here

Related

How to kill subprocess of xdg?

I spent a lot of time searching for the answer for my question, but I could not find it.
I run xdm-open, using subrocess, to play a video (I do not want to know what applications are available)
I'm waiting for a while
I want to kill the video
import os
import subprocess
import psutil
from time import sleep;
from signal import SIGTERM
file = "test.mkv"
out = subprocess.Popen(['xdg-open', file])
pid = out.pid
print('sleeping...')
sleep(20.0)
print('end of sleep...')
os.kill(pid, SIGTERM) #alternatively: out.terminate()
Unfortunatelly the last line is killing only the xdg-open process. The mplayer process (which was started by xdg) is still running.
I tried to get the sub-processes of the xdg by using the following code:
main_process = psutil.Process(pid)
children_processes = main_process.children(recursive=True)
for child in children_processes:
print("child process: ", child.pid, child.name())
but it did not help either. The list was empty.
Has anybody an idea how to kill the player process?

Programs like xdg-open typically look for a suitable program to open a file with, start that program with the file as argument and then exit.
By the time you get around to checking for child processes, xdg-open has probably already exited.
What happens then is OS dependant. In what follows, I'll be talking about UNIX-like operating systems.
The processes launched by xdg-open will usually get PID 1 as their parent process id (PPID) after xdg-open exits, so it will be practically impossible to find out for certain who started them by looking at the PPID.
But, there will probably be a relatively small number of processes running under your user-ID with PPID 1, so if you list those before and after calling xdg-open and remove all the programs that were in the before-list from the after-list, the program you seek will be in the after-list. Unless your machine is very busy, chances are that there will be only one item in the after-list; the one started by xdg-open.
Edit 1:
You commented:
I want to make the app OS independent.
All operating systems that support xdg-open are basically UNIX-like operating systems. If you use the psutil Python module to get process information, you can run your "app" on all the systems that psutil supports:
Linux
macOS
FreeBSD, OpenBSD, NetBSD
Sun Solaris
AIX
(psutil even works on ms-windows, but I kind of doubt you will find xdg-open there...)

Python multiprocessing copy-on-write behaving differently between OSX and Ubuntu

I'm trying to share objects between the parent and child process in Python. To play around with the idea, I've created a simple Python script:
from multiprocessing import Process
from os import getpid
import psutil
shared = list(range(20000000))
def shared_printer():
mem = psutil.Process(getpid()).memory_info().rss / (1024 ** 2)
print(getpid(), len(shared), '{}MB'.format(mem))
if __name__ == '__main__':
p = Process(target=shared_printer)
p.start()
shared_printer()
p.join()
The code snippet uses the excellent psutil library to print the RSS (Resident Set Size). When I run this on OSX with Python 2.7.15, I get the following output:
(33101, 20000000, '1MB')
(33100, 20000000, '626MB')
When I run the exact same snippet on Ubuntu (Linux 4.15.0-1029-aws #30-Ubuntu SMP x86_64 GNU/Linux), I get the following output:
(4077, 20000000, '632MB')
(4078, 20000000, '629MB')
Notice that the child process' RSS is basicall 0MB on OSX and about the same size as the parent process' RSS in Linux. I had assumed that copy-on-write behavior would work the same way in Linux and allow the child process to refer to the parent process' memory for most pages (perhaps except the one storing the head of the object).
So I'm guessing that there's some difference in the copy-on-write behavior in the 2 systems. My question is: is there anything I can do in Linux to get that OSX-like copy-on-write behavior?

So I'm guessing that there's some difference in the copy-on-write behavior >in the 2 systems. My question is: is there anything I can do in Linux to >get that OSX-like copy-on-write behavior?
The answer is NO. Behind the command psutil.Process(getpid()).memory_info().rss / (1024 ** 2) the OS uses the UNIX command $top [PID] and search for the field RES. Which contains the non-swapped physical memory a task has used in kb. i.e. RES = CODE + DATA.
IMHO, these means that both OS uses different memory managers. So that, it's almost impossible to constrain how much memory a process uses/needs. This is a intern issue of the OS.
In Linux the child process has the same size of the parent process. Indeed, they copy the same stack, code and data. But different PCB (Process Control Block). Therefore, it is impossible to get close to 0 as OSX does. It smells that OSX does not literally copy the code and data. If they are the same code, it will make pointer to the data of the parent process.
PD: I hope that would help you!

Linux IPC: Locking, but not file based locking

I need a way to ensure only one python process is processing a directory.
The lock/semaphore should be local to the machine (linux operating system).
Networking or NFS is not involved.
I would like to avoid file based locks, since I don't know where I should put the lock file.
There are libraries which provide posix IPC at pypi.
Is there no way to use linux semaphores with python without a third party library?
The lock provided by multiprocessing.Lock does not help, since both python interpreter don't share one the same parent.
Threading is not involved. All processes have only one thread.
I am using Python 2.7 on linux.
How to to synchronize two python scripts on linux (without file based locking)?
Required feature: If one process dies, then the lock/semaphore should get released by the operating system.

flock the directory itself — then you never need worry about where to put the lock file:
import errno
import fcntl
import os
import sys
# This will work on Linux
dirfd = os.open(THE_DIRECTORY, os.O_RDONLY) # FIXME: FD_CLOEXEC
try:
fcntl.flock(dirfd, fcntl.LOCK_EX|fcntl.LOCK_NB)
except IOError as ex:
if ex.errno != errno.EAGAIN:
raise
print "Somebody else is working here; quitting." # FIXME: logging
sys.exit(1)
do_the_work()
os.close(dirfd)

I would like to avoid file based locks, since I don't know where I should put the lock file.
You can lock the existing file or directory (the one being processed).
Required feature: If one process dies, then the lock/semaphore should get released by the operating system.
That is exactly how file locks work.

How to start daemon process from python on windows?

Can my python script spawn a process that will run indefinitely?
I'm not too familiar with python, nor with spawning deamons, so I cam up with this:
si = subprocess.STARTUPINFO()
si.dwFlags = subprocess.CREATE_NEW_PROCESS_GROUP | subprocess.CREATE_NEW_CONSOLE
subprocess.Popen(executable, close_fds = True, startupinfo = si)
The process continues to run past python.exe, but is closed as soon as I close the cmd window.

Using the answer Janne Karila pointed out this is how you can run a process that doen't die when its parent dies, no need to use the win32process module.
DETACHED_PROCESS = 8
subprocess.Popen(executable, creationflags=DETACHED_PROCESS, close_fds=True)
DETACHED_PROCESS is a Process Creation Flag that is passed to the underlying CreateProcess function.

This question was asked 3 years ago, and though the fundamental details of the answer haven't changed, given its prevalence in "Windows Python daemon" searches, I thought it might be helpful to add some discussion for the benefit of future Google arrivees.
There are really two parts to the question:
Can a Python script spawn an independent process that will run indefinitely?
Can a Python script act like a Unix daemon on a Windows system?
The answer to the first is an unambiguous yes; as already pointed out; using subprocess.Popen with the creationflags=subprocess.CREATE_NEW_PROCESS_GROUP keyword will suffice:
import subprocess
independent_process = subprocess.Popen(
'python /path/to/file.py',
creationflags=subprocess.CREATE_NEW_PROCESS_GROUP
)
Note that, at least in my experience, CREATE_NEW_CONSOLE is not necessary here.
That being said, the behavior of this strategy isn't quite the same as what you'd expect from a Unix daemon. What constitutes a well-behaved Unix daemon is better explained elsewhere, but to summarize:
Close open file descriptors (typically all of them, but some applications may need to protect some descriptors from closure)
Change the working directory for the process to a suitable location to prevent "Directory Busy" errors
Change the file access creation mask (os.umask in the Python world)
Move the application into the background and make it dissociate itself from the initiating process
Completely divorce from the terminal, including redirecting STDIN, STDOUT, and STDERR to different streams (often DEVNULL), and prevent reacquisition of a controlling terminal
Handle signals, in particular, SIGTERM.
The reality of the situation is that Windows, as an operating system, really doesn't support the notion of a daemon: applications that start from a terminal (or in any other interactive context, including launching from Explorer, etc) will continue to run with a visible window, unless the controlling application (in this example, Python) has included a windowless GUI. Furthermore, Windows signal handling is woefully inadequate, and attempts to send signals to an independent Python process (as opposed to a subprocess, which would not survive terminal closure) will almost always result in the immediate exit of that Python process without any cleanup (no finally:, no atexit, no __del__, etc).
Rolling your application into a Windows service, though a viable alternative in many cases, also doesn't quite fit. The same is true of using pythonw.exe (a windowless version of Python that ships with all recent Windows Python binaries). In particular, they fail to improve the situation for signal handling, and they cannot easily launch an application from a terminal and interact with it during startup (for example, to deliver dynamic startup arguments to your script, say, perhaps, a password, file path, etc), before "daemonizing". Additionally, Windows services require installation, which -- though perfectly possible to do quickly at runtime when you first call up your "daemon" -- modifies the user's system (registry, etc), which would be highly unexpected if you're coming from a Unix world.
In light of that, I would argue that launching a pythonw.exe subprocess using subprocess.CREATE_NEW_PROCESS_GROUP is probably the closest Windows equivalent for a Python process to emulate a traditional Unix daemon. However, that still leaves you with the added challenge of signal handling and startup communications (not to mention making your code platform-dependent, which is always frustrating).
That all being said, for anyone encountering this problem in the future, I've rolled a library called daemoniker that wraps both proper Unix daemonization and the above strategy. It also implements signal handling (for both Unix and Windows systems), and allows you to pass objects to the "daemon" process using pickle. Best of all, it has a cross-platform API:
from daemoniker import Daemonizer
with Daemonizer() as (is_setup, daemonizer):
if is_setup:
# This code is run before daemonization.
do_things_here()
# We need to explicitly pass resources to the daemon; other variables
# may not be correct
is_parent, my_arg1, my_arg2 = daemonizer(
path_to_pid_file,
my_arg1,
my_arg2
)
if is_parent:
# Run code in the parent after daemonization
parent_only_code()
# We are now daemonized, and the parent just exited.
code_continues_here()

For that purpose you could daemonize your python process or as you are using windows environment you would like to run this as a windows service.
You know i like to hate posting only web-links:
But for more information according to your requirement:
A simple way to implement Windows Service. read all comments it will resolve any doubt
If you really want to learn more
First read this
what is daemon process or creating-a-daemon-the-python-way
update:
Subprocess is not the right way to achieve this kind of thing

Printing PDF's using Python,win32api, and Acrobat Reader 9

I have reports that I am sending to a system that requires the reports be in a readable PDF format. I tried all of the free libraries and applications and the only one that I found worked was Adobe's acrobat family.
I wrote a quick script in python that uses the win32api to print a pdf to my printer with the default registered application (Acrobat Reader 9) then to kill the task upon completion since acrobat likes to leave the window open when called from the command line.
I compiled it into an executable and pass in the values through the command line
(for example printer.exe %OUTFILE% %PRINTER%) this is then called within a batch file
import os,sys,win32api,win32print,time
# Command Line Arguments.
pdf = sys.argv[1]
tempprinter = sys.argv[2]
# Get Current Default Printer.
currentprinter = win32print.GetDefaultPrinter()
# Set Default printer to printer passed through command line.
win32print.SetDefaultPrinter(tempprinter)
# Print PDF using default application, AcroRd32.exe
win32api.ShellExecute(0, "print", pdf, None, ".", 0)
# Reset Default Printer to saved value
win32print.SetDefaultPrinter(currentprinter)
# Timer for application close
time.sleep(2)
# Kill application and exit scipt
os.system("taskkill /im AcroRd32.exe /f")
This seemed to work well for a large volume, ~2000 reports in a 3-4 hour period but I have some that drop off and I'm not sure if the script is getting overwhelmed or if I should look into multithreading or something else.
The fact that it handles such a large amount with no drop off leads me to believe that the issue is not with the script but I'm not sure if its an issue with the host system or Adobe Reader, or something else.
Any suggestions or opinions would be greatly appreciated.

Based on your feedback (win32api.ShellExecute() is probably not synchronous), your problem is the timeout: If your computer or the print queue is busy, the kill command can arrive too early.
If your script runs concurrently (i.e. you print all documents at once instead of one after the other), the kill command could even kill the wrong process (i.e. an acrobat process started by another invocation of the script).
So what you need it a better synchronization. There are a couple of things you can try:
Convert this into a server script which starts Acrobat once, then sends many print commands to the same process and terminates afterwards.
Use a global lock to make sure that ever only a single script is running. I suggest to create a folder somewhere; this is an atomic operation on every file system. If the folder exists, the script is active somewhere.
On top of that, you need to know when the job is finished. Use win32print.EnumJobs() for this.
If that fails, another solution could be to install a Linux server somewhere. You can run a Python server on this box which accepts print jobs that you send with the help of a small Python script on your client machine. The server can then print the PDFs for you in the background.
This approach allow you to add any kind of monitoring you like (sending mails if something fails or send a status mail after all jobs have finished).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.