How to get memory usage of an external program - python

How to get memory usage of an external program - python - python

I am trying to get the memory usage of an external program within my python script. I have tried using the script http://code.activestate.com/recipes/286222/ as follows:
m0 = memory()
subprocess.call('My program')
m1 = memory(m0)
print m1
But this seems to be just giving me the memory usage of the python script rather than 'My program'. Is there a way of outputting the memory usage of the program for use within the python script?

Try using Psutil
import psutil
import subprocess
import time
SLICE_IN_SECONDS = 1
p = subprocess.Popen('calling/your/program')
resultTable = []
while p.poll() == None:
resultTable.append(psutil.get_memory_info(p.pid))
time.sleep(SLICE_IN_SECONDS)

If you look at the recipe you will see the line:
_proc_status = '/proc/%d/status' % os.getpid()
I suggest you replace the os.getpid() with the process id of your child process. As #Neal said, as I was typing this you need to use Popen and get the pid attribute of the returned object.
However, you have a possible race condition because you don't know at what state the child process is at, and the memory usage will vary anyway.

You may want to check out the psutil module: http://code.google.com/p/psutil/. The Process Management section on the homepage gives you examples of getting memory usage for a running process specified by the pid.
Do you want to spawn the process you are monitoring in your script as well? If so, you probably don't want to use subprocess.call as this will wait for the program to exit and you won't be able to monitor it while it's running. If you want to spawn the process then monitor it, you probably want to use Popen http://docs.python.org/library/subprocess.html#subprocess.Popen. This will allow you to spawn the process, get the pid, hand the pid to psutil, then monitor the memory usage.

I know this is an older post, but it's the only one that appears when I google this issue, so, I want to add the updated version of this:
import psutil
import humanfriendly
proc = subprocess.Popen("...Your process...")
SLICE_IN_SECONDS = 1
while proc.poll() is None:
p = psutil.Process(proc.pid)
mem_status = "RSS {}, VMS: {}".format(humanfriendly.format_size(p.memory_info().rss),
humanfriendly.format_size(p.memory_info().vms))
time.sleep(SLICE_IN_SECONDS)
print(mem_status)
I used humanfriendly here, to make the values more readable, but it's not required.
The RSS and VMS values are on all os, and there may be other values depending on the os you're using: https://psutil.readthedocs.io/en/latest/#psutil.Process.memory_info

Related

Is there a way to get just the pid of a process with its name for the purpose of suspending it for a set time

I need to gain the pid of a program in order to suspend it temporarily. How can I gain the pid of a program using python with just its username in order to use this pid along with psutil to suspend the process? Let's just call it process.exe for now.
I have tried using item for item along with psuti. However, this gives me additional text along with the pid and I am unsure how to remove this unnecesary text.
I have tried using os.getpid but this gives me the pid of Python rather than the process I want to get the pid of.
1.
import psutil
pid = [item for item in psutil.process_iter() if item.name() == 'process.exe']
print(pid)
2.
import os
pid = os.getpid()
print(pid)
For (1) I want the output to just be
pid=x
However, right now it is:
[psutil.Process(pid=x, name='process.exe', started='14:11:40')]

In 1, you are receiving process object which contains all the information about process, you can use the process object to fetch the pid:
pid[0].pid
In 2nd ,
os.getpid()
returns the process id of python interpreter and is of hardly any use under python interpreter, only use it when you are running some python script to get its process id.

Python subprocess always waits for programm [duplicate]

I'm trying to port a shell script to the much more readable python version. The original shell script starts several processes (utilities, monitors, etc.) in the background with "&". How can I achieve the same effect in python? I'd like these processes not to die when the python scripts complete. I am sure it's related to the concept of a daemon somehow, but I couldn't find how to do this easily.

While jkp's solution works, the newer way of doing things (and the way the documentation recommends) is to use the subprocess module. For simple commands its equivalent, but it offers more options if you want to do something complicated.
Example for your case:
import subprocess
subprocess.Popen(["rm","-r","some.file"])
This will run rm -r some.file in the background. Note that calling .communicate() on the object returned from Popen will block until it completes, so don't do that if you want it to run in the background:
import subprocess
ls_output=subprocess.Popen(["sleep", "30"])
ls_output.communicate() # Will block for 30 seconds
See the documentation here.
Also, a point of clarification: "Background" as you use it here is purely a shell concept; technically, what you mean is that you want to spawn a process without blocking while you wait for it to complete. However, I've used "background" here to refer to shell-background-like behavior.

Note: This answer is less current than it was when posted in 2009. Using the subprocess module shown in other answers is now recommended in the docs
(Note that the subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using these functions.)
If you want your process to start in the background you can either use system() and call it in the same way your shell script did, or you can spawn it:
import os
os.spawnl(os.P_DETACH, 'some_long_running_command')
(or, alternatively, you may try the less portable os.P_NOWAIT flag).
See the documentation here.

You probably want the answer to "How to call an external command in Python".
The simplest approach is to use the os.system function, e.g.:
import os
os.system("some_command &")
Basically, whatever you pass to the system function will be executed the same as if you'd passed it to the shell in a script.

I found this here:
On windows (win xp), the parent process will not finish until the longtask.py has finished its work. It is not what you want in CGI-script. The problem is not specific to Python, in PHP community the problems are the same.
The solution is to pass DETACHED_PROCESS Process Creation Flag to the underlying CreateProcess function in win API. If you happen to have installed pywin32 you can import the flag from the win32process module, otherwise you should define it yourself:
DETACHED_PROCESS = 0x00000008
pid = subprocess.Popen([sys.executable, "longtask.py"],
creationflags=DETACHED_PROCESS).pid

Use subprocess.Popen() with the close_fds=True parameter, which will allow the spawned subprocess to be detached from the Python process itself and continue running even after Python exits.
https://gist.github.com/yinjimmy/d6ad0742d03d54518e9f
import os, time, sys, subprocess
if len(sys.argv) == 2:
time.sleep(5)
print 'track end'
if sys.platform == 'darwin':
subprocess.Popen(['say', 'hello'])
else:
print 'main begin'
subprocess.Popen(['python', os.path.realpath(__file__), '0'], close_fds=True)
print 'main end'

Both capture output and run on background with threading
As mentioned on this answer, if you capture the output with stdout= and then try to read(), then the process blocks.
However, there are cases where you need this. For example, I wanted to launch two processes that talk over a port between them, and save their stdout to a log file and stdout.
The threading module allows us to do that.
First, have a look at how to do the output redirection part alone in this question: Python Popen: Write to stdout AND log file simultaneously
Then:
main.py
#!/usr/bin/env python3
import os
import subprocess
import sys
import threading
def output_reader(proc, file):
while True:
byte = proc.stdout.read(1)
if byte:
sys.stdout.buffer.write(byte)
sys.stdout.flush()
file.buffer.write(byte)
else:
break
with subprocess.Popen(['./sleep.py', '0'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc1, \
subprocess.Popen(['./sleep.py', '10'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc2, \
open('log1.log', 'w') as file1, \
open('log2.log', 'w') as file2:
t1 = threading.Thread(target=output_reader, args=(proc1, file1))
t2 = threading.Thread(target=output_reader, args=(proc2, file2))
t1.start()
t2.start()
t1.join()
t2.join()
sleep.py
#!/usr/bin/env python3
import sys
import time
for i in range(4):
print(i + int(sys.argv[1]))
sys.stdout.flush()
time.sleep(0.5)
After running:
./main.py
stdout get updated every 0.5 seconds for every two lines to contain:
0
10
1
11
2
12
3
13
and each log file contains the respective log for a given process.
Inspired by: https://eli.thegreenplace.net/2017/interacting-with-a-long-running-child-process-in-python/
Tested on Ubuntu 18.04, Python 3.6.7.

You probably want to start investigating the os module for forking different threads (by opening an interactive session and issuing help(os)). The relevant functions are fork and any of the exec ones. To give you an idea on how to start, put something like this in a function that performs the fork (the function needs to take a list or tuple 'args' as an argument that contains the program's name and its parameters; you may also want to define stdin, out and err for the new thread):
try:
pid = os.fork()
except OSError, e:
## some debug output
sys.exit(1)
if pid == 0:
## eventually use os.putenv(..) to set environment variables
## os.execv strips of args[0] for the arguments
os.execv(args[0], args)

You can use
import os
pid = os.fork()
if pid == 0:
Continue to other code ...
This will make the python process run in background.

I haven't tried this yet but using .pyw files instead of .py files should help. pyw files dosen't have a console so in theory it should not appear and work like a background process.

Python Multiprocessing - sending inputs to child processes

I am using the multiprocessing module in python to launch few processes in parallel. These processes are independent of each other. They generate their own output and write out the results in different files. Each process calls an external tool using the subprocess.call method.
It was working fine until I discovered an issue in the external tool where due to some error condition it goes into a 'prompt' mode and waits for the user input. Now in my python script I use the join method to wait till all the processes finish their tasks. This is causing the whole thing to wait for this erroneous subprocess call. I can put a timeout for each of the process but I do not know in advance how long each one is going to run and hence this option is ruled out.
How do I figure out if any child process is waiting for an user input and how do I send an 'exit' command to it? Any pointers or suggestions to relevant modules in python will be really appreciated.
My code here:
import subprocess
import sys
import os
import multiprocessing
def write_script(fname,e):
f = open(fname,'w')
f.write("Some useful cammnd calling external tool")
f.close()
subprocess.call(['chmod','+x',os.path.abspath(fname)])
return os.path.abspath(fname)
def run_use(mname,script):
print "ssh "+mname+" "+script
subprocess.call(['ssh',mname,script])
if __name__ == '__main__':
dict1 = {}
dict['mod1'] = ['pp1','ext2','les3','pw4']
dict['mod2'] = ['aaa','bbb','ccc','ddd']
machines = ['machine1','machine2','machine3','machine4']
log_file.write(str(dict1.keys()))
for key in dict1.keys():
arr = []
for mod in dict1[key]:
d = {}
arr.append(mod)
if ((mod == dict1[key][-1]) | (len(arr)%4 == 0)):
for i in range(0,len(arr)):
e = arr.pop()
script = write_script(e+"_temp.sh",e)
d[i] = multiprocessing.Process(target=run_use,args=(machines[i],script,))
d[i].daemon = True
for pp in d:
d[pp].start()
for pp in d:
d[pp].join()

Since you're writing a shell script to run your subcommands, can you simply tell them to read input from /dev/null?
#!/bin/bash
# ...
my_other_command -a -b arg1 arg2 < /dev/null
# ...
This may stop them blocking on input and is a really simple solution. If this doesn't work for you, read on for some other options.
The subprocess.call() function is simply shorthand for constructing a subprocess.Popen instance and then calling the wait() method on it. So, your spare processes could instead create their own subprocess.Popen instances and poll them with poll() method on the object instead of wait() (in a loop with a suitable delay). This leaves them free to remain in communication with the main process so you can, for example, allow the main process to tell the child process to terminate the Popen instance with the terminate() or kill() methods and then itself exit.
So, the question is how does the child process tell whether the subprocess is awaiting user input, and that's a trickier question. I would say perhaps the easiest approach is to monitor the output of the subprocess and search for the user input prompt, assuming that it always uses some string that you can look for. Alternatively, if the subprocess is expected to generate output continually then you could simply look for any output and if a configured amount of time goes past without any output then you declare that process dead and terminate it as detailed above.
Since you're reading the output, actually you don't need poll() or wait() - the process closing its output file descriptor is good enough to know that it's terminated in this case.
Here's an example of a modified run_use() method which watches the output of the subprocess:
def run_use(mname,script):
print "ssh "+mname+" "+script
proc = subprocess.Popen(['ssh',mname,script], stdout=subprocess.PIPE)
for line in proc.stdout:
if "UserPrompt>>>" in line:
proc.terminate()
break
In this example we assume that the process either gets hung on on UserPrompt>>> (replace with the appropriate string) or it terminates naturally. If it were to get stuck in an infinite loop, for example, then your script would still not terminate - you can only really address that with an overall timeout, but you didn't seem keen to do that. Hopefully your subprocess won't misbehave in that way, however.
Finally, if you don't know in advance the prompt that will be giving from your process then your job is rather harder. Effectively what you're asking to do is monitor an external process and know when it's blocked reading on a file descriptor, and I don't believe there's a particularly clean solution to this. You could consider running a process under strace or similar, but that's quite an awful hack and I really wouldn't recommend it. Things like strace are great for manual diagnostics, but they really shouldn't be part of a production setup.

How do I know which python script is running in taskmgr?

It seems that in the task manager all I get is the process of the python/pythonwin. So How can I figure out which python script is running?

The usual answer to such questions is Process Explorer. You can see the full command line for any instance of python.exe or pythonw.exe in the tooltip.
To get the same information in Python, you can use the psutil module.
import psutil
pythons = [[" ".join(p.cmdline), p.pid] for p in psutil.process_iter()
if p.name.lower() in ("python.exe", "pythonw.exe")]
The result, pythons, is a list of lists representing Python processes. The first item of each list is the command line that started the process, including any options. The second item is the process ID.
The psutil Process class has a lot of other stuff in it so if you want all that, you can do this instead:
pythons = [p for p in psutil.process_iter() if p.name.lower() in ("python.exe", "pythonw.exe")]
Now, on my system, iterating all processes with psutil.process_iter() takes several seconds, which seems to me ludicrous. The below is significantly faster, as it does the process filtering before Python sees it, but it relies on the wmic command line tool, which not all versions of Windows have (XP Home lacks it, notably). The result here is the same as the first psutil version (a list of lists, each containing the command line and process ID for one Python process).
import subprocess
wmic_cmd = """wmic process where "name='python.exe' or name='pythonw.exe'" get commandline,processid"""
wmic_prc = subprocess.Popen(wmic_cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
wmic_out, wmic_err = wmic_prc.communicate()
pythons = [item.rsplit(None, 1) for item in wmic_out.splitlines() if item][1:]
pythons = [[cmdline, int(pid)] for [cmdline, pid] in pythons]
If wmic is not available, you will get an empty list []. Since you know there's at least one Python process (yours!), you can trap this as an error and display an appropriate message.
To get your own process ID, so you can exclude it from consideration if you're going to e.g. start killing processes, try pywin32's win32process.GetCurrentProcessID()

I had some issues with kindall's answer. With python 3.8:
import psutil
for p in psutil.process_iter():
try:
if p.name().lower() in ["python.exe", "pythonw.exe"]:
print(p.pid, p.cmdline)
except:
continue

With Python 3:
import psutil
pythons = [[" ".join(p.cmdline()), p.pid] for p in psutil.process_iter()
if p.name().lower() in ["python.exe", "pythonw.exe"]]

How to tell process id within Python

I am working with a cluster system over linux (www.mosix.org) that allows me to run jobs and have the system run them on different computers. Jobs are run like so:
mosrun ls &
This will naturally create the process and run it on the background, returning the process id, like so:
[1] 29199
Later it will return. I am writing a Python infrastructure that would run jobs and control them. For that I want to run jobs using the mosrun program as above, and save the process ID of the spawned process (29199 in this case). This naturally cannot be done using os.system or commands.getoutput, as the printed ID is not what the process prints to output... Any clues?
Edit:
Since the python script is only meant to initially run the script, the scripts need to run longer than the python shell. I guess it means the mosrun process cannot be the script's child process. Any suggestions?
Thanks

Use subprocess module. Popen instances have a pid attribute.

Looks like you want to ensure the child process is daemonized -- PEP 3143, which I'm pointing to, documents and points to a reference implementation for that, and points to others too.
Once your process (still running Python code) is daemonized, be it by the means offered in PEP 3143 or others, you can os.execl (or other os.exec... function) your target code -- this runs said target code in exactly the same process which we just said is daemonized, and so it keeps being daemonized, as desired.
The last step cannot use subprocess because it needs to run in the same (daemonized) process, overlaying its executable code -- exactly what os.execl and friends are for.
The first step, before daemonization, might conceivably be done via subprocess, but that's somewhat inconvenient (you need to put the daemonize-then-os.exec code in a separate .py): most commonly you'd just want to os.fork and immediately daemonize the child process.
subprocess is quite convenient as a mostly-cross-platform way to run other processes, but it can't really replace Unix's good old "fork and exec" approach for advanced uses (such as daemonization, in this case) -- which is why it's a good thing that the Python standard library also lets you do the latter via those functions in module os!-)

Thanks all for the help. Here's what I did in the end, and seems to work ok. The code uses python-daemon. Maybe something smarter should be done about transferring the process id from the child to the father, but that's the easier part.
import daemon
def run_in_background(command, tmp_dir="/tmp"):
# Decide on a temp file beforehand
warnings.filterwarnings("ignore", "tempnam is a potential security")
tmp_filename = os.tempnam(tmp_dir)
# Duplicate the process
pid = os.fork()
# If we're child, daemonize and run
if pid == 0:
with daemon.DaemonContext():
child_id = os.getpid()
file(tmp_filename,'w').write(str(child_id))
sp = command.split(' ')
os.execl(*([sp[0]]+sp))
else:
# If we're a parent, poll for the new file
n_iter = 0
while True:
if os.path.exists(tmp_filename):
child_id = int(file(tmp_filename, 'r').read().strip())
break
if n_iter == 100:
raise Exception("Cannot read process id from temp file %s" % tmp_filename)
n_iter += 1
time.sleep(0.1)
return child_id

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.