Alternative to psutil.Process(pid).name - python

I have measured the performance of psutil.Process(pid).name and it turns out that it is more than ten times slower than for example psutil.Process(pid).exe. Because the last one of these functions requires different privileges over the path, I cannot just just extract the filename from the path. My question is: Are there any alternatives to psutil.Process(pid).name, which does the same?

You mentioned this is for windows. I took a look at what psutil does for windows. It looks like psutil.Process().name is using the windows tool help API. If you look at psutil's Process code and trace .name, it goes to get_name() in process_info.c. It is looping through all the pids on your system until it finds the one you're looking for. I think this may be a limitation of the toolhelp API. But this is why it's slower than .exe, which uses a different API path, that (as you pointed out), requires additional privilege.
The solution I came up with is to use ctypes and ctypes.windll to call the windows ntapi directly. It only needs PROCESS_QUERY_INFORMATION, which is different than PROCESS_ALL_ACCESS:
import ctypes
import os.path
# duplicate the UNICODE_STRING structure from the windows API
class UNICODE_STRING(ctypes.Structure):
_fields_ = [
('Length', ctypes.c_short),
('MaximumLength', ctypes.c_short),
('Buffer', ctypes.c_wchar_p)
]
# args
pid = 8000 # put your pid here
# define some constants; from windows API reference
MAX_TOTAL_PATH_CHARS = 32767
PROCESS_QUERY_INFORMATION = 0x0400
PROCESS_IMAGE_FILE_NAME = 27
# open handles
ntdll = ctypes.windll.LoadLibrary('ntdll.dll')
process = ctypes.windll.kernel32.OpenProcess(PROCESS_QUERY_INFORMATION,
False, pid)
# allocate memory
buflen = (((MAX_TOTAL_PATH_CHARS + 1) * ctypes.sizeof(ctypes.c_wchar)) +
ctypes.sizeof(UNICODE_STRING))
buffer = ctypes.c_char_p(' ' * buflen)
# query process image filename and parse for process "name"
ntdll.NtQueryInformationProcess(process, PROCESS_IMAGE_FILE_NAME, buffer,
buflen, None)
pustr = ctypes.cast(buffer, ctypes.POINTER(UNICODE_STRING))
print os.path.split(pustr.contents.Buffer)[-1]
# cleanup
ctypes.windll.kernel32.CloseHandle(process)
ctypes.windll.kernel32.FreeLibrary(ntdll._handle)

As of psutil 1.1.0 this problem has been fixed, see https://code.google.com/p/psutil/issues/detail?id=426

Related

Detect if Python script is running in focus [duplicate]

I would like to get the active window on the screen using python.
For example, the management interface of the router where you enter the username and password as admin
That admin interface is what I want to capture using python to automate the entry of username and password.
What imports would I require in order to do this?
On windows, you can use the python for windows extensions (http://sourceforge.net/projects/pywin32/):
from win32gui import GetWindowText, GetForegroundWindow
print GetWindowText(GetForegroundWindow())
Below code is for python 3:
from win32gui import GetWindowText, GetForegroundWindow
print(GetWindowText(GetForegroundWindow()))
(Found this on http://scott.sherrillmix.com/blog/programmer/active-window-logger/)
Thanks goes to the answer by Nuno André, who showed how to use ctypes to interact with Windows APIs. I have written an example implementation using his hints.
The ctypes library is included with Python since v2.5, which means that almost every user has it. And it's a way cleaner interface than old and dead libraries like win32gui (last updated in 2017 as of this writing). ((Update in late 2020: The dead win32gui library has come back to life with a rename to pywin32, so if you want a maintained library, it's now a valid option again. But that library is 6% slower than my code.))
Documentation is here: https://docs.python.org/3/library/ctypes.html (You must read its usage help if you wanna write your own code, otherwise you can cause segmentation fault crashes, hehe.)
Basically, ctypes includes bindings for the most common Windows DLLs. Here is how you can retrieve the title of the foreground window in pure Python, with no external libraries needed! Just the built-in ctypes! :-)
The coolest thing about ctypes is that you can Google any Windows API for anything you need, and if you want to use it, you can do it via ctypes!
Python 3 Code:
from typing import Optional
from ctypes import wintypes, windll, create_unicode_buffer
def getForegroundWindowTitle() -> Optional[str]:
hWnd = windll.user32.GetForegroundWindow()
length = windll.user32.GetWindowTextLengthW(hWnd)
buf = create_unicode_buffer(length + 1)
windll.user32.GetWindowTextW(hWnd, buf, length + 1)
# 1-liner alternative: return buf.value if buf.value else None
if buf.value:
return buf.value
else:
return None
Performance is extremely good: 0.01 MILLISECONDS on my computer (0.00001 seconds).
Will also work on Python 2 with very minor changes. If you're on Python 2, I think you only have to remove the type annotations (from typing import Optional and -> Optional[str]). :-)
Enjoy!
Win32 Technical Explanations:
The length variable is the length of the actual text in UTF-16 (Windows Wide "Unicode") CHARACTERS. (It is NOT the number of BYTES.) We have to add + 1 to add room for the null terminator at the end of C-style strings. If we don't do that, we would not have enough space in the buffer to fit the final real character of the actual text, and Windows would truncate the returned string (it does that to ensure that it fits the super important final string Null-terminator).
The create_unicode_buffer function allocates room for that many UTF-16 CHARACTERS.
Most (or all? always read Microsoft's MSDN docs!) Windows APIs related to Unicode text take the buffer length as CHARACTERS, NOT as bytes.
Also look closely at the function calls. Some end in W (such as GetWindowTextLengthW). This stands for "Wide string", which is the Windows name for Unicode strings. It's very important that you do those W calls to get proper Unicode strings (with international character support).
PS: Windows has been using Unicode for a long time. I know for a fact that Windows 10 is fully Unicode and only wants the W function calls. I don't know the exact cutoff date when older versions of Windows used other multi-byte string formats, but I think it was before Windows Vista, and who cares? Old Windows versions (even 7 and 8.1) are dead and unsupported by Microsoft.
Again... enjoy! :-)
UPDATE in Late 2020, Benchmark vs the pywin32 library:
import time
import win32ui
from typing import Optional
from ctypes import wintypes, windll, create_unicode_buffer
def getForegroundWindowTitle() -> Optional[str]:
hWnd = windll.user32.GetForegroundWindow()
length = windll.user32.GetWindowTextLengthW(hWnd)
buf = create_unicode_buffer(length + 1)
windll.user32.GetWindowTextW(hWnd, buf, length + 1)
return buf.value if buf.value else None
def getForegroundWindowTitle_Win32UI() -> Optional[str]:
# WARNING: This code sometimes throws an exception saying
# "win32ui.error: No window is is in the foreground."
# which is total nonsense. My function doesn't fail that way.
return win32ui.GetForegroundWindow().GetWindowText()
iterations = 1_000_000
start_time = time.time()
for x in range(iterations):
foo = getForegroundWindowTitle()
elapsed1 = time.time() - start_time
print("Elapsed 1:", elapsed1, "seconds")
start_time = time.time()
for x in range(iterations):
foo = getForegroundWindowTitle_Win32UI()
elapsed2 = time.time() - start_time
print("Elapsed 2:", elapsed2, "seconds")
win32ui_pct_slower = ((elapsed2 / elapsed1) - 1) * 100
print("Win32UI library is", win32ui_pct_slower, "percent slower.")
Typical result after doing multiple runs on an AMD Ryzen 3900x:
My function: 4.5769994258880615 seconds
Win32UI library: 4.8619983196258545 seconds
Win32UI library is 6.226762715455125 percent slower.
However, the difference is small, so you may want to use the library now that it has come back to life (it had previously been dead since 2017). But you're going to have to deal with that library's weird "no window is in the foreground" exception, which my code doesn't suffer from (see the code comments in the benchmark code).
Either way... enjoy!
The following script should work on Linux, Windows and Mac. It is currently only tested on Linux (Ubuntu Mate Ubuntu 15.10).
Prerequisites
For Linux:
Install wnck (sudo apt-get install python-wnck on Ubuntu, see libwnck.)
For Windows:
Make sure win32gui is available
For Mac:
Make sure AppKit is available
The script
#!/usr/bin/env python
"""Find the currently active window."""
import logging
import sys
logging.basicConfig(format='%(asctime)s %(levelname)s %(message)s',
level=logging.DEBUG,
stream=sys.stdout)
def get_active_window():
"""
Get the currently active window.
Returns
-------
string :
Name of the currently active window.
"""
import sys
active_window_name = None
if sys.platform in ['linux', 'linux2']:
# Alternatives: https://unix.stackexchange.com/q/38867/4784
try:
import wnck
except ImportError:
logging.info("wnck not installed")
wnck = None
if wnck is not None:
screen = wnck.screen_get_default()
screen.force_update()
window = screen.get_active_window()
if window is not None:
pid = window.get_pid()
with open("/proc/{pid}/cmdline".format(pid=pid)) as f:
active_window_name = f.read()
else:
try:
from gi.repository import Gtk, Wnck
gi = "Installed"
except ImportError:
logging.info("gi.repository not installed")
gi = None
if gi is not None:
Gtk.init([]) # necessary if not using a Gtk.main() loop
screen = Wnck.Screen.get_default()
screen.force_update() # recommended per Wnck documentation
active_window = screen.get_active_window()
pid = active_window.get_pid()
with open("/proc/{pid}/cmdline".format(pid=pid)) as f:
active_window_name = f.read()
elif sys.platform in ['Windows', 'win32', 'cygwin']:
# https://stackoverflow.com/a/608814/562769
import win32gui
window = win32gui.GetForegroundWindow()
active_window_name = win32gui.GetWindowText(window)
elif sys.platform in ['Mac', 'darwin', 'os2', 'os2emx']:
# https://stackoverflow.com/a/373310/562769
from AppKit import NSWorkspace
active_window_name = (NSWorkspace.sharedWorkspace()
.activeApplication()['NSApplicationName'])
else:
print("sys.platform={platform} is unknown. Please report."
.format(platform=sys.platform))
print(sys.version)
return active_window_name
print("Active window: %s" % str(get_active_window()))
For Linux users:
All the answers provided required additional modules like "wx" that had numerous errors installing ("pip" failed on build), but I was able to modify this solution quite easily -> original source. There were bugs in the original (Python TypeError on regex)
import sys
import os
import subprocess
import re
def get_active_window_title():
root = subprocess.Popen(['xprop', '-root', '_NET_ACTIVE_WINDOW'], stdout=subprocess.PIPE)
stdout, stderr = root.communicate()
m = re.search(b'^_NET_ACTIVE_WINDOW.* ([\w]+)$', stdout)
if m != None:
window_id = m.group(1)
window = subprocess.Popen(['xprop', '-id', window_id, 'WM_NAME'], stdout=subprocess.PIPE)
stdout, stderr = window.communicate()
else:
return None
match = re.match(b"WM_NAME\(\w+\) = (?P<name>.+)$", stdout)
if match != None:
return match.group("name").strip(b'"')
return None
if __name__ == "__main__":
print(get_active_window_title())
The advantage is it works without additional modules. If you want it to work across multiple platforms, it's just a matter of changing the command and regex strings to get the data you want based on the platform (with the standard if/else platform detection shown above sys.platform).
On a side note: import wnck only works with python2.x when installed with "sudo apt-get install python-wnck", since I was using python3.x the only option was pypie which I have not tested. Hope this helps someone else.
There's really no need to import any external dependency for tasks like this. Python comes with a pretty neat foreign function interface - ctypes, which allows for calling C shared libraries natively. It even includes specific bindings for the most common Win32 DLLs.
E.g. to get the PID of the foregorund window:
import ctypes
from ctypes import wintypes
user32 = ctypes.windll.user32
h_wnd = user32.GetForegroundWindow()
pid = wintypes.DWORD()
user32.GetWindowThreadProcessId(h_wnd, ctypes.byref(pid))
print(pid.value)
In Linux under X11:
xdo_window_id = os.popen('xdotool getactivewindow').read()
print('xdo_window_id:', xdo_window_id)
will print the active window ID in decimal format:
xdo_window_id: 67113707
Note xdotool must be installed first:
sudo apt install xdotool
Note wmctrl uses hexadecimal format for window ID.
This only works on windows
import win32gui
import win32process
def get_active_executable_name():
try:
process_id = win32process.GetWindowThreadProcessId(
win32gui.GetForegroundWindow()
)
return ".".join(psutil.Process(process_id[-1]).name().split(".")[:-1])
except Exception as exception:
return None
I'll recommend checking out this answer for making it work on linux, mac and windows.
I'd been facing same problem with linux interface (Lubuntu 20).
What I do is using wmctrl and execute it with shell command from python.
First, Install wmctrl
sudo apt install wmctrl
Then, Add this code :
import os
os.system('wmctrl -a "Mozilla Firefox"')
ref wmctrl :
https://askubuntu.com/questions/21262/shell-command-to-bring-a-program-window-in-front-of-another
In Linux:
If you already have installed xdotool, you can just use:
from subprocess import run
def get__focused_window():
return run(['xdotool', 'getwindowfocus', 'getwindowpid', 'getwindowname'], capture_output=True).stdout.decode('utf-8').split()
While I was writing this answer I've realised that there were also:
A reference about "xdotool" on comments
& another slightly similar "xdotool" answer
So, I've decided to mention them here, too.
Just wanted to add in case it helps, I have a function for my program (It's a software for my PC's lighting I have this simple few line function:
def isRunning(process_name):
foregroundWindow = GetWindowText(GetForegroundWindow())
return process_name in foregroundWindow
Try using wxPython:
import wx
wx.GetActiveWindow()

python multiprocessing - OverflowError('cannot serialize a bytes object larger than 4GiB')

We are running a script using the multiprocessing library (python 3.6), where a big pd.DataFrame is passed as an argument to a function :
from multiprocessing import Pool
import time
def my_function(big_df):
# do something time consuming
time.sleep(50)
if __name__ == '__main__':
with Pool(10) as p:
res = {}
output = {}
for id, big_df in some_dict_of_big_dfs:
res[id] = p.apply_async(my_function,(big_df ,))
output = {id : res[id].get() for id in id_list}
The problem is that we are getting an error from the pickle library.
Reason: 'OverflowError('cannot serialize a bytes objects larger than
4GiB',)'
We are aware than pickle v4 can serialize larger objects question related, link, but we don't know how to modify the protocol that multiprocessing is using.
does anybody know what to do?
Thanks !!
Apparently is there an open issue about this topic , and there is a few related initiatives described on this particular answer. I Found a way to change the default pickle protocol that is used in the multiprocessing library based on this answer. As was pointed out in the comments this solution Only works with Linux and OS multiprocessing lib
Solution:
You first create a new separated module
pickle4reducer.py
from multiprocessing.reduction import ForkingPickler, AbstractReducer
class ForkingPickler4(ForkingPickler):
def __init__(self, *args):
if len(args) > 1:
args[1] = 2
else:
args.append(2)
super().__init__(*args)
#classmethod
def dumps(cls, obj, protocol=4):
return ForkingPickler.dumps(obj, protocol)
def dump(obj, file, protocol=4):
ForkingPickler4(file, protocol).dump(obj)
class Pickle4Reducer(AbstractReducer):
ForkingPickler = ForkingPickler4
register = ForkingPickler4.register
dump = dump
And then, in your main script you need to add the following:
import pickle4reducer
import multiprocessing as mp
ctx = mp.get_context()
ctx.reducer = pickle4reducer.Pickle4Reducer()
with mp.Pool(4) as p:
# do something
That will probably solve the problem of the overflow.
But, warning, you might consider reading this before doing anything or you might reach the same error as me:
'i' format requires -2147483648 <= number <= 2147483647
(the reason of this error is well explained in the link above). Long story short, multiprocessing send data through all its process using the pickle protocol, if you are already reaching the 4gb limit, that probably means that you might consider redefining your functions more as "void" methods rather than input/output methods. All this inbound/outbound data increase the RAM usage, is probably inefficient by construction (my case) and it might be better to point all process to the same object rather than create a new copy for each call.
hope this helps.
Supplementing answer from Pablo
The following problem can be resolved be Python3.8, if you are okay to use this version of python:
'i' format requires -2147483648 <= number <= 2147483647

Change mtime within Python of a broken symlink

Apparently there is nothing like os.lutime which would allow to change mtime of the symlink itself, even if the file it points to is absent. For that purpose on Linux and on OSX, touch command has -h option to not dereference the link. But I found no way to do it natively cross-platform (at least on OSX and Linux) within Python. So is there a remedy for my desire? ;)
While not easy to make cross-platform, it is possible to use the ctypes module to call the native functions to do this.
Here is the Python 2 code I created to do this on macOS. I imagine with some tweaking it could be made to work on Linux also.
import ctypes
import ctypes.util
class ctype_timeval(ctypes.Structure):
_fields_ = [
('tv_sec', ctypes.c_long),
('tv_usec', ctypes.c_long)
]
ctype_libsystemc = ctypes.cdll.LoadLibrary(ctypes.util.find_library('libsystem.c'))
ctype_libsystemc_lutimes = ctype_libsystemc.lutimes
ctype_libsystemc_lutimes.restype = ctypes.c_int
ctype_libsystemc_lutimes.argtypes = [ctypes.c_char_p, ctype_timeval * 2]
def lutime(filename, time):
times = (ctype_timeval * 2)()
# access:
times[0].tv_sec = time[0]
times[0].tv_usec = 0
# modification:
times[1].tv_sec = time[1]
times[1].tv_usec = 0
return ctype_libsystemc_lutimes(filename, times)
You can use it just like you would os.utime:
lutime('file-or-symlink', (1488079452, 1488079452))

Python - How to get the start/base address of a process?

How do I get the start/base address of a process? Per example Solitaire.exe (solitaire.exe+BAFA8)
#-*- coding: utf-8 -*-
import ctypes, win32ui, win32process
PROCESS_ALL_ACCESS = 0x1F0FFF
HWND = win32ui.FindWindow(None,u"Solitär").GetSafeHwnd()
PID = win32process.GetWindowThreadProcessId(HWND)[1]
PROCESS = ctypes.windll.kernel32.OpenProcess(PROCESS_ALL_ACCESS,False,PID)
print PID, HWND,PROCESS
I would like to calculate a memory address and for this way I need the base address of solitaire.exe.
Here's a picture of what I mean:
I think the handle returned by GetModuleHandle is actually the base address of the given module. You get the handle of the exe by passing NULL.
Install pydbg
Source: https://github.com/OpenRCE/pydbg
Unofficial binaries here: http://www.lfd.uci.edu/~gohlke/pythonlibs/#pydbg
from pydbg import *
from pydbg.defines import *
import struct
dbg = pydbg()
path_exe = "C:\\windows\\system32\\calc.exe"
dbg.load(path_exe, "-u amir")
dbg.debug_event_loop()
parameter_addr = dbg.context.Esp #(+ 0x8)
print 'ESP (address) ',parameter_addr
#attach not working under Win7 for me
#pid = raw_input("Enter PID:")
#print 'PID entered %i'%int(pid)
#dbg.attach(int(pid)) #attaching to running process not working
You might want to have a look at PaiMei, although it's not very active right now https://github.com/OpenRCE/paimei
I couldn't get attach() to work and used load instead. Pydbg has loads of functionality, such as read_proccess_memory, write_process_memory etc.
Note that you can't randomly change memory, because an operating system protects memory of other processes from your process (protected mode). Before the x86 processors there were some which allowed all processors to run in real mode, i.e. the full access of memory for every programm. Non-malicious software usually (always?) doesn't read/write other processes' memory.
The HMDOULE value of GetModuleHandle is the base address of the loaded module and is probably the address you need to compute the offset.
If not, that address is the start of the header of the module (DLL/EXE), which can be displayed with the dumpbin utility that comes with Visual Studio or you can interpret it yourself using the Microsoft PE and COFF Specification to determine the AddressOfEntryPoint and BaseOfCode as offsets from the base address. If the base address of the module isn't what you need, one of these two is another option.
Example:
>>> BaseAddress = win32api.GetModuleHandle(None) + 0xBAFA8
>>> print '{:08X}'.format(BaseAddress)
1D0BAFA8
If The AddressOfEntryPoint or BaseOfCode is needed, you'll have to use ctypes to call ReadProcessMemory following the PE specification to locate the offsets, or just use dumpbin /headers solitaire.exe to learn the offsets.
You can use frida to easy do that.
It is very useful to make hack and do some memory operation just like make address offset, read memory, write something to special memory etc...
https://github.com/frida/frida
2021.08.01 update:
Thanks for #Simas Joneliunas reminding
There some step using frida(windows):
Install frida by pip
pip install frida-tools # CLI tools
pip install frida # Python bindings
Using frida api
session = frida.attach(processName)
script = session.create_script("""yourScript""")
script.load()
sys.stdin.read() #make program always alive
session.detach()
Edit your scrip(using JavaScrip)
var baseAddr = Module.findBaseAddress('solitaire.exe');
var firstPointer = baseAddr.add(0xBAFA8).readPointer();
var secondPointer = firstPointer.add(0x50).readPointer();
var thirdPointer = secondPointer.add(0x14).readPointer();
#if your target pointer points to a Ansi String, you can use #thirdPointer.readAnsiString() to read
The official site https://frida.re/

Why python has limit for count of file handles?

I writed simple code for test, how much files may be open in python script:
for i in xrange(2000):
fp = open('files/file_%d' % i, 'w')
fp.write(str(i))
fp.close()
fps = []
for x in xrange(2000):
h = open('files/file_%d' % x, 'r')
print h.read()
fps.append(h)
and I get a exception
IOError: [Errno 24] Too many open files: 'files/file_509'
The number of open files is limited by the operating system. On linux you can type
ulimit -n
to see what the limit is. If you are root, you can type
ulimit -n 2048
now your program will run ok (as root) since you have lifted the limit to 2048 open files
I see same behavior on Windows when running your code. The limit exists from C runtime. You can use win32file to change the limit value:
import win32file
print win32file._getmaxstdio()
The above shall give you 512, which explains the failure at #509 (+stdin, stderr, stdout as others have already stated)
Execute the following and your code shall run fine:
win32file._setmaxstdio(2048)
Note that 2048 is the hard limit, though (hard limit of the underlying C Stdio). As a result, executing the _setmaxstdio with a value greater than 2048 fails for me.
To check change the limit of open file handles on Linux, you can use the Python module resource:
import resource
# the soft limit imposed by the current configuration
# the hard limit imposed by the operating system.
soft, hard = resource.getrlimit(resource.RLIMIT_NOFILE)
print 'Soft limit is ', soft
# For the following line to run, you need to execute the Python script as root.
resource.setrlimit(resource.RLIMIT_NOFILE, (3000, hard))
On Windows, I do as Punit S suggested:
import platform
if platform.system() == 'Windows':
import win32file
win32file._setmaxstdio(2048)
Most likely because the operating system has a limit for the number of files that an application can have open.
On Windows one can get or set the limit with the built-in ctypes library:
import ctypes
print("Before: {}".format(ctypes.windll.msvcrt._getmaxstdio()))
ctypes.windll.msvcrt._setmaxstdio(2048)
print("After: {}".format(ctypes.windll.msvcrt._getmaxstdio()))
Since this is not a Python problem, do this:
for x in xrange(2000):
with open('files/file_%d' % x, 'r') as h:
print h.read()
The following is a very bad idea.
fps.append(h)
The append is needed so the garbage collector does not clean up and close the files

Categories