How can I automate this sequence of lldb commands? - python

In order to work around a bug in Apple's lldb (rdar://13702081) I very frequently need to type two commands in sequence, like this:
(lldb) p todo.matA
(vMAT_Array *) $2 = 0x000000010400b5a0
(lldb) po $2.dump
$3 = 0x0000000100503ce0 <vMAT_Int8Array: 0x10400b5a0; size: [9 1]> =
1
1
1
1
1
1
1
1
1
Is it possible to write a new lldb command using the Python library (or something) that could combine those steps for me? Ideally to something like:
(lldb) pmat todo
$3 = 0x0000000100503ce0 <vMAT_Int8Array: 0x10400b5a0; size: [9 1]> =
1
1
1
1
1
1
1
1
1
Solution
Thanks to Jason Molenda here is output from a working lldb command script:
(lldb) pmat Z
$0 = 0x0000000100112920 <vMAT_DoubleArray: 0x101880c20; size: [9 3]> =
7 9 0.848715
3 5 0.993378
0 1 1.11738
4 12 1.2013
11 13 1.20193
6 10 1.29206
14 15 1.53283
8 16 1.53602
2 17 1.68116
I did have to tweak the script provided in the answer below very slightly, using Jason's suggestions for working around the lldb bug with overly-complex expressions. Here is my final script:
# import this into lldb with a command like
# command script import pmat.py
import lldb
import shlex
import optparse
def pmat(debugger, command, result, dict):
# Use the Shell Lexer to properly parse up command options just like a
# shell would
command_args = shlex.split(command)
parser = create_pmat_options()
try:
(options, args) = parser.parse_args(command_args)
except:
return
target = debugger.GetSelectedTarget()
if target:
process = target.GetProcess()
if process:
frame = process.GetSelectedThread().GetSelectedFrame()
if frame:
var = frame.FindVariable(args[0])
if var:
array = var.GetChildMemberWithName("matA")
if array:
id = array.GetValueAsUnsigned (lldb.LLDB_INVALID_ADDRESS)
if id != lldb.LLDB_INVALID_ADDRESS:
debugger.HandleCommand ('po [0x%x dump]' % id)
def create_pmat_options():
usage = "usage: %prog"
description='''Print a dump of a vMAT_Array instance.'''
parser = optparse.OptionParser(description=description, prog='pmat',usage=usage)
return parser
#
# code that runs when this script is imported into LLDB
#
def __lldb_init_module (debugger, dict):
# This initializer is being run from LLDB in the embedded command interpreter
# Make the options so we can generate the help text for the new LLDB
# command line command prior to registering it with LLDB below
# add pmat
parser = create_pmat_options()
pmat.__doc__ = parser.format_help()
# Add any commands contained in this module to LLDB
debugger.HandleCommand('command script add -f %s.pmat pmat' % __name__)

You can do this either with a regex command or by creating your own python command and loading it in to lldb. In this specific instance the regex command won't help you because you'll hit the same crasher you're hitting. But just for fun, I'll show both solutions.
First, python. This python code gets the currently selected frame on the currently selected thread. It looks for a variable whose name is provided on the command argument. It finds a child of that variable called matA and it runs GetObjectDescription() on that SBValue object.
# import this into lldb with a command like
# command script import pmat.py
import lldb
import shlex
import optparse
def pmat(debugger, command, result, dict):
# Use the Shell Lexer to properly parse up command options just like a
# shell would
command_args = shlex.split(command)
parser = create_pmat_options()
try:
(options, args) = parser.parse_args(command_args)
except:
return
target = debugger.GetSelectedTarget()
if target:
process = target.GetProcess()
if process:
frame = process.GetSelectedThread().GetSelectedFrame()
if frame:
var = frame.FindVariable(args[0])
if var:
child = var.GetChildMemberWithName("matA")
if child:
print child.GetObjectDescription()
def create_pmat_options():
usage = "usage: %prog"
description='''Call po on the child called "matA"'''
parser = optparse.OptionParser(description=description, prog='pmat',usage=usage)
return parser
#
# code that runs when this script is imported into LLDB
#
def __lldb_init_module (debugger, dict):
# This initializer is being run from LLDB in the embedded command interpreter
# Make the options so we can generate the help text for the new LLDB
# command line command prior to registering it with LLDB below
# add pmat
parser = create_pmat_options()
pmat.__doc__ = parser.format_help()
# Add any commands contained in this module to LLDB
debugger.HandleCommand('command script add -f %s.pmat pmat' % __name__)
In use,
(lldb) br s -p break
Breakpoint 2: where = a.out`main + 31 at a.m:8, address = 0x0000000100000eaf
(lldb) r
Process 18223 launched: '/private/tmp/a.out' (x86_64)
Process 18223 stopped
* thread #1: tid = 0x1f03, 0x0000000100000eaf a.out`main + 31 at a.m:8, stop reason = breakpoint 2.1
#0: 0x0000000100000eaf a.out`main + 31 at a.m:8
5 #autoreleasepool {
6 struct var myobj;
7 myobj.matA = #"hello there";
-> 8 printf ("%s\n", [(id)myobj.matA UTF8String]); // break here
9 }
10 }
(lldb) p myobj
(var) $0 = {
(void *) matA = 0x0000000100001070
}
(lldb) comm scri imp ~/lldb/pmat.py
(lldb) pmat myobj
hello there
(lldb)
You can put the command script import line in your ~/.lldbinit file if you want to use this.
It's easy to use the Python APIs once you have a general idea of how the debugger is structured. I knew that I would find the variable based on the frame, so I looked at the help for the SBFrame object with
(lldb) script help (lldb.SBFrame)
The method FindVariable returns an SBValue so then I looked at the lldb.SBValue help page, etc. There's a lot of boilerplate in my example python above - you're really looking at 4 lines of python that do all the work.
If this is still triggering the code path that is crashing your lldb process, you can do the last little bit of the script in two parts - get the address of the object and run po on that raw address. e.g.
child = var.GetChildMemberWithName("matA")
if child:
id = child.GetValueAsUnsigned (lldb.LLDB_INVALID_ADDRESS)
if id != lldb.LLDB_INVALID_ADDRESS:
debugger.HandleCommand ('po 0x%x' % id)
Second, using a command regex:
(lldb) br s -p break
Breakpoint 1: where = a.out`main + 31 at a.m:8, address = 0x0000000100000eaf
(lldb) r
Process 18277 launched: '/private/tmp/a.out' (x86_64)
Process 18277 stopped
* thread #1: tid = 0x1f03, 0x0000000100000eaf a.out`main + 31 at a.m:8, stop reason = breakpoint 1.1
#0: 0x0000000100000eaf a.out`main + 31 at a.m:8
5 #autoreleasepool {
6 struct var myobj;
7 myobj.matA = #"hello there";
-> 8 printf ("%s\n", [(id)myobj.matA UTF8String]); // break here
9 }
10 }
(lldb) command regex pmat 's/(.*)/po %1.matA/'
(lldb) pmat myobj
$0 = 0x0000000100001070 hello there
(lldb)
You can't use the simpler command alias in this instance - you have to use a regex alias - because you're calling a command which takes raw input. Specifically, po is really an alias to expression and you need to use regex command aliases to substitute values into those.

Related

Calling arguments from main() in C from Python with ctypes

I'm trying to call a main() from open-plc-utils/slac/evse.c via ctypes. Therefor I turned the c file into a shared object (.so) and called it from Python.
from ctypes import *
so_file = "/home/evse/open-plc-utils/slac/evse.so"
evse = CDLL(so_file)
evse.main.restype = c_int
evse.main.argtypes = c_int,POINTER(c_char_p)
args = (c_char_p * 1)(b'd')
evse.main(len(args),args)
The d should return debug output but the output in the stdout is the same no matter which letter I pass to main(). Do you know what I did wrong here? And how can I pass something like
evse -i eth1 -p evse.ini -c -d
in one comand via ctypes?
Try the following. Make sure the first argument is the program name.
from ctypes import *
so_file = "/home/main/open-plc-utils/slac/main.so"
evse = CDLL(so_file)
evse.main.restype = c_int
evse.main.argtypes = c_int,POINTER(c_char_p)
def make_args(cmd):
args = cmd.encode().split()
return (c_char_p * len(args))(*args)
args = make_args('main -i eth1 -p main.ini -c -d')
evse.main(len(args), args)

Memory scanner for any program in Python

I am trying to create a memory scanner. similar to Cheat Engine. but only for extract information.
I know how to get the pid (in this case is "notepad.exe"). But I don't have any Idea about how to know wicht especific adress belong to the program that I am scanning.
Trying to looking for examples. I could see someone it was trying to scan every adress since one point to other. But it's to slow. Then I try to create a batch size (scan a part of memory and not one by one each adress). The problem is if the size is to short. still will take a long time. and if it is to long, is possible to lose many adress who are belong to the program. Because result from ReadMemoryScan is False in the first Adress, but It can be the next one is true. Here is my example.
import ctypes as c
from ctypes import wintypes as w
import psutil
from sys import stdout
write = stdout.write
import numpy as np
def get_client_pid(process_name):
pid = None
for proc in psutil.process_iter():
if proc.name() == process_name:
pid = int(proc.pid)
print(f"Found '{process_name}' PID = ", pid,f" hex_value = {hex(pid)}")
break
if pid == None:
print('Program Not found')
return pid
pid = get_client_pid("notepad.exe")
if pid == None:
sys.exit()
k32 = c.WinDLL('kernel32', use_last_error=True)
OpenProcess = k32.OpenProcess
OpenProcess.argtypes = [w.DWORD,w.BOOL,w.DWORD]
OpenProcess.restype = w.HANDLE
ReadProcessMemory = k32.ReadProcessMemory
ReadProcessMemory.argtypes = [w.HANDLE,w.LPCVOID,w.LPVOID,c.c_size_t,c.POINTER(c.c_size_t)]
ReadProcessMemory.restype = w.BOOL
GetLastError = k32.GetLastError
GetLastError.argtypes = None
GetLastError.restype = w.DWORD
CloseHandle = k32.CloseHandle
CloseHandle.argtypes = [w.HANDLE]
CloseHandle.restype = w.BOOL
processHandle = OpenProcess(0x10, False, int(pid))
# addr = 0x0FFFFFFFFFFF
data = c.c_ulonglong()
bytesRead = c.c_ulonglong()
start = 0x000000000000
end = 0x7fffffffffff
batch_size = 2**13
MemoryData = np.zeros(batch_size, 'l')
Size = MemoryData.itemsize*MemoryData.size
index = 0
Data_address = []
for c_adress in range(start,end,batch_size):
result = ReadProcessMemory(processHandle,c.c_void_p(c_adress), MemoryData.ctypes.data,
Size, c.byref(bytesRead))
if result: # Save adress
Data_address.extend(list(range(c_adress,c_adress+batch_size)))
e = GetLastError()
CloseHandle(processHandle)
I decided from 0x000000000000 to 0x7fffffffffff Because cheat engine scan this size. I am still a begginer with this kind of this about memory scan. maybe there are things that I can do to improve the efficiency.
I suggest you take advantage of existing python libraries that can analyse Windows 10 memory.
I'm no specialist but I've found Volatility. Seems to be pretty useful for your problem.
For running that tool you need Python 2 (Python 3 won't work).
For running python 2 and 3 in the same Windows 10 machine, follow this tutorial (The screenshots are in Spanish but it can easily be followed).
Then see this cheat sheet with main commands. You can dump the memory and then operate on the file.
Perhaps this leads you to the solution :) At least the most basic command pslist dumps all the running processes addresses.
psutil has proc.memory_maps()
pass the result as map to this function
TargetProcess eaxample 'Calculator.exe'
def get_memSize(self,TargetProcess,map):
for m in map:
if TargetProcess in m.path:
memSize= m.rss
break
return memSize
if you use this function, it returns the memory size of your Target Process
my_pid is the pid for 'Calculator.exe'
def getBaseAddressWmi(self,my_pid):
PROCESS_ALL_ACCESS = 0x1F0FFF
processHandle = win32api.OpenProcess(PROCESS_ALL_ACCESS, False, my_pid)
modules = win32process.EnumProcessModules(processHandle)
processHandle.close()
base_addr = modules[0] # for me it worked to select the first item in list...
return base_addr
to get the base address of your prog
so you search range is from base_addr to base_addr + memSize

Python,Shell: How to extract data and store in a iterable varible(specifically in list)?

I am working on my own project. In which these steps have to be performed:
Connect to remote server.
Get pid, process name, cpu usage, swap memory usage by each running process on remote server daily on some specific time(say at 4'0 clock).
I have to compare every day's result with previous day's result (e.g. day1-pid with day2 pid and day1 process name with day2 process name etc.)
So far I have done up to step-2. Now I want to know that how to extract the pid, process name, cpu usage, swap memory usage from remote server and store it in some iterable variable. So that I can compare it for checking memory spike?
Any other way apart from my idea will be appreciable.
My code sample is like this:
import paramiko
import re
import psutil
class ShellHandler:
def __init__(self, host, user, psw):
self.ssh = paramiko.SSHClient()
self.ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
self.ssh.connect(host, username=user, password=psw, port=22)
channel = self.ssh.invoke_shell()
self.stdin = channel.makefile('wb')
self.stdout = channel.makefile('r')
def __del__(self):
self.ssh.close()
#staticmethod
def _print_exec_out(cmd, out_buf, err_buf, exit_status):
print('command executed: {}'.format(cmd))
print('STDOUT:')
for line in out_buf:
print(line, end="")
print('end of STDOUT')
print('STDERR:')
for line in err_buf:
print(line, end="")
print('end of STDERR')
print('finished with exit status: {}'.format(exit_status))
print('------------------------------------')
#print(psutil.pids())
pass
def execute(self, cmd):
"""
:param cmd: the command to be executed on the remote computer
:examples: execute('ls')
execute('finger')
execute('cd folder_name')
"""
cmd = cmd.strip('\n')
self.stdin.write(cmd + '\n')
finish = 'end of stdOUT buffer. finished with exit status'
echo_cmd = 'echo {} $?'.format(finish)
self.stdin.write(echo_cmd + '\n')
shin = self.stdin
self.stdin.flush()
shout = []
sherr = []
exit_status = 0
for line in self.stdout:
if str(line).startswith(cmd) or str(line).startswith(echo_cmd):
# up for now filled with shell junk from stdin
shout = []
elif str(line).startswith(finish):
# our finish command ends with the exit status
exit_status = int(str(line).rsplit(maxsplit=1)[1])
if exit_status:
# stderr is combined with stdout.
# thus, swap sherr with shout in a case of failure.
sherr = shout
shout = []
break
else:
# get rid of 'coloring and formatting' special characters
shout.append(re.compile(r'(\x9B|\x1B\[)[0-?]*[ -/]*[#-~]').sub('', line).replace('\b', '').replace('\r', ''))
# first and last lines of shout/sherr contain a prompt
if shout and echo_cmd in shout[-1]:
shout.pop()
if shout and cmd in shout[0]:
shout.pop(0)
if sherr and echo_cmd in sherr[-1]:
sherr.pop()
if sherr and cmd in sherr[0]:
sherr.pop(0)
self._print_exec_out(cmd=cmd, out_buf=shout, err_buf=sherr, exit_status=exit_status)
return shin, shout, sherr
obj=ShellHandler('Servername','username','password')
pID=[]
## I want this(pid, cmd, swap memory) to store in a varible which would be iterable.
pID=ShellHandler.execute(obj,"ps -eo pid,cmd,lstart,%mem,%cpu|awk '{print $1}'")
print(pID[0])##---------------------------------Problem not giving any output.
Your ShellHandler's execute method returns three items, the first of which is the input you sent to it.
You should probably call it directly like this, anyway:
obj = ShellHandler('Servername','username','password')
in, out, err = obj.execute("ps -eo pid,lstart,%mem,%cpu,cmd")
for line in out.split('\n'):
pid, lstartwd, lstartmo, lstartdd, lstartm, lstartyy, mem, cpu, cmd = line.split(None, 8)
I moved cmd last because it might contain spaces. The lstart value also contains multiple space-separated fields. Here's what the output looks like in Debian:
19626 Tue Jan 15 15:03:57 2019 0.0 0.0 less filename
There are many questions about how to parse ps output in more detail; I'll refer you to them for figuring out how to handle the results from split exactly.
Splitting out the output of ps using Python
Is there any way to get ps output programmatically?
ps aux command should have all the info you need (pid, process name, cpu, memory)

Automatically print the result of expressions in non-interactive Python

I would like to find a way where the execution of Python scripts automatically write the result of the expressions in the top level, as is done in interactive mode.
For instance, if I have this script.py:
abs(3)
for x in [1,2,3]:
print abs(x)
abs(-4)
print abs(5)
and execute python script.py, I will get
1
2
3
5
but I would rather have
3
1
2
3
4
5
which is what one would get executing it interactively (modulo prompts).
More or less, I would like to achieve the contrary of Disable automatic printing in Python interactive session . It seems that the code module could help me, but I got no no success with it.
Well, I'm not seriously proposing using something like this, but you could (ab)use ast processing:
# -*- coding: utf-8 -*-
import ast
import argparse
_parser = argparse.ArgumentParser()
_parser.add_argument('file')
class ExpressionPrinter(ast.NodeTransformer):
visit_ClassDef = visit_FunctionDef = lambda self, node: node
def visit_Expr(self, node):
node = ast.copy_location(
ast.Expr(
ast.Call(ast.Name('print', ast.Load()),
[node.value], [], None, None)
),
node
)
ast.fix_missing_locations(node)
return node
def main(args):
with open(args.file) as source:
tree = ast.parse(source.read(), args.file, mode='exec')
new_tree = ExpressionPrinter().visit(tree)
exec(compile(new_tree, args.file, mode='exec'))
if __name__ == '__main__':
main(_parser.parse_args())
Output for your example script.py:
% python2 printer.py test2.py
3
1
2
3
4
5

Grep reliably all C #defines

I need to analyse some C files and print out all the #define found.
It's not that hard with a regexp (for example)
def with_regexp(fname):
print("{0}:".format(fname))
for line in open(fname):
match = macro_regexp.match(line)
if match is not None:
print(match.groups())
But for example it doesn't handle multiline defines for example.
There is a nice way to do it in C for example with
gcc -E -dM file.c
the problem is that it returns all the #defines, not just the one from the given file, and I don't find any option to only use the given file..
Any hint?
Thanks
EDIT:
This is a first solution to filter out the unwanted defines, simply checking that the name of the define is actually part of the original file, not perfect but seems to work nicely..
def with_gcc(fname):
cmd = "gcc -dM -E {0}".format(fname)
proc = Popen(cmd, shell=True, stdout=PIPE)
out, err = proc.communicate()
source = open(fname).read()
res = set()
for define in out.splitlines():
name = define.split(' ')[1]
if re.search(name, source):
res.add(define)
return res
Sounds like a job for a shell one-liner!
What I want to do is remove the all #includes from the C file (so we don't get junk from other files), pass that off to gcc -E -dM, then remove all the built in #defines - those start with _, and apparently linux and unix.
If you have #defines that start with an underscore this won't work exactly as promised.
It goes like this:
sed -e '/#include/d' foo.c | gcc -E -dM - | sed -e '/#define \(linux\|unix\|_\)/d'
You could probably do it in a few lines of Python too.
In PowerShell you could do something like the following:
function Get-Defines {
param([string] $Path)
"$Path`:"
switch -regex -file $Path {
'\\$' {
if ($multiline) { $_ }
}
'^\s*#define(.*)$' {
$multiline = $_.EndsWith('\');
$_
}
default {
if ($multiline) { $_ }
$multiline = $false
}
}
}
Using the following sample file
#define foo "bar"
blah
#define FOO \
do { \
do_stuff_here \
do_more_stuff \
} while (0)
blah
blah
#define X
it prints
\x.c:
#define foo "bar"
#define FOO \
do { \
do_stuff_here \
do_more_stuff \
} while (0)
#define X
Not ideal, at least how idiomatic PowerShell functions should work, but should work well enough for your needs.
Doing this in pure python I'd use a small state machine:
def getdefines(fname):
""" return a list of all define statements in the file """
lines = open(fname).read().split("\n") #read in the file as a list of lines
result = [] #the result list
current = []#a temp list that holds all lines belonging to a define
lineContinuation = False #was the last line break escaped with a '\'?
for line in lines:
#is the current line the start or continuation of a define statement?
isdefine = line.startswith("#define") or lineContinuation
if isdefine:
current.append(line) #append to current result
lineContinuation = line.endswith("\\") #is the line break escaped?
if not lineContinuation:
#we reached the define statements end - append it to result list
result.append('\n'.join(current))
current = [] #empty the temp list
return result

Categories