Embed Python source code in C++ as string - python

I'm writing a C++ program that requires Python (3.11) code to be embedded into it and am using Python.h to try and accomplish this. The general idea is that my a python script, which will be stored by the C++ program as a string, as I'll be performing operations on the source at runtime, will contain a "main()" function which returns an array of known size.
I'm aware I can do it via:
...
PyObject *pName = PyString_FromString("main");
PyObject *pModule = PyImport_Import(pName)
...
However, in order to actually execute the script, I would need to write it to a file just so that python could read it again. This adds extra time to execution that I'd prefer to avoid. Isn't there some way in which I can pass python the source code directly as a string and work from there? Or am I just screwed?
EDIT: BTW, PyRun_SimpleString does not do what I want, as it doesn't return anything from the executed code.

Found the answer thanks to nick in the comments.
An example of usage of PyRun_String: https://schneide.blog/2011/10/10/embedding-python-into-cpp/, and extracting list variables from python script https://docs.python.org/3/c-api/list.html
The final frankenstein:
PyObject *main = PyImport_AddModule("__main__");
PyObject *globalDictionary = PyModule_GetDict(main);
PyObject *localDictionary = PyDict_New();
PyRun_String("a=[0, 1, 2, 3, 4, 5]", Py_file_input, globalDictionary, localDictionary);
PyObject *result = PyDict_GetItemString(localDictionary, "a");
double a[6];
for (int i = 0; i < PyList_Size(result); i++) {
a[i] = PyFloat_AsDouble(PyList_GetItem(result, i));
}

Related

Getting result of PyRun_String when python code returns an object

i have a problem with my code.
i have a python file for the capturing of mavlink messages(i'm using pymavlink library) and i need to create a library for interfacing python results with c/c++ code.
this is my python code from .py file
from pymavlink import mavutil
the_connection = mavutil.mavlink_connection('udpin:localhost:14550')
the_connection.wait_heartbeat()
print("Heartbeat from system (system %u component %u)" % (the_connection.target_system, the_connection.target_component))
while 1:
attitude=the_connection.messages['ATTITUDE']
print("attitude: ",attitude)
i need to recover the attitude object as PyObject, the result of the last print is:
attitude: ATTITUDE {time_boot_ms : 1351674, roll : -0.006938610225915909, pitch : -0.009435104206204414, yaw : 1.8100472688674927, rollspeed : 0.0005244240164756775, pitchspeed : -0.0023000920191407204, yawspeed : 0.0002169199287891388}
i have a streaming of messages, so i need to call the connection and the to evaluate the result in a loop. so i tried to call the simple python commands as string, to open the connection and then access to the data. My C code is:
Py_Initialize();
PyRun_SimpleString("from pymavlink import mavutil\n"
"the_connection = mavutil.mavlink_connection('udpin:localhost:14550')\n"
"the_connection.wait_heartbeat()\n"
"print(\"Heartbeat from system (system %u component %u)\" % (the_connection.target_system, the_connection.target_component), flush=True)" );
PyObject* main_module=PyImport_AddModule("__main__");
PyObject* pdict = PyModule_GetDict(main_module);
PyObject* pdict_new = PyDict_New();
while (1) {
PyObject* pval = PyRun_String("the_connection.messages['ATTITUDE']", Py_single_input, pdict, pdict_new);
PyObject* repr = PyObject_Str(pval);
PyObject* str = PyUnicode_AsEncodedString(repr, "utf-8", "~E~");
const char* bytes = PyBytes_AS_STRING(str);
PyObject_Print(pval, stdout, 0);
printf(" end\n");
Py_XDECREF(repr);
}
Py_Finalize();
the result of this code is:
<pymavlink.dialects.v20.ardupilotmega.MAVLink_attitude_message object at 0x7fba218220>
None end
<pymavlink.dialects.v20.ardupilotmega.MAVLink_attitude_message object at 0x7fba218220>
None end
<pymavlink.dialects.v20.ardupilotmega.MAVLink_attitude_message object at 0x7fba218220>
None end
<pymavlink.dialects.v20.ardupilotmega.MAVLink_attitude_message object at 0x7fba218220>
None end
i've tried using a return of the object, but it didn't work
PyObject* pval = PyRun_String("return(the_connection.messages['ATTITUDE'])", Py_single_input, pdict, pdict_new);
i'm not expert of C/C++, is there a way to obtain the result in the right way?i'm not interested in a string format, i only need a way to use the result as c object
i'm using python 3.9, on a raspberry pi, gcc version is 10.2.1.
thank you
You want
PyRun_String("the_connection.messages['ATTITUDE']", Py_eval_input, pdict, pdict_new);
Py_eval_input treats it like the Python builtin eval (so what you're running must be an expression rather than a statement, which it is...).
In contrast, Py_single_input evaluates a single statement, but just returns None because a statement doesn't necessary returns anything. (In Python all expressions are statements, but not all statements are expressions). It's more akin to exec (but only deals with a single line).
Using "return(the_connection.messages['ATTITUDE'])" doesn't work because return is specifically designed to appear in a Python function.

What is visible in an executable built with Cython, in case non-compiled Python code is executed?

When we write Cython code (with types), this will eventually be compiled like C-compiled code and we can't recover the source code (except disassembling but then this is similar to disassembling C code), as seen in Are executables produced with Cython really free of the source code?.
But what happens when we write "normal Python code" (interpreted code without types) in a Cython .pyx file and we produce an executable? How much of it will be visible in the strings of the executable?
Example:
import bottle, random, json
app = bottle.Bottle()
#bottle.route('/')
def index():
return 'hello'
#bottle.route('/random')
def testrand():
return str(random.randint(0, 100))
#bottle.route('/jsontest')
def testjson():
x = json.loads('{ "1": "2" }')
return 'done'
bottle.run()
In this case I see in the test.c:
static const char __pyx_k_1_2[] = "{ \"1\": \"2\" }";
static const char __pyx_k_json[] = "json";
static const char __pyx_k_main[] = "__main__";
static const char __pyx_k_name[] = "__name__";
static const char __pyx_k_test[] = "__test__";
static const char __pyx_k_loads[] = "loads";
static const char __pyx_k_import[] = "__import__";
static const char __pyx_k_cline_in_traceback[] = "cline_in_traceback";
So in example 2, won't all these strings be easily visible in the executable?
In general you won't be able to avoid having those strings in the resulting executable, this is just how python works - they are needed at the run time.
If we look at a simple C-code:
void do_nothing(){...}
int main(){
do_nothing();
return 0;
}
compile and link it statically. When the linker is done, the call of do_nothing (let's assume it is not inlined or optimized out) is just a jump to a memory-address - the name of the function is no longer needed and can be erased from the resulting executable.
Python works differently: there is no linker, we don't use raw memory-addresses during the run time to call some functionality, but use Python-machinery to find it for us given the name of the package/module and of the function - thus we need this information - the names - during the run time. And thus they must be provided during the runtime.
However, if you are game changing the produced c-file you could make the life of the "hacker" somewhat harder.
When there is a string needed for calling Python-functionality, this will result in the following code (e.g. import json):
static const char __pyx_k_json[] = "json";
static PyObject *__pyx_n_s_json;
static __Pyx_StringTabEntry __pyx_string_tab[] = {
...
{&__pyx_n_s_json, __pyx_k_json, sizeof(__pyx_k_json), 0, 0, 1, 1},
...
{0, 0, 0, 0, 0, 0, 0}
};
static CYTHON_SMALL_CODE int __Pyx_InitGlobals(void) {
if (__Pyx_InitStrings(__pyx_string_tab) < 0) __PYX_ERR(0, 1, __pyx_L1_error);
...
}
...
__pyx_t_1 = __Pyx_Import(__pyx_n_s_json, 0, 0); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 1, __pyx_L1_error)
so one could save "json" as "irnm" (every character shifted by -1) and then restore the real name during the run time before __Pyx_InitStrings is called in __Pyx_InitGlobals.
So now, just dumping the strings in exe would lead to nothing saying combination of characters. One even could go further and load the real names from somewhere after the program started, if this is worth the trouble.

Python C API free() errors after using Py_SetPath() and Py_GetPath()

I'm trying to figure out why I can't simply get and set the python path through its C API. I am using Python3.6, on Ubuntu 17.10 with gcc version 7.2.0. Compiling with:
gcc pytest.c `python3-config --libs` `python3-config --includes`
#include <Python.h>
int main()
{
Py_Initialize(); // removes error if put after Py_SetPath
printf("setting path\n"); // prints
Py_SetPath(L"/usr/lib/python3.6"); // Error in `./a.out': free(): invalid size: 0x00007fd5a8365030 ***
printf("success\n"); // doesn't print
return 0;
}
Setting the path works fine, unless I also try to get the path prior to doing so. If I get the path at all, even just to print without modifying the returned value or anything, I get a "double free or corruption" error.
Very confused. Am I doing something wrong or is this a bug? Anyone know a workaround if so?
Edit: Also errors after calling Py_Initialize();. Updated code. Now errors even if I don't call Py_GetPath() first.
From alk it seems related to this bug: https://bugs.python.org/issue31532
Here is the workaround I am using. Since you can't call Py_GetPath() before Py_Initialize(), and also seemingly you can't call Py_SetPath() after Py_Initialize(), you can add to or get the path like this after calling Py_Initialize():
#include <Python.h>
int main()
{
Py_Initialize();
// get handle to python sys.path object
PyObject *sys = PyImport_ImportModule("sys");
PyObject *path = PyObject_GetAttrString(sys, "path");
// make a list of paths to add to sys.path
PyObject *newPaths = PyUnicode_Split(PyUnicode_FromWideChar(L"a:b:c", -1), PyUnicode_FromWideChar(L":", 1), -1);
// iterate through list and add all paths
for(int i=0; i<PyList_Size(newPaths); i++) {
PyList_Append(path, PyList_GetItem(newPaths, i));
}
// print out sys.path after appends
PyObject *newlist = PyUnicode_Join(PyUnicode_FromWideChar(L":", -1), path);
printf("newlist = %ls\n", PyUnicode_AsWideCharString(newlist, NULL));
return 0;
}
[the below answer refers to this version of the question.]
From the docs:
void Py_Initialize()
Initialize the Python interpreter. In an application embedding Python, this should be called before using any other Python/C API functions; with the exception of Py_SetProgramName(), Py_SetPythonHome() and Py_SetPath().
But the code you show does call Py_GetPath() before it calls Py_Initialize();, which it per the above paragraph implicitly should not.

Get Python to look in different location for Lib using Py_SetPath()

I have embedded Python in an application, foo.exe. When it runs, the Python is invoked and immediately looks for Lib. The only way I can get it to work is to place Lib (Python's Directory Library of modules) in the location as foo.exe.
Is there a way I can redirect Python to look somewhere else, such as Python/Lib ? I am not able to change PATH (This is windows) and I don't want to hack the Python source code.
Basically, I cannot get Py_SetPath() to work, and I have not been able to find any practical examples on the internet.
Update:
OK, this works:
#define MYMAXPATHLEN 1000
static wchar_t progpath[MYMAXPATHLEN + 1];
wchar_t* pdir = L"\\My_New_Location\\Python\\Lib";
wchar_t* pdelim = L";";
wchar_t* pypath = NULL;
GetModuleFileNameW(NULL, progpath, MYMAXPATHLEN);
reduce(progpath);
wcscat(progpath,pdir);
// I get the present module path and add the extra dirs to access Lib code
wcscat(progpath, pdelim); // I add a path delimiter
pypath = Py_GetPath();
wcscat(progpath, pypath);
// I add the paths that Py_GetPath() produces.
Py_SetPath(progpath);
Py_Initialize();
I also call Py_SetProgramName(); AFTER Py_Initialize(); I am not sure if all this extra stuff is needed, but smaller solutions seem to fail.
It seems that calling Py_SetProgamName() AFTER the initialize is very important to having the embedding call working properly.
Before importing the library, run the following line:
sys.path.append('C:\path to Lib')
Details can be found here.
I got it to work (on Linux) doing the following:
// method to inspect PyObjects
static void reprint(PyObject *obj)
{
PyObject* repr = PyObject_Repr(obj);
PyObject* str = PyUnicode_AsEncodedString(repr, "utf-8", "~E~");
const char *bytes = PyBytes_AS_STRING(str);
printf("REPR: %s\n", bytes);
Py_XDECREF(repr);
Py_XDECREF(str);
}
.
int main()
{
// for our manually compiled and installed usr-local Python 3.6
//#define PATH L"/usr/local/bin:/usr/local/lib:/usr/local/lib/python3.6/lib-dynload"
//#define PREFIX L"/usr/local"
//#define EXEC_PREFIX L"/usr/local"
//#define FULL_PROG_PATH L"/usr/local/bin/python3.6"
// for apt installed Python 3.7
//#define PATH L"/usr/bin:/usr/lib:/usr/lib/python3.7/lib-dynload"
//#define PREFIX L"/usr"
//#define EXEC_PREFIX L"/usr"
//#define FULL_PROG_PATH L"/usr/bin/python3.7"
// for venv (which uses the /usr/local/lib python 3.6)
#define PATH L"/home/me/venv_dir/bin:/home/me/venv_dir/lib:/usr/local/lib/python3.6:/usr/local/lib/python3.6/lib-dynload"
#define PREFIX L"/home/me/venv_dir"
#define EXEC_PREFIX L"/usr/local"
#define FULL_PROG_PATH L"/usr/local/bin/python3.6"
// ------------------------------------------------
#define CHANGE_THE_INTERPRETER
#ifdef CHANGE_THE_INTERPRETER
// TODO : Look at using this: https://www.python.org/dev/peps/pep-0587/
Py_SetPath(PATH);
// change the built-in prefix/exec-prefix, in place.
wchar_t* wpPrefix = Py_GetPrefix();
wchar_t* wpExecPrefix = Py_GetExecPrefix();
wchar_t* wpProgramFullPath = Py_GetProgramFullPath();
wcscpy (wpPrefix, PREFIX);
wcscpy (wpExecPrefix, EXEC_PREFIX);
wcscpy (wpProgramFullPath, FULL_PROG_PATH);
#endif //CHANGE_THE_INTERPRETER
// inspect the environment variables. With the #define commented out above the "defaults" appear as indicated to the right hand side
wchar_t* xx; //defaults
xx = Py_GetPrefix(); //<prefix> L"/usr/local"
xx = Py_GetExecPrefix(); //<exec_prefix> L"/usr/local"
xx = Py_GetPath(); //L"/usr/local/lib/python36.zip:/usr/local/lib/python3.6:/usr/local/lib/python3.6:/usr/local/lib/python3.6/lib-dynload"
xx = Py_GetProgramName(); //L"python3"
xx = Py_GetPythonHome(); //null
xx = Py_GetProgramFullPath(); //<progpath> L"/usr/local/bin/python3"
Py_Initialize();
some extra bits
int x2 = PyRun_SimpleString ("import site; print (site.getsitepackages())");
int x3 = PyRun_SimpleString ("import datetime");
int x4 = PyRun_SimpleString ("import numpy as np");
//inspect sys info
PyObject* sys_executable = PySys_GetObject((char*)"executable"); reprint(sys_executable);
PyObject* sys_version = PySys_GetObject((char*)"version"); reprint(sys_version);
PyObject* sys_realPrefix = PySys_GetObject((char*)"real_prefix"); reprint(sys_realPrefix);
PyObject* sys_basePrefix = PySys_GetObject((char*)"base_prefix"); reprint(sys_basePrefix);
Some points to note:
If you look at the Python module getpath.c you will see the following buffers:
static wchar_t prefix[MAXPATHLEN+1];
static wchar_t exec_prefix[MAXPATHLEN+1];
static wchar_t progpath[MAXPATHLEN+1];
Methods such as Py_GetProgramFullPath perform as follows:
if (!module_search_path)
calculate_path();
return progpath;
...so it is possible to use those methods to obtain the buffer pointers and wcscpy the values directly into the buffers. Note this is currently only possible with getpath being implemented in this fashion!
lib-dynload is needed to ensure that some modules (e.g. datetime) can be pulled in
Also note this approach ensures that the .../python3.x/encodings directory can be found, which prevents a runtime error within Py_Initialize

Problems using embedded Python in a C++ application

I'm trying to embed Python in my C++ application, somewhat akin to the method found here in section 1.4:
https://docs.python.org/3.5/extending/embedding.html
The synopsis of the problem I'm having is that I can't get the C++ application to work with .py files that import the 'emb' module, ie, the Python extension module that's written into the C++ code.
I have a Python file, testmod.py:
import emb
# define some functions
def printhello(input):
emb.numargs()
return 2
def timesfour(input):
print(input * 4)
In my C++ application, I have this code which works:
PyImport_AppendInittab("emb", &(mynamespace::PyInit_emb) );
Py_Initialize();
PyObject *globals = PyModule_GetDict(PyImport_AddModule("__main__"));
PyObject *testModule = PyImport_ImportModule("emb");
PyObject* pFunc = PyObject_GetAttrString(testModule, "numargs");
After this, pFunc is non-NULL; things look good. So I think the 'embedded module' is fine.
If I change the last two lines from above to:
PyObject* testModule = PyImport_ImportModule("testmod");
PyObject* pFunc = PyObject_GetAttrString(testModule, "printhello");
This also works fine, provided the line emb.numargs() is removed from testmod.py Once I add that line, and re-run the C++ application, testModule becomes NULL, which means something has gone wrong.
Any ideas?
Is this the way this capability is supposed to be used?

Categories