How to use basic string operations with PyObject* strings? - python

I would not like to keep converting every Python String Object from PyObject* to std::string or char* with PyUnicode_DecodeUTF8 and PyUnicode_AsUTF8 because it is an expensive operation.
On my last question How to extend/reuse Python C Extensions/API implementation?, I managed to use the Python open function, to directly give me a PyObject* string. Once doing that, it is very simple to send the string back to the Python program because I can just pass its PyObject* pointer back, instead of doing a full char-by-char copy as PyUnicode_DecodeUTF8 or PyUnicode_AsUTF8 do.
On the regex implementation of CPython API, I can find a function like this:
static void* getstring(PyObject* string, Py_ssize_t* p_length,
int* p_isbytes, int* p_charsize,
Py_buffer *view)
{
/* given a python object, return a data pointer, a length (in
characters), and a character size. return NULL if the object
is not a string (or not compatible) */
/* Unicode objects do not support the buffer API. So, get the data directly. */
if (PyUnicode_Check(string)) {
if (PyUnicode_READY(string) == -1)
return NULL;
*p_length = PyUnicode_GET_LENGTH(string);
*p_charsize = PyUnicode_KIND(string);
*p_isbytes = 0;
return PyUnicode_DATA(string);
}
/* get pointer to byte string buffer */
if (PyObject_GetBuffer(string, view, PyBUF_SIMPLE) != 0) {
PyErr_SetString(PyExc_TypeError, "expected string or bytes-like object");
return NULL;
}
*p_length = view->len;
*p_charsize = 1;
*p_isbytes = 1;
if (view->buf == NULL) {
PyErr_SetString(PyExc_ValueError, "Buffer is NULL");
PyBuffer_Release(view);
view->buf = NULL;
return NULL;
}
return view->buf;
}
It does not seem to be using PyUnicode_DecodeUTF8 or PyUnicode_AsUTF8 to work with the PyObject* coming from the Python Interpreter.
How can I use basic string operations with PyObject* strings without conversion then to std::string or char*?
I would call basic operations the following examples: (Just for exemplifying, I am using Py_BuildValue to build a PyObject* string from a string as a char* or std::string)
static PyObject* PyFastFile_do_concatenation(PyFastFile* self)
{
PyObject* hello = Py_BuildValue( "s", "Hello" );
PyObject* word = Py_BuildValue( "s", "word" );
// I am just guessing the `->value` property
PyObject* hello_world = hello->value + word->value;
hello_world; // return the `PyObject*` string `Hello word`
}
static PyObject* PyFastFile_do_substring(PyFastFile* self)
{
PyObject* hello = Py_BuildValue( "s", "Hello word" );
PyObject* hello_world = hello->value[5:];
hello_world; // return the `PyObject*` string `word`
}
static PyObject* PyFastFile_do_contains(PyFastFile* self)
{
PyObject* hello = Py_BuildValue( "s", "Hello word" );
if( "word" in hello->value ) {
Py_BuildValue( "p", true ); // return the `PyObject*` boolean `true`
}
Py_BuildValue( "p", false ); // return the `PyObject*` boolean `false`
}

Related

How do you correctly pass a pointer between calls with Python C-API

I'm writing a python + c module, and I'm trying to pass a pointer to a certain struct I need. I'm using PyCapsule to encapsulate the pointer, but I'm having problems when retrieving the pointer from it.
The C functions used are like:
static PyObject *
spam_new (PyObject *self, PyObject *args)
{
unsigned int number;
struct spam *pointer;
if (!PyArg_ParseTuple(args, "I", &number)) {
return NULL;
}
state = (struct spam*) malloc(sizeof (struct spam));
if (state == NULL) {
return NULL;
}
spam_init(*pointer, number);
return PyCapsule_New((void*) pointer, "spam", &spam_destroy);
}
static PyObject *
spam_get (PyObject *self, PyObject *args)
{
PyObject *capsule, *result;
void *raw_pointer;
struct spam *pointer;
unsigned long long int number;
if (!PyArg_ParseTuple(args, "OK", capsule, &number)) {
return NULL;
}
printf("[DEBUG] Number: %llu\n", number);
printf("[DEBUG] Capsule pointer: %p\n", capsule);
raw_pointer = PyCapsule_GetPointer(capsule, "spam");
if (raw_pointer == NULL) {
return NULL;
}
pointer = (struct spam*) raw_pointer;
.
.
.
}
They are both declared with METH_VARARGS.
When in python, custom.new(1) returns a capsule as expected, which I store in a variable c.
When calling custom.get(c, 14) python crashes at the PyCapsule_GetPointer function call. Both prints show the same (14), meaning that PyArg_ParseTuple is not getting the capsule passed as a parameter.
For security reasons passing the pointer as a long is not an option.
Thanks.
In the Python documentation is stated that the "O" format string will try to get a pointer to a PyObject (a PyObject*), not a PyObject.
Therefore when using PyArg_ParseTuple to get a PyObject* you have to pass a pointer to a PyObject*.
The provided code was fixed by adding a & to the capsule in the line
if (!PyArg_ParseTuple(args, "OK", &capsule, &number)) {
Fixed thanks to Davis Herring's comment.

Python c++ wrapper : Convert multi-type struct to it's python representation (preferable dictionary)

I've chosen setuptools to use my C/C++ code from python scripts.
One of the phases when building such wrapper is to convert the C/C++ return value into python object.
So far I was able to convert simple primitive values and list of primitive values. However, I wish to extend it to multi-value struct, as shown in the example below.
My main challenge right now is how do I create the python struct representation (PyObject* ret = PyList_New(...);) and I do I set it's values properly with the different types.
I tried to create list of items from the same types (such as std::vector<float>) and manage to set the values properly using Py_BuildValue and PyList_SetItem, but I'm still struggling with the multi types...
typedef struct _fileParams
{
bool valid;
int index;
std::string key;
std::value value;
} fileParams;
FileDataBase * db;
static PyObject *searchFileInDB(PyObject *self, PyObject *args)
{
if (db == NULL)
{
PyErr_SetString(PyExc_RuntimeError, "DB could not be initialized");
return NULL;
}
char* fileName = NULL;
int fileNameSize = 0;
PyArg_ParseTuple(args, "s#", &fileName, &fileNameSize);
try
{
fileParams p;
bool res = db->lookup(fileName, fileNameSize, p);
PyObject* ret = PyList_New(...);
if (res)
{
PyObject* r1 = Py_BuildValue("b", p.valid);
PyList_SetItem(ret, 0, r1);
PyObject* r2 = Py_BuildValue("i", p.index);
PyList_SetItem(ret, 1, r2);
PyObject* r1 = Py_BuildValue("s", p.key);
PyList_SetItem(ret, 2, r3);
PyObject* r1 = Py_BuildValue("s", p.value);
PyList_SetItem(ret, 3, r4);
}
return ret;
} catch (...) {
PyErr_SetString(PyExc_RuntimeError, "failed with C exception");
return NULL;
}
}
You probably want to look into the Dictionary Object: Dictionary Objects
I'm guessing you'd want to set values with PyDict_SetItemString() as per that doc.
HTH

python3 str object cannot pass PyUnicode_Check

I was writing a C extension function, which was supposed to accept a str object as argument. The code is shown below:
static PyObject *py_print_chars(PyObject *self, PyObject *o) {
PyObject *bytes;
char *s;
if (!PyUnicode_Check(o)) {
PyErr_SetString(PyExc_TypeError, "Expected string");
return NULL;
}
bytes = PyUnicode_AsUTF8String(o);
s = PyBytes_AsString(bytes);
print_chars(s);
Py_DECREF(bytes);
Py_RETURN_NONE;
}
But as I test the module in python3 console, I find str objects can't pass the PyUnicode_Check:
>>> from sample2 import *
>>> print_chars('Hello world')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Expected string
As far as I know, Python 3’s str() type is called PyUnicode in C and the C code above was written in refer to "python cookbook3" Char15.13. I just can't work out the problem. Can anybody tell me what's wrong with my code.
Here is what "python cookbook3" said:
If for some reason, you are working directly with a PyObject * and can’t use PyArg_ParseTuple(), the following code samples show how you can check and extract a suitable char * reference, from both a bytes and string object:
/* Some Python Object (obtained somehow) */
PyObject *obj;
/* Conversion from bytes */
{
char *s;
s = PyBytes_AsString(o);
if (!s) {
return NULL; /* TypeError already raised */
}
print_chars(s);
}
/* Conversion to UTF-8 bytes from a string */
{
PyObject *bytes;
char *s;
if (!PyUnicode_Check(obj)) {
PyErr_SetString(PyExc_TypeError, "Expected string");
return NULL;
}
bytes = PyUnicode_AsUTF8String(obj);
s = PyBytes_AsString(bytes);
print_chars(s);
Py_DECREF(bytes);
}
And the whole code:
#include "Python.h"
#include "sample.h"
static PyObject *py_print_chars(PyObject *self, PyObject *o) {
PyObject *bytes;
char *s;
if (!PyUnicode_Check(o)) {
PyErr_SetString(PyExc_TypeError, "Expected string");
return NULL;
}
bytes = PyUnicode_AsUTF8String(o);
s = PyBytes_AsString(bytes);
print_chars(s);
Py_DECREF(bytes);
Py_RETURN_NONE;
}
/* Module method table */
static PyMethodDef SampleMethods[] = {
{"print_chars", py_print_chars, METH_VARARGS, "print character"},
{ NULL, NULL, 0, NULL}
};
/* Module structure */
static struct PyModuleDef samplemodule = {
PyModuleDef_HEAD_INIT,
"sample",
"A sample module",
-1,
SampleMethods
};
/* Module initialization function */
PyMODINIT_FUNC
PyInit_sample2(void) {
return PyModule_Create(&samplemodule);
}
If the goal is to accept exactly one argument, the function should be declared as METH_O, not METH_VARARGS; the former passes along the single argument without wrapping, the latter wraps in a tuple which would need to be unpacked or parsed to get the PyUnicode* inside.

SWIG adds lines to delete variables that don't exist

I have a C++ class that is able to output strings in normal ASCII or wide character format. I am using SWIG (version 3.0.5) to create the bindings for Python. The bindings have to work under Windows (32-bit and 64-bit) and Linux (64-bit). I have written a custom typemap to get strings from Python to my C++ class: / This typemap is used for getting strings from Python to the C++ class
%typemap(in) const myNamespace::myStringType&
{
// Custom input conversion #7
const char* pChars = "";
PyObject* pyobj = $input;
if(PyString_Check(pyobj))
{
pChars = PyString_AsString( pyobj );
$1 = new myNamespace::myStringType( pChars );
}
else if (PyUnicode_Check( pyobj ))
{
PyObject* tmp = PyUnicode_AsUTF8String( pyobj );
pChars = PyString_AsString( tmp );
$1 = new myNamespace::myStringType( pChars );
}
else
{
std::string strTemp;
int rrr = SWIG_ConvertPtr(pyobj, (void **) &strTemp, $descriptor(String), 0);
if (!SWIG_IsOK(rrr))
SWIG_exception_fail(SWIG_ArgError(rrr), "Expected a String "
"in method '$symname', argument $argnum of type '$type'");
$1 = new myNamespace::myStringType( strTemp );
}
}
This typemap works fine for the 32-bit and 64-bit normal character builds, but a problem arises when I try to build for wide-character. In the wide character builds, I need to include the following SWIG include files in my interface file:
%include "std_wiostream.i"
%include "std_wsstream.i"
When these include files are used with the above typemap we get spurious lines of code inserted into the wrapper like so:
if (SWIG_IsNewObj(res2)) delete arg2;
Here's an example of a complete wrapper function produced by SWIG:
SWIGINTERN PyObject *_wrap_timeStampFromStr(PyObject *SWIGUNUSEDPARM(self), PyObject *args) {
PyObject *resultobj = 0;
myNamespace::myStringType *arg1 = 0 ;
PyObject * obj0 = 0 ;
slx::SlxU64 result;
if (!PyArg_ParseTuple(args,(char *)"O:timeStampFromStr",&obj0)) SWIG_fail;
{
// Custom input conversion #7
const char* pChars = "";
PyObject* pyobj = obj0;
if(PyString_Check(pyobj))
{
pChars = PyString_AsString( pyobj );
arg1 = new myNamespace::myStringType( pChars );
}
else if (PyUnicode_Check( pyobj ))
{
PyObject* tmp = PyUnicode_AsUTF8String( pyobj );
pChars = PyString_AsString( tmp );
arg1 = new myNamespace::myStringType( pChars );
}
else
{
std::string strTemp;
int rrr = SWIG_ConvertPtr(pyobj, (void **) &strTemp, SWIGTYPE_String, 0);
if (!SWIG_IsOK(rrr))
SWIG_exception_fail(SWIG_ArgError(rrr), "Expected a String "
"in method 'timeStampFromStr', argument 1 of type 'myNamespace::myStringType const &'");
arg1 = new myNamespace::myStringType( strTemp );
}
}
result = (slx::SlxU64)slx::timeStampFromStr((std::basic_string< char,std::char_traits< char >,std::allocator< char > > const &)*arg1);
resultobj = SWIG_From_unsigned_SS_long_SS_long(static_cast< unsigned long long >(result));
if (SWIG_IsNewObj(res2)) delete arg2;
return resultobj;
fail:
if (SWIG_IsNewObj(res2)) delete arg2;
return NULL;
}
The code fails to compile because the res2 and arg2 variables are never defined in the wrapper code.
If I leave out the SWIG includes, the extra lines of code disappear but then I won't have the support for wide character iostream that I need.
Currently the work around is to manually delete these lines of code but obviously this will not work with automated builds in Windows and Makefiles in Linux.
Does anyone know why this happens? I believe my typemap must have an error in it that is producing the extra lines of code, but again, this is ONLY happening when the SWIG include files for wide characters iostreams are included.
Any ideas would be greatly appreciated. Thanks in advance.
I had the same issue with string const &, it seems that Swig adds a wrong freearg statement from std_string.i. The workaround is to override the wrongly added delete:
%typemap(freearg) string const & {}
%typemap(freearg) const myNamespace::myStringType& {}

Py_RunString Returns Only None Object

I'm using Python 3.4.3 embded in my c++ project, When I Call Py_RunString it always returns None object.
here is my code
#include <Python.h>
#include <string>
int main(){
//Initialize the python interpreter
Py_Initialize();
//create new dictionary containing both global and local definitions
PyObject *globals = PyDict_New();
PyObject *locals = PyDict_New();
//Set the build in definitions to the global dictionary: eg:len, str ,.. funtions
PyDict_SetItemString(globals, "__builtins__", PyEval_GetBuiltins());
//evaluate some python code and get the result, here is my issue, always None
PyObject *string_result = PyRun_StringFlags(
"1 + 1" /*or what ever python code*/
,
Py_file_input, globals, locals, NULL);
//check whether the python code caused any Exception and print it
if (PyErr_Occurred()) {
PyErr_Print(); PyErr_Clear(); return 1;
}
else {
//if no Exceptions then try To Get string represiation of the python object, But it always None
auto str = PyToStr(string_result);
printf("python result is %s",str);
}
return 0;
}
and here is my Function that convert any python object to c++ std::wstring
std::wstring PyToStr(PyObject* Object)
{
//it is equivalent to python code : str(Object)
PyObject* objectsRepresentation = PyObject_Str(Object);
//convert Python String Object to C++ wchar_t*
const wchar_t* ws = PyUnicode_AsUnicode(objectsRepresentation);
if (ws)
return ws;
//if ws is NULL it could not be converted implicitly to std::wstring
return L"";
}
If you just want to evaluate a single line you can use Py_eval_input instead of Py_file_input. It should then return the evaluated value.

Categories