Calling thread with char pointer function and std::string produces different results - python

I have a function that returns a char pointer called loop_p and I call it many times on my main_thread like this to pass it to the py_embed thread:
HANDLE handle;
SENDTOPY *cmd=new SENDTOPY();
char* msg=loop_p(ac);
char *argv[4]={"PythonPlugIn2","bridge","test_callsign",msg};
cmd->argc=4;
for(int i = 0; i < NUM_ARGUMENTS; i++ )
{
cmd->argv[i] = argv[i];
}
handle=(HANDLE) _beginthread(py_embed,0,(void*)cmd);}
where SENDTOPY is a struct:
typedef struct{
int argc;
char *argv[4];
}SENDTOPY;
The message it sent to python like this and python receives it well:
SENDTOPY *arg=(SENDTOPY*)data;
pArgs2=Py_BuildValue("(s)",arg->argv[4]);
pValue2 = PyObject_CallObject(pFunc, pArgs2);
In order to avoid having memory allocation problems i modified the loop_p function to a function that returns a std::string. I then call that string in the main_threadwith some modifications:
...
std::string msg_python=loop_p(ac);
const char * msg2=msg_python.data();
char *argv[3]={"PythonPlugIn2","bridge","test_callsign"};
cmd->argc=3;
cmd->msg=msg2;
for(...
...
and i modify the struct SENDTOPYto this:
typedef struct{
int argc;
char *argv[3];
const char* msg;
}SENDTOPY;
I print it to a textfile in the main_thread and the message before and after the modifications are equal. But in the py_embedthread the const char is no longer what is was, is just a bunch of gibberish. What am I doing wrong?
Thank you in advance.
Edit:
loop_p code
std::string CNewDisplay::loop_p(int ac){
std::string res("Number of Aircrafts\nHour of simulation\n\n");
for (...
....
//Route
textfile<<fp.GetRoute()<<endl;
std::string route=fp.GetRoute();
std::replace(route.begin(),route.end(),' ',',');
res+=route;
res.append(",\n");
res.append("\n\n");
};
return res;
}

It appears to me that you are storing a pointer to the internal guts of a temporary string object created on the stack. If you make string static, then the string's guts will remain valid throughout program execution, and you can safely store pointer to string guts:
static std::string msg_python; // survives beyond local scope
msg_python=loop_p(ac); // set string to loop_p return value
const char *msg2=msg_python.c_str(); // get ptr each time since it could change
Also, ensure that you use .c_str() to get your c-style char string pointer so that you are assured the string is null-terminated. Using .data() does not guarantee null termination.

Related

Pass Python bytes to C/C++ function using swig?

How do you pass a bytes value from Python (like data loaded from a file with open('my file.dat', 'rb').read()) to a C/C++ function using swig?
When I try using char * or uint8_t * and then a size parameter it gives me an error like this:
TypeError: in method 'processData', argument 3 of type 'char *'
I've tried using %pybuffer_mutable_binary and %pybuffer_binary but they don't seem to change the definition of the wrapper and I still get the same error.
Without code can't diagnose what is wrong, but likely you didn't declare %pybuffer lines before the function definitions. If declared after the generated wrappers won't use them when processing the functions, which would explain "they don't seem to change the definition of the wrapper".
Here's a working example. Note that passing an immutable item to a function that modifies the string will crash Python. It would be nice if the commands from pybuffer.i type-checked the Python object for mutability. If you want that don't use pybuffer.i.
test.i
%module test
%{
#include <stdlib.h>
#include <string.h>
%}
%include <pybuffer.i>
%pybuffer_mutable_string(char* str1)
%pybuffer_string(const char* str2)
%pybuffer_mutable_binary(char* str3, size_t size)
%pybuffer_binary(const char* str4, size_t size)
%inline %{
void funcms(char *str1) {
strupr(str1);
}
size_t funcs(const char *str2) {
return strlen(str2);
}
void funcmb(char* str3, size_t size) {
memset(str3,'A',size);
}
size_t funcb(const char* str4, size_t size) {
size_t tmp = 0;
for(size_t i = 0; i < size; ++i)
tmp += str4[i];
return tmp % 256;
}
%}
Demo:
>>> import test
>>> b=bytearray(b'abc') # mutable string (nul-terminated)
>>> test.funcms(b)
>>> b
bytearray(b'ABC')
>>> test.funcs(b'abc') # immutable string (nul-terminated)
3
>>> b=bytearray(b'ab\0cd\0ef') # mutable data (includes nulls)
>>> test.funcmb(b)
>>> b
bytearray(b'AAAAAAAA')
>>> test.funcb(b'ab\0cd\0ef') # immutable data (compute byte checksum)
85
>>> sum(b'ab\0cd\0ef')%256 # verify result
85
I think the best way to do this is a type map using the Python buffer interface. This passes a pointer to your data to the C/C++ function without any copying of data. For example:
%typemap(in, numinputs=1) (const char *data, unsigned long int size) {
Py_buffer view;
if (PyObject_CheckBuffer($input) != 1) {
PyErr_SetString(
PyExc_TypeError,
"in method '$symname', argument $argnum does not support the buffer interface");
SWIG_fail;
}
if (PyObject_GetBuffer($input, &view, PyBUF_SIMPLE) != 0) {
PyErr_SetString(
PyExc_TypeError,
"in method '$symname', argument $argnum does not export a simple buffer");
SWIG_fail;
}
$1 = view.buf;
$2 = view.len;
PyBuffer_Release(&view);
}
%typemap(doc) const char *data, unsigned long int size "$1_name: readable buffer (e.g. bytes)"

glibc rand() doesn't work with python but works fine in online compiler

I'm trying to embed a glibc rand() function into python. My purpose is to predict the next values of rand() basing on the assumption that is uses an LCG. I've read that it only uses LCG if it's operating on an 8-byte state, so I'm trying to use the initstate method to set that.
I have the following code in my glibc_random.c file:
#include <stdlib.h>
#include "glibc_random.h"
void initialize()
{
unsigned int seed = 1;
char state[8];
initstate(seed, state, sizeof(state));
}
long int call_glibc_random()
{
long r = rand();
return r;
}
And the following in the respective glibc_random.h:
void initialize();
long int call_glibc_random();
Code in python:
def test():
glibc_random.initialize()
number_of_initial_values = 10
number_of_values_to_predict = 5
initial_values = []
for i in range(number_of_initial_values):
initial_values.extend([glibc_random.call_glibc_random()])
When invoked in python, the code above keeps adding 12345 to my list of initial_values. However, when running the C code in www.onlinegdb.com I get a more reasonable list of numbers (11035275900, 3774015750, etc.). I can only reproduce my problem in onlinegdb when I use setstate(state) after the call to initstate(seed, state, sizeof(state)) in the initialize() method.
Can anybody suggest what is wrong here? I'm using swig and python2.7, btw.
I have never used initstate before but
void initialize()
{
unsigned int seed = 1;
char state[8];
initstate(seed, state, sizeof(state));
}
seems wrong to me. state is a local variable of initialize and when the
function ends, the variable ceases to exit, so rand() might give you garbage
because it is trying to access a pointer that is no longer valid anymore.
You can declare state as static so that it doesn't cease to exist when
initialize end,
void initialize()
{
unsigned int seed = 1;
static char state[8];
initstate(seed, state, sizeof(state));
}
or make state a global variable.
char state[8];
void initialize()
{
unsigned int seed = 1;
initstate(seed, state, sizeof(state));
}

Running python script that should accepts arguments from a C program

how can i specify arguments to a python script from C program , where this arguments must be passed while the calling python script inside the c program
This C code is able to run python script successfully ,but how can i pass arguments as well that can be accepted by python script?
#include <stdio.h>
#include <string.h>
#include <python2.7/Python.h>
#include <getopt.h>
int main (int argc, char * argv[])
{
char command[50] = "python2.7 /alok/analyze.py";
system(command);return(0);
}
From the commends, i saw that your real problem is, how to make a string from 2 given strings.
What you can do is: Write a function that concatenate 2 strings in to one.
For that you need to get the length of both strings, then add this lengths (also add 1 for the '\0'-Byte and check for overflows), then use malloc() to reserve buffer space for the new string and copy both strings to this buffer.
You can do it like this (do not just use this, it is not very well testet, and the error handling is not great):
void die(const char *msg)
{
fprintf(stderr,"[ERROR] %s\n",msg);
exit(EXIT_FAILURE);
}
char *catString(const char *a, const char *b)
{
//calculate the buffer length we need
size_t lena = strlen(a);
size_t lenb = strlen(b);
size_t lenTot = lena+lenb+1; //need 1 extra for the end-of-string '\0'
if(lenTot<lena) //check for overflow
{
die("size_t overflow");
}
//reseve memory
char *buffer = malloc(lenTot);
if(!buffer) //check if malloc fail
{
die("malloc fail");
}
strcpy(buffer,a); //copy string a to the buffer
strcpy(&buffer[lena],b);//copy string b to the buffer
return buffer;
}
After this you can use this function to create the string you need from your static string "python2.7 ./myScript " and argv[1]
int main(int argc, char **argv)
{
//without a argument we should not call the python script
if(argc<2)
{
die("need at least one argument");
}
//make one string to call system()
char *combined = catString("python2.7 ./myScript ",argv[1]);
printf("DEBUG complete string is '%s'\n",combined);
int i = system(combined);
//we must free the buffer after use it or we generate memory leaks
free(combined);
if(i<0)
{
die("system()-call failed");
}
printf("DEBUG returned from system()-call\n");
return EXIT_SUCCESS;
}
You need the extra space in "python2.7 ./myScript ", without you would get "python2.7 ./myScriptArgumentToMain".
And with this your caller can execute any code he like because we do not escape argv[1], so a call to your program with yourProgram "argumentToPython ; badProgram argumentToBadProgram" will execute badProgram also which you do not want (in the most cases)

What's the difference between tp_clear, tp_dealloc and tp_free?

I have a custom python module for fuzzy string search, implementing Levenshtein distance calculation, it contains a python type, called levtree which has two members a pointer to a wlevtree C type (called tree) which does all the calculations and a PyObject* pointing to a python-list of python-strings, called wordlist. Here is what I need:
-when I create a new instance of levtree I use a constructor which takes a tuple of strings as its only input (and it is the dictionary in which the instance will perform all the searches), this constructor will have to create a new instance of wordlist into the new instance of levtree and copy the content of the input tuple into the new instance of wordlist. Here is my first code snippet and my first question:
static int
wlevtree_python_init(wlevtree_wlevtree_obj *self, PyObject *args, PyObject *kwds)
{
int numLines; /* how many lines we passed for parsing */
wchar_t** carg; /* argument to pass to the C function*/
unsigned i;
PyObject * strObj; /* one string in the list */
PyObject* intuple;
/* the O! parses for a Python object (listObj) checked
to be of type PyList_Type */
if (!(PyArg_ParseTuple(args, "O!", &PyTuple_Type, &intuple)))
{
return -1;
}
/* get the number of lines passed to us */
numLines = PyTuple_Size(intuple);
carg = malloc(sizeof(char*)*numLines);
/* should raise an error here. */
if (numLines < 0)
{
return -1; /* Not a list */
}
self->wordlist = PyList_New(numLines);
Py_IncRef(self->wordlist);
for(i=0; i<numLines; i++)
{
strObj = PyTuple_GetItem(intuple, i);
//PyList_Append(self->wordlist, string);
PyList_SetItem(self->wordlist, i, strObj);
Py_IncRef(strObj);
}
/* iterate over items of the list, grabbing strings, and parsing
for numbers */
for (i=0; i<numLines; i++)
{
/* grab the string object from the next element of the list */
strObj = PyList_GetItem(self->wordlist, i); /* Can't fail */
/* make it a string */
if(PyUnicode_Check(strObj))
{
carg[i] = PyUnicode_AsUnicode( strObj );
if(PyErr_Occurred())
{
return -1;
}
}
else
{
strObj = PyUnicode_FromEncodedObject(strObj,NULL,NULL);
if(PyErr_Occurred())
{
return -1;
}
carg[i] = PyUnicode_AsUnicode( strObj );
}
}
self->tree = (wlevtree*) malloc(sizeof(wlevtree));
wlevtree_init(self->tree,carg,numLines);
free(carg);
return 0;
}
Do I have to call Py_IncRef(self->wordlist); after self->wordlist = PyList_New(numLines); or it is redundant because references are already incremented in PyList_new?
Then I have the same doubt on PyList_SetItem(self->wordlist, i, strObj); and Py_IncRef(strObj);..
-when I destroy an instance of levtree i want to call the C function that frees the space occupied by tree, destroy wordlist and decrement all reference count on all the strings contained into wordlist.. Here is my tp_dealloc:
static void
wlevtree_dealloc(wlevtree_wlevtree_obj* self)
{
//wlevtree_clear(self);
if(self->tree!=NULL)
{
wlevtree_free(self->tree);
}
free(self->tree);
PyObject *tmp, *strObj;
unsigned i;
int size = PyList_Size(self->wordlist);
for(i=0; i<size; i++)
{
strObj = PyList_GetItem(self->wordlist, i);
Py_CLEAR(strObj);
}
Py_CLEAR(self->wordlist);
Py_TYPE(self)->tp_free((PyObject *)self);
}
Is it correct to make all the deallocation work here?
At the moment I don't have a tp_clear and a tp_free, do I need them?
My code at the moment works on allocation but not on deallocation because even though I can call init on the same python variable more than once, at the end of every python script (which works correctly) I get a "Segmentation Fault" which makes me think that something in the deallocation process goes wrong..
tp_clear is only needed if you implement cyclic garbage collection. It appears that this is not needed because you only maintain references to Python unicode objects.
tp_dealloc is called when the reference count of the object goes down to zero. This is where you destroy the object and its members. It should then free the memory occupied by the object by calling tp_free.
tp_free is where the memory for the object is freed. Implement this only if you implement tp_alloc yourself.
The reason for the separation between tp_dealloc and tp_free is that if your type is subclassed, then only the subclass knows how the memory was allocated and how to properly free the memory.
If your type is a subclass of an exisiting type, your tp_dealloc may need to call the tp_dealloc of the derived class, but that depends on the details of the case.
To summarize, it seems that you are handling object destruction correctly (except that you leak carg when exiting the function with an error).

OUT argument internally allocated to return an array of structs

I'm new to swig and I have the following function which i cant fix:
int get_list(IN const char * string, OUT struct entry ** results);
where struct entry is defined:
struct flux_entry
{
char * addr_str;
char cc[2];
};
the entry struct is properly converted to a python class.
I googled but couldn't find any explanation i could use.
I want to make it return a tuple of: (original get_list int return value, python list of entry python objects, based on the results buffer), but don't know how to convert the C entry to a python object in the argout code snippet.
I've managed to get thus far:
%typemap(argout) struct entry **
{
PyObject *o = PyList_New(0);
int i;
for(i=0; $1[i] ; i++)
{
PyList_Append(o, SWIG_HOW_TO_CONVERT_TO_PYOBJECT($1[i]));
}
$result = o;
}
what should i replace SWIG_HOW_TO_CONVERT_TO_PYOBJECT with?
passed results is supposed to be a pointer to a (struct entry *) type, set to NULL before calling get_list and should be set to an allocated array of struct entry * pointers. maybe a small wrapper function could make that easier?
the struct entry array is allocated within the C function using malloc, after calculating (inside get_list) how many elements are needed, and ends with a NULL pointer to indicate the end of the array.
i'd also like to make sure it's freed somewhere :)
thanks!
This should at least give you a starting point that works. I still wasn't sure how the data was returned, since to return an array of pointers so that the final one was NULL I'd think you'd need a struct entry ***, so I just set addr_str = NULL on the last one as a sentinel, and just put some dummy data partially based on the input string into the fields. Modify as needed to suit your needs:
%module example
// Insert the structure definition and function to wrap into the wrapper code.
%{
struct entry {
char* addr_str;
char cc[2];
};
int get_list(const char* string, struct entry** results)
{
*results = malloc(3 * sizeof(struct entry));
(*results)[0].addr_str = malloc(10);
strcpy((*results)[0].addr_str,"hello");
(*results)[0].cc[0] = string[0];
(*results)[0].cc[1] = string[1];
(*results)[1].addr_str = malloc(10);
strcpy((*results)[1].addr_str,"there");
(*results)[1].cc[0] = string[2];
(*results)[1].cc[1] = string[3];
(*results)[2].addr_str = NULL;
return 0;
}
%}
#include <typemaps.i>
// Define the structure for SWIG
struct entry {
char* addr_str;
char cc[2];
};
// Define a set of typemaps to be used for an output parameter.
// This typemap suppresses requiring the parameter as an input.
// A temp variable is created and passed instead.
%typemap(in,numinputs=0) struct entry **OUTPUT (struct entry* temp) %{
$1 = &temp;
%}
// Build a list of tuples containing the two entries from the struct.
// Append the new Python list object to the existing "int" result.
%typemap(argout) struct entry **OUTPUT {
int i = 0;
PyObject* out = PyList_New(0);
while((*$1)[i].addr_str != NULL)
{
//PyObject* t = PyTuple_New(2);
//PyTuple_SET_ITEM(t,0,PyBytes_FromString((*$1)[i].addr_str));
//PyTuple_SET_ITEM(t,1,PyBytes_FromStringAndSize((*$1)[i].cc,2));
//PyList_Append(out,t);
//Py_DECREF(t);
PyObject* s = SWIG_NewPointerObj(*$1+i,$descriptor(struct entry*),0);
PyList_Append(out,s);
Py_DECREF(s);
++i;
}
$result = SWIG_AppendOutput($result,out);
}
// Since a Python object was created and the data copied for each entry struct,
// free the memory returned in the structure.
//%typemap(freearg) struct entry **OUTPUT {
// int i=0;
// while((*$1)[i].addr_str != NULL) {
// free((*$1)[i].addr_str);
// ++i;
// }
// free(*$1);
//}
// Apply the OUTPUT typemap set to the "results" parameter.
%apply struct entry **OUTPUT {struct entry** results};
// Finally, define the function for SWIG
int get_list(const char* string, struct entry** results);
Demo (Python 3.3):
>>> import example
>>> example.get_list('abcd')
[0, [(b'hello', b'ab'), (b'there', b'cd')]]
Hope that helps.
Edit:
I commented out the tuple creation and just save the entry* proxy instead. This doesn't leak Python objects, but the memory malloced for use by an entry* is not freed. I'm not sure where to put that, although I'm experimenting with %extend.

Categories