Following the question here: Writing a Python script to print out an array of recs in lldb
I would like to be able to create a type summary for an array of a given struct in lldb. Problem is that I am not able to access array correctly through python-lldb. Some data is incorrect.
I have the following test code in C:
#include <stdio.h>
#include <stdlib.h>
struct Buffer
{
struct Buffer* next;
struct Buffer* prev;
};
struct Base
{
struct Buffer* buffers;
int count;
};
void fill(struct Buffer* buf, int count)
{
for (int i = 0; i < count; ++i)
{
struct Buffer t = {(void*)0xdeadbeef,(void*)i};
buf[i] = t;
}
}
void foo(struct Base* base)
{
printf("break here\n");
}
int main(int argc, char** argv)
{
int c = 20;
void* buf = malloc(sizeof (struct Buffer) * c);
struct Base base = {.buffers = buf, .count = c};
fill(base.buffers, base.count);
foo(&base);
return 0;
}
In lldb:
(lldb) b foo
(lldb) r
(lldb) script
>>> debugger=lldb.debugger
>>> target=debugger.GetSelectedTarget()
>>> frame=lldb.frame
>>> base=frame.FindVariable('base')
>>> buffers=base.GetChildMemberWithName('buffers')
Now, buffers should point to array of struct Buffer and I should be able to access each and every Buffer via the buffers.GetChildAtIndex function, but the data is corrupted in the first 2 items.
>>> print buffers.GetChildAtIndex(0,0,1)
(Buffer *) next = 0x00000000deadbeef
>>> print buffers.GetChildAtIndex(1,0,1)
(Buffer *) prev = 0x0000000000000000
>>> print buffers.GetChildAtIndex(2,0,1)
(Buffer) [2] = {
next = 0x00000000deadbeef
prev = 0x0000000000000002
}
Only the buffers[2] and up items are ok.
Why does print buffers.GetChildAtIndex(1,0,1) points to buffers[0].count item instead of buffers[1]?
What am I doing wrong?
GetChildAtIndex is trying to be a little over-helpful for your purposes here. It is in accord with the help, which says:
Pointers differ depending on what they point to. If the pointer
points to a simple type, the child at index zero
is the only child value available, unless synthetic_allowed
is true, in which case the pointer will be used as an array
and can create 'synthetic' child values using positive or
negative indexes. If the pointer points to an aggregate type
(an array, class, union, struct), then the pointee is
transparently skipped and any children are going to be the indexes
of the child values within the aggregate type. For example if
we have a 'Point' type and we have a SBValue that contains a
pointer to a 'Point' type, then the child at index zero will be
the 'x' member, and the child at index 1 will be the 'y' member
(the child at index zero won't be a 'Point' instance).
So really, buffers.GetChildAtIndex(2,0,1) should have returned "No Value". Either that or passing 1 for the allow-synthetic argument should turn off this peek-through behavior. In either case, this is a bug, please file it with http://bugreporter.apple.com.
In the mean time you should be able to get the same effect by walking your array by hand and using "SBTarget.CreateValueFromAddress to create the values. Start by getting the address of the array with buffers.GetAddress(); and the size of Buffers by getting the type of buffers, getting its Pointee type & calling GetByteSize on that. Then just increment the address by the size count times to create all the values.
Related
I have created a Python class with an attribute that is a Swig object (which happens to be a wrapper of a C structure). I want to be able to create copies of that class, e.g., by defining a __copy__ method, that contain independent copies of the Swig object (using the copy modules' copy class just creates a pointer to the original object, and deepcopy fails).
Does anyone know if you can just copy chunks of memory in Python, and use this to copy the attribute containing the Swig object? Or, could I create a __copy__ or __deepcopy__ method in the Swig interface file that created the Swig object, which is able to use Cs memcpy?
From looking at the __deepcopy__ implemented in the Swig interface for LAL, finding the Swig macros for allocating and deallocating memory, and looking at my own(!) example of extending the Swig interface to a C structure, I have figured out how to create a __deepcopy__ method for the Swig-wrapped structure.
Repeating my gist, and extending it to add a __deepcopy__ method is as follows:
Say you have some C code containing a structure like this:
/* testswig.h file */
#include <stdlib.h>
#include <stdio.h>
typedef struct tagteststruct{
double *data;
size_t len;
} teststruct;
teststruct *CreateStruct(size_t len);
where the structure will contain a data array of length len. The function CreateStruct() allocates
memory for an instantiation of the structure, and is defined as
/* testswig.c file */
#include "testswig.h"
/* function for allocating memory for test struct */
teststruct *CreateStruct(size_t len){
teststruct *ts = NULL;
ts = (teststruct *)malloc(sizeof(teststruct));
ts->data = (double *)malloc(sizeof(double)*len);
ts->len = len;
return ts;
}
If you wrap this with SWIG for use in python, then it might be useful to have some python list-like methods available to,
e.g., add or get items from the data array. To do this you can create the following SWIG interface file:
/* testswig.i */
%module testswig
%include "exception.i"
%{
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include "testswig.h"
static int teststructErr = 0; // flag to save test struct error state
%}
%include "testswig.h"
// set exception handling for __getitem__
%exception tagteststruct::__getitem__ {
assert(!teststructErr);
$action
if ( teststructErr ){
teststructErr = 0; // clear flag for next time
SWIG_exception(SWIG_IndexError, "Index out of bounds");
}
}
// set exception handling for __setitem__
%exception tagteststruct::__setitem__ {
assert(!teststructErr);
$action
if ( teststructErr ){
teststructErr = 0; // clear flag for next time
SWIG_exception(SWIG_IndexError, "Index out of bounds");
}
}
// set exception handling for insert()
%exception tagteststruct::insert {
assert(!teststructErr);
$action
if ( teststructErr ){
teststructErr = 0; // clear flag for next time
SWIG_exception(SWIG_IndexError, "Index out of bounds");
}
}
// "extend" the structure with various methods
%extend tagteststruct{
// add a __getitem__ method to the structure to get values from the data array
double __getitem__(size_t i) {
if (i >= $self->len) {
teststructErr = 1;
return 0;
}
return $self->data[i];
}
// add a __setitem__ method to the structure to set values in the data array
void __setitem__(size_t i, double value) {
if ( i >= $self->len ){
teststructErr = 1;
return;
}
$self->data[i] = value;
}
size_t __len__(){
return $self->len;
}
void insert(size_t i, double value) {
if ( i >= $self->len ){
teststructErr = 1;
return;
}
$self->data[i] = value;
}
%typemap(in, noblock=1) const void *memo "";
struct tagteststruct * __deepcopy__(const void *memo) {
// copy structure
struct tagteststruct * scopy = %new_copy(*$self, struct tagteststruct);
// copy array within the structure
scopy->data = %new_copy_array($self->data, $self->len, double);
return scopy;
}
%clear const void *memo;
}
In the above example, it adds the following methods to the structure:
__getitem__: this allows the structure's data array to be accessed like a list item in python, e.g., using x[0] returns the value in teststruct->data[0]
__setitem__: this allows the structure's data array values to be set like a list item in python, e.g., using x[0] = 1.2 sets the value in teststruct->data[0]
__len__: this returns the length of the data array when using len(x)
insert(): this inserts a value into a particular index in the array like with __getitem__
__deepcopy__: this allows the use of deepcopy to create a copy of the structure.
The example also shows how to perform some exception checking for these methods, in particular, making sure the requested index does not exceed the size of the array.
To compile and use this example, you could do the following (see, e.g., SWIG's tutorial):
$ swig -python testswig.i
$ gcc -c testswig.c testswig_wrap.c -fPIC -I/usr/include/python2.7
$ ld -shared testswig.o testswig_wrap.o -o _testswig.so
where, in this case, the -I/usr/include/python2.7 flag points to the path containing the Python.h file. The
testswig_wrap.c file is generated by the swig command.
The structure can then be used in python as in the following example:
>>> from testswig import CreateStruct
>>> # create an instance of the structure with 10 elements
>>> x = CreateStruct(10)
>>> # set the 5th element of the data array to 1.3
>>> x[4] = 1.3
>>> # output the 5th element of the array
>>> print(x[4])
1.3
>>> # output the length of the array
>>> print(len(x))
10
>>> # create a copy
>>> import copy
>>> y = copy.deepcopy(x)
>>> print(len(y))
10
>>> print(y[4])
1.3
>>> y[4] = 3.4
>>> print(y[4])
3.4
>>> print(x[4]) # check x hasn't been altered
1.3
The Swig-wrapped structure could itself be in a class, e.g.,:
from testswig import CreateStruct
class mystruct():
def __init__(self, size):
self.array = CreateStruct(size)
self.name = 'array'
def __len__(self):
return len(self.array)
def __getitem__(self, idx):
return self.array[idx]
def __setitem__(self, idx, val):
self.array[idx] = val
which we can test:
>>> x = mystruct(10)
>>> x[4] = 1.2
>>> print(x[4])
1.2
>>> import copy
>>> y = copy.deepcopy(x)
>>> print(y[4])
1.2
>>> y[4] = 3.4
>>> print(y[4])
3.4
>>> print(x[4]) # check it hasn't changed
1.2
Suppose there is a class MyArray in C++. It implements an array of SomeType In order to make a __getitem__ function for it in Python, I do something like this
const SomeType& getitem(const MyArray *arr, PyObject *slice) {
// ???
}
BOOST_PYTHON_MODULE(mymodule)
{
class_<MyArray>("MyArray")
.def("__getitem__", &getitem)
// probably some other methods...
;
}
It is possible to get indices in slice by using these functions. However, "Boost::Python is designed with the idea in mind that users never touch a PyObject*".
Is there a better 'boost way' to do this?
Boost.Python is designed to minimize the need to interact with PyObject, and it often accomplishes this by:
Providing higher-level type wrappers.
Allowing access to the Python object's interface through the associated boost::python::object.
For example, one can access the Python object's interface through C++ in a similar manner as one would do in Python. The following demonstrates accessing the start attribute of a boost::python::object that refers to a Python slice instance:
namespace python = boost::python;
python::object slice = get_slice_object();
python::object start = slice.attr("start");
std::size_t start_index = !start.is_none()
? python::extract<std::size_t>(start) // Extract index.
: 0; // Default.
While this approach works, it tends to result in much boilerplate code: creating defaults when None is provided, handling zero-length slices, and converting negative indexes to positive index. In this case, Boost.Python provides a higher-level type wrapper boost::python::slice that has a get_indices() member-function that will remove much of the boilerplate code. Here is a complete minimal example:
#include <vector>
#include <boost/range/algorithm.hpp>
#include <boost/range/irange.hpp>
#include <boost/python.hpp>
#include <boost/python/slice.hpp>
/// #brief Mockup class that creates a range from 0 to N.
struct counter
{
counter(std::size_t n)
{
data.reserve(n);
boost::copy(boost::irange(std::size_t(0), n), std::back_inserter(data));
}
std::vector<int> data;
};
/// #brief Handle slicing for counter object.
boost::python::list spam_getitem(
const counter& self,
boost::python::slice slice)
{
namespace python = boost::python;
python::list result;
// Boost.Python will throw std::invalid_argument if the range would be
// empty.
python::slice::range<std::vector<int>::const_iterator> range;
try
{
range = slice.get_indices(self.data.begin(), self.data.end());
}
catch (std::invalid_argument)
{
return result;
}
// Iterate over fully-closed range.
for (; range.start != range.stop; std::advance(range.start, range.step))
{
result.append(*range.start);
}
result.append(*range.start); // Handle last item.
return result;
}
BOOST_PYTHON_MODULE(example)
{
namespace python = boost::python;
python::class_<counter>("Counter", python::init<int>())
.def("__getitem__", &spam_getitem)
;
}
Interactive usage:
>>> from example import Counter
>>> counter = Counter(5)
>>> assert(counter[:] == [0,1,2,3,4])
>>> assert(counter[:-2] == [0,1,2])
>>> assert(counter[-2:] == [3,4])
>>> assert(counter[::2] == [0,2,4])
>>> assert(counter[1::2] == [1,3])
>>> assert(counter[100:] == [])
I have a function that returns a char pointer called loop_p and I call it many times on my main_thread like this to pass it to the py_embed thread:
HANDLE handle;
SENDTOPY *cmd=new SENDTOPY();
char* msg=loop_p(ac);
char *argv[4]={"PythonPlugIn2","bridge","test_callsign",msg};
cmd->argc=4;
for(int i = 0; i < NUM_ARGUMENTS; i++ )
{
cmd->argv[i] = argv[i];
}
handle=(HANDLE) _beginthread(py_embed,0,(void*)cmd);}
where SENDTOPY is a struct:
typedef struct{
int argc;
char *argv[4];
}SENDTOPY;
The message it sent to python like this and python receives it well:
SENDTOPY *arg=(SENDTOPY*)data;
pArgs2=Py_BuildValue("(s)",arg->argv[4]);
pValue2 = PyObject_CallObject(pFunc, pArgs2);
In order to avoid having memory allocation problems i modified the loop_p function to a function that returns a std::string. I then call that string in the main_threadwith some modifications:
...
std::string msg_python=loop_p(ac);
const char * msg2=msg_python.data();
char *argv[3]={"PythonPlugIn2","bridge","test_callsign"};
cmd->argc=3;
cmd->msg=msg2;
for(...
...
and i modify the struct SENDTOPYto this:
typedef struct{
int argc;
char *argv[3];
const char* msg;
}SENDTOPY;
I print it to a textfile in the main_thread and the message before and after the modifications are equal. But in the py_embedthread the const char is no longer what is was, is just a bunch of gibberish. What am I doing wrong?
Thank you in advance.
Edit:
loop_p code
std::string CNewDisplay::loop_p(int ac){
std::string res("Number of Aircrafts\nHour of simulation\n\n");
for (...
....
//Route
textfile<<fp.GetRoute()<<endl;
std::string route=fp.GetRoute();
std::replace(route.begin(),route.end(),' ',',');
res+=route;
res.append(",\n");
res.append("\n\n");
};
return res;
}
It appears to me that you are storing a pointer to the internal guts of a temporary string object created on the stack. If you make string static, then the string's guts will remain valid throughout program execution, and you can safely store pointer to string guts:
static std::string msg_python; // survives beyond local scope
msg_python=loop_p(ac); // set string to loop_p return value
const char *msg2=msg_python.c_str(); // get ptr each time since it could change
Also, ensure that you use .c_str() to get your c-style char string pointer so that you are assured the string is null-terminated. Using .data() does not guarantee null termination.
I'm new to swig and I have the following function which i cant fix:
int get_list(IN const char * string, OUT struct entry ** results);
where struct entry is defined:
struct flux_entry
{
char * addr_str;
char cc[2];
};
the entry struct is properly converted to a python class.
I googled but couldn't find any explanation i could use.
I want to make it return a tuple of: (original get_list int return value, python list of entry python objects, based on the results buffer), but don't know how to convert the C entry to a python object in the argout code snippet.
I've managed to get thus far:
%typemap(argout) struct entry **
{
PyObject *o = PyList_New(0);
int i;
for(i=0; $1[i] ; i++)
{
PyList_Append(o, SWIG_HOW_TO_CONVERT_TO_PYOBJECT($1[i]));
}
$result = o;
}
what should i replace SWIG_HOW_TO_CONVERT_TO_PYOBJECT with?
passed results is supposed to be a pointer to a (struct entry *) type, set to NULL before calling get_list and should be set to an allocated array of struct entry * pointers. maybe a small wrapper function could make that easier?
the struct entry array is allocated within the C function using malloc, after calculating (inside get_list) how many elements are needed, and ends with a NULL pointer to indicate the end of the array.
i'd also like to make sure it's freed somewhere :)
thanks!
This should at least give you a starting point that works. I still wasn't sure how the data was returned, since to return an array of pointers so that the final one was NULL I'd think you'd need a struct entry ***, so I just set addr_str = NULL on the last one as a sentinel, and just put some dummy data partially based on the input string into the fields. Modify as needed to suit your needs:
%module example
// Insert the structure definition and function to wrap into the wrapper code.
%{
struct entry {
char* addr_str;
char cc[2];
};
int get_list(const char* string, struct entry** results)
{
*results = malloc(3 * sizeof(struct entry));
(*results)[0].addr_str = malloc(10);
strcpy((*results)[0].addr_str,"hello");
(*results)[0].cc[0] = string[0];
(*results)[0].cc[1] = string[1];
(*results)[1].addr_str = malloc(10);
strcpy((*results)[1].addr_str,"there");
(*results)[1].cc[0] = string[2];
(*results)[1].cc[1] = string[3];
(*results)[2].addr_str = NULL;
return 0;
}
%}
#include <typemaps.i>
// Define the structure for SWIG
struct entry {
char* addr_str;
char cc[2];
};
// Define a set of typemaps to be used for an output parameter.
// This typemap suppresses requiring the parameter as an input.
// A temp variable is created and passed instead.
%typemap(in,numinputs=0) struct entry **OUTPUT (struct entry* temp) %{
$1 = &temp;
%}
// Build a list of tuples containing the two entries from the struct.
// Append the new Python list object to the existing "int" result.
%typemap(argout) struct entry **OUTPUT {
int i = 0;
PyObject* out = PyList_New(0);
while((*$1)[i].addr_str != NULL)
{
//PyObject* t = PyTuple_New(2);
//PyTuple_SET_ITEM(t,0,PyBytes_FromString((*$1)[i].addr_str));
//PyTuple_SET_ITEM(t,1,PyBytes_FromStringAndSize((*$1)[i].cc,2));
//PyList_Append(out,t);
//Py_DECREF(t);
PyObject* s = SWIG_NewPointerObj(*$1+i,$descriptor(struct entry*),0);
PyList_Append(out,s);
Py_DECREF(s);
++i;
}
$result = SWIG_AppendOutput($result,out);
}
// Since a Python object was created and the data copied for each entry struct,
// free the memory returned in the structure.
//%typemap(freearg) struct entry **OUTPUT {
// int i=0;
// while((*$1)[i].addr_str != NULL) {
// free((*$1)[i].addr_str);
// ++i;
// }
// free(*$1);
//}
// Apply the OUTPUT typemap set to the "results" parameter.
%apply struct entry **OUTPUT {struct entry** results};
// Finally, define the function for SWIG
int get_list(const char* string, struct entry** results);
Demo (Python 3.3):
>>> import example
>>> example.get_list('abcd')
[0, [(b'hello', b'ab'), (b'there', b'cd')]]
Hope that helps.
Edit:
I commented out the tuple creation and just save the entry* proxy instead. This doesn't leak Python objects, but the memory malloced for use by an entry* is not freed. I'm not sure where to put that, although I'm experimenting with %extend.
In part of the code of a C module I am integrating with Python, I have a char** (array of strings) which is repeatedly allocated, filled with allocated strings, then freed and allocated again. The general pattern is that when a certain function is called (from Python) supplying the new contents of the array (as a list), it iterates through the array of strings, freeing each of them, then frees the array itself. It then allocates the array again to hold the contents of the new Python list, then allocates memory for each of the strings to hold.
All that to say that I am getting an error when attempting to free one of the strings in the list. This error is deterministic; it is always the same word from the same list of words at the same point in the program, but there is nothing extraordinary about that word or list of words. (It is just ["CCellEnv", "18", "34"], which is a similar format to many others) I tried adding some debug code to the loop that allocates the strings; here is the function that produces the error:
static PyObject* py_set_static_line(PyObject* self, PyObject* args)
{
int i;
//Free the old values of the allocated variables, if there are any
if (numStaticWords > 0)
{
for (i = 0; i < numStaticWords; i++)
{
printf("Freeing word %d = '%s'\n", i, staticWords[i]);
free(staticWords[i]);
}
free(staticWords);
free(staticWordMatches);
}
//Parse arguments
PyObject* wordList;
unsigned short numWords;
PyObject* wordMatchesList;
if (!PyArg_ParseTuple(args, "O!HO!", &PyList_Type, &wordList, &numWords, &PyList_Type, &wordMatchesList))
return NULL;
numStaticWords = numWords;
if (numStaticWords > 0)
{
staticWords = malloc(sizeof(char*) * numStaticWords);
staticWordMatches = malloc(sizeof(int) * numStaticWords);
PyObject* wordObj;
PyObject* matchObj;
char* word;
for (i = 0; i < numStaticWords; i++)
{
//wordList is the list of strings passed from Python
wordObj = PyList_GetItem(wordList, i);
word = PyString_AsString(wordObj); //word is "18" in the failing case
//staticWords is the char** array of strings, which has already been malloc'd
staticWords[i] = malloc(sizeof(char) * strlen(word));
//Test freeing the word to see if it crashes
free(staticWords[i]); //Crashes for one specific word
staticWords[i] = malloc(sizeof(char) * strlen(word));
strcpy(staticWords[i], word);
matchObj = PyList_GetItem(wordMatchesList, i);
if (matchObj == Py_None)
{
staticWordMatches[i] = -1;
}
else
{
staticWordMatches[i] = PyInt_AsLong(matchObj);
}
}
}
Py_RETURN_NONE;
}
So, somehow, always and only for this specific string, allocating the memory to put it in, then immediately freeing that memory causes an error. The actual text of the string is not even copied into the memory. What could be causing this mysterious behavior?
Here
staticWords[i] = malloc(sizeof(char) * strlen(word));
strcpy(staticWords[i], word);
you are missing to allocate the 0-termination for the "strings". So any operation on those character arrays as strings, most likely will lead to undefined behaviour.
Do it this way:
{
int isNull = !word;
staticWords[i] = calloc(sizeof(*staticWords[i]), (isNull ?0 :strlen(word)) + 1);
strcpy(staticWords[i], isNull ?"" :word);
}