I have statically declared a large structure in C, but I need to use this same data to do some analysis in Python. I'd rather not re-copy this data in to Python to avoid errors, is there a way to access (read only) this data directly in Python? I have looked at "ctypes" and SWIG, and neither one of them seems to provide what I'm looking for....
For example I have:
/* .h file */
typedef struct
{
double data[10];
} NestedStruct;
typedef struct
{
NestedStruct array[10];
} MyStruct;
/* .c file */
MyStruct the_data_i_want =
{
{0},
{
{1,2,3,4}
},
{0},
};
Ideally, I'd like something that would allow me to get this into python and access it via the_data_i_want.array[1].data[2] or something similar. Any thoughts? I got swig to "work" in the sense that I was able to compile/import a .so created from my .c file, but I couldn't access any of it through cvars. Maybe there's another way? It seems like this should't be that hard....
Actually, I figured it out. I'm adding this because my reputation does not allow me to answer my own question within 8 hours, and since I don't want to have to remember in 8 hours I will add it now. I'm sure there's a good reason for this that I don't understand.
Figured it out.
1st I compiled my .c file into an library:
Then, I used types to define a python class that would hold the data:
from ctypes import *
class NestedStruct(Structure):
_fields_ = [("data", c_double*10)]
class MyStruct(Structure):
_fields_ = [("array", NestedStruct*10)]
Then, I loaded the shared library into python:
my_lib = cdll.LoadLibrary("my_lib.so")
Then, I used the "in_dll" method to get the data:
the_data_i_want = MyStruct.in_dll(my_lib, "the_data_i_want")
Then, I could access it as if it were C. the_data_i_want.array[1].data[2]
Note I may have messed up the syntax slightly here because my actual data structure is nested 3 levels and I wanted to simplify for illustration purposes here.
You could've also in C read the data and written to a JSON-File, which you could then easily parse (usually there's a library which will even do that for you; python import json) and access form any different platform with almost every language setup you could think of. And at the same time you could've accessed you're data very similar compared to how you accessed it within you're original C code.
Just as a suggestion. This would make you're data also more portable and versatile I think, but you'll spend more time on writing and parsing the JSON as if you just read the stream of data directly from you're C code into python.
Related
I work with Python most of the time, for some reasons now I also need to use C++.
I find Python's import XXX as X very neat in the following way, for example:
import numpy as np
a = np.array([1,2,3])
where I'm very clear by looking at my code that the array() function is provided by the numpy module.
However, when working with C++, if I do:
#include<cstdio>
std::remove(filename);
It's not clear to me at first sight that remove() function under the std namespace is provided by <cstdio>.
So I'm wondering if there is a way to do it in C++ as the import XXX as X way in Python?
Nope.
It'll be slightly clearer if you write std::remove (which you should be doing anyway; there's no guarantee that the symbol is available in the global namespace) because then at least you'll know it comes from a standard header.
Beyond that, it's up to your memory. 😊
Some people try to introduce hacks like:
namespace SomeThing {
#include <cstdio>
}
// Now it's SomeThing::std::remove
That might work for your own headers (though I'd still discourage it even then). But it'll cause all manner of chaos with standard headers for sure and is not permitted:
[using.headers]/1: The entities in the C++ standard library are defined in headers, whose contents are made available to a translation unit when it contains the appropriate #include preprocessing directive.
[using.headers]/3: A translation unit shall include a header only outside of any declaration or definition, and shall include the header lexically before the first reference in that translation unit to any of the entities declared in that header. No diagnostic is required.
Recall that #include and import are fundamentally different things. C++ modules may go some way towards this sort of functionality, perhaps, but by including source code you are not even touching namespaces of symbols created by that code.
No there is no way to force this syntax. The person who developped the code that you include is free. Generally people split their code into namespaces, which can result to this syntax:
#include <MyLibrary.h>
int main()
{
MyLibrary::SayHello();
return 0;
}
But you have no guarentee on how the code in the header is written.
C++ #include<XXX.h> equivalent of Python's import XXX as X
There is no equivalent in C++.
When you include a file into another, you get every single declaration from the included file, and you have no option of changing their names.
You can add aliases for types and namespaces though, and references to objects, as well as write wrapper functions to do some of what the as X part does in Python.
It's not clear to me at first sight that remove() is provided by <cstdio>.
The std namespace at least tells you that it is provided by the standard library.
What I like to do, is document which header provides the used declarations:
#include<cstdio> // std::remove
std::remove(filename);
That said, most IDE's can show you where an identifier is declared by ctrl-clicking or hovering over it (although this doesn't always work well when there are overloads in different headers). My primary use for inclusion comments is checking which includes can be removed after refactoring.
I've been looking for a simple answer to this question, but it seems that I can't find one. I would prefer to stay away from any external libraries that aren't already included in Python 2.6/2.7.
I have 2 c header files that resemble the following:
//constants_a.h
const double constant1 = 2.25;
const double constant2 = -0.173;
const int constant3 = 13;
...
//constants_b.h
const double constant1 = 123.25;
const double constant2 = -0.12373;
const int constant3 = 14;
...
And I have a python class that I want to import these constants into:
#pythonclass.py
class MyObject(object):
def __init(self, mode):
if mode is "a":
# import from constants_a.h, like:
# self.constant1 = constant1
# self.constant2 = constant2
elif mode is "b":
# import from constants_b.h, like:
# self.constant1 = constant1
# self.constant2 = constant2
...
I have c code which uses the constants as well, and resembles this:
//computations.c
#include <stdio.h>
#include <math.h>
#include "constants_a.h"
// do some calculations, blah blah blah
How can I import the constants from the header file into the Python class?
The reason for the header files constants_a.h and constants_b.h is that I am using python to do most of the calculations using the constants, but at one point I need to use C to do more optimized calculations. At this point I am using ctypes to wrap the c code into Python. I want to keep the constants away from the code just in case I need to update or change them, and make my code much cleaner as well. I don't know if it helps to note I am also using NumPy, but other than that, no other non-standard Python extensions. I am also open to any suggestions regarding the design or architecture of this program.
In general, defining variables in C header file is poor style. The header file should only declare objects, leaving their definition for the appropriate ".c" source code file.
One thing you may want to do is to declare the library-global constants like extern const whatever_type_t foo; and define (or "implement") them (i.e. assigning values to them) somewhere in your C code (make sure you do this only once).
Anyway, let's ignore how you do it. Just suppose you've already defined the constants and made their symbols visible in your shared object file "libfoo.so". Let us suppose you want to access the symbol pi, defined as extern const double pi = 3.1415926; in libfoo, from your Python code.
Now you typically load your object file in Python using ctypes like this:
>>> import ctypes
>>> libfoo = ctypes.CDLL("path/to/libfoo.so")
But then you'll see ctypes thinks libfoo.pi is a function, not a symbol for constant data!
>>> libfoo.pi
<_FuncPtr object at 0x1c9c6d0>
To access its value, you have to do something rather awkward -- casting what ctypes thinks is a function back to a number.
>>> pi = ctypes.cast(foo.pi, ctypes.POINTER(ctypes.c_double))
>>> pi.contents.value
3.1415926
In C jargon, this vaguely corresponds to the following thing happening: You have a const double pi, but someone forces you to use it only via a function pointer:
typedef int (*view_anything_as_a_function_t)(void);
view_anyting_as_a_function_t pi_view = π
What do you do with the pointer pi_view in order to use the value of pi? You cast it back as a const double * and dereference it: *(const double *)(pi_view).
So this is all very awkward. Maybe I'm missing something but this I believe is by design of the ctypes module -- it's there chiefly for making foreign function calls, not for accessing "foreign" data. And exporting pure data symbol in a loadable library is arguably rare.
And this will not work if the constants are only C macro definitions. There's in general no way you can access macro-defined data externally. They're macro-expanded at compile time, leaving no visible symbol in the generated library file, unless you also export their macro values in your C code.
I recommend using regular expressions (re module) to parse the information you want out of the files.
Building a full C parser would be huge, but if you only use the variables and the file is reasonably simple/predictable/under control, then what you need to write is straightforward.
Just watch out for 'gotcha' artifacts such as commented-out code!
I would recommend using some kind of configuration file readable by both Python and C program, rather than storing constant values in headers. E.g. a simple csv, ini-file, or even your own simple format of 'key:value' pairs. And there will be no need to recompile the C program every time you'd like to change one of the values :)
I'd up-vote emilio, but I'm lacking rep!
Although you have requested to avoid other non-standard libraries, you may wish to take a look at Cython (Cython: C-Extensions for Python www.cython.org/), which offers the flexibility of Python coding and the raw speed of execution of C/C++-compiled code.
This way you can use regular Python for everything, but handle the expensive elements of code using its built-in C-types. You can then convert your Python code into .c files too (or just wrap external C-libraries themselves. ), which can then be compiled into a binary. I've achieved up to 10x speed-ups doing so for numerical routines. I also believe NumPy uses it.
How do I use a C struct foo, defined in a header file as a datatype in my Python code?
(This document does not seem to address the issue.)
typedef struct {
PyObject_HEAD
/* Type-specific fields go here. */
struct api_query query; /* instead of PyObject * type here */
} api_Request;
Building an extension module is not a trivial task (the document you linked to does explain how to do it in detail). To wrap a C structure that way, you need to define the new type and a Python's object usual methods (constructor, destructor, access methods, etc...).
You might find the ctypes package is an easier way to go about it.
Better still, if all you need to do is create the structure and send it (to a socket, as you say), and if the structure is simple enough, then the struct.pack function might prove easiest.
I've been searching around the web with no luck. I have the following Python code:
class LED(Structure):
_fields_ = [
('color', c_char_p),
('id', c_uint32)
]
class LEDConfiguration(Structure):
_fields_ = [
('daemon_user', c_char_p),
('leds', POINTER(LED)),
('num_leds', c_uint32)
]
Here is a simplified example function that uses these structures and returns an LEDConfiguration.
def parseLedConfiguration(path, board):
lc = LEDConfiguration()
for config in configs:
if( config.attributes['ID'].value.lstrip().rstrip() == board ):
lc.daemon_user = c_char_p('some_name')
leds = []
#Imagine this in a loop
ld = LED()
ld.color = c_char_p('red')
ld.id = int(0)
leds.append(ld)
#end imagined loop
lc.num_leds = len(leds)
lc.leds = (LED * len(leds))(*leds)
return lc
Now this the C code I am using (I've stripped out everything involved with setting up python/calling the "parseLedConfiguration" function/etc but I can add it in if it is helpful).
/*Calling the python function "parseLedConfiguration"
pValue is the returned "LEDConfiguration" python Structure*/
pValue = PyObject_CallObject(pFunc, pArgs);
Py_DECREF(pArgs);
if (pValue != NULL)
{
int i, num_leds;
PyObject *obj = PyObject_GetAttr(pValue, PyString_FromString("daemon_user"));
daemon_user = PyString_AsString(obj);
Py_DECREF(obj);
obj = PyObject_GetAttr(pValue, PyString_FromString("num_leds"));
num_leds = PyInt_AsLong(obj);
Py_DECREF(obj);
obj = PyObject_GetAttr(pValue, PyString_FromString("leds"));
PyObject_Print(obj, stdout, 0);
My problem is figuring out how to access what is returned to the final "obj". The "PyObject_Print" on the "obj" shows this output:
<ConfigurationParser.LP_LED object at 0x7f678a06fcb0>
I want to get into a state where I can access that LP_LED object in the same way I'm accessing the above "LEDConfiguration" object.
EDIT 1
I guess another maybe more important question, is my python code correct? Is that how I should be storing a list or array of "Structure" inside another "Structure" so it can be accessed from the Python C API?
Thanks!
Since your EDIT 1 clarifies the underlying question, let me put that at top:
guess another maybe more important question, is my python code correct? Is that how I should be storing a list or array of "Structure" inside another "Structure" so it can be accessed from the Python C API?
No, that's how you should be storing an array of Structure inside another Structure so it can be accessed from non-Python-C-API C code. If you want it to be accessed from the Python C API, just use a Python list.
In general, if you're writing code in both Python and C, only one side has to bend over backward to work with the other one. The point of using ctypes Structures and POINTERs and the like in Python is to allow them to work directly in C, without having to go through the C API. Conversely, the point of using functions like PyList_GetItem is to allow you to use normal Python code, not ctypes Python code.
So, if you want to store a list inside a Structure to be accessed via the Python C API, just store a Python list—and you really don't need the Structure in the first place; use a normal Python class (possibly with __slots__). You can write this code without importing ctypes at all.
Conversely, if you want to store structures that can be used directly in C, you can do that with ctypes; then, in the C code, once you've gotten into Structure guts of the PyObject * you don't need the Python API anymore, because the structure is all C. This is usually the way you go when you have existing C code, and want to interface with it from Python, rather than when you're designing the C code from scratch, but there's no rule that says you can't use it the other way.
Meanwhile, if this is your first attempt at writing C and Python code that talk to each other, I'd suggest you use Cython. Then, once you're comfortable with that, if you want to learn ctypes, do a different project that uses Python with ctypes to talk to C code that knows nothing at all about Python. And then, a third project that uses the C API to talk to Python code that knows nothing about ctypes (like most C extension modules). Once you're familiar with all three, you'll be able to pick the right one for most projects in the future.
Now, to answer the specific problem:
First, when PyList_GetItem (or most other functions in the C API) returns NULL, this means there's an exception, so you should check the exception and log it. Trying to debug NULL return values in the C API without looking at the set exception is like trying to debug Python code without looking at the tracebacks.
Anyway, there are a few obvious reasons this function could fail: Maybe you're calling it with an out-of-bounds index, or maybe you're calling it on something that isn't a list at all.
In fact, the second one seems pretty obvious here. If printing out obj gives you this:
<ConfigurationParser.LP_LED object at 0x7f678a06fcb0>
Then you've got a (pointer to an) LED object, and LED objects aren't lists.
And if you look at your code, you don't seem to have a list of LED objects anywhere, at least not in the code you show us. You do have a POINTER(LED), which could hold a C-array-decayed C array of LEDs, but that's not the same thing as a Python list of them. It's just a C array, which you use C array syntax to dereference:
PyObject *led = ledarray[i];
My application embeds python by dynamically loading it. I need to obtain the values from the dictionary of the script being executed.
pFnPyDict_GetItemString *pFGetItemString = NULL;
pFGetItemString = (pFnPyDict_GetItemString *)::GetProcAddress(hModulePython, PyDict_GetItemString);
if (pFGetItemString)
{
PyObject *pGet = pFGetItemString(pLocals, pVar);
if (pGet)
{
//The following code will not work as PyInt_Check is a macro
pFnPyInt_Check *pIsInt = (pFnPyInt_Check *)::GetProcAddress(hModulePython, "PyInt_Check");
if (PyInt_Check(get))
{
}
// There fore i am using PyObject_IsInstance
pFnPyObject_IsInstance *pFIsInstance = (pFnPyObject_IsInstance*)::GetProcAddress(hModulePython, "PyObject_IsInstance");
if (pFIsInstance)
{
int i = pFIsInstance(pGet, (PyObject*)&PyInt_Type); ----> the problem is here. This call fails.
}
}
}
How do I specify the second parameter to PyObject_IsInstance? Here i want to check if the value in pGet is of type int.
Do you only want to check for ints? If so you're better off using PyInt_Check instead.
Additional: Some advice, that you didn't ask for but which might help you. :) Are you using C or C++. If it's the later, consider using Boost.Python instead of the Python C API — it will make things a lot easier. Exposing functions and classes is trivial with Boost.
Surely the correct approach here is to include the header file and use PyInt_Check().
I assume that you have not included the Python C API header file because you don't want to use implicit linking. But you are making life hard for yourself by trying to work without the header file. Just because you include the header file, doesn't mean that the DLL functions will be implicitly linked to your program. This will only happen if you actually call some of the functions in the DLL.
If you want to be 100% sure that you don't implicitly link to the DLL then simply ensure that you don't link the .lib file.