How to use a .NET method which modifies in place in Python? - python

I am trying to use a .NET dll in Python. In a .NET language the method requires passing it 2 arrays by reference which it then modifies:
public void GetItems(
out int[] itemIDs,
out string[] itemNames
)
How can I use this method in Python using the Python for .NET module?
Edit: Forgot to mention this is in CPython not IronPython.
Additional info.
When I do the following:
itemIDs = []
itemNames = []
GetItems(itemIDs, itemNames)
I get an output like:
(None, <System.Int32[] at 0x43466c0>, <System.String[] at 0x43461c0>)
Do I just need to figure out how to convert these back into python types?

PythonNet doesn't document this quite as clearly as IronPython, but it does almost the same thing.
So, let's look at the IronPython documentation for ref and out parameters:
The Python language passes all arguments by-value. There is no syntax to indicate that an argument should be passed by-reference like there is in .NET languages like C# and VB.NET via the ref and out keywords. IronPython supports two ways of passing ref or out arguments to a method, an implicit way and an explicit way.
In the implicit way, an argument is passed normally to the method call, and its (potentially) updated value is returned from the method call along with the normal return value (if any). This composes well with the Python feature of multiple return values…
In the explicit way, you can pass an instance of clr.Reference[T] for the ref or out argument, and its Value field will get set by the call. The explicit way is useful if there are multiple overloads with ref parameters…
There are examples for both. But to tailor it to your specific case:
itemIDs, itemNames = GetItems()
Or, if you really want:
itemIDsRef = clr.Reference[Array[int]]()
itemNamesRef = clr.Reference[Array[String]]()
GetItems(itemIDs, itemNames)
itemIDs, itemNames = itemIDsRef.Value, itemNamesRef.Value
CPython using PythonNet does basically the same thing. The easy way to do out parameters is to not pass them and accept them as extra return values, and for ref parameters to pass the input values as arguments and accept the output values as extra return values. Just like IronPython's implicit solution. (Except that a void function with ref or out parameters always returns None before the ref or out arguments, even if it wouldn't in IronPython.) You can figure it out pretty easily by inspecting the return values. So, in your case:
_, itemIDs, itemNames = GetItems()
Meanwhile, the fact that these happen to be arrays doesn't make things any harder. As the docs explain, PythonNet provides the iterable interface for all IEnumerable collections, and the sequence protocol as well for Array. So, you can do this:
for itemID, itemName in zip(itemIDs, itemNames):
print itemID, itemName
And the Int32 and String objects will be converted to native int/long and str/unicode objects just as if they were returned directly.
If you really want to explicitly convert these to native values, you can. map or a list comprehension will give you a Python list from any iterable, including a PythonNet wrapper around an Array or other IEnumerable. And you can explicitly make a long or unicode out of an Int32 or String if you need to. So:
itemIDs = map(int, itemIDs)
itemNames = map(unicode, itemNames)
But I don't see much advantage to doing this, unless you need to, e.g., pre-check all the values before using any of them.

I have managed to use the method
bool XferData(ref byte[] buf, ref int len) from C# library CyUSB.dll
with the following code:
>>> xferLen = 2;
>>> outData=[10, 0]
>>> inData=[]
>>> n, outData, xferLen = XferData(outData, xferLen)
>>> print n, outData[0], outData[1], xferLen
True 10 0 2
Hope this helps someone.

Related

The simplest way to pass pointer to contiguous data from Python to C

I am using ctypes to call function in C. Function expects pointer to the first element of the contiguous data and number of data.
One thing that works is something like that
a=15 # a could be any number
temp = numpy.array([a]*14, dtype=numpy.int8)
c_function(temp.ctypes.data_as(ctypes.c_void_p), 14)
This is really cumbersome, requires both numpy and ctypes. Is there any other more simple way that works both in Python2 and Python3 (AFAIK bytes([a]*14) works but only for Python3)
EDIT: More interestingly this also works (!)
a=15 # a could be any number
temp = chr(a)*14
c_function(temp, 14)
There were suggestions in other threads that one could pass something pointer to the first element of the contiguous data, like here Passing memoryview to C function, but I was just unable to make this work.
Preliminaries
Python does not have pointers. You cannot create a pointer in Python, though Python variables act in some ways like pointers. You can create Python objects that represent lower-level pointers, but what you actually seem to want is to feed your C function a pointer to Python-managed data, which is an altogether different thing.
What ctypes does for you
You seem to have settled on using ctypes for the actual function call, so the general question expressed in the question title is a little overbroad for what you actually want to know. The real question seems to be more like "How do I get ctypes to pass a C pointer to Python-managed data to a C function?"
According to the ctypes Python 2 docs, in Python 2,
None, integers, longs, byte strings and unicode strings are the only
native Python objects that can directly be used as parameters in these
function calls. None is passed as a C NULL pointer, byte strings and
unicode strings are passed as pointer to the memory block that
contains their data (char * or wchar_t *). [...]
(emphasis added).
It's more or less the same list in Python 3 ...
None, integers, bytes objects and (unicode) strings
... with the same semantics.
Note well that ctypes takes care of the conversion from Python object to corresponding C representation -- nowhere does Python code handle C pointers per se, nor even direct representations of them.
Relevant C details
In many C implementations, all object pointer types have the same representation and can be used semi-interchangeably, but pointers of type char * are guaranteed by the standard to have the same size and representation as pointers of type void *. These two pointer types are guranteed to be interchangeable as function parameters and return values, among other things.
Synthesis
How convenient! It is acceptable to call your C function with a first argument of type char * when the function declares that parameter to be of type void *, and that is exactly what ctypes will arrange for you when the Python argument is a byte string (Python 2) or a bytes object (Python 3). The C function will receive a pointer to the object's data, not to the object itself. This provides a simpler and better way forward than going through numpy or a similar package, and it is basically the approach that you appended to your question. Thus, supposing that c_function identifies a ctypes-wrapped C function, you could do this (python3):
len = 15
c_function(b'0' * len, len)
Of course, you can also create a variable for the object and pass that, instead, which would allow you to afterward see whatever the C function has done with the contents of the object.
Do note, however, that
Byte strings and bytes objects are immutable as far as Python is concerned. You can get yourself in trouble if you use a C function to change the contents of a bytes object that other Python code assumes will never change.
The C side cannot determine the size of the data from a pointer to it. That is presumably the purpose of the second parameter. If you tell the function that the object is larger than it really is, and the function relies on that to try to modify bytes past the end of the actual data, then you will have difficult to debug trouble, from corruption of other data to memory leaks. If you're lucky, your program will crash.
It depends on what Python implementation you use, but typically the elements of a Unicode string are larger than one byte each. Save yourself some trouble and use byte strings / bytes instead.

Python-like Coding in C for Pointers

I am transitioning from Python to C, so my question might appear naive. I am reading tutorial on Python-C bindings and it is mentioned that:
In C, all parameters are pass-by-value. If you want to allow a function to change a variable in the caller, then you need to pass a pointer to that variable.
Question: Why cant we simply re-assign the values inside the function and be free from pointers?
The following code uses pointers:
#include <stdio.h>
int i = 24;
int increment(int *j){
(*j)++;
return *j;
}
void main() {
increment(&i);
printf("i = %d", i);
}
Now this can be replaced with the following code that doesn't use pointers:
int i = 24;
int increment(int j){
j++;
return j;
}
void main() {
i=increment(i);
printf("i = %d", i);
}
You can only return one thing from a function. If you need to update multiple parameters, or you need to use the return value for something other than the updated variable (such as an error code), you need to pass a pointer to the variable.
Getting this out of the way first - pointers are fundamental to C programming. You cannot be “free” of pointers when writing C. You might as well try to never use if statements, arrays, or any of the arithmetic operators. You cannot use a substantial chunk of the standard library without using pointers.
“Pass by value” means, among other things, that the formal parameter j in increment and the actual parameter i in main are separate objects in memory, and changing one has absolutely no effect on the other. The value of i is copied to j when the function is called, but any changes to i are not reflected in j and vice-versa.
We work around this in C by using pointers. Instead of passing the value of i to increment, we pass its address (by value), and then dereference that address with the unary * operator.
This is one of the cases where we have to use pointers. The other case is when we track dynamically-allocated memory. Pointers are also useful (if not strictly required) for building containers (lists, trees, queues, stacks, etc.).
Passing a value as a parameter and returning its updated value works, but only for a single parameter. Passing multiple parameters and returning their updated values in a struct type can work, but is not good style if you’re doing it just to avoid using pointers. It’s also not possible if the function must update parameters and return some kind of status (such as the scanf library function, for example).
Similarly, using file-scope variables does not scale and creates maintenance headaches. There are times when it’s not the wrong answer, but in general it’s not a good idea.
So, imagine you need to pass large arrays or other data structures that need modification. If you apply the way you use to increment an integer, then you create a copy of that large array for each call to that function. Obviously, it is not memory-friendly to create a copy, instead, we pass pointers to functions and do the updates on a single array or whatever it is.
Plus, as the other answer mentioned, if you need to update many parameters then it is impossible to return in the way you declared.

In C++, why is & needed for some parameters? [duplicate]

Is it better in C++ to pass by value or pass by reference-to-const?
I am wondering which is better practice. I realize that pass by reference-to-const should provide for better performance in the program because you are not making a copy of the variable.
It used to be generally recommended best practice1 to use pass by const ref for all types, except for builtin types (char, int, double, etc.), for iterators and for function objects (lambdas, classes deriving from std::*_function).
This was especially true before the existence of move semantics. The reason is simple: if you passed by value, a copy of the object had to be made and, except for very small objects, this is always more expensive than passing a reference.
With C++11, we have gained move semantics. In a nutshell, move semantics permit that, in some cases, an object can be passed “by value” without copying it. In particular, this is the case when the object that you are passing is an rvalue.
In itself, moving an object is still at least as expensive as passing by reference. However, in many cases a function will internally copy an object anyway — i.e. it will take ownership of the argument.2
In these situations we have the following (simplified) trade-off:
We can pass the object by reference, then copy internally.
We can pass the object by value.
“Pass by value” still causes the object to be copied, unless the object is an rvalue. In the case of an rvalue, the object can be moved instead, so that the second case is suddenly no longer “copy, then move” but “move, then (potentially) move again”.
For large objects that implement proper move constructors (such as vectors, strings …), the second case is then vastly more efficient than the first. Therefore, it is recommended to use pass by value if the function takes ownership of the argument, and if the object type supports efficient moving.
A historical note:
In fact, any modern compiler should be able to figure out when passing by value is expensive, and implicitly convert the call to use a const ref if possible.
In theory. In practice, compilers can’t always change this without breaking the function’s binary interface. In some special cases (when the function is inlined) the copy will actually be elided if the compiler can figure out that the original object won’t be changed through the actions in the function.
But in general the compiler can’t determine this, and the advent of move semantics in C++ has made this optimisation much less relevant.
1 E.g. in Scott Meyers, Effective C++.
2 This is especially often true for object constructors, which may take arguments and store them internally to be part of the constructed object’s state.
Edit: New article by Dave Abrahams on cpp-next: Want speed? Pass by value.
Pass by value for structs where the copying is cheap has the additional advantage that the compiler may assume that the objects don't alias (are not the same objects). Using pass-by-reference the compiler cannot assume that always. Simple example:
foo * f;
void bar(foo g) {
g.i = 10;
f->i = 2;
g.i += 5;
}
the compiler can optimize it into
g.i = 15;
f->i = 2;
since it knows that f and g doesn't share the same location. if g was a reference (foo &), the compiler couldn't have assumed that. since g.i could then be aliased by f->i and have to have a value of 7. so the compiler would have to re-fetch the new value of g.i from memory.
For more pratical rules, here is a good set of rules found in Move Constructors article (highly recommended reading).
If the function intends to change the argument as a side effect, take it by non-const reference.
If the function doesn't modify its argument and the argument is of primitive type, take it by value.
Otherwise take it by const reference, except in the following cases
If the function would then need to make a copy of the const reference anyway, take it by value.
"Primitive" above means basically small data types that are a few bytes long and aren't polymorphic (iterators, function objects, etc...) or expensive to copy. In that paper, there is one other rule. The idea is that sometimes one wants to make a copy (in case the argument can't be modified), and sometimes one doesn't want (in case one wants to use the argument itself in the function if the argument was a temporary anyway, for example). The paper explains in detail how that can be done. In C++1x that technique can be used natively with language support. Until then, i would go with the above rules.
Examples: To make a string uppercase and return the uppercase version, one should always pass by value: One has to take a copy of it anyway (one couldn't change the const reference directly) - so better make it as transparent as possible to the caller and make that copy early so that the caller can optimize as much as possible - as detailed in that paper:
my::string uppercase(my::string s) { /* change s and return it */ }
However, if you don't need to change the parameter anyway, take it by reference to const:
bool all_uppercase(my::string const& s) {
/* check to see whether any character is uppercase */
}
However, if you the purpose of the parameter is to write something into the argument, then pass it by non-const reference
bool try_parse(T text, my::string &out) {
/* try to parse, write result into out */
}
Depends on the type. You are adding the small overhead of having to make a reference and dereference. For types with a size equal or smaller than pointers that are using the default copy ctor, it would probably be faster to pass by value.
As it has been pointed out, it depends on the type. For built-in data types, it is best to pass by value. Even some very small structures, such as a pair of ints can perform better by passing by value.
Here is an example, assume you have an integer value and you want pass it to another routine. If that value has been optimized to be stored in a register, then if you want to pass it be reference, it first must be stored in memory and then a pointer to that memory placed on the stack to perform the call. If it was being passed by value, all that is required is the register pushed onto the stack. (The details are a bit more complicated than that given different calling systems and CPUs).
If you are doing template programming, you are usually forced to always pass by const ref since you don't know the types being passed in. Passing penalties for passing something bad by value are much worse than the penalties of passing a built-in type by const ref.
This is what i normally work by when designing the interface of a non-template function:
Pass by value if the function does not want to modify the parameter and the
value is cheap to copy (int, double, float, char, bool, etc... Notice that std::string, std::vector, and the rest of the containers in the standard library are NOT)
Pass by const pointer if the value is expensive to copy and the function does
not want to modify the value pointed to and NULL is a value that the function handles.
Pass by non-const pointer if the value is expensive to copy and the function
wants to modify the value pointed to and NULL is a value that the function handles.
Pass by const reference when the value is expensive to copy and the function does not want to modify the value referred to and NULL would not be a valid value if a pointer was used instead.
Pass by non-const reference when the value is expensive to copy and the function wants to modify the value referred to and NULL would not be a valid value if a pointer was used instead.
Sounds like you got your answer. Passing by value is expensive, but gives you a copy to work with if you need it.
As a rule passing by const reference is better.
But if you need to modify you function argument locally you should better use passing by value.
For some basic types the performance in general the same both for passing by value and by reference. Actually reference internally represented by pointer, that is why you can expect for instance that for pointer both passing are the same in terms of performance, or even passing by value can be faster because of needless dereference.
Pass by value for small types.
Pass by const references for big types (the definition of big can vary between machines) BUT, in C++11, pass by value if you are going to consume the data, since you can exploit move semantics. For example:
class Person {
public:
Person(std::string name) : name_(std::move(name)) {}
private:
std::string name_;
};
Now the calling code would do:
Person p(std::string("Albert"));
And only one object would be created and moved directly into member name_ in class Person. If you pass by const reference, a copy will have to be made for putting it into name_.
As a rule of thumb, value for non-class types and const reference for classes.
If a class is really small it's probably better to pass by value, but the difference is minimal. What you really want to avoid is passing some gigantic class by value and having it all duplicated - this will make a huge difference if you're passing, say, a std::vector with quite a few elements in it.
Pass by referece is better than pass by value. I was solving the longest common subsequence problem on Leetcode. It was showing TLE for pass by value but accepted the code for pass by reference. Took me 30 mins to figure this out.
Simple difference :- In function we have input and output parameter , so if your passing input and out parameter is same then use call by reference else if input and output parameter are different then better to use call by value .
example void amount(int account , int deposit , int total )
input parameter : account , deposit
output paramteter: total
input and out is different use call by vaule
void amount(int total , int deposit )
input total deposit
output total

Python C-API: Using `PySequence_Length` with dictionaries

I'm trying to use PySequence_Length to get the length of a Python dictionary in C. I realize I can use PyDict_Size, but I'm interested in using a more generic function in certain contexts.
PyObject* d = PyDict_New();
Py_ssize_t res = PySequence_Length(d);
printf("Result : %ld\n", res);
if (res == -1) PyErr_Print();
This fails, and prints the error:
TypeError: object of type 'dict' has no len()
My question is: why does this fail? Although Python dictionary objects don't support the Sequence protocol, the documentation for PySequence_Length says:
Py_ssize_t PySequence_Length(PyObject *o)
Returns the number of objects in sequence o on success, and -1 on
failure. For objects that do not provide sequence protocol, this is
equivalent to the Python expression len(o).
Since a dictionary type does have a __len__ attribute, and since the expression len(d) (where d is a dictionary) properly returns the length in Python, I don't understand why PySequence_Length should fail in C.
Any explanation? Is the documentation incorrect here?
The documentation is misleading, yes. A dict is not a sequence, even though it does implement some parts of the sequence protocol (for containment tests, which are part of the sequence protocol.) This particular distinction in the Python/C types API is unfortunate, but it's an artifact of a design choice made decades ago. The documentation reflects that distinction, albeit in an equally awkward way. What it tries to say is that for Python classes it's the same thing as len(o), regardless of what the Python class pretends to be. For C types, if the type does not implement the sequence version of the sizefunc, PySequence_Length() will raise an exception without even considering whether the type has the mapping version of the sizefunc.
If you are not entirely sure whether you have a sequence or not, you should use PyObject_Size() instead. In fact, there's very little reason to call PySequence_Length(); normally you either know the type (because you just created it, and you can call a type-specific length function or macro like PyList_GET_SIZE()) or you don't even know if it'll be a sequence.

pyobjc indexed accessor method with range

I'm trying to implement an indexed accessor method for my model class in Python, as per the KVC guide. I want to use the optional ranged method, to load multiple objects at once for performance reasons. The method takes a pointer to a C-array buffer which my method needs to copy the objects into. I've tried something like the following, which doesn't work. How do I accomplish this?
#objc.accessor # i've also tried #objc.signature('v#:o^#')
def getFoos_range_(self, range):
return self._some_array[range.location:range.location + range.length]
Edit: I finally found the type encodings reference after Apple moved all the docs. After reading that, I tried this:
#objc.signature('v#:N^##')
def getFoos_range_(self, buf, range):
but this didn't appear to work either. The first argument is supposed to be a pointer to a C-array, but the length is unknown until runtime, so I didn't know exactly how to construct the correct type encoding. I tried 'v#:N^[1000#]#' just to see, and that didn't work either.
My model object is bound to the contentArray of an NSArrayController driving a table view. It doesn't appear to be calling this method at all, perhaps because it expects a different signature than the one the bridge is providing. Any suggestions?
You were close. The correct decorator for this method is:
#objc.signature('v#:o^#{_NSRange=QQ}')
NSRange is not an object, but a struct, and can't be specified simply as #; you need to include the members1.
Unfortunately, this is not the end of it. After a whole lot of experimentation and poring over the PyObjC source, I finally figured out that in order to get this method to work, you also need to specify metadata for the method that is redundant to this signature. (However, I still haven't puzzled out why.)
This is done using the function objc.registerMetaDataForSelector:
objc.registerMetaDataForSelector(b"SUPERCLASSNAME",
b"getKey:range:",
dict(retval=dict(type=objc._C_VOID),
arguments={
2+0: dict(type_modifier=objc._C_OUT,
c_array_length_in_arg=2+1),
2+1: dict(type=b'{_NSRange=II}',
type64=b'{_NSRange=QQ}')
}
)
)
Examples and some details of the use of this function can be found in the file test_metadata_py.py (and nearby test_metadata*.py files) in the PyObjC source.
N.B. that the metadata has to be specified on the superclass of whatever class you are interested in implementing get<Key>:range: for, and also that this function needs to be called sometime before the end of your class definition (but either before or inside the class statement itself both seem to work). I haven't yet puzzled these bits out either.
I based this metadata on the metadata for NSArray getObjects:range: in the Foundation PyObjC.bridgesupport file2, and was aided by referring to Apple's BridgeSupport manpage.
With this worked out, it's also worth noting that the easiest way to define the method is (at least, IMO):
#objc.signature('v#:o^#{_NSRange=QQ}')
def get<#Key#>_range_(self, buf, inRange):
#NSLog(u"get<#Key#>")
return self.<#Key#>.getObjects_range_(buf, inRange)
I.e., using your array's built-in getObjects:range:.
1: On 32-bit Python, the QQ, meaning two unsigned long longs, should become II, meaning two unsigned ints
2: Located (on Snow Leopard) at: /System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/PyObjC/Foundation/PyObjC.bridgesupport

Categories