Python-like Coding in C for Pointers - python

I am transitioning from Python to C, so my question might appear naive. I am reading tutorial on Python-C bindings and it is mentioned that:
In C, all parameters are pass-by-value. If you want to allow a function to change a variable in the caller, then you need to pass a pointer to that variable.
Question: Why cant we simply re-assign the values inside the function and be free from pointers?
The following code uses pointers:
#include <stdio.h>
int i = 24;
int increment(int *j){
(*j)++;
return *j;
}
void main() {
increment(&i);
printf("i = %d", i);
}
Now this can be replaced with the following code that doesn't use pointers:
int i = 24;
int increment(int j){
j++;
return j;
}
void main() {
i=increment(i);
printf("i = %d", i);
}

You can only return one thing from a function. If you need to update multiple parameters, or you need to use the return value for something other than the updated variable (such as an error code), you need to pass a pointer to the variable.

Getting this out of the way first - pointers are fundamental to C programming. You cannot be “free” of pointers when writing C. You might as well try to never use if statements, arrays, or any of the arithmetic operators. You cannot use a substantial chunk of the standard library without using pointers.
“Pass by value” means, among other things, that the formal parameter j in increment and the actual parameter i in main are separate objects in memory, and changing one has absolutely no effect on the other. The value of i is copied to j when the function is called, but any changes to i are not reflected in j and vice-versa.
We work around this in C by using pointers. Instead of passing the value of i to increment, we pass its address (by value), and then dereference that address with the unary * operator.
This is one of the cases where we have to use pointers. The other case is when we track dynamically-allocated memory. Pointers are also useful (if not strictly required) for building containers (lists, trees, queues, stacks, etc.).
Passing a value as a parameter and returning its updated value works, but only for a single parameter. Passing multiple parameters and returning their updated values in a struct type can work, but is not good style if you’re doing it just to avoid using pointers. It’s also not possible if the function must update parameters and return some kind of status (such as the scanf library function, for example).
Similarly, using file-scope variables does not scale and creates maintenance headaches. There are times when it’s not the wrong answer, but in general it’s not a good idea.

So, imagine you need to pass large arrays or other data structures that need modification. If you apply the way you use to increment an integer, then you create a copy of that large array for each call to that function. Obviously, it is not memory-friendly to create a copy, instead, we pass pointers to functions and do the updates on a single array or whatever it is.
Plus, as the other answer mentioned, if you need to update many parameters then it is impossible to return in the way you declared.

Related

How does the python #cache decorator work?

I recently learned about the cache decorator in Python and was surprised how well it worked and how easily it could be applied to any function. Like many others before me I tried to replicate this behavior in C++ without success ( tried to recursively calculate the Fib sequence ). The problem was that the internal calls didn't get cached. This is not a problem if I modify the original function , but I want it to be a decorator so that it can be applied anywhere. I am trying to decipher the Python #cache decorator from the source code but couldn't make out a lot, to figure out how I can ( IF it is even possible ) to replicate this behavior elsewhere.
Is there a way to cache the internal calls also ?
This is a simple way to add memoisation to the fib function. What i want is to build a decorator so that i can wrap any function. Just like the one in Python.
class CacheFib {
public:
CacheFib() {}
unsigned long long fib(int n) {
auto hit = cache_pool.find(n);
if (hit != cache_pool.end()) {
return hit->second;
} else if (n <= 1) {
return n;
} else {
auto miss = this->fib(n - 1) + this->fib(n - 2);
cache_pool.insert({n, miss});
return miss;
}
}
std::map<int, int> cache_pool;
};
This approach caches the actual call, meaning that if i call cachedFib(40) , twice the second time it will be O(1).It doesn't actually cache the internal calls to help with performance.
// A PROTOTYPE IMPLEMENTATION
template <typename Func> class CacheDecorator {
public:
CacheDecorator(Func fun) : function(fun) {}
int operator()(int n) {
auto hit = cache_pool.find(n);
if (hit != cache_pool.end()) {
return hit->second;
} else {
auto miss = function(n);
cache_pool.insert({n, miss});
return miss;
}
}
std::function<Func> function;
std::map<int, int> cache_pool;
};
int fib(int n) {
if (n == 0 || n == 1) {
return n;
} else
return fib(n - 1) + fib(n - 2);
}
//main
auto cachedFib = CacheDecorator<decltype(fib)>(fib);
cachedFib(**);
Also any information on the #cache decorator or any C++ implementation ideas would be helpful.
So, as you're finding out, Python and C++ are different languages.
The key difference in this context is that in Python, the function name fib is looked up at run-time, even for the recursive call; meanwhile, in C++, the function name is looked up at compile-time, so by the time your CacheDecorator gets to it, it's too late.
A few possibilities:
Move the lookup of fib to run-time; you can do this either by using an explicit function pointer, or by making it a dynamic method. Either of those would mean coding the fib function differently.
Some sort of terrible, platform-dependent hack to overwrite either the function address table or the beginning of the function itself. This is going to be deep magic, particularly in the face of optimisations; the compiler might write out multiple copies of the function, or it might turn a recursive call into a loop, for example.
Move the implementation of the CacheDecorator to compile-time, as a pre-processor macro. That's probably the best way to preserve the intent of a python decorator.
Ideally, write Python in Python and C++ in C++; the languages each have their own idioms, which don't generally translate to each other in a one-to-one fashion.
Trying to write Python-style code in C++ will always result in code that's somewhat alien, even in cases where it is possible. Much better to become fluent in the idioms of the language you're using.
I think that internal calls are not being cached because once the CacheDecorator is invoked, the CacheDecorator::operator() is only invoked once. After that, the CacheDecorator::function is recursively invoked. This is problematic because you want to check the cache at every recursive call of CacheDecorator::function; however, this does not occur because your cache checking code is in CacheDecorator::operator(). Consequently, the first and only time CacheDecorator::operator() is invoked is when you invoke it in main, which is also the first and only time the cache is checked.
I also think that you would only encounter your issue if the passed in function uses recursion to compute its return value.
I may have made a mistake, but that's what I think the issue is.
I think that one way to fix this would be to accumulate and return a vector/map of computed values. Once your CacheDecorator::function is complete, you can then cache the returned vector/map. This would require you to modify your fib function, so this may not be a good solution. This modification also does not perfectly replicate Python's #cache decorator functionality since the programmer is essentially expected to store pre-computed values.

Pass Go Pointer via "cgo" as a Key

I am doing some interop between Go and Python. I am planning on creating objects in the Go land and only access them from Python through Go methods. For example, struct A below will only be created and destroyed in the Go land and only be accessed through method1. But Python code does need to know the handle/pointer to instances of struct A such that when it passes this handle/pointer to a Go function, the Go function knows to convert it to a Go pointer and calls its method. In other words, Python does not directly use the Go pointer *A as a pointer that references/dereferences memory, but rather uses it as a key to identify the Go objects.
The problem is that it is not valid to do int(a) for var a *A and I cannot directly pass *A between cgo methods.
What can I do so that I can convert *A to some blackbox key that is an integer and later convert it back to *A ?
type A struct {
a int
}
func (a *A) method1(){
fmt.Println("Hello world")
}
Short answer: you can't really.
Longer answer, is that because Go is garbage collected, you're able to pass pointers to C, but it must C must preserve this property: C should not store any Go pointers in Go memory, even temporarily, and may not keep a copy of a Go pointer after a call returns.
You can read a much more about the handling and uses in the cgo documentation https://golang.org/cmd/cgo/#hdr-Passing_pointers , which may help you with your usecase, maybe.

In C++, why is & needed for some parameters? [duplicate]

Is it better in C++ to pass by value or pass by reference-to-const?
I am wondering which is better practice. I realize that pass by reference-to-const should provide for better performance in the program because you are not making a copy of the variable.
It used to be generally recommended best practice1 to use pass by const ref for all types, except for builtin types (char, int, double, etc.), for iterators and for function objects (lambdas, classes deriving from std::*_function).
This was especially true before the existence of move semantics. The reason is simple: if you passed by value, a copy of the object had to be made and, except for very small objects, this is always more expensive than passing a reference.
With C++11, we have gained move semantics. In a nutshell, move semantics permit that, in some cases, an object can be passed “by value” without copying it. In particular, this is the case when the object that you are passing is an rvalue.
In itself, moving an object is still at least as expensive as passing by reference. However, in many cases a function will internally copy an object anyway — i.e. it will take ownership of the argument.2
In these situations we have the following (simplified) trade-off:
We can pass the object by reference, then copy internally.
We can pass the object by value.
“Pass by value” still causes the object to be copied, unless the object is an rvalue. In the case of an rvalue, the object can be moved instead, so that the second case is suddenly no longer “copy, then move” but “move, then (potentially) move again”.
For large objects that implement proper move constructors (such as vectors, strings …), the second case is then vastly more efficient than the first. Therefore, it is recommended to use pass by value if the function takes ownership of the argument, and if the object type supports efficient moving.
A historical note:
In fact, any modern compiler should be able to figure out when passing by value is expensive, and implicitly convert the call to use a const ref if possible.
In theory. In practice, compilers can’t always change this without breaking the function’s binary interface. In some special cases (when the function is inlined) the copy will actually be elided if the compiler can figure out that the original object won’t be changed through the actions in the function.
But in general the compiler can’t determine this, and the advent of move semantics in C++ has made this optimisation much less relevant.
1 E.g. in Scott Meyers, Effective C++.
2 This is especially often true for object constructors, which may take arguments and store them internally to be part of the constructed object’s state.
Edit: New article by Dave Abrahams on cpp-next: Want speed? Pass by value.
Pass by value for structs where the copying is cheap has the additional advantage that the compiler may assume that the objects don't alias (are not the same objects). Using pass-by-reference the compiler cannot assume that always. Simple example:
foo * f;
void bar(foo g) {
g.i = 10;
f->i = 2;
g.i += 5;
}
the compiler can optimize it into
g.i = 15;
f->i = 2;
since it knows that f and g doesn't share the same location. if g was a reference (foo &), the compiler couldn't have assumed that. since g.i could then be aliased by f->i and have to have a value of 7. so the compiler would have to re-fetch the new value of g.i from memory.
For more pratical rules, here is a good set of rules found in Move Constructors article (highly recommended reading).
If the function intends to change the argument as a side effect, take it by non-const reference.
If the function doesn't modify its argument and the argument is of primitive type, take it by value.
Otherwise take it by const reference, except in the following cases
If the function would then need to make a copy of the const reference anyway, take it by value.
"Primitive" above means basically small data types that are a few bytes long and aren't polymorphic (iterators, function objects, etc...) or expensive to copy. In that paper, there is one other rule. The idea is that sometimes one wants to make a copy (in case the argument can't be modified), and sometimes one doesn't want (in case one wants to use the argument itself in the function if the argument was a temporary anyway, for example). The paper explains in detail how that can be done. In C++1x that technique can be used natively with language support. Until then, i would go with the above rules.
Examples: To make a string uppercase and return the uppercase version, one should always pass by value: One has to take a copy of it anyway (one couldn't change the const reference directly) - so better make it as transparent as possible to the caller and make that copy early so that the caller can optimize as much as possible - as detailed in that paper:
my::string uppercase(my::string s) { /* change s and return it */ }
However, if you don't need to change the parameter anyway, take it by reference to const:
bool all_uppercase(my::string const& s) {
/* check to see whether any character is uppercase */
}
However, if you the purpose of the parameter is to write something into the argument, then pass it by non-const reference
bool try_parse(T text, my::string &out) {
/* try to parse, write result into out */
}
Depends on the type. You are adding the small overhead of having to make a reference and dereference. For types with a size equal or smaller than pointers that are using the default copy ctor, it would probably be faster to pass by value.
As it has been pointed out, it depends on the type. For built-in data types, it is best to pass by value. Even some very small structures, such as a pair of ints can perform better by passing by value.
Here is an example, assume you have an integer value and you want pass it to another routine. If that value has been optimized to be stored in a register, then if you want to pass it be reference, it first must be stored in memory and then a pointer to that memory placed on the stack to perform the call. If it was being passed by value, all that is required is the register pushed onto the stack. (The details are a bit more complicated than that given different calling systems and CPUs).
If you are doing template programming, you are usually forced to always pass by const ref since you don't know the types being passed in. Passing penalties for passing something bad by value are much worse than the penalties of passing a built-in type by const ref.
This is what i normally work by when designing the interface of a non-template function:
Pass by value if the function does not want to modify the parameter and the
value is cheap to copy (int, double, float, char, bool, etc... Notice that std::string, std::vector, and the rest of the containers in the standard library are NOT)
Pass by const pointer if the value is expensive to copy and the function does
not want to modify the value pointed to and NULL is a value that the function handles.
Pass by non-const pointer if the value is expensive to copy and the function
wants to modify the value pointed to and NULL is a value that the function handles.
Pass by const reference when the value is expensive to copy and the function does not want to modify the value referred to and NULL would not be a valid value if a pointer was used instead.
Pass by non-const reference when the value is expensive to copy and the function wants to modify the value referred to and NULL would not be a valid value if a pointer was used instead.
Sounds like you got your answer. Passing by value is expensive, but gives you a copy to work with if you need it.
As a rule passing by const reference is better.
But if you need to modify you function argument locally you should better use passing by value.
For some basic types the performance in general the same both for passing by value and by reference. Actually reference internally represented by pointer, that is why you can expect for instance that for pointer both passing are the same in terms of performance, or even passing by value can be faster because of needless dereference.
Pass by value for small types.
Pass by const references for big types (the definition of big can vary between machines) BUT, in C++11, pass by value if you are going to consume the data, since you can exploit move semantics. For example:
class Person {
public:
Person(std::string name) : name_(std::move(name)) {}
private:
std::string name_;
};
Now the calling code would do:
Person p(std::string("Albert"));
And only one object would be created and moved directly into member name_ in class Person. If you pass by const reference, a copy will have to be made for putting it into name_.
As a rule of thumb, value for non-class types and const reference for classes.
If a class is really small it's probably better to pass by value, but the difference is minimal. What you really want to avoid is passing some gigantic class by value and having it all duplicated - this will make a huge difference if you're passing, say, a std::vector with quite a few elements in it.
Pass by referece is better than pass by value. I was solving the longest common subsequence problem on Leetcode. It was showing TLE for pass by value but accepted the code for pass by reference. Took me 30 mins to figure this out.
Simple difference :- In function we have input and output parameter , so if your passing input and out parameter is same then use call by reference else if input and output parameter are different then better to use call by value .
example void amount(int account , int deposit , int total )
input parameter : account , deposit
output paramteter: total
input and out is different use call by vaule
void amount(int total , int deposit )
input total deposit
output total

Why can Python functions return locally declared arrays but C can't? [duplicate]

This question already has an answer here:
CPython memory allocation
(1 answer)
Closed 6 years ago.
I was reading this question: Cannot return int array because I ran into the same problem.
It seems that data structures (because C can obviously return a locally declared variable) declared locally within a function cannot be returned, in this case an array.
However Python doesn't suffer from the same problem; as far as I can remember, it's possible to declare an array within a function and to return that array without having to pass it as an argument.
What is the difference "under the hood"? Is Python using pointers implicitly (using malloc within the function)?
For the record, Python's built-in mutable sequence type is called a list, not an array, but it behaves similarly (it's just dynamically resizable, like C++'s std::vector).
In any event, you're correct that all Python objects are implicitly dynamically allocated; only the references (roughly, pointers) to them are on the "stack" (that said, the Python interpreter stack and the C level stack are not the same thing to start with). Comparable C code would dynamically allocate the array and return a pointer to it (with the caller freeing it when done; different Python interpreters handle this differently, but the list would be garbage collected when no longer referenced in one way or another).
Python has no real concept of "stack arrays" (it always returns a single object, though that object could be a tuple to simulate multiple return values), so returns are always ultimately a single "pointer" value (the reference to the returned object).
It seems that data structures (because C can obviously return a locally declared variable) declared locally within a function cannot be returned, in this case an array.
You already have a good Python answer; I wanted to look at the C side a little more closely.
Yes, a C function returns a value. That value may be primitive C type, or a struct or union type. Or, it may be a pointer type.
The C language syntax makes arrays and pointers seem very similar, which makes arrays special. Because the name of the array is the same as the address of the first element, it can't be something else. In particular, an array name does not refer to the whole array (except in the case of the sizeof operator). Because any other use of an array name refers to the address of the first element, attempting to return an array results in returning only that address.
Because it's a C function, that address is returned by value: namely, a value of a pointer type. So, when we say,
char *s = strdup("hello");
s is a pointer type whose value is not "hello", but the value of address of the first element of the array that strdup allocates.
Python doesn't suffer from the same problem
When Y is a property of X, Y is a problem only if that property is, in the eyes of the beholder, undesirable. You can be sure the way C treats arrays is not accidental, and is often convenient.

How to use a .NET method which modifies in place in Python?

I am trying to use a .NET dll in Python. In a .NET language the method requires passing it 2 arrays by reference which it then modifies:
public void GetItems(
out int[] itemIDs,
out string[] itemNames
)
How can I use this method in Python using the Python for .NET module?
Edit: Forgot to mention this is in CPython not IronPython.
Additional info.
When I do the following:
itemIDs = []
itemNames = []
GetItems(itemIDs, itemNames)
I get an output like:
(None, <System.Int32[] at 0x43466c0>, <System.String[] at 0x43461c0>)
Do I just need to figure out how to convert these back into python types?
PythonNet doesn't document this quite as clearly as IronPython, but it does almost the same thing.
So, let's look at the IronPython documentation for ref and out parameters:
The Python language passes all arguments by-value. There is no syntax to indicate that an argument should be passed by-reference like there is in .NET languages like C# and VB.NET via the ref and out keywords. IronPython supports two ways of passing ref or out arguments to a method, an implicit way and an explicit way.
In the implicit way, an argument is passed normally to the method call, and its (potentially) updated value is returned from the method call along with the normal return value (if any). This composes well with the Python feature of multiple return values…
In the explicit way, you can pass an instance of clr.Reference[T] for the ref or out argument, and its Value field will get set by the call. The explicit way is useful if there are multiple overloads with ref parameters…
There are examples for both. But to tailor it to your specific case:
itemIDs, itemNames = GetItems()
Or, if you really want:
itemIDsRef = clr.Reference[Array[int]]()
itemNamesRef = clr.Reference[Array[String]]()
GetItems(itemIDs, itemNames)
itemIDs, itemNames = itemIDsRef.Value, itemNamesRef.Value
CPython using PythonNet does basically the same thing. The easy way to do out parameters is to not pass them and accept them as extra return values, and for ref parameters to pass the input values as arguments and accept the output values as extra return values. Just like IronPython's implicit solution. (Except that a void function with ref or out parameters always returns None before the ref or out arguments, even if it wouldn't in IronPython.) You can figure it out pretty easily by inspecting the return values. So, in your case:
_, itemIDs, itemNames = GetItems()
Meanwhile, the fact that these happen to be arrays doesn't make things any harder. As the docs explain, PythonNet provides the iterable interface for all IEnumerable collections, and the sequence protocol as well for Array. So, you can do this:
for itemID, itemName in zip(itemIDs, itemNames):
print itemID, itemName
And the Int32 and String objects will be converted to native int/long and str/unicode objects just as if they were returned directly.
If you really want to explicitly convert these to native values, you can. map or a list comprehension will give you a Python list from any iterable, including a PythonNet wrapper around an Array or other IEnumerable. And you can explicitly make a long or unicode out of an Int32 or String if you need to. So:
itemIDs = map(int, itemIDs)
itemNames = map(unicode, itemNames)
But I don't see much advantage to doing this, unless you need to, e.g., pre-check all the values before using any of them.
I have managed to use the method
bool XferData(ref byte[] buf, ref int len) from C# library CyUSB.dll
with the following code:
>>> xferLen = 2;
>>> outData=[10, 0]
>>> inData=[]
>>> n, outData, xferLen = XferData(outData, xferLen)
>>> print n, outData[0], outData[1], xferLen
True 10 0 2
Hope this helps someone.

Categories