Python Style: nested vs extra function - python

I'm quite new to python (2.7) and have a question about what's the most Pythonic way to do something; my code (part of a class) Looks like this (a somewhat naive Version):
def calc_pump_height(self):
for i in range(len(self.primary_)):
for j in range(len(self.primary_)):
if self.connections_[i][j].sub_kind_ in [1,4]:
self.calc_spec_pump_height(i,j)
def calc_spec_pump_height(self,i,j):
pass
(obviously pass will be replaced by something else, manipulating attributes of the object of this class, without generating a return value)
I'd like to ask how I should do this: I could avoid the second function and write the extra code directly into the first function, getting rid of one function (Simple is better than complex), but creating a heavily nested function at the same time (Flat is better than nested).
I could also create some sort of list comprehension to avoid using a double Loop, eg:
def calc_pump_height(self):
ra = range(len(self.primary_))
[self.calc_spec_pump_height(i,j) for i,j in zip(ra, ra)]
(I'd have to move the if condition into the 2nd function; this would also create a null-list but I don't care about this, since calc_spec_pump_height is supposed to manipulate the object, not return something useful)
In essence: I'm iterating over a 2D list, testing each object for a certain characteristic and then do something with that object.
Which of the above methods is 'the best'? Or is there another way that I'm missing?

The key thing about functions/methods is that they should do one thing.
calc_pump_height implements two things: It finds elements in a 2D list that match some criteria, and then it calculates a value for each of those elements. It's ok for its purpose to be combining the other two operations, if that makes sense for the object's public API, but its not ok for it to implement either or both.
Finding the elements that match the criteria is a discrete step; that should be a function.
Calculating your value is clearly a discrete step; that should be a function.
I would implement the element matcher as a (private) generator, that takes the test condition as an argument, and yields all matching elements. It's just an iterator over your data structure, masked by the logical test. You can wrap that in a named public method called get_1_4_subkinds() or something that makes more sense in your domain. That generalises the code and gives you the flexibility to implement other conditions in the future. Also, your i and j are tightly coupled, so it makes sense to pass them around as a single concept. Then your code becomes:
def calc_pump_height(self):
for subkind_indices in self.get_1_4_subkinds():
self.calc_pump_spec_height(subkind_indices)

You have misunderstood “simplicity”:
write the extra code directly into the first function, getting rid of one function (Simple is better than complex)
That's not simple. Breaking complex sequences into discrete, focussed functions increases simplicity.
In that light, I would say that yes, you should definitely prefer calc_spec_pump_height as a separate function.

You can eliminate one level of nesting in your first function by using itertools.product to generate your i and j values at the same time (itertools.product(range(len(self.primary_)), repeat=2). The zip you use in the your second version won't work correctly, it will only yield identical pairs, 0,0, 1,1, 2,2, etc.
As for the overall design, you should not use a list comprehension if you don't care about the return value from the function you're calling. Use an explicit loop when it's the looping you want (rather than a list of computed values).
If there's a non-trivial amount of code that will go in calc_spec_pump_height, it makes perfect sense to make it as a separate method. If it's a one or two liner, then it might be OK to inline within calc_pump_height, but that method's loops and condition testing may be complicated enough already to justify factoring out the inner part of the algorithm.
You should usually think about splitting a big function up when it is too long to fit onto a single screen in your editor. That is about the limit of how many details (variable names, etc.) we can keep in our mind simultaneously. On the other hand, you shouldn't waste time (either your own programming time or function call overhead at run time) by factoring out every little piece of every problem. Factor part of a function out if you're using it from more than one place, or if you can't keep the details of the whole function in your head at once otherwise.
So, other than the (marginal) improvement of itertools.product and given the limited information you've provided about what calc_spec_pump_height will do, I think your code is already about as good as it can get!

Related

Is it computationally faster to change a list to a tuple before moving it to another function?

I create a list of 24 floats that is needed in a calling function. The calling function will not need to alter the floats, so it can work with a tuple. Is it computationally faster to change the list to a tuple, tuple(list), before returning it to the calling function rather than passing and then using the list the entire time?
A corollary to this is: Should I change a list to a tuple within a function if the function can work with the tuple? I have many instances of creating a list, then using it later in the same function where a tuple of the list would work.
I have several instances of this in my program, so any speed advantage would be helpful to overall performance.
I don't know how to time these things and cannot find a past similar question. I know tuples are more about 3 times faster.
Code sample not needed.
After the creation it will not matter anymore so if you want to improve your code than think before the creation if you need a tuple or a list.be aware that tuple are fixed size and list are dynamic so it ll depend on what you are trying to do.
after creation it doesnt actually matter because accessing elements is not faster or anything and it doesnt make sense to convert a list to tuple after creation that will not make your code faster or more efficient. you can look here for more link
However to test the execution Time you can use the timeit module
import timeit
start = timeit.default_timer()
# your function or piece of code
end= timeit.default_timer()
print(end-start)

Difference between Python methods which is can make new variable or not

Some methods don't need to make a new variable, i.e. lists.reverse() works like this:
lists = [123, 456, 789]
lists.reverse()
print(lists)
this method make itself reversed (without new variable).
Why there is vary ways to manufacture variable in Python?
Some cases which is like variable.method().method2().method3() are typed continuously but type(variable) and print() are not. Why we can't typing like variable.print() or variable.type()?
Is there any philosophical reasons for Python?
You may be confused by the difference between a function and a method, and by three different purposes to them. As much as I dislike using SO for tutorial purposes, these issues can be hard to grasp from other documentation. You can look up function vs method easily enough -- once you know it's a (slightly) separate issue.
Your first question is a matter of system design. Python merely facilitates what programmers want to do, and the differentiation is common to many (most?) programming languages since ASM and FORTRAN crawled out of the binary slime pools in the days when dinosaurs roamed the earth.
When you design how your application works, you need to make a lot of implementation decisions: individual variables vs a sequence, in-line coding vs functions, separate functions vs encased functions vs classes and methods, etc. Part of this decision making is what each function should do. You've raised three main types:
(1) Process this data -- take the given data and change it, rearrange it, whatever needs doing -- but I don't need the previous version, just the improved version, so just put the new stuff where the old stuff was. This is used almost exclusively when one variable is getting processed; we don't generally take four separate variables and change each of them. In that case, we'd put them all in a list and change the list (a single variable). reverse falls into this class.
One important note is that for such a function, the argument in question must be mutable (capable of change). Python has mutable and immutable types. For instance, a list is mutable; a tuple is immutable. If you wanted to reverse a tuple, you'd need to return a new tuple; you can't change the original.
(2) Tell me something interesting -- take the given data and extract some information. However, I'm going to need the originals, so leave them alone. If I need to remember this cool new insight, I'll put it in a variable of my own. This is a function that returns a value. sqrt is one such function.
(3) Interact with the outside world -- input or output data permanently. For output, nothing in the program changes; we may present the data in an easy-to-read format, but we don't change anything internally. print is such a function.
Much of this decision also depends on the function's designed purpose: is this a "verb" function (do something) or a noun/attribute function (look at this data and tell me what you see)?
Now you get the interesting job for yourself: learn the art of system design. You need to become familiar enough with the available programming tools that you have a feeling for how they can be combined to form useful applications.
See the documentation:
The reverse() method modifies the sequence in place for economy of space when reversing a large sequence. To remind users that it operates by side effect, it does not return the reversed sequence.

How to revert the functionality of a lambda/function in python

I am trying to degenerate my data from already manipulated data. It takes five parameters: target, source, target_key, source_key, transformer
For example:
target = {}
source = {first_name: tom}
target_key = name
source_key = first_name
transformer = lambda value: value.title()
So, currently, I set first_name to name. and response becomes {name: Tom}
Now, I am trying to reverse it. If I get {name: Tom}, it should result in {first_name: tom} using same lambda or function. similarly, there are many other keys with different transformers
Is there any way/keyword to reverse the functionality of a lambda or any function.
Thanks,
Short answer: is if fundamentally impossible to construct an "inverse" function for a given function.
You cannot derive the "reverse" function from a given function (whether it is a lambda expression) is irrelevant. There are several aspects here:
First of all, it is possible that several inputs map on the same output. Take for instance the function lambda x : x.lower(). In that case both 'foo' and 'FOO' map to 'foo'. So even if you somehow could calculate input that maps on a given output, a question would be: "what input do you pick".
Next say we simply state that any input would suffice, one can ask whether it is possible. It still is not since the problem is also undecidable in the sense that if you provide as "expected output" a value that cannot be generated by the function, the hypothetical inverse function cannot know that. One can prove this by using computability theory since it would conflict with the fact that the emptiness problem ETM is undecidable.
Is there a theoretical way to derive an object that maps to a given valid value? Yes: one could enumerate over all possible inputs (it is an infinite, but countable so enumerable), calculate the output and then validate it. Furthermore the evaluation of functions should happen in "parallel" since it is possible one of the function calls results in an infinite loop.
Nevertheless hoping that it is realistic to construct a real function that calculates the inverse is not advisable. In a practical sense the above sketched algorithm is unfeasible. It would require an enormous amount of memory to store all the simulations of these functions. Furthermore it is possible that these have side effects (like writing to a file). As a result you should make copies of everything that might have side effects. Furthermore in practice some side effects cannot be "virtualized" or "undone". If the function for instance communicates with a web server, you cannot "undo" the HTTP request. It can also take ages before a valid input structure is entered and evaluated.
Like #JohnColeman says in his comment the fact that a function is not (feasibly) inverse is sometimes desired behavior. In asymmetrical encryption for instance the public key is usually publicly available. Nevertheless we do not want the message encrypted by the public key to be (efficiently) computable. A lot of todays cryptography and security depends on the fact that it is hard or impossible to perform the inverse operation of a function.
A final note is that of course it can be possible to construct an "inverse constructor" for certain families of functions. But in general (meaning a "inverse generator" that can take any kind of function as input), it is impossible.
To regenerate your data, you need to invert the mappings that were applied to it. There is no general function-inverse operator in Python or any other programming language, for the reasons that #Willem explained, but humans are pretty good at identifying and reversing simple manipulations. With enough work, it is possible to understand and reverse complicated manipulations too. This is part of how hackers reverse-engineer programs and algorithms, and it is what you need to do too if your data is worth the effort. (Of course you can partially
automate the process, especially if you know the kinds of manipulations that have been applied, e.g. if you wrote them yourself.)
If you have the source code, it's relatively easy: Inspect each function, write a suitable inverse (to the extent that hey exist), and write a main loop that somehow determines which inverse to apply. If you don't have the source code but have the compiled program (.pyc or .pyo files), you can still disassemble them and puzzle out what they do. See the dis module (but it's not at at all trivial):
>>> import dis
>>> dis.dis(transformer)
1 0 LOAD_FAST 0 (value)
3 LOAD_ATTR 0 (title)
6 CALL_FUNCTION 0 (0 positional, 0 keyword pair)
9 RETURN_VALUE
So... the bottom line is, you have to do it yourself. Good luck with it.

Return a value coming directly from a function call vs. an intermediate variable

I have a function f(x), which does something and return values (a tuple).
I have another function that call this function , after processing parameters (the whole function operation is irrelevant to the question); and now I would like to know if there are evil intent in returning the function itself, vs runt the function, dump the output in a variable and return the variable.
A variable has a cost, and assign a value to a variable has a cost; but beside that, is there any sorcery that would happen behind the scene, that would make one better than the other ?
def myfunction(self):
[do something]
return f(x)
is the same as
def myfunction(self):
[do something]
b = f(x)
return b
or one is to prefer to the other (and why)? I am talking purely on the OOP persepctive; without considering that create variables and assign has a cost, in terms of memory and CPU cycles.
That doesn't return the function. Returning the function would look like return f. You're returning the result of the function call. Generally speaking, the only reason to save that result before returning it is if you plan to do some other kind of processing on it before the return, in which case it's faster to just refer to a saved value rather than recomputing it. Another reason to save it would be for clarity, turning what might be a long one-liner with extensive chaining into several steps.
There's a possibility that those two functions might produce different results if you have some kind of asynchronous process that modifies your data in the background between saving the reference and returning it, but that's something you'll have to keep in mind based on your program's situation.
In a nutshell, save it if you want to refer to it, or just return it directly otherwise.
Those are practically identical; use whichever one you think is more readable. If the performance of once versus the other actually matters for you, perhaps Python is not the best choice ;).
The cost difference between these is utterly negligible: in the worst case, one extra dictionary store, one extra dictionary lookup and one extra string in memory. In practice it won't even be that bad, since cpython stores local variables in a C array, so it's more like two c level pointer indirections.
As a matter of style, I would usually avoid the unnecessary variable but its possible that it might be better in particular cases. As a guideline, think about things like whether the amalgamated version leads to an excessively long line of code, whether the extra variable has a better name than eg result, and how clear it is that that function call is the result you need (and if it isnt, whether/how much a variable helps).

Python documentation: iterable many times?

In documenting a Python function, I find it more Pythonic to say:
def Foo(i):
"""i: An interable containing…"""
…rather than…
def Foo(i):
"""i: A list of …"""
When i really doesn't need to be a list. (Foo will happily operate on a set, tuple, etc.) The problem is generators. Generators typically only allow 1 iteration. Most functions are OK with generators or iterables that only allow a single pass, but some are not.
For those functions that cannot accept generators/things that can only be iterated once, is there a clear, consistent Python term to say "thing that can only be iterated more than once"?
The Python glossary for iterable and iterator seem to have a "once, but maybe more if you're lucky" definition.
I don't know of a standard term for this, at least not offhand, but I think "reusable iterable" would get the point across if you need a short phrase.
In practice, it's generally possible to structure your function so that you don't need to iterate over i more than once. Alternatively, you can create a list out of the iterable and then iterate over the list as many times as you want; or you can use itertools.tee to get multiple independent "copies" of the iterator. That lets you accept a generator even if you do need to use it more than once.
This is probably more a matter of style and preference than anything else, yet... I have a different take on my documentation: I always write the docstring according to the expected input in the context of the program.
Example: if I wrote a function that expect to go over keys of a dictionary and ignore its values I write:
arg : a dictionary of...
even if for e in arg: would work with other iterables. I chose to do so, because within the context of my code, I don't care if the function would still work... I care more that whoever reads the documentation understand how that function is meant to be used.
On the other hand, if I am writing a utility function that can cope with a wide spectrum of iterables by design, I go one of these two ways:
document what kind of exception will be rose under certain conditions [ex: "Raise TypeError if the iterable can't be iterated more than once"]
perform some pre-emptive argument handling that will make the function compatible with 'once-only' iterables.
In other words, I try to either make my function solid enough to handle edge cases, or to be very outspoken on its limitations.
Again: there's nothing wrong with the approach you want to take, but I consider this one of the cases in which "explicit is better than implicit": a documentation in which is mentioned "reusable iterable" is definitively accurate, but the adjective could easily be overlooked.
HTH!

Categories