I know this is super basic and I have been searching everywhere but I am still very confused by everything I'm seeing and am not sure the best way to do this and am having a hard time wrapping my head around it.
I have a script where I have multiple functions. I would like the first function to pass it's output to the second, then the second pass it's output to the third, etc. Each does it's own step in an overall process to the starting dataset.
For example, very simplified with bad names but this is to just get the basic structure:
#!/usr/bin/python
# script called process.py
import sys
infile = sys.argv[1]
def function_one():
do things
return function_one_output
def function_two():
take output from function_one, and do more things
return function_two_output
def function_three():
take output from function_two, do more things
return/print function_three_output
I want this to run as one script and print the output/write to new file or whatever which I know how to do. Just am unclear on how to pass the intermediate outputs of each function to the next etc.
infile -> function_one -> (intermediate1) -> function_two -> (intermediate2) -> function_three -> final result/outfile
I know I need to use return, but I am unsure how to call this at the end to get my final output
Individually?
function_one(infile)
function_two()
function_three()
or within each other?
function_three(function_two(function_one(infile)))
or within the actual function?
def function_one():
do things
return function_one_output
def function_two():
input_for_this_function = function_one()
# etc etc etc
Thank you friends, I am over complicating this and need a very simple way to understand it.
You could define a data streaming helper function
from functools import reduce
def flow(seed, *funcs):
return reduce(lambda arg, func: func(arg), funcs, seed)
flow(infile, function_one, function_two, function_three)
#for example
flow('HELLO', str.lower, str.capitalize, str.swapcase)
#returns 'hELLO'
edit
I would now suggest that a more "pythonic" way to implement the flow function above is:
def flow(seed, *funcs):
for func in funcs:
seed = func(seed)
return seed;
As ZdaR mentioned, you can run each function and store the result in a variable then pass it to the next function.
def function_one(file):
do things on file
return function_one_output
def function_two(myData):
doThings on myData
return function_two_output
def function_three(moreData):
doMoreThings on moreData
return/print function_three_output
def Main():
firstData = function_one(infile)
secondData = function_two(firstData)
function_three(secondData)
This is assuming your function_three would write to a file or doesn't need to return anything. Another method, if these three functions will always run together, is to call them inside function_three. For example...
def function_three(file):
firstStep = function_one(file)
secondStep = function_two(firstStep)
doThings on secondStep
return/print to file
Then all you have to do is call function_three in your main and pass it the file.
For safety, readability and debugging ease, I would temporarily store the results of each function.
def function_one():
do things
return function_one_output
def function_two(function_one_output):
take function_one_output and do more things
return function_two_output
def function_three(function_two_output):
take function_two_output and do more things
return/print function_three_output
result_one = function_one()
result_two = function_two(result_one)
result_three = function_three(result_two)
The added benefit here is that you can then check that each function is correct. If the end result isn't what you expected, just print the results you're getting or perform some other check to verify them. (also if you're running on the interpreter they will stay in namespace after the script ends for you to interactively test them)
result_one = function_one()
print result_one
result_two = function_two(result_one)
print result_two
result_three = function_three(result_two)
print result_three
Note: I used multiple result variables, but as PM 2Ring notes in a comment you could just reuse the name result over and over. That'd be particularly helpful if the results would be large variables.
It's always better (for readability, testability and maintainability) to keep your function as decoupled as possible, and to write them so the output only depends on the input whenever possible.
So in your case, the best way is to write each function independently, ie:
def function_one(arg):
do_something()
return function_one_result
def function_two(arg):
do_something_else()
return function_two_result
def function_three(arg):
do_yet_something_else()
return function_three_result
Once you're there, you can of course directly chain the calls:
result = function_three(function_two(function_one(arg)))
but you can also use intermediate variables and try/except blocks if needed for logging / debugging / error handling etc:
r1 = function_one(arg)
logger.debug("function_one returned %s", r1)
try:
r2 = function_two(r1)
except SomePossibleExceptio as e:
logger.exception("function_two raised %s for %s", e, r1)
# either return, re-reraise, ask the user what to do etc
return 42 # when in doubt, always return 42 !
else:
r3 = function_three(r2)
print "Yay ! result is %s" % r3
As an extra bonus, you can now reuse these three functions anywhere, each on it's own and in any order.
NB : of course there ARE cases where it just makes sense to call a function from another function... Like, if you end up writing:
result = function_three(function_two(function_one(arg)))
everywhere in your code AND it's not an accidental repetition, it might be time to wrap the whole in a single function:
def call_them_all(arg):
return function_three(function_two(function_one(arg)))
Note that in this case it might be better to decompose the calls, as you'll find out when you'll have to debug it...
I'd do it this way:
def function_one(x):
# do things
output = x ** 1
return output
def function_two(x):
output = x ** 2
return output
def function_three(x):
output = x ** 3
return output
Note that I have modified the functions to accept a single argument, x, and added a basic operation to each.
This has the advantage that each function is independent of the others (loosely coupled) which allows them to be reused in other ways. In the example above, function_two() returns the square of its argument, and function_three() the cube of its argument. Each can be called independently from elsewhere in your code, without being entangled in some hardcoded call chain such as you would have if called one function from another.
You can still call them like this:
>>> x = function_one(3)
>>> x
3
>>> x = function_two(x)
>>> x
9
>>> x = function_three(x)
>>> x
729
which lends itself to error checking, as others have pointed out.
Or like this:
>>> function_three(function_two(function_one(2)))
64
if you are sure that it's safe to do so.
And if you ever wanted to calculate the square or cube of a number, you can call function_two() or function_three() directly (but, of course, you would name the functions appropriately).
With d6tflow you can easily chain together complex data flows and execute them. You can quickly load input and output data for each task. It makes your workflow very clear and intuitive.
import d6tlflow
class Function_one(d6tflow.tasks.TaskCache):
function_one_output = do_things()
self.save(function_one_output) # instead of return
#d6tflow.requires(Function_one)
def Function_two(d6tflow.tasks.TaskCache):
output_from_function_one = self.inputLoad() # load function input
function_two_output = do_more_things()
self.save(function_two_output)
#d6tflow.requires(Function_two)
def Function_three():
output_from_function_two = self.inputLoad()
function_three_output = do_more_things()
self.save(function_three_output)
d6tflow.run(Function_three()) # executes all functions
function_one_output = Function_one().outputLoad() # get function output
function_three_output = Function_three().outputLoad()
It has many more useful features like parameter management, persistence, intelligent workflow management. See https://d6tflow.readthedocs.io/en/latest/
This way function_three(function_two(function_one(infile))) would be the best, you do not need global variables and each function is completely independent of the other.
Edited to add:
I would also say that function3 should not print anything, if you want to print the results returned use:
print function_three(function_two(function_one(infile)))
or something like:
output = function_three(function_two(function_one(infile)))
print output
Use parameters to pass the values:
def function1():
foo = do_stuff()
return function2(foo)
def function2(foo):
bar = do_more_stuff(foo)
return function3(bar)
def function3(bar):
baz = do_even_more_stuff(bar)
return baz
def main():
thing = function1()
print thing
Related
The code is very long so I won't type it in.
What I am confused about as a beginner programmer, is function calling. So I had a csv file that the function divided all the content (they were integers) by 95 to get the normalised scores.
I finished the function by returning the result. its called return sudentp_file
Now I want to continue this new variable into another function.
So this new function will get the average of the studentp_file. So I made a new function. Ill add the other function as a template of what im doing.
def normalise(student_file, units_file)
~ Do stuff here ~
return studentp_file
def mean(studentp_file):
mean()
What I get confused about is what to put in the mean(). Do I keep it or remove it? I understand you guys don't know the file I'm working with my a little basic understanding of how functions and function calling works would be appreciated. Thanks.
When you call your function you need to pass in the parameters it needs (based on what you specified in your def statement. So you code might look like this:
def normalise(student_file, units_file)
~ Do stuff here ~
return studentp_file
def mean(studentp_file):
~ other stuff here ~
return mean
# main code starts here
# get student file and units file from somewhere, I'll call them files A and B. Get the resulting studentp file back from the function call and store it in variable C.
C = normalize(A, B)
# now call the mean function using the file we got back from normalize and capture the result in variable my_mean
my_mean = mean(C)
print(my_mean)
i assume that normalise function is executed prior to mean function? if so try out this structure:
def normalise(student_file, units_file):
#do stuff here
return studentp_file
def mean(studentp_file):
#do stuff here
sp_file = normalise(student_file, units_file)
mean(sp_file)
functions in python(2/3) are made for reusability and to keep your code organized in a block. these functions may or may not return a value, based on arguments you pass (if it accepts arguments). think of it as if functions are like real life factories making finished products. raw goods are fed into factories, so that they produce a finished product. functions are also like that. :)
now, notice that i assigned a variable called sp_file with the value of the function call normalise(...). this function call - accepted parameters (student_file, units_file) - which are your 'raw' goods to be fed towards your function normalise.
return - basically returns whatever value towards the point in your code which called your function. in this case return, returns the value of studentp_file back to sp_file. sp_file would then get studentp_file's value and can be then passed to mean() function.
/ogs
Well, it's unclear buy why not just (dummy example):
def f(a,b):
return f2(3)+a+b
def f2(c):
return c+1
Call the f2 in f and do return in f2
If the results from function one will always be called to function two you could do this.
def f_one(x, y):
return (f_two(x, y))
def f_two(x, y):
return x + y
print(f_one(1, 1))
2
Or just a thought... You could set up a variable z that works as a switch, if its 1 it passes the result to function to the next function , or if 2 returns result of function one
def f_one(x, y, z):
result = x + y
if z == 1:
return (f_two(result))
elif z == 2:
return result
def f_two(x):
return x - 1
a = f_one(1, 1, 1)
print(a)
b = f_one(1, 1, 2)
print(b)
A bit of a general question that I cannot find the solution for,
I currently have two functions
def func1(*args, **kwargs):
...
return a,b
and
def func2(x,y):
...
return variables
I would like my code to evaluate
variables = func2(func1())
Which python does not accept as it says func2 requires two arguments when only one is given. My current solution is doing an intermediate dummy redefinition but makes my code extremely cluttered (my "func1" has an output of many parameters).
Is there an elegant solution to this?
def func1():
return 10, 20
def func2(x, y):
return x + y
results = func2(*func1())
print results
--output:--
30
A function can only return one thing, so func1() actually returns the tuple (10, 20). In order to get two things, you need to explode the tuple with *.
I want to write a testing function for an exercise, to make sure a function is implemented correctly.
So I got to wonder, is there a way, given a function "foo", to check if it is implemented recursively?
If it encapsulates a recursive function and uses it it also counts. For example:
def foo(n):
def inner(n):
#more code
inner(n-1)
return inner(n)
This should also be considered recursive.
Note that I want to use an external test function to perform this check. Without altering the original code of the function.
Solution:
from bdb import Bdb
import sys
class RecursionDetected(Exception):
pass
class RecursionDetector(Bdb):
def do_clear(self, arg):
pass
def __init__(self, *args):
Bdb.__init__(self, *args)
self.stack = set()
def user_call(self, frame, argument_list):
code = frame.f_code
if code in self.stack:
raise RecursionDetected
self.stack.add(code)
def user_return(self, frame, return_value):
self.stack.remove(frame.f_code)
def test_recursion(func):
detector = RecursionDetector()
detector.set_trace()
try:
func()
except RecursionDetected:
return True
else:
return False
finally:
sys.settrace(None)
Example usage/tests:
def factorial_recursive(x):
def inner(n):
if n == 0:
return 1
return n * factorial_recursive(n - 1)
return inner(x)
def factorial_iterative(n):
product = 1
for i in xrange(1, n+1):
product *= i
return product
assert test_recursion(lambda: factorial_recursive(5))
assert not test_recursion(lambda: factorial_iterative(5))
assert not test_recursion(lambda: map(factorial_iterative, range(5)))
assert factorial_iterative(5) == factorial_recursive(5) == 120
Essentially test_recursion takes a callable with no arguments, calls it, and returns True if at any point during the execution of that callable the same code appeared twice in the stack, False otherwise. I think it's possible that it'll turn out this isn't exactly what OP wants. It could be modified easily to test if, say, the same code appears in the stack 10 times at a particular moment.
from inspect import stack
already_called_recursively = False
def test():
global already_called_recursively
function_name = stack()[1].function
if not already_called_recursively:
already_called_recursively = True
print(test()) # One recursive call, leads to Recursion Detected!
if function_name == test.__name__:
return "Recursion detected!"
else:
return "Called from {}".format(function_name)
print(test()) # Not Recursion, "father" name: "<module>"
def xyz():
print(test()) # Not Recursion, "father" name: "xyz"
xyz()
The output is
Recursion detected!
Called from <module>
Called from xyz
I use the global variable already_called_recursively to make sure I only call it once, and as you can see, at the recursion it says "Recursion Detected", since the "father" name is the same as the current function, which means I called it from the same function aka recursion.
The other prints are the module-level call, and the call inside xyz.
Hope it helps :D
I have not yet verified for myself if Alex's answer works (though I assume it does, and far better than what I'm about to propose), but if you want something a little simpler (and smaller) than that, you can simply use sys.getrecursionlimit() to error it out manually, then check for that within a function. For example, this is what I wrote for a recursion verification of my own:
import sys
def is_recursive(function, *args):
try:
# Calls the function with arguments
function(sys.getrecursionlimit()+1, *args)
# Catches RecursionError instances (means function is recursive)
except RecursionError:
return True
# Catches everything else (may not mean function isn't recursive,
# but it means we probably have a bug somewhere else in the code)
except:
return False
# Return False if it didn't error out (means function isn't recursive)
return False
While it may be less elegant (and more faulty in some instances), this is far smaller than Alex's code and works reasonably well for most instances. The main drawback here is that with this approach, you're making your computer process through every recursion the function goes through until reaching the recursion limit. I suggest temporarily changing the recursion limit with sys.setrecursionlimit() while using this code to minimize the time taken to process through the recursions, like so:
sys.setrecursionlimit(10)
if is_recursive(my_func, ...):
# do stuff
else:
# do other stuff
sys.setrecursionlimit(1000) # 1000 is the default recursion limit
In Matlab, nargout is a variable that tells you if the output is assigned, so
x = f(2);
and
f(2);
can behave differently.
Is it possible to do similar in Python?
I have a function that plots to screen and returns a matplotlib figure object. I want that if output is assigned to a variable then do not plot to screen.
Here's a way you can do it (not that i'd advise it), but it has many cases where it won't work - to make it work you'd essentially need to parse the python code in the line and see what it is doing, which would be possible down to some level, but there are likely always going to be ways to get around it.
import inspect, re
def func(x, noCheck=False):
if not noCheck:
#Get the line the function was called on.
_, _, _, _, lines, _ = inspect.getouterframes(inspect.currentframe())[1]
#Now we need to search through `line` to see how the function is called.
line = lines[0].split("#")[0] #Get rid of any comments at the end of the line.
match = re.search(r"[a-zA-Z0-9]+ *= *func\(.*\)", line) #Search for instances of `func` being called after an equals sign
try:
variable, functioncall = match.group(0).split("=")
print variable, "=", functioncall, "=", eval(functioncall.strip()[:-1] + ", noCheck=True)")
except:
pass #print "not assigned to a variable"
#Actually make the function do something
return 3*x**2 + 2*x + 1
func(1) # x = func(1)
x = func(1)
Another way to do it would be to examine all of the set local variables when you call the code, and check if any of them have been set to the result of your function, then use that information to help parse the python.
Or you could look at object IDs, and try and do things that way, but that's not goign to be straightforward, as not all objects work the same way (i.e. do a=10 and c=10 and then have a look at each object's IDs, they're the same ven though a and c are seperate. The same happens with short strings too)
If you can think up a way to do this that would work universally, i'd be interested to know how you do it, I'd pressume it will need to be done by digging around in inspect though, rather than through parsing the actual code.
Others have mentioned that this is complex, but can be done with inspect. You may want a simple approach by having a separate function to plot it, or pass an extra variable that says to plot.
def create_plot(x):
return plot
def display(plot):
# show the plot
x = create_plot(2)
display(x)
Plot variable
def plot(x, show=False)
# create the plot
if show:
# show the plot
plot(2, True)
x = plot(2)
It is probably not worth the time and easier to just create the two functions.
Personally, I think this is ugly, nasty, and I do not believe that functionality should be based on something catching the return value. However, I was curious, and I found a way. You could probably turn this into a decorator if you want to use it in the future, but I still suggest that you use two separate methods instead of checking for an output.
import inspect
def f(val):
has_output = False
frame = inspect.currentframe()
name = frame.f_code.co_name
outer = inspect.getouterframes(frame)[1] # may want to loop through available frames.
for i in range(len(outer)):
item = str(outer[i]).replace(" ", "")
check = "="+name+"("
if check in item and "="+check not in item: # also check assignment vs equality
# Your method has an output
has_output = True
break
if has_output:
print("Something catches the output")
return val*val
# end f
In many cases this will not work either. You will have to make really good regex for the check if you always want it to work.
import my_lib
x = my_lib.f(2)
Consider this Python segment:
def someTestFunction():
if someTest:
return value1
elif someOtherTest:
return value2
elif yetSomeOtherTest:
return value3
return None
def SomeCallingFunction():
a = someTestFunction()
if a != None:
return a
... normal execution continues
Now, the question: the three-line segment in the beginning of SomeCallingFunction to get the value of the test function and bail out if it's not None, is repeated very often in many other functions. Three lines is too long. I want to shorten it to one. How do I do that?
I can freely restructure this code and the contents of someTestFunction however needed. I thought of using exceptions, but those don't seem to help in cutting down the calling code length.
(I've read a bit about Python decorators, but haven't used them. Would this be the place? How would it work?)
If you want to use a decorator, it would look like this:
def testDecorator(f):
def _testDecorator():
a = someTestFunction()
if a is None:
return f()
else: return a
return _testDecorator
#testDecorator
def SomeCallingFunction():
... normal execution
When the module is first imported, it runs testDecorator, passing it your original SomeCallingFunction as a parameter. A new function is returned, and that gets bound to the SomeCallingFunction name. Now, whenever you call SomeCallingFunction, it runs that other function, which does the check, and returns either a, or the result of the original SomeCallingFunction.
I often use a hash table in place of a series of elifs:
def someTestFunction(decorated_test):
options = {
'val1': return_val_1,
'val2': return_val_2
}
return options[decorated_test]
You can set up options as a defaultdict(None) to default to None if a key isn't found.
If you can't get your tests in that form, then a series of if statements might actually be the best thing to do.
One small thing you can do to shorten your code is to use this:
if a: return a
There may be other ways to shorten your code, but these are the ones I can come up with on the spot.
I think this would do it:
UPDATE Fixed!
Sorry for yesterday, I rushed and didn't test the code!
def test_decorator( test_func ):
def tester( normal_function ):
def tester_inner():
a = test_func()
if a is not None:
return a
return normal_function()
return tester_inner
return tester
#usage:
#test_decorator( my_test_function )
def my_normal_function():
#.... normal execution continue ...
It's similar to DNS's answer but allows you to specify which test function you want to use