5

I know this is super basic and I have been searching everywhere but I am still very confused by everything I'm seeing and am not sure the best way to do this and am having a hard time wrapping my head around it.

I have a script where I have multiple functions. I would like the first function to pass it's output to the second, then the second pass it's output to the third, etc. Each does it's own step in an overall process to the starting dataset.

For example, very simplified with bad names but this is to just get the basic structure:

#!/usr/bin/python
# script called process.py
import sys
infile = sys.argv[1]

def function_one():
    do things
    return function_one_output

def function_two():
    take output from function_one, and do more things
    return function_two_output

def function_three():
    take output from function_two, do more things
    return/print function_three_output

I want this to run as one script and print the output/write to new file or whatever which I know how to do. Just am unclear on how to pass the intermediate outputs of each function to the next etc.

infile -> function_one -> (intermediate1) -> function_two -> (intermediate2) -> function_three -> final result/outfile

I know I need to use return, but I am unsure how to call this at the end to get my final output

Individually?

function_one(infile)
function_two()
function_three()

or within each other?

function_three(function_two(function_one(infile)))

or within the actual function?

def function_one():
    do things
    return function_one_output

def function_two():
    input_for_this_function = function_one()

# etc etc etc

Thank you friends, I am over complicating this and need a very simple way to understand it.

5
  • This method would work fine function_three(function_two(function_one(infile))) although it lacks the opportunity to perform error handling between intermediate steps if you anticipate needing to do so Commented Jun 3, 2015 at 14:31
  • 1
    You can simply catch the value returned by the function in a variable and pass it as an argument to another function, Commented Jun 3, 2015 at 14:31
  • 2
    You should really look some tutorials on python and programming in general before asking basic questions. You say you search "everywhere" but if you saw python code anywhere you would find examples of this. Commented Jun 3, 2015 at 14:45
  • I know it seems otherwise but I'm actually decent with python and have some (IMO and from feedback from others) pretty good scripts. I just have a hard time wrapping my head around passing things like this from all the examples I've seen since they are all different. Commented Jun 3, 2015 at 15:18
  • I do need to read more documentation regarding classes, objects and functions, though. I need to better understand when to use stuff like main(), self etc. If you want I can show you some of my scripts but I'd rather not post them here publicly for various reasons. Commented Jun 3, 2015 at 15:28

8 Answers 8

6

You could define a data streaming helper function

    from functools import reduce
    
    def flow(seed, *funcs):
        return reduce(lambda arg, func: func(arg), funcs, seed)

    flow(infile, function_one, function_two, function_three)

    #for example
    flow('HELLO', str.lower, str.capitalize, str.swapcase)
    #returns 'hELLO'

edit

I would now suggest that a more "pythonic" way to implement the flow function above is:

def flow(seed, *funcs):
    for func in funcs:
        seed = func(seed)
    return seed;
Sign up to request clarification or add additional context in comments.

Comments

5

As ZdaR mentioned, you can run each function and store the result in a variable then pass it to the next function.

def function_one(file):
    do things on file
    return function_one_output

def function_two(myData):
    doThings on myData
    return function_two_output

def function_three(moreData):
    doMoreThings on moreData
    return/print function_three_output

def Main():
    firstData = function_one(infile)
    secondData = function_two(firstData)
    function_three(secondData)

This is assuming your function_three would write to a file or doesn't need to return anything. Another method, if these three functions will always run together, is to call them inside function_three. For example...

def function_three(file):
    firstStep = function_one(file)
    secondStep = function_two(firstStep)
    doThings on secondStep
    return/print to file

Then all you have to do is call function_three in your main and pass it the file.

Comments

3

For safety, readability and debugging ease, I would temporarily store the results of each function.

def function_one():
    do things
    return function_one_output

def function_two(function_one_output):
    take function_one_output and do more things
    return function_two_output

def function_three(function_two_output):
    take function_two_output and do more things
    return/print function_three_output

result_one = function_one()
result_two = function_two(result_one)
result_three = function_three(result_two)

The added benefit here is that you can then check that each function is correct. If the end result isn't what you expected, just print the results you're getting or perform some other check to verify them. (also if you're running on the interpreter they will stay in namespace after the script ends for you to interactively test them)

result_one = function_one()
print result_one
result_two = function_two(result_one)
print result_two
result_three = function_three(result_two)
print result_three

Note: I used multiple result variables, but as PM 2Ring notes in a comment you could just reuse the name result over and over. That'd be particularly helpful if the results would be large variables.

3 Comments

Better and traditional way of doing.
Sure. But it's not really necessary here to give those intermediate results separate names. result = f1(indata) result = f2(result) result = f3(result) (on separate lines) is perfectly clear & readable, IMHO. Of course, if you needed to pass both result_one and result_two to the final function it'd be a different story.
Yeah, I mostly used different ones for clarity's sake. But if they were going to be large variables to store then re-using a name is fine.
3

It's always better (for readability, testability and maintainability) to keep your function as decoupled as possible, and to write them so the output only depends on the input whenever possible.

So in your case, the best way is to write each function independently, ie:

def function_one(arg):
    do_something()
    return function_one_result

def function_two(arg):    
    do_something_else()
    return function_two_result

def function_three(arg):    
    do_yet_something_else()
    return function_three_result

Once you're there, you can of course directly chain the calls:

result = function_three(function_two(function_one(arg)))

but you can also use intermediate variables and try/except blocks if needed for logging / debugging / error handling etc:

r1 = function_one(arg)
logger.debug("function_one returned %s", r1)
try:
    r2 = function_two(r1)
except SomePossibleExceptio as e:
    logger.exception("function_two raised %s for %s", e, r1)
    # either return, re-reraise, ask the user what to do etc
    return 42 # when in doubt, always return 42 !
else:
   r3 = function_three(r2)
   print "Yay ! result is %s" % r3

As an extra bonus, you can now reuse these three functions anywhere, each on it's own and in any order.

NB : of course there ARE cases where it just makes sense to call a function from another function... Like, if you end up writing:

result = function_three(function_two(function_one(arg)))

everywhere in your code AND it's not an accidental repetition, it might be time to wrap the whole in a single function:

def call_them_all(arg):
    return function_three(function_two(function_one(arg)))

Note that in this case it might be better to decompose the calls, as you'll find out when you'll have to debug it...

Comments

1

I'd do it this way:

def function_one(x):
    # do things
    output = x ** 1
    return output

def function_two(x):
    output = x ** 2
    return output

def function_three(x):
    output = x ** 3
    return output

Note that I have modified the functions to accept a single argument, x, and added a basic operation to each.

This has the advantage that each function is independent of the others (loosely coupled) which allows them to be reused in other ways. In the example above, function_two() returns the square of its argument, and function_three() the cube of its argument. Each can be called independently from elsewhere in your code, without being entangled in some hardcoded call chain such as you would have if called one function from another.

You can still call them like this:

>>> x = function_one(3)
>>> x
3
>>> x = function_two(x)
>>> x
9
>>> x = function_three(x)
>>> x
729

which lends itself to error checking, as others have pointed out.

Or like this:

>>> function_three(function_two(function_one(2)))
64

if you are sure that it's safe to do so.

And if you ever wanted to calculate the square or cube of a number, you can call function_two() or function_three() directly (but, of course, you would name the functions appropriately).

Comments

0

With d6tflow you can easily chain together complex data flows and execute them. You can quickly load input and output data for each task. It makes your workflow very clear and intuitive.

import d6tlflow

class Function_one(d6tflow.tasks.TaskCache):
    function_one_output = do_things()
    self.save(function_one_output) # instead of return

@d6tflow.requires(Function_one)
def Function_two(d6tflow.tasks.TaskCache):
    output_from_function_one = self.inputLoad() # load function input
    function_two_output = do_more_things()
    self.save(function_two_output)

@d6tflow.requires(Function_two)
def Function_three():
    output_from_function_two = self.inputLoad()
    function_three_output = do_more_things()
    self.save(function_three_output)
    
d6tflow.run(Function_three()) # executes all functions

function_one_output = Function_one().outputLoad() # get function output
function_three_output = Function_three().outputLoad()

It has many more useful features like parameter management, persistence, intelligent workflow management. See https://d6tflow.readthedocs.io/en/latest/

Comments

-2

This way function_three(function_two(function_one(infile))) would be the best, you do not need global variables and each function is completely independent of the other.

Edited to add:

I would also say that function3 should not print anything, if you want to print the results returned use:

print function_three(function_two(function_one(infile)))

or something like:

output = function_three(function_two(function_one(infile)))
print output

11 Comments

How can you say that it is best ? it would hinder debugging and also if OP wants to use the value returned from various function he won't be able to get the values.
How in the world does it hinder debugging? Also, OP does not say he wants to use the intermediate values. I'm answering his question, not inferring a bunch of stuff.
the function stated in the question are just for example purposes, original code may have functions of 50-100 lines or may be more, who knows, and if any one one of the function fails, the whole chain would return error, I guess.
@mikeb It means that only the final result is called, there's no chance to check the output of function 1 or 2.
@ZdaR Or worse, it runs ok but doesn't produce the intended behaviour.
|
-3

Use parameters to pass the values:

def function1():
    foo = do_stuff()
    return function2(foo)

def function2(foo):
    bar = do_more_stuff(foo)
    return function3(bar)

def function3(bar):
    baz = do_even_more_stuff(bar)
    return baz

def main():
    thing = function1()
    print thing

2 Comments

This hardcode the flow in the functions... You could as well write everything in function1() here
Sure, but that's what he said he wanted to do.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.