1

Let's say you have a dataframe in a Jupyter notebook called MainNotebook.ipynb and you're passing this dataframe to an external python function called testmath in a python file called testmath.py:

import pandas as pd
from testmath import testmath

sales = [{'account': 'Jones LLC', 'Jan': 150, 'Feb': 200, 'Mar': 140},
         {'account': 'Alpha Co',  'Jan': 200, 'Feb': 210, 'Mar': 215},
         {'account': 'Blue Inc',  'Jan': 50,  'Feb': 90,  'Mar': 95 }]

mydf = pd.DataFrame(sales)

testmath(mydf)

Here's the code for testmath.py:

import pandas as pd

def testmath(inputdf):
    Feb = inputdf['Feb']
    inputdf['FebPesos'] = Feb * 12
    return inputdf, Feb

I'm trying to get the function to return BOTH the DataFrame mydf AND the variable Feb so that I can use them for later analysis.

However, what's weird is that when you run testmath(mydf) from MainNotebook.ipynb, while the DataFrame is returned with the new column added, the variable 'Feb' is not accessible.

By this I mean that if you run the following from MainNotebook:

from importdebug import testmath
import pandas as pd

sales = [{'account': 'Jones LLC', 'Jan': 150, 'Feb': 200, 'Mar': 140},
         {'account': 'Alpha Co',  'Jan': 200, 'Feb': 210, 'Mar': 215},
         {'account': 'Blue Inc',  'Jan': 50,  'Feb': 90,  'Mar': 95 }]

mydf = pd.DataFrame(sales)

testmath(mydf)

print(Feb)

The command to print(Feb) returns the error: NameError: name 'Feb' is not defined

Is there any way to retrieve the variables generated inside the function? Especially if you have a lot of them? (I would prefer a method that doesn't involve global variables, gulp)

I've already tried deleting pycache, and restarting the kernel and clearing the outputs. I also updated all of the conda packages, but still no luck.

5
  • Why shouldn't you be able to reference it? In my experience this works fine. Commented Aug 31, 2018 at 7:18
  • Could please post the error message ? Commented Aug 31, 2018 at 7:19
  • Just added the error text. Commented Aug 31, 2018 at 7:49
  • 1
    Your input variable to getmydata is myjsonfile, but internally, you read from a variable named inputpath (which I assume is defined somewhere in the notebook). This might be why when calling the function externally the dataframe is not found and therefore has no name 'apples'. Mind double checking with the different variables? Commented Aug 31, 2018 at 8:08
  • Even though it may be a bit difficult for pandas in Jupyter (and several Notebooks), please try to provide a minimal reproducible example, so that we know what error to tackle. Commented Aug 31, 2018 at 22:36

1 Answer 1

2

Since your function returns a tuple, you can use sequence unpacking:

mydf, Feb = testmath(mydf)

The right hand side returns a tuple of results, which are unpacked to variables mydf and Feb. These variables can then be accessed like any other variable.

Equivalently, with pd.DataFrame.pipe:

mydf, Feb = mydf.pipe(testmath)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.