Return DataFrame from a function in another file

Question

I'm trying to create two files: one that will create a series of dataframe, and another that will import theses dataframes to my principal file.

Is something like this:

load_data.py

def data_mean():
    import pandas as pd

    global mean_re5200, mean_re2000

    mean_re5200=pd.read_csv('mean_re5200.csv')
    mean_re2000=pd.read_csv('mean_re5200.csv')

main_project.py

from load_data import data_mean

When I run the main_project file and type data_mean() in the terminal, all seems fine, but the dataframes aren't save as local variables that I can use them. I saw another similar quotes here in StackOverFlow, but no one was about saving dataframe, only simple variables.

How can I proceed?

The data frames are "simple" variables, in that you have stored a scalar "handle" to the data frame. However, you still need to make some external visibility to them in load_data, just as with any other local variable. — Prune
– Prune, Commented Apr 10, 2020 at 3:26
Well, I still don't know how to proceed. If a put "return variable" in the load_data.py, the dataframes will appear in the console, but they will not be saved yet. — Alessandro Melo
– Alessandro Melo, Commented Apr 10, 2020 at 3:43
How does returning a value make the data frame appear in a console? Are you confused about the difference between return and print? — Prune
– Prune, Commented Apr 10, 2020 at 17:41
Maybe, because when I add "return variable" at the final of the function and execute, the function will be executed and the results will be printed in the console, but they will don't be saved as variables. I achive this using "global", equal I did in the code posted. But this don't work when I try to execute this function in another file... — Alessandro Melo
– Alessandro Melo, Commented Apr 10, 2020 at 19:14

rpanai · Accepted Answer · 2020-04-10 04:22:16Z

1

Why don't you simply try something like

load_data.py

import pandas as pd

df = pd.DataFrame({"a":list(range(10))})

main.py

from load_data import *

print(df)

or alternatively

load_data.py

import pandas as pd

def data_mean():
    df0 = pd.DataFrame({"a":list(range(10))})
    df1 = pd.DataFrame({"b":list(range(10))})
    return df0, df1

main.py

from load_data import data_mean

df1, df2 = data_mean()
print(df1)

answered Apr 10, 2020 at 4:22

rpanai

13.5k3 gold badges48 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Alessandro Melo Over a year ago

This helps, but what I really want to know if theres a way to achieve df1 and df2 without using "df1, df2 = data_mean() ". Because if I have much more dataframes, this kind of attribution will be really unpractical.

rpanai Over a year ago

Ciao Alessandro, this is the way it works for seaborn, scikit-learn, plotly and other. A function return only a df and it's name is not globally defined

Collectives™ on Stack Overflow

Return DataFrame from a function in another file

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related