24

I am trying to create and return a data frame from a Python function

def create_df():
    data = {'state': ['Ohio','Ohio','Ohio','Nevada','Nevada'],
           'year': [2000,2001,2002,2001,2002],
           'pop': [1.5,1.7,3.6,2.4,2.9]}
    df = pd.DataFrame(data)
    return df
create_df()
df

I get an error that saying that df is not defined. If I replace return with print I get print of the data frame correctly. Is there a way to do this?

1
  • 3
    df is a local variable. You need to assign the result like df = create_df(). Commented Aug 8, 2017 at 23:37

6 Answers 6

38

Wwhen you call create_df(), Python calls the function but doesn't save the result in any variable. That is why you got the error.

Assign the result of create_df() to a new variable df like this:

df = create_df()
df
Sign up to request clarification or add additional context in comments.

Comments

17

I'm kind of late here, but what about creating a global variable within the function? It should save a step for you.

def create_df():

    global df

    data = {
    'state': ['Ohio','Ohio','Ohio','Nevada','Nevada'],
    'year': [2000,2001,2002,2001,2002],
    'pop': [1.5,1.7,3.6,2.4,2.9]
    }

    df = pd.DataFrame(data)

Then when you run create_df(), you'll be able to just use df.

Of course, be careful in your naming strategy if you have a large program so that the value of df doesn't change as various functions execute.

EDIT: I noticed I got some points for this. Here's another (probably worse) way to do this using exec. This also allows for multiple dataframes to be created, if desired.

import pandas as pd

def create_df():
    data = {'state': ['Ohio','Ohio','Ohio','Nevada','Nevada'],
           'year': [2000,2001,2002,2001,2002],
           'pop': [1.5,1.7,3.6,2.4,2.9]}
    df = pd.DataFrame(data)
    return df

### We'll create three dataframes for an example
for i in range(3):
    exec(f'df_{i} = create_df()')

Then, you can test them out:

Input: df_0

Output:

    state  year  pop
0    Ohio  2000  1.5
1    Ohio  2001  1.7
2    Ohio  2002  3.6
3  Nevada  2001  2.4
4  Nevada  2002  2.9

Input: df_1

Output:

    state  year  pop
0    Ohio  2000  1.5
1    Ohio  2001  1.7
2    Ohio  2002  3.6
3  Nevada  2001  2.4
4  Nevada  2002  2.9

Etc.

1 Comment

I would advise against the use of global variables for a python beginner, your namespace can get messy very quick
2

You can return dataframe from a function by making a copy of the dataframe like

def my_function(dataframe):
  my_df=dataframe.copy()
  my_df=my_df.drop(0)
  return(my_df)

new_df=my_function(old_df)
print(type(new_df))

Output: pandas.core.frame.DataFrame

Comments

2

Function explicitly returns two DataFrames:

import pandas as pd
import numpy as np

def return_2DF():

    date = pd.date_range('today', periods=20)
    DF1 = pd.DataFrame(np.random.rand(20, 2), index=date, columns=list('xyz'))

    DF2 = pd.DataFrame(np.random.rand(20, 4), index=date, columns='A B C D'.split())

    return DF1, DF2

Calling and returning two data frame

one, two = return_2DF()

1 Comment

exactly what I needed!!
0
Dataframe_object.copy()

A deep copy needs to be performed to avoid issues of one dataframe being the reference to another dataframe. This is most crucial when you have a function in a module (or a separate file) returning a dataframe. If you don't do return DataFrame_object.copy(), it will only return a reference to the dataframe created in the function.\

If you are using a function in the same file, you might not even realize this issue of deep copy / shallow copy if you are using a global variable in the function.

Comments

0

I have come across this issue before but solved it really easily by setting a variable outside the function to be the output of the function.

def create_df():
    data = {'state': ['Ohio','Ohio','Ohio','Nevada','Nevada'],
           'year': [2000,2001,2002,2001,2002],
           'pop': [1.5,1.7,3.6,2.4,2.9]}
    df = pd.DataFrame(data)
    return df

df = create_df()

1 Comment

Instead of posting another answer with the same solution, consider vote up when you reach 15+ reputation points.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.