0

OK, this might be a very silly question, but...

I have an object x, which contains a number of pandas dataframes (df1,df2,df3 for sake of argument)

When the program has finished running, I want to export each of these dataframes to its own csv file. At the moment, I have a function within x that looks like this:

def export(self, path):
    self.df1.to_csv(path+"_df1.csv")
    self.df2.to_csv(path+"_df2.csv")
    self.df3.to_csv(path+"_df3.csv")

It works (I can just use x.export(somepath), but I keep having to change it because the data I want to keep in the object changes. Can anybody tell me if there is a way of simply iterating over all of the variables within an object (they are all the same format) and (in this case) dumping them all to their respective .csv files?

Thanks

14
  • Have an instance variable dataframes, which is a list holding the dataframes? Commented Jun 12, 2018 at 13:43
  • wouldn't I then have to manually add all of the different variables to this list? - that's what I'm trying to avoid. I've only used three in the example, but there are a lot more in the original code! Commented Jun 12, 2018 at 13:44
  • At one point you will have to specify which dataframes you want to export. Might as well keep them in a list. I have a sense that your are trying to do something weird and your question is lacking information. Commented Jun 12, 2018 at 13:45
  • 1
    @Will Then you could do self.dataframes = [pd.DataFrame(), pd.DataFrame(), ...] in __init__ and instead of accessing instance.df1 you would use instance.dataframes[0]. (Of course, you could also pop and append from that list.) Commented Jun 13, 2018 at 8:56
  • 1
    @Will of course a dict will work. Don't be afraid to try. Commented Jun 13, 2018 at 9:24

1 Answer 1

2

You can iterate over __dict__ to retrieve the attribute names and data frame objects. You can check the object types to avoid calling .to_csv on something that is not a data frame.

Here is some example code that prototypes the solution.

class A():
    def __init__(self):
        self.x = pd.DataFrame([[1,2],[3,4]])
        self.y = pd.DataFrame([[5,6],[7,8]])

    def export(self, path):
        for k, v in self.__dict__.items()
            if isintance(v, pd.DataFrame):
                v.to_csv('{}_{}.csv'.format(path, k)

a = A()
a.export('dataframes')
Sign up to request clarification or add additional context in comments.

10 Comments

I downvoted because this is - in my opinion - horrible and inefficient Python. Keep an iterable holding the dataframes. Create it in __init__.
@timgeb It doesn't matter if it's inefficient, the difference is less than milliseconds.
@timgeb, do give us an efficient method
@timgeb, cool, agreed with having a list of all the dataframes.
@cal97g First of all, you don't know how large the __dict__ of OP's objects can get, so your estimated time is pure speculation. Second, readability and the principles of duck typing are violated. We can avoid explicit instance checks here and make the code faster at the same time without any cost - so we should do it.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.