1

With the following class:

class Test:
    a : str
    b : str

and the following data frame:

output = pd.DataFrame(columns=['a', 'b']

how can I convert an array, or list, of class Test into a pandas dataframe with matching columns?


Edit:

Let me add a more concrete example:

class Test:
    a: int
    b: int

    def __init__(self, a: int, b: int):
        self.a = a
        self.b = b

l = [Test(10, 20), Test(50, 60)]

output = pd.DataFrame(l,
                  columns=['a', 'b'],
                  index=range(len(l)))

and the error I get is:

ValueError: Shape of passed values is (2, 1), indices imply (2, 2)

7
  • Are you having trouble with the typical way to create a DataFrame? For instance, output = pd.DataFrame([test.a, test.b], columns=['a', 'b']), where test = Test() Commented Jul 24, 2019 at 22:59
  • Possible duplicate of Creating a Pandas DataFrame from a Numpy array: How do I specify the index column and column headers? Commented Jul 24, 2019 at 23:00
  • @PyNoob: I put a concrete example with the error Commented Jul 24, 2019 at 23:15
  • @tim: the questions may be related, but they're not exactly the same since the other question involves part of the list to become the header, which is not the case here Commented Jul 24, 2019 at 23:16
  • 1
    I'm not sure what you intended to do, but Test(10, 20) evaluates to <__main__.Test at 0x1db821405c0> - one element; so pd.DataFrame(l) tells panda to expect one column and two rows, while columns=['a', 'b'] implies two columns. Hence the error. Commented Jul 25, 2019 at 0:08

2 Answers 2

3

You can call vars to convert all the attributes of the class into a dict:

class Test:
    def __init__(self, a: int, b: int):
        self.a = a
        self.b = b

tests = [Test(10, 20), Test(50, 60)]
df = pd.DataFrame([vars(t) for t in tests])
Sign up to request clarification or add additional context in comments.

4 Comments

this is working, thanks! can you explain me the [vars(t)..] section and why it works? kind of new to python, but very new to pandas (like 4 days :))
to clarify, I understand vars(t), but I don't understand the vars(t) for ...; I would understand for t in tests: somelist.append(vars(t))
It's called list comprehension. Basically a one-liner loop. [vars(t) for t in tests] apply the function vars on every element in tests. Since vars(t) returns a dictionary, [vars(t) for ...] return a list of dictionaries
I didn’t know about list comprehension; I’m reading about it now, thanks!
2

Another way to achieve this is to do:

df = pd.DataFrame([test.__dict__ for test in tests])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.