0

I'm looking for a solution to convert a Pandas DataFrame with 3 subcategories to an JSON output without any lists.

This is the data-structure I got:

import pandas as pd

df = pd.DataFrame({
    'cat1': ['A', 'A', 'A', 'B', 'B', 'C', 'C', 'C'],
    'cat2': ['BB', 'BB', 'BC', 'BB', 'BB', 'BB', 'BC', 'BC'],
    'cat3': ['CC', 'CC', 'CD', 'CD', 'CD', 'CC', 'CD', 'CE'],
    'prod': ['P1', 'P2', 'P3', 'P1', 'P4', 'P1', 'P3','P6'],
    'amount': [132, 51, 12, 421, 55, 11, 123, 312]
})

And this is the desired JSON output:

{
"A": {
    "BB": {
        "CC": {
            "P1": 132,
            "P2": 51
        }
    },
    "BC": {
        "CD": {
            "P3": 12
        }
    }
},
"B": {
    "BB": {
        "CD": {
            "P1": 421,
            "P4": 55
        }
    }
},
"C": {
    "BB": {
        "CC": {
            "P1": 11
        }
    },
    "BC": {
        "CD": {
            "P3": 123
        },
        "CE": {
            "P6": 312
        }
    }
}

I had several attemts with the to_json method and tried to adapt this answer without any luck.

def f(x):
    return dict({k:v for k,v in zip(x.prod, x.amount)})

(
    df.groupby(['cat1','cat2', 'cat3'])
      .apply(f)
      .groupby(level=0)
      .apply(lambda x: x.tolist())
      .to_dict()
)

TypeError: zip argument #1 must support iteration

1 Answer 1

1

Credit to DSM but I modified the .ix to .iloc

def recur_dictify(frame):
         if len(frame.columns) == 1:
             if frame.values.size == 1: return frame.values[0][0]
             return frame.values.squeeze()
         grouped = frame.groupby(frame.columns[0])
         d = {k: recur_dictify(g.iloc[:,1:]) for k,g in grouped}
         return d
     
recur_dictify(df)
Out[211]: 
{'A': {'BB': {'CC': {'P1': 132, 'P2': 51}}, 'BC': {'CD': {'P3': 12}}},
 'B': {'BB': {'CD': {'P1': 421, 'P4': 55}}},
 'C': {'BB': {'CC': {'P1': 11}}, 'BC': {'CD': {'P3': 123}, 'CE': {'P6': 312}}}}
Sign up to request clarification or add additional context in comments.

2 Comments

Works like a charm. Nice recursive solution. Could you explain this solution a little bit? I got to check the squeeze function and how works g.iloc[:,1:] exactly?
@SvenRojek slice the position from 2nd row to the end

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.