2

I need create a table using Pandas with nested documents in MongoDB.

This is my json:

{
"CNPJ" : "65206503000163",
"CNAE" : [ 
        {
            "codigoCNAE" : 7911200,
            "dataInicioCNAE" : 20000101,
        },
        {
            "codigoCNAE" : 9999999,
            "dataInicioCNAE" : 2018101,
        }
        ]
}

I need a simple table:

    CNPJ             codigoCNAE     dataInicioCNAE 
0   65206503000163   7911200        20000101      
1   65206503000163   9999999        2018101

Thanks

5 Answers 5

1

Assuming that you have only one such document you can use the following code.

dict1 = { "CNPJ" : "65206503000163", "CNAE" : [{ "codigoCNAE" : 7911200, "dataInicioCNAE" : 20000101, }, { "codigoCNAE" : 9999999, "dataInicioCNAE" : 2018101, } ] }

df = pd.DataFrame(dict1['CNAE'])
df['CNPJ'] = dict1['CNPJ']

OUTPUT :

print(df)

   codigoCNAE dataInicioCNAE     CNPJ
0   7911200   20000101          65206503000163
1   9999999   2018101           65206503000163

For multiple documents you can iterate through each document and use pd.concat to combine each df

Sign up to request clarification or add additional context in comments.

Comments

1

Use json_normalize:

from pandas.io.json import json_normalize

dict1 = { "CNPJ" : "65206503000163", 
          "CNAE" : [{ "codigoCNAE" : 7911200, 
                     "dataInicioCNAE" : 20000101, }, 
                      { "codigoCNAE" : 9999999, 
                        "dataInicioCNAE" : 2018101, } ] }

df = json_normalize(dict1, ['CNAE'],'CNPJ')
print (df)
   codigoCNAE  dataInicioCNAE            CNPJ
0     7911200        20000101  65206503000163
1     9999999         2018101  65206503000163

Comments

0

You need:

import pandas as pd 

x = {
"CNPJ" : "65206503000163",
"CNAE" : [ 
        {
            "codigoCNAE" : 7911200,
            "dataInicioCNAE" : 20000101,
        },
        {
            "codigoCNAE" : 9999999,
            "dataInicioCNAE" : 2018101,
        }
        ]
}

df = pd.DataFrame.from_dict(x, orient='columns')
df = pd.concat([df['CNAE'].apply(pd.Series), df['CNPJ']], axis=1)
print(df)

Output:

codigoCNAE        dataInicioCNAE            CNPJ                                                                                         
0     7911200        20000101       65206503000163                                                                                         
1     9999999         2018101       65206503000163 

Comments

0

Simply make a dataframe from the dict you have, seperate the dataframe to 2 parts. Make the CNAE part as a Series and concat with the other part on axis 1.

 x = {
    "CNPJ" : "65206503000163",
    "CNAE" : [ 
                {
                    "codigoCNAE" : 7911200,
                    "dataInicioCNAE" : 20000101,
                },
                {
                    "codigoCNAE" : 9999999,
                    "dataInicioCNAE" : 2018101,
                }
            ]
}

x_df = pd.DataFrame(x)

a_df = x_df['CNAE'].apply(pd.Series)
b_df = x_df['CNPJ']

df = pd.concat([b_df, a_df], axis=1)

df

#Output

CNPJ    codigoCNAE  dataInicio    CNAE
0   65206503000163  7911200     20000101
1   65206503000163  9999999     2018101

Comments

0

Use concat:

>>> df=pd.DataFrame(d)
>>> pd.concat([df[['CNPJ']],pd.DataFrame(d['CNAE'])],axis=1)
             CNPJ  codigoCNAE  dataInicioCNAE
0  65206503000163     7911200        20000101
1  65206503000163     9999999         2018101
>>> 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.