0

i have this json file i wanted to convert it to CSV using pandas

  {
        "partes": [
            {
                "processo": "1001824-89.2019.8.26.0493",
                "tipo": "Reqte: ",
                "nome": "Sérgio Izaias Massaranduba  Advogada: Mariana Pretel E Pretel      ",
                "cnpj_cpf": "Não encontrado",
                "oab": "Não encontrado"
            },
            {
                "processo": "1001824-89.2019.8.26.0493",
                "tipo": "Reqda: ",
                "nome": "CLARO S/A   ",
                "cnpj_cpf": "Não encontrado",
                "oab": "Não encontrado"
            }
        ],
        "movimentacoes": [
            {
                "processo": "1001824-89.2019.8.26.0493",
                "data": "28/10/2019",
                "tem_anexo": "",
                "movimentacao": " Distribuído Livremente (por Sorteio) (movimentação exclusiva do distribuidor)  "
            }
        ]
    }

when i use the following function read_json, he returns me one of these error ValueError: arrays must all be same length

aqui está meu código:

import pandas as pd
import json
import os

os.chdir('C:\\Users\\Suporte\\Desktop\\AUT\\autonomation')


df = pd.read_json('file.json')

df_ = df.to_csv('file.csv', sep=';',index=False)

I don't know why he can't read the file

5
  • 1
    Does this answer your question? Convert JSON to CSV with pandas Commented Oct 29, 2019 at 16:03
  • I already tried this POST and it didn't work Commented Oct 29, 2019 at 16:07
  • what do you expect the csv to look like? Commented Oct 29, 2019 at 16:12
  • separated by commas, in rows and columns Commented Oct 29, 2019 at 16:15
  • and what are the columns? Commented Oct 29, 2019 at 16:21

1 Answer 1

2
  • Remember that pandas is about tables of data, with repeating column headers.
  • The JSON presented here, as a whole, does not correspond to tabular data.
  • This JSON needs to be read in by separate keys
  • Alternatively, partes and movimentacoes must be the same length.
    • Length of partes value is 2, while movimentacoes is 1.
  • Given the following data, in a file named test1.json

Data:

{
    "partes": [{
            "processo": "1001824-89.2019.8.26.0493",
            "tipo": "Reqte: ",
            "nome": "Sérgio Izaias Massaranduba  Advogada: Mariana Pretel E Pretel      ",
            "cnpj_cpf": "Não encontrado",
            "oab": "Não encontrado"
        }, {
            "processo": "1001824-89.2019.8.26.0493",
            "tipo": "Reqda: ",
            "nome": "CLARO S/A   ",
            "cnpj_cpf": "Não encontrado",
            "oab": "Não encontrado"
        }
    ],
    "movimentacoes": [{
            "processo": "1001824-89.2019.8.26.0493",
            "data": "28/10/2019",
            "tem_anexo": "",
            "movimentacao": " Distribuído Livremente (por Sorteio) (movimentação exclusiva do distribuidor)  "
        }
    ]
}

Code:

from pathlib import Path
import pandas as pd
import json

# path to file
p = Path(r'c:\some_path_to_data\test1.json')

# read the JSON file in
with p.open('r') as f:
    data = json.loads(f.read())

# create the dataframe
df_partes = pd.DataFrame.from_dict(data['partes'])
print(df_partes)

                  processo     tipo                                                                  nome         cnpj_cpf              oab
 1001824-89.2019.8.26.0493  Reqte:   Sérgio Izaias Massaranduba  Advogada: Mariana Pretel E Pretel        Não encontrado  Não encontrado
 1001824-89.2019.8.26.0493  Reqda:                                                           CLARO S/A     Não encontrado  Não encontrado

df_movimentacoes = pd.DataFrame.from_dict(data['movimentacoes'])
print(df_movimentacoes)

                  processo        data tem_anexo                                                                         movimentacao
 1001824-89.2019.8.26.0493  28/10/2019             Distribuído Livremente (por Sorteio) (movimentação exclusiva do distribuidor)

# save to csv
df_partes.to_csv('partes.csv', index=False)
df_movimentacoes('moviementacoes.csv', index=False)
  • If the JSON has many keys, consider making a dictionary of dataframes as follows:
df_dict = {key: pd.DataFrame.from_dict(data[key]) for key in data.keys()}

# Access a specific dataframe just like a regular dictionary
df_dict['partes']

# save to csv
for key in df_dict.keys():
    df_dict[key].to_csv(f'{key}.csv', index=False)
Sign up to request clarification or add additional context in comments.

2 Comments

I correct, he returns me ValueError: The DataFrame builder was not called correctly!
@user158433 I can only help you with the information you've provided. The code has been tested against the JSON you provided, so I know the syntax is correct. However, if the entire JSON is significantly different than what's been provided, there may be issues I can't account for.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.