0

I am working with json files that stores thousands or even more entries. firstly I want to understand the data I am working with.

import json


with open("/home/xu/stock_data/stock_market_data/nasdaq/json/AAL.json", "r") as f:
    data = json.load(f)
print(json.dumps(data, indent=4))

this gives me a easy to read format, but some of the "keys"(I am not familiar with the json name, so I use the word "key" as in dict objects) have thousands of values, which makes it hard to read as a whole.

I also tried:

import json


with open("/home/xu/stock_data/stock_market_data/nasdaq/json/AAL.json", "r") as f:
    data = json.load(f)

df = pd.DataFrame.from_dict(data, orient="index")
print (df.info)

but got

<bound method DataFrame.info of                                                   result error
chart  [{'meta': {'currency': 'USD', 'symbol': 'AAL',...  None>

this result kind of shows the structure, but it ends with ... not showcasing the whole picture.

My Question:

  1. Is there something that works like np.array.shape for json/dict/pandas, of which can show the shape of the structure?

  2. Is there a better library usage of interpretating the json file's structure?

Edit: Sorry perhaps my wording of my problem was misdirecting. I tried pprint, and it provided me with:

{ 'chart': { 'error': None,
             'result': [ { 'events': { 'dividends': { '1406813400': { 'amount': 0.1,
                                                                      'date': 1406813400},
                                                      '1414675800': { 'amount': 0.1,
                                                                      'date': 1414675800},
                                                      '1423146600': { 'amount': 0.1,
                                                                      'date': 1423146600},
                                                      '1430400600': { 'amount': 0.1,
                                                                      'date': 1430400600},
                                                      '1438867800': { 'amount': 0.1,
                                                                      'date': 1438867800},
                                                      '1446561000': { 'amount': 0.1,
                                                                      'date': 1446561000},
                                                      '1454941800': { 'amount': 0.1,
                                                                      'date': 1454941800},
                                                      '1462195800': { 'amount': 0.1,
                                                                      'date': 1462195800},
                                                      '1470231000': { 'amount': 0.1,
                                                                      'date': 1470231000},
                                                      '1478179800': { 'amount': 0.1,
                                                                      'date': 1478179800},
                                                      '1486650600': { 'amount': 0.1,
                                                                      'date': 1486650600},
                                                      '1494595800': { 'amount': 0.1,
                                                                      'date': 1494595800},
                                                      '1502371800': { 'amount': 0.1,
                                                                      'date': 1502371800},
                                                      '1510324200': { 'amount': 0.1,
                                                                      'date': 1510324200},
                                                      '1517841000': { 'amount': 0.1,
                                                                      'date': 1517841000},
                                                      '1525699800': { 'amount': 0.1,
                                                                      'date': 1525699800},
                                                      '1533562200': { 'amount': 0.1,
                                                                      'date': 1533562200},
                                                      '1541428200': { 'amount': 0.1,
                                                                      'date': 1541428200},
                                                      '1549377000': { 'amount': 0.1,
                                                                      'date': 1549377000},
                                                      '1557235800': { 'amount': 0.1,
                                                                      'date': 1557235800},
                                                      '1565098200': { 'amount': 0.1,
                                                                      'date': 1565098200},
                                                      '1572964200': { 'amount': 0.1,
                                                                      'date': 1572964200},
                                                      '1580826600': { 'amount': 0.1,
                                                                      'date': 1580826600}}},
                           'indicators': { 'adjclose': [ { 'adjclose': [ 18.19490623474121,
                                                                         19.326200485229492,
                                                                         19.05280113220215,
                                                                         19.80699920654297,
                                                                         20.268939971923828,
                                                                         20.891149520874023,
                                                                         20.928863525390625,
                                                                         21.28710174560547,
                                                                         20.88172149658203,
                                                                         20.93828773498535,
                                                                         20.721458435058594,
                                                                         20.514055252075195,
                                                                         20.466917037963867,
                                                                         20.994853973388672,
                                                                         20.81572914123535,
                                                                         20.2595157623291,
                                                                         20.155811309814453,
                                                                         19.816425323486328,
                                                                         20.702600479125977,
                                                                         21.032560348510742,
                                                                         20.740314483642578,
                                                                         21.0419864654541,
                                                                         21.26824951171875,
                                                                         22.531522750854492,
                                                                         23.266857147216797,
                                                                         23.587390899658203,
                                                                         25.9725284576416,
                                                                         26.27420997619629,
                                                                         27.150955200195312,
                                                                         27.273509979248047,
                                                                         27.7448787689209,
                                                                         29.507808685302734,
                                                                         30.92192840576172,
                                                                         31.4404239654541,
                                                                         31.817523956298828,
                                                                         31.940074920654297,
                                                                         31.676118850708008,
                                                                         32.354888916015625,
                                                                         31.157604217529297,
                                                                         30.158300399780273,
                                                                         30.63909339904785,
                                                                         31.148174285888672,
                                                                         30.969064712524414,
                                                                         31.496990203857422,
                                                                         31.01619529724121,
                                                                         31.666685104370117,
                                                                         32.31717300415039,
                                                                         32.31717300415039,
                                                                         30.497684478759766,
                                                                         31.69496726989746,
                                                                         32.006072998046875,
                                                                         31.7326717376709,
                                                                         31.940074920654297,
                                                                         31.826950073242188,
                                                                         31.346155166625977,
                                                                         31.61954689025879,
                                                                         ...
                                                                         ...
                                                                         ...
#this goes on and on for the respective "keys" of the json file. which means I have to scroll down thousands of lines to find out what type of data I have.

what I am hoping to find a a solutions that outputs something like this, where it doesn't show the data itself in whole, but only shows the "keys" and maybe some additional information. as some files may literally contain many GBs of data, making it impractical to scroll through.

#this is what I am hoping to achieve.
{
    "Name": {
        "title": <datatype=str,len=20>,
        "time_stamp":<data_type=list, len=3000>,
        "closing_price":<data_type=list, len=3000>,
        "high_price_of_the_day":<data_type=list, len=3000>
        ...
        ...
        ...
            }
}
3
  • Would displaying a tree as follows be helpful? stackoverflow.com/questions/55926688/… Commented Nov 5, 2021 at 9:20
  • @JasonChia that may be potentially workable solution, but not out of the box, as I want to hide some of the "branches" to make visualizing the data easier. I will try to build on top of that suggestion, and come back if it works. Commented Nov 5, 2021 at 11:58
  • Looks like you want to print out the keys as well as the types of values. You can do a simple recursive function for that with some additional 'ignore' rules for keys that you do not care about. Essentially, parsing a nested dict such that you get all keys and the final type(value) of each possible key. Commented Nov 5, 2021 at 12:49

1 Answer 1

-1

You have a few options on how to navigate this. If you want to render your data to make more informed decisions quickly, there are the built-in libraries for rendering dictionaries (see pprint) but on a personal level I recommend something that works out of the box without much configuration. I found pprintpp to be the ideal choice for any python data structure. https://pypi.org/project/pprintpp/

Simply run in your terminal: pip3 install pprintpp The libraries should install under C:\Users\User\AppData\Local\Programs\Python\PythonXX\Lib\site-packages\pprintpp

After that, simply do this in your code:

import json
from pprintpp import pprint

with open("/home/xu/stock_data/stock_market_data/nasdaq/json/AAL.json", "r") as f:
    data = json.load(f)
pprint(data)

You can also do pprint(data, width=1) to guarantee next dictionary key goes on the next line, even if the key is short. Ie:

some_dict = {'a': 'b', 'c': {'aa': 'bb'}}
pprint(data, width=1)

Outputs:

{
    'a': 'b',
    'c': {
        'aa': 'bb',
    },
}

Hope this helped! Cheers :)

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for thepprintoption, but I am looking for a way to display the structure of the data in a readable fashion, rather than the data itself.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.