0

I am struggling to get some json information to appear in a dataframe column

The information is 'data_at': 1619293080600

Here is what I have so far:

requestT = requests.get('https:............)
json_dataT = json.loads(requestT.text)

print(json_dataT)

Output:

{'data_at': 1619293080600, 'data': {'london_NW': {'loc_postcode': 'NW1', 'loc_name': 'camden_twn', 'ave_price': '1061227.00'}, 'london_SW': {'loc_postcode': 'SW1', 'loc_name': 'victoria', 'ave_price': '1878130.00'}}}

I then transform this into a dataframe via the following method:

df = pd.DataFrame(json_dataT)
dfNormal = json_normalize(df['data'])

However, I lose the 'data_at' information which is a timestamp that I want in column 0. What I get is the following:

        loc_postcode          loc_name              ave_price
0                NW1        camden_twn             1061227.00
1                SW1          victoria             1878130.00

How can I get the 'data_at' (timestamp) to appear as the first column?

2 Answers 2

1

One way to get the desired result is to "normalize" json_dataT, as json_dataT['data'].keys() is not present in the desired result.

Specifically, "drop" the level w/ json_dataT['data'].keys():

>>> json_dataT['data'] = list(json_dataT['data'].values())  

Then, apply json_normalize to get the dataframe:

>>> df_normal = json_normalize(json_dataT, record_path='data', meta='data_at')
>>> df_normal
  loc_postcode    loc_name   ave_price        data_at
0          NW1  camden_twn  1061227.00  1619293080600
1          SW1    victoria  1878130.00  1619293080600

Finally, reorder the columns to make data_at the first column:

>>> cols = df_normal.columns.tolist()
>>> cols = cols[-1:] + cols[:-1]
>>> df_normal = df_normal[cols]
>>> df_normal
         data_at loc_postcode    loc_name   ave_price
0  1619293080600          NW1  camden_twn  1061227.00
1  1619293080600          SW1    victoria  1878130.00
Sign up to request clarification or add additional context in comments.

1 Comment

Just saw this a few minutes ago. It worked like a charm. Thanks buddy
0

Because of the way df is constructed, the index will not match with the result of your .json_normalize() operation -- it will be: Index(['london_NW', 'london_SW'], dtype='object').
To get around this you can use .reset_index(), and then horizontally concatenate with pd.concat() with axis=1:

df.reset_index(drop=True,inplace=True)
df_normal = pd.concat([df['data_at'],pd.json_normalize(df['data'])],axis=1)

Result:

In [63]: df_normal
Out[63]: 
         data_at loc_postcode    loc_name   ave_price
0  1619293080600          NW1  camden_twn  1061227.00
1  1619293080600          SW1    victoria  1878130.00

1 Comment

I get an error of TypeError: 'NoneType' object is not subscriptable

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.