1

This must be easy, but I am not able to get this dataframe in the correct form.

df = pd.read_json('https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia.org/all-access/user/Python_(programming_language)/daily/20210101/20210501')

The expected columns are:

project, article,granularity,timestamp,access,agent,user,views

2 Answers 2

1
>>> df = pd.read_json('https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia.org/all-access/user/Python_(programming_language)/daily/20210101/20210501')
>>> pd.concat([df.drop(['items'], axis=1), df['items'].apply(pd.Series)], axis=1)
          project                        article granularity   timestamp      access agent  views
0    en.wikipedia  Python_(programming_language)       daily  2021010100  all-access  user   7238
1    en.wikipedia  Python_(programming_language)       daily  2021010200  all-access  user   8449
2    en.wikipedia  Python_(programming_language)       daily  2021010300  all-access  user   8669
3    en.wikipedia  Python_(programming_language)       daily  2021010400  all-access  user  10688
4    en.wikipedia  Python_(programming_language)       daily  2021010500  all-access  user  11383
..            ...                            ...         ...         ...         ...   ...    ...
116  en.wikipedia  Python_(programming_language)       daily  2021042700  all-access  user   6125
117  en.wikipedia  Python_(programming_language)       daily  2021042800  all-access  user   6184
118  en.wikipedia  Python_(programming_language)       daily  2021042900  all-access  user   5960
119  en.wikipedia  Python_(programming_language)       daily  2021043000  all-access  user   5489
120  en.wikipedia  Python_(programming_language)       daily  2021050100  all-access  user   4297

[121 rows x 7 columns]
>>>
Sign up to request clarification or add additional context in comments.

2 Comments

Why the concat and drop ? It seems that df['items'].apply(pd.Series) produces the same result ?
It's just a common pandas pattern. It helps when you have columns, other than the nested one, that you want to keep
1

You can utilise assign as well -


>>> df = pd.read_json('https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia.org/all-access/user/Python_(programming_language)/daily/20210101/20210501')
>>> 
>>> 
>>> df.drop('items', 1).assign(**df['items'].apply(pd.Series))
          project                        article granularity   timestamp      access agent  views
0    en.wikipedia  Python_(programming_language)       daily  2021010100  all-access  user   7238
1    en.wikipedia  Python_(programming_language)       daily  2021010200  all-access  user   8449
2    en.wikipedia  Python_(programming_language)       daily  2021010300  all-access  user   8669
3    en.wikipedia  Python_(programming_language)       daily  2021010400  all-access  user  10688
4    en.wikipedia  Python_(programming_language)       daily  2021010500  all-access  user  11383
..            ...                            ...         ...         ...         ...   ...    ...
116  en.wikipedia  Python_(programming_language)       daily  2021042700  all-access  user   6125
117  en.wikipedia  Python_(programming_language)       daily  2021042800  all-access  user   6184
118  en.wikipedia  Python_(programming_language)       daily  2021042900  all-access  user   5960
119  en.wikipedia  Python_(programming_language)       daily  2021043000  all-access  user   5489
120  en.wikipedia  Python_(programming_language)       daily  2021050100  all-access  user   4297

[121 rows x 7 columns]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.