0

I am extracting some tweets and I am getting json (json_response) in return which looks something like this (I've added dummy IDs):

{
    "data": [
        {
            "author_id": "123456",
            "conversation_id": "7890",
            "created_at": "2020-03-01T23:59:58.000Z",
            "id": "12345678",
            "lang": "en",
            "public_metrics": {
                "like_count": 1,
                "quote_count": 2,
                "reply_count": 3,
                "retweet_count": 4
            },
            "referenced_tweets": [
                {
                    "id": "13664100",
                    "type": "retweeted"
                }
            ],
            "reply_settings": "everyone",
            "source": "Twitter for Android",
            "text": "This is a sample."
        }
],
"includes": {
        "users": [
            {
                "created_at": "2018-08-29T23:45:37.000Z",
                "description": "",
                "id": "7890123",
                "name": "Twitter user",
                "public_metrics": {
                    "followers_count": 1199,
                    "following_count": 1351,
                    "listed_count": 0,
                    "tweet_count": 52607
                },
                "username": "user_123",
                "verified": false
            }
]
}

I am trying to convert it into pandas dataframe using the following code:

import json
from pandas.io.json import json_normalize

df = pd.DataFrame.from_dict(pd.json_normalize(json_response['data']), orient='columns')

And it is giving me the output whose header is as follows:

conversation_id | text | source | reply_settings | referenced_tweets | id | created_at | lang | author_id | public_metrics.retweet_count | public_metrics.reply_count | public_metrics.like_count | public_metrics.quote_count | in_reply_to_user_id

except that I want to add username as a column in the df along with other columns. I'd like to add the column username among these columns and I don't know how to do that. Any guidance please?

0

1 Answer 1

2

IIUC you have a list of users dictionaries in json_response['data'] and json_response['include']['users']. Why not create your own dictionary list from those two?

json_response = json.loads(response_raw)
your_dict_list = json_response['data']
for i, user in enumerate(json_response['includes']['users']):
    your_dict_list[i]['username'] = user['username']

df = pd.json_normalize(your_dict_list)

Output:

  author_id conversation_id                created_at        id lang  ...  username public_metrics.like_count public_metrics.quote_count public_metrics.reply_count public_metrics.retweet_count
0    123456            7890  2020-03-01T23:59:58.000Z  12345678   en  ...  user_123                         1                          2                          3                            4
Sign up to request clarification or add additional context in comments.

3 Comments

It is giving me the this error: IndexError Traceback (most recent call last) <ipython-input-40-7b211ebb88ef> in <module>() 1 your_dict_list = json_response['data'] 2 for i, user in enumerate(json_response['includes']['users']): ----> 3 your_dict_list[i]['username'] = user['username'] 4 df = pd.json_normalize(your_dict_list) IndexError: list index out of range
Are json_response['data'] and json_response['include']['users'] the same length? How do you actually load your data? The json in your question has a missing bracket and it's not a python dictionary either (false instead of False) I assumed you would read your json as string (response_raw in my example) and load it with json.loads
Thank you for your response. Its strange that the length of json_response['data'] is 10 and 11 for json_response['include']['users']. However, I am getting the data using these commands url = create_url(keyword, start_time,end_time, max_results) json_response = connect_to_endpoint(url[0], headers, url[1])``` and then using json_response further.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.