I have written a function that extracts certain information from a given website and stores it into a list. The output is a nested dict that looks like this:
t = [{'title': ['title1',
'title2',
'title3'],
'link': ['link1',
'link2',
'link3'],
'promo_text': ['text1',
'text2',
'text3'],
'additional_info':['a',
'b',
'c']},
{'title': ['title4',
'title5',
'title6'],
'link': ['link4',
'link5',
'link6'],
'promo_text': ['text4',
'text5',
'text6'],
'additional_info': ['d',
'e',
'f']},
{'title': ['title7',
'title8',
'title9'],
'link': ['link7',
'link8',
'link9'],
'promo_text': ['text7',
'text8',
'text9',],
'additional_info': ['g',
'h',
'i']}]
I would like to convert this list into a pandas dataframe where the columns are 'title', 'link', 'promo_text' and 'additional_info' so that it looks like this:
| title | link | promo_text | additional_info | |
|---|---|---|---|---|
| 0 | title1 | link1 | text1 | a |
| 1 | title2 | link2 | text2 | b |
| 2 | title3 | link3 | text3 | c |
| 3 | title4 | link4 | text4 | d |
| 4 | title5 | link5 | text5 | e |
| 5 | title6 | link6 | text6 | f |
| 6 | title7 | link7 | text7 | g |
| 7 | title8 | link8 | text8 | h |
| 8 | title9 | link9 | text9 | i |
Unfortunately, using the standard pandas command does not seem provide me with the desired output:
t_df = pd.DataFrame(t)
t_df
| title | link | promo_text | additional_info | |
|---|---|---|---|---|
| 0 | [title1, title2, title3] | [link1, link2, link3] | [text1, text2, text3] | [a, b, c] |
| 1 | [title4, title5, title6] | [link4, link5, link6] | [text4, text5, text6] | [d, e, f] |
| 2 | [title7, title8, title9] | [link7, link8, link9] | [text7, text8, text9] | [g, h, i] |
Is there a way to convert this adequately using pandas? Any help is much appreciated!
Best, Sebastian