flixoflax's answer should work and is nice and straightforward.
It does a big chunk of the work outside of pandas though; here is a pure pandas solution as an example (with the added bonus of retaining all the data if you decide you do want it after all).
The problem with doing the initial loop outside of pandas is that, while it works for the sample data, if the source data is in Excel then pandas can (and IMO should) be used to load the data. In that case it makes no sense to load the data in pandas, loop outside of pandas, and then step back into pandas for processing.
Import and load the data as normal:
import numpy as np
import pandas as pd
# Let's assume data is the result of the pandas Excel read
data = [
{
'other': False,
'total': {
'impressions': 346821,
'taps': 12167,
'installs': 7535,
'newDownloads': 5364,
'redownloads': 2171,
'latOnInstalls': 1878,
'latOffInstalls': 5657,
'ttr': 0.0351,
'avgCPA': {
'amount': '1.8',
'currency': 'GBP'
},
'avgCPT': {
'amount': '1.1147',
'currency': 'GBP'
},
'localSpend': {
'amount': '123.456',
'currency': 'GBP'
},
'conversionRate': 0.6193
},
'metadata': {
'campaignId': 219752776,
'campaignName': 'Campaign1',
'deleted ': False
}
},
{
'other': False,
'total': {
'impressions': 346821,
'taps': 12167,
'installs': 7535,
'newDownloads': 5364,
'redownloads': 2171,
'latOnInstalls': 1878,
'latOffInstalls': 5657,
'ttr': 0.0351,
'avgCPA': {
'amount': '1.8',
'currency': 'GBP'
},
'avgCPT': {
'amount': '1.1147',
'currency': 'GBP'
},
'localSpend': {
'amount': '123.456',
'currency': 'GBP'
},
'conversionRate': 0.6193
},
'metadata': {
'campaignId': 219752776,
'campaignName': 'Campaign1',
'deleted ': False
}
}
]
Load the data straight into a dataframe (instead of looping and picking out one key value). Note this leads to a dataframe that uses more memory but it means all the values are available and an initial loop is not needed outside of pandas.
Also, on first load some columns will be dictionaries. This will need to be cleaned up.
The necessary operations can be done step by step or all at once. For the sake of this example, I'll post both sets for comparison.
# Step by step
# Create dataframe
df = pd.DataFrame(data)
# Split out the 'total' column
df2 = df['total'].apply(pd.Series)
# Split out the 'localSpend' column
df3 = df2['localSpend'].apply(pd.Series)
# Merge the three dataframes back together
result = pd.concat([df, df2, df3], axis=1)
print(f"Total result is:{result['amount'].astype('float64').sum()}")
Or a more concise form, with the split and merges occurring together:
df = pd.DataFrame(data)
df = pd.concat([df, df['total'].apply(pd.Series)], axis=1)
df = pd.concat([df, df['localSpend'].apply(pd.Series)], axis=1)
print(f"Total result is:{df['amount'].astype('float64').sum()}")
Not that df['amount'].astype('float64') has been performed because the column has been left as it's default dtype (object). This would not be needed if you convert the column to a numeric as flixoflax did.
df['amount'] = pd.to_numeric(df["amount"], downcast="float")
print(f"Total result is:{df['amount'].sum():.3f}")
The final version of the 'amount' column can be split off into it's own dataframe or Series, and can be converted to a float at the same time:
df2["amount"] = pd.to_numeric(df["amount"], downcast="float")
print(f"Total result is:{df2['amount'].sum():.3f}")