0

So i have the following sample json, contained into a series of mine

s = pd.Series(['{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityCepInDatabaseEqualTrue","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}',
       '{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityCepInDatabaseEqualTrue","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}',
       '{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityAddressConfirmation","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}'])

I dont actually know a way of exploding this type of json, into columns, would appreciate some help? I tried json_normalize loads and so on but i get empty results.

Wanted result would be something like this:

df = pd.DataFrame({'city':['Uberlandia','Uberlandia','Uberlandia'],'bot-origin':[null,null,null]}) # There are more columns but you get the jist.

Since there were a lot of answers would appreciate if someone showed me the most time efficient way, have lots of rows.

3 Answers 3

1

Try this:

df = pd.DataFrame.from_records(s.map(json.loads))
print(df)

only printed a part of the whole df for presentation.

         city bot-origin                    campaign-source                                  lastState main-installation-date
0  Uberlândia       None  carrinho-abandonado-ecommerce-sms  productAvailabilityCepInDatabaseEqualTrue                   None
1  Uberlândia       None  carrinho-abandonado-ecommerce-sms  productAvailabilityCepInDatabaseEqualTrue                   None
2  Uberlândia       None  carrinho-abandonado-ecommerce-sms     productAvailabilityAddressConfirmation                   None
Sign up to request clarification or add additional context in comments.

Comments

0

Looks like you have string formatted json objects, first we need to convert them then pass them into a json_normalize method.

import json 

df1 = pd.json_normalize(s.map(json.loads))


         city bot-origin                    campaign-source                                  lastState  ... main-installation-period-day plan-value alternative-installation-period-day data-change
0  Uberlândia       None  carrinho-abandonado-ecommerce-sms  productAvailabilityCepInDatabaseEqualTrue  ...                         None       None                                None       false
1  Uberlândia       None  carrinho-abandonado-ecommerce-sms  productAvailabilityCepInDatabaseEqualTrue  ...                         None       None                                None       false
2  Uberlândia       None  carrinho-abandonado-ecommerce-sms     productAvailabilityAddressConfirmation  ...                         None       None                                None       false

Comments

0

The following should work

import json 

s = pd.Series(['{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityCepInDatabaseEqualTrue","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}',
       '{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityCepInDatabaseEqualTrue","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}',
       '{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityAddressConfirmation","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}'])

df = s.apply(lambda row: pd.Series(json.loads(row)))

Output:

city bot-origin campaign-source lastState main-installation-date userid full-name alternative-installation-date chosen-product bank postalcode due-date cpf origin-link payment state api-orders-hash-id email api-orders-error plan-name userphone plan-offer completed-address type-of-person onboarding-simplified type-of-product main-installation-period-day plan-value alternative-installation-period-day data-change
0 Uberlândia carrinho-abandonado-ecommerce-sms productAvailabilityCepInDatabaseEqualTrue [email protected] Internet 38405328 01548226041 MG 34 9342-8011 38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG CPF Residencial false
1 Uberlândia carrinho-abandonado-ecommerce-sms productAvailabilityCepInDatabaseEqualTrue [email protected] Internet 38405328 01548226041 MG 34 9342-8011 38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG CPF Residencial false
2 Uberlândia carrinho-abandonado-ecommerce-sms productAvailabilityAddressConfirmation [email protected] Internet 38405328 01548226041 MG 34 9342-8011 38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG CPF Residencial false

The null values are converted to None. If you want to convert them to NaN you can use .fillna(np.nan):

import json
import numpy as np

df = (
    s.apply(lambda row: pd.Series(json.loads(row)))
     .fillna(np.nan)
)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.