1
data_df = pandas.read_csv('details.csv')
data_df = data_df.replace('Null', np.nan)
df = data_df.groupby(['country', 'branch']).count()
df = df.drop('sales', axis=1)  
df = df.reset_index()
print df

I would like to convert result of a Data Frame(df) to user defined json format that i mentioned below. After printing the result(df) i will get the result in the form

country     branch      no_of_employee     total_salary    count_DOB   count_email
  x            a            30                 2500000        20            25
  x            b            20                 350000         15            20
  y            c            30                 4500000        30            30
  z            d            40                 5500000        40            40
  z            e            10                 1000000        10            10
  z            f            15                 1500000        15            15

i would like to convert this to Json.My desired format is

x
   {
      a
        {
              no.of employees:30
              total salary:2500000
              count_email:25
         }
       b
         {
              no.of employees:20
              total salary:350000
              count_email:25

           }
     }

   y
     {

        c
         {
              no.of employees:30
              total salary:4500000
              count_email:30

           }
      }
   z
     {
       d
         {
              no.of employees:40
              total salary:550000
              count_email:40
         }
       e
         {
              no.of employees:10
              total salary:100000
              count_email:15

         }
        f
         {
              no.of employees:15
              total salary:1500000
              count_email:15

         }
    }

Please notice that i don't want all the fields in the data Frame Result in Json(eg:count_DOB)

1 Answer 1

2

You can use groupby with apply to_dict and last to_json:

  country branch  no_of_employee  total_salary  count_DOB  count_email
0       x      a              30       2500000         20           25
1       x      b              20        350000         15           20
2       y      c              30       4500000         30           30
3       z      d              40       5500000         40           40
4       z      e              10       1000000         10           10
5       z      f              15       1500000         15           15

g = df.groupby('country')[["branch", "no_of_employee","total_salary", "count_email"]]
                              .apply(lambda x: x.set_index('branch').to_dict(orient='index'))
print g.to_json()
{
    "x": {
        "a": {
            "total_salary": 2500000,
            "no_of_employee": 30,
            "count_email": 25
        },
        "b": {
            "total_salary": 350000,
            "no_of_employee": 20,
            "count_email": 20
        }
    },
    "y": {
        "c": {
            "total_salary": 4500000,
            "no_of_employee": 30,
            "count_email": 30
        }
    },
    "z": {
        "e": {
            "total_salary": 1000000,
            "no_of_employee": 10,
            "count_email": 10
        },
        "d": {
            "total_salary": 5500000,
            "no_of_employee": 40,
            "count_email": 40
        },
        "f": {
            "total_salary": 1500000,
            "no_of_employee": 15,
            "count_email": 15
        }
    }
}

I try print g.to_dict(), but JSON was invalid (check it here).

Sign up to request clarification or add additional context in comments.

5 Comments

while running this code i got TypeError: to_dict() got an unexpected keyword argument 'orient'
It doesnt work with this sample of with your real data?
Please chcek it with this df - df = pd.DataFrame({'count_email': {0: 25, 1: 20, 2: 30, 3: 40, 4: 10, 5: 15}, 'country': {0: 'x', 1: 'x', 2: 'y', 3: 'z', 4: 'z', 5: 'z'}, 'count_DOB': {0: 20, 1: 15, 2: 30, 3: 40, 4: 10, 5: 15}, 'branch': {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e', 5: 'f'}, 'total_salary': {0: 2500000, 1: 350000, 2: 4500000, 3: 5500000, 4: 1000000, 5: 1500000}, 'no_of_employee': {0: 30, 1: 20, 2: 30, 3: 40, 4: 10, 5: 15}}) . Still same error? If yes, what version of pandas use - print pd.__version__?
now everything works perfectly . what i have done is un install the pandas and install another version.now ,everything works perfectly
thank you very much. my current pandas version is 0.17.1. I tried with json Lint and the JSON is valid

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.