0

i want to create sub-categories from the existing data frame data frame column consists of (sample table) my changes required at the columns level not any changes in the data like a set of columns are the and column names 3 different suffixes (few with similar column names and other column names) example like
|payer_id|payer_name|halo_payer_name|delta_payer_name|halo_desc|delta_desc|halo_operations|delta_notes|halo_processed_data|delta_processed_data|extra|insurance_company|
I want it to be grouped in this halo group halo_payer_name|halo_desc|halo_operations|halo_processed_data|
I want it to be grouped in this delta group delta_payer_name|delta_desc|delta_notes|delta_processed_data|
and the remaining columns as one group so when converted to JSON it would come in this layout

{
    "schema": {
        "fields": [{
                "payer_details": [{
                        "name": "payer_id",
                        "type": "string"
                    },
                    {
                        "name": "payer_name",
                        "type": "string"
                    },
                    {
                        "name": "extra",
                        "type": "string"
                    },
                    {
                        "name": "insurance_company",
                        "type": "string"
                    }
                ]
            },
            {
                "halo": [{
                        "name": "halo_payer_name",
                        "type": "string"
                    },
                    {
                        "name": "halo_desc",
                        "type": "string"
                    },
                    {
                        "name": "halo_operstions",
                        "type": "string"
                    },
                    {
                        "name": "halo_processed_data",
                        "type": "string"
                    }
                ]
            }, {
                "delta": [{
                        "name": "delta_payer_name",
                        "type": "string"
                    },
                    {
                        "name": "delta_desc",
                        "type": "string"
                    },
                    {
                        "name": "delta_notes",
                        "type": "string"
                    },
                    {
                        "name": "delta_processed_data",
                        "type": "string"
                    }
                ]
            }
        ],
        "pandas_version": "1.4.0"
    },
    "masterdata": [{
        "payer_details": [{
            "payer_id": "",
            "payer_name": "",
            "extra": "",
            "insurance_company": ""
        }],
        "halo": [{
            "halo_payer_name": "",
            "halo_desc": "",
            "halo_operations": "",
            "halo_processed_data": "",
                    }],
        "delta":[{
            "delta_payer_name": "",
            "delta_desc": "",
            "delta_notes": "",
            "delta_processed_data": "",
                    }]
    }]
}

for this type of situation i couldn't find a solution as it is a column based grouping instead of data-based grouping

1 Answer 1

0

so came across this post today it helped with my situation (adding data from a data frame and using it to create looped data and insert it into a dict and then convert the whole into a JSON file) the ref that was helpful to me is link so the solution for this question goes like this

schema={
    "schema": {
        "fields": [{
                "payer_details": [{
                        "name": "payer_id",
                        "type": "string"
                    },
                    {
                        "name": "payer_name",
                        "type": "string"
                    },
                    {
                        "name": "extra",
                        "type": "string"
                    },
                    {
                        "name": "insurance_company",
                        "type": "string"
                    }
                ]
            },
            {
                "halo": [{
                        "name": "halo_payer_name",
                        "type": "string"
                    },
                    {
                        "name": "halo_desc",
                        "type": "string"
                    },
                    {
                        "name": "halo_operstions",
                        "type": "string"
                    },
                    {
                        "name": "halo_processed_data",
                        "type": "string"
                    }
                ]
            }, {
                "delta": [{
                        "name": "delta_payer_name",
                        "type": "string"
                    },
                    {
                        "name": "delta_desc",
                        "type": "string"
                    },
                    {
                        "name": "delta_notes",
                        "type": "string"
                    },
                    {
                        "name": "delta_processed_data",
                        "type": "string"
                    }
                ]
            }
        ],
        "pandas_version": "1.4.0"
    },
    "masterdata": []
}


derived the schema above as i have desired

payer_list=[]
for i in df.index:
  case={
        "payer_details": [{
            "payer_id": "{}".format(df['payer_id'][i]),
            "payer_name": "{}".format(df['payer_name'][i]),
            "extra": "{}".format(df['extra'][i]),
            "insurance_company": "{}".format(df['insurance_company'][i])
        }],
        "halo": [{
            "halo_payer_name": "{}".format(df['halo_payer_name'][i]),
            "halo_desc": "{}".format(df['halo_desc'][i]),
            "halo_operations": "{}".format(df['halo_operations'][i]),
            "halo_processed_data": "{}".format(df['halo_processed_data'][i]),
                    }],
        "delta":[{
            "delta_payer_name": "{}".format(df['delta_payer_name'][i]),
            "delta_desc": "{}".format(df['delta_desc'][i]),
            "delta_notes": "{}".format(df['delta_notes'][i]),
            "delta_processed_data": "{}".format(df['delta_processed_data'][i]),
                    }]
    }
  payer_list.append(case)
schema["masterdata"] = payer_list

created and empty list and run the loop and included in the empty list and joined or linked to the schema

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.