I have a DataFrame looking like this:
| EMPLOYEE_ID | NAME | MANAGER EMPLOYEE_ID | |
|---|---|---|---|
| 0 | ID_1 | ABC1 | NaN |
| 1 | ID_2 | ABC2 | ID_1 |
| 2 | ID_3 | ABC3 | ID_1 |
| 3 | ID_4 | ABC4 | ID_3 |
| 4 | ID_5 | ABC5 | ID_2 |
The goal here was to print a Tree Json:
{
"employeeID":" ",
"name":" ",
"reportees":[
{
"employeeID":xx1,
"name":yy1,
"reportees":[
{
"employeeID":xx11,
"name":yy22
}
]
}
]
}
The approach I used here was to create a MultiIndex, something like this.
multi_df2 = df.set_index(["MANAGER EMPLOYEE_ID", "EMPLOYEE_ID", "NAME"]).sort_index()
But I was unable to convert the resultant DF into JSON, then I used Lamda operation and GrouBy
df4 = (df.groupby(["EMPLOYEE_ID"])['MANAGER_EMPLOYEE_ID', 'NAME']
.apply(lambda x: x.to_dict('r'))
.to_json()
)
to generate the following JSON:
{
"employeeID":" ", # MANAGER
{
"employeeID":xx1,
"name":yy1
},
{
"employeeID":xx2,
"name":yy2
},
{
"employeeID":xx3,
"name":yy3
},
"employeeID":" ", # MANAGER2
{
"employeeID":xx12,
"name":yy12
},
{
"employeeID":xx13,
"name":yy13
},
{
"employeeID":xx14,
"name":yy14
}
}
And am not sure how to increase the hierarchy from here on, any help is appreciated.
This solution is quite similar but the selected answer says that this is not possible with pandas as, "the data structure you're going after is recursive, not tabular." Which am unable to understand.
EDIT 1:
I performed a merge operation in order to obtain the Manager Name:
manager_df = merged_df3.merge(merged_df3[["EMPLOYEE_ID", "NAME"]], left_on="MANAGER_EMPLOYEE_ID", right_on="EMPLOYEE_ID", how='left').drop("EMPLOYEE_ID_y", axis=1)
The Dataframe looks like this as of now:
| EMPLOYEE_ID | NAME | MANAGER_EMPLOYEE_ID | MANAGER_NAME |
|---|---|---|---|
| 42 | S | 40 | G |
| 40 | G | nan | nan |
| T | M | 40 | G |
| 0c | H | 42 | S |