1

I have a Dataframe in the below format:

id, ref
101, [{'id': '74947', 'type': {'id': '104', 'name': 'Sales', 'inward': 'Sales', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-A'}}]
102, [{'id': '74948', 'type': {'id': '105', 'name': 'Return', 'inward': 'Return Order', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-C'}}, 
      {'id': '750001', 'type': {'id': '342', 'name': 'Sales', 'inward': 'Sales', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-X'}}]
103, [{'id': '74949', 'type': {'id': '106', 'name': 'Sales', 'inward': 'Return Order', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-B'}},
104, [{'id': '67543', 'type': {'id': '106', 'name': 'Other', 'inward': 'Return Order', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-BA'}}]

I am trying to extract rows that have name = Sales and return back the below output:

101, Prod-A
102, Prod-X
103, Prod-B

I am able to extract the required data if the key value pair appears at the first instance but I am not able to do so if it is not the first instance like in the case of id = 102

df['names'] = df['ref'].str[0].str.get('type').str.get('name')
df['value'] = df['ref'].str[0].str.get('inwardIssue').str.get('key')
df['output'] = np.where(df['names'] == 'Sales', df['value'], 0)

Currently I am able to only get values for id = 101, 103

2 Answers 2

2

Let us do explode

s=pd.DataFrame(df.ref.explode().tolist())
s=s.loc[s.type.str.get('name').eq('Sales'),'inwardIssue'].str.get('key')
dfs=df.join(s,how='right')
    id                                                ref inwardIssue
0  101  [{'id': '74947', 'type': {'id': '104', 'name':...      Prod-A
2  103  [{'id': '74949', 'type': {'id': '106', 'name':...      Prod-X
3  104  [{'id': '67543', 'type': {'id': '106', 'name':...      Prod-B
Sign up to request clarification or add additional context in comments.

Comments

1

If you already have a dataframe in that format, you may convert it to json format and use pd.json_normalize to turn original df to a flat dataframe and slicing/filering on this flat dataframe.

df1 = pd.json_normalize(df.to_dict(orient='records'), 'ref')

The output of this flat dataframe df1

Out[83]:
       id type.id type.name   type.inward type.outward inwardIssue.id  \
0   74947     104     Sales         Sales           PO          76560
1   74948     105    Return  Return Order           PO          76560
2  750001     342     Sales         Sales           PO          76560
3   74949     106     Sales  Return Order           PO          76560
4   67543     106     Other  Return Order           PO          76560

  inwardIssue.key
0          Prod-A
1          Prod-C
2          Prod-X
3          Prod-B
4         Prod-BA

Finally, slicing on df1

df_final = df1.loc[df1['type.name'].eq('Sales'), ['type.id', 'inwardIssue.key']]

Out[88]:
  type.id inwardIssue.key
0     104          Prod-A
2     342          Prod-X
3     106          Prod-B

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.