1

I have a dataframe in Python called "Item_Table" and it looks like the below:

--------------------------
item_id | item_description
--------------------------
N4      | Steel
M3      | Oil
B1      | Water
X9      | Coal
Z5      | Plastic
--------------------------

I also have a nested orders list which looks like below. Each list inside the nested list represents one order. One order can have multiple items

orders = [[Z5], [X9, Z5], [B1, Z5, N4], [B1, X9]]

Ideally, I would like to return a nested list with the respective item_descriptions from the "Item_Table" which looks like the below nested list:

orders_descriptions = [[Plastic], [Coal, Plastic], [Water, Plastic, Steel], [Water, Coal]]

I tried a solution where I converted orders list to a dataframe and tried to merge that with item_table to get item_descriptions but because one order has multiple items, I am unable to do the join.

1
  • try to solve for a non nested list: [X9 B1 Z5] ... after that it will be easier. Commented May 11, 2022 at 17:05

3 Answers 3

1

You can try export the item_id and item_description to dictionary then loop the orders

d = df.set_index('item_id')['item_description'].to_dict()

orders_descriptions = [[d[o] for o in os] for os in orders]
print(orders_descriptions)

[['Plastic'], ['Coal', 'Plastic'], ['Water', 'Plastic', 'Steel'], ['Water', 'Coal']]
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you so much!! I have the orders_descriptions nested list now. If I want to convert this nested list to one column in a dataframe, how would I do it? I am currently using df_new = pd.DataFrame(orders_descriptions) but this is giving me each item in a different column. For example Coal and Plastic are coming in two columns. I want both items in one column
@SriyaTR You can try pd.Series(orders_descriptions)
1

Here is a slightly different answer (it might be faster for large dataframes, but still to be checked) that still makes use of pandas indexes, but it does not loop through a dictionary.

item_frame.set_index("item_id", inplace=True)
order_descriptions = [item_frame.loc[o]["item_description"].tolist() for o in orders]

And if you want to create a DataFrame with the orders in a single column:

pd.DataFrame({"order_descriptions": order_descriptions})

Comments

0

Here's the brute force lookup method, but I think the "export to a dictionary" idea is a much better concept.

import pandas as pd
data = [
    ['N4','Steel'],
    ['M3','Oil'],
    ['B1','Water'],
    ['X9','Coal'],
    ['Z5','Plastic']
]
df = pd.DataFrame(data, columns=('item_id','item_description'))
print(df)

orders = [['Z5'], ['X9','Z5'], ['B1', 'Z5', 'N4'], ['B1', 'X9']]

orders_descriptions = []
for order in orders:
    orders_descriptions.append( [df[df['item_id'].eq(o)]['item_description'].iloc[0] for o in order] )

print(orders_descriptions)

2 Comments

Thank you so much!! I have the orders_descriptions nested list now. If I want to convert this nested list to one column in a dataframe, how would I do it? I am currently using df_new = pd.DataFrame(orders_descriptions) but this is giving me each item in a different column. For example Coal and Plastic are coming in two columns. I want both items in one column
As above, use pd.Series(orders_descriptions). However, working with nested lists in a column is not particularly convenient. You should figure out why you are doing that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.