So, I have an optimization problem that can perhaps be solved by linear programming (with PuLP?). My experience with this line of work is limited so perhaps another solution would be better.
The problem is as follows:
There are 37 items that need to be bought. Each item must be bought in a specific quantity, in a specific color. For each item I have a variable number of stores that sell that item. There are about 8000 stores that combined sell the 37 items. There's not a single store that sells all 37 items. Each store has a variable quantity of that item available (if it is available) and a variable price. Also, each store has a minimum-buy amount.
In python I have two dataframes that should have all the information that I need. (store names are 'blurred')
wanted.head()
item_color_id item_id item_qty
0 86 21837 1
1 5 2431 2
2 11 2444 6
3 11 2476 4
4 3 2654 2
stores.head()
item_color_id item_id store_min_buy store_name store_price store_qty
0 86 21837 20.00 fda 0.18 100
1 86 21837 10.00 asdfa 0.52 89
2 86 21837 10.00 ghsde 0.55 64
3 86 21837 9.14 j5rs 0.41 31
4 86 21837 10.00 pjvds 0.44 26
The stores dataframe is already preprocessed so it does not contain any NaN values. Note that store_min_buy is the minimum amount of money that needs to be spend in that store.
The challenge is to minimize the cost of buying the 37 items. In addition to that I need the actual solution: Which items need to be bought from which stores.
store_min_buyis the minimum amount of money need to be spend in that store. So a min_buy of 9.14 means that at least 9.14 euro/dollar need to be spend at that store.