1

I have a dataframe imported from excel:

>>df

    Name Emp ID  Total Salary     A      B     C     D      E
0   Mike   A001         25000  5000  15000  3000     0   2000
1   John   A002         23000  5000  10000  3000  3000   2000
2    Bob   A003         21000  5000  15000     0  1000      0
3   Rose   A004         20000  5000  10000  2000  1000  20000
4  James   A005         10000  5000      0  3000     0   2000

Now I have calculated the sum of subset of Total Salary using the following code:

Code:

import pandas as pd
import numpy as np

df = pd.read_excel('tmp/test.xlsx')
val = df.drop(['Name','Emp ID','Total Salary'],1)
test = np.array(val)

num = df['Total Salary'][0]
array = test[0]

def subsetsum(array,num):
    if num == 0 or num < 1:
        return None
    elif len(array) == 0:
        return None
    else:
        if np.isclose(array[0],num):
            return [array[0]]
    else:
        with_v = subsetsum(array[1:],(num - array[0])) 
        if with_v:
            return [array[0]] + with_v
        else:
            return subsetsum(array[1:],num)

print('\nValues : ',array)
print('\nTotal Salary : ',num)
print('\nValues of Salary : ',subsetsum(array,num))

Output:

Values :  [ 5000 15000  3000     0  2000]

Total Salary :  25000

Values of Salary :  [5000, 15000, 3000, 0, 2000]

Now I need a way to link the values of salary present in the array to the column names present in data frame.

So my output that I would like would be:

Output Required:

Values :  [ 5000 15000  3000     0  2000]

Total Salary :  25000

Values of Salary :  A - 5000 B - 15000 C - 3000 E - 2000
0

2 Answers 2

1

I would suggest rewriting your subsetsum function to return the indices of the chosen elements, rather than the elements themselves (or perhaps it could return both, if that works out to be better for you). For example,

subsetsum([5000, 15000, 3000, 0, 2000], 25000)

would return [0, 1, 2, 3, 4], or possibly [0, 1, 2, 4]. Then you can use these indices to access the corresponding column labels as well as the elements.

Sign up to request clarification or add additional context in comments.

4 Comments

I have tried around but have failed, could you please guide me how to do it. Please..it would be a great help!!
@cgmaster What have you tried, and why did it fail?
I am unable to extract index values from the function. When I try the to extract the values individually so that I can get the index, it throws me None [2000] [3000, 2000] [15000, 3000, 2000].
@cgmaster Honestly, that doesn't help me help you at all. I'm not sure what you mean by "extract index values from the function".
1

With all your provided info, I check it on my own machine. The easiest way to convert a data.frame to a numpy array:

test = val.values
array = test[0]

You can always have access to column names

col = val.columns.values

Finally, match the names with values

link = list(zip(col, subsetsum(array,num)))
print(link)

# Output
[('A', 5000), ('B', 15000), ('C', 3000), ('D', 0), ('E', 2000)]

The zip() will match 2 arrays with the same length, and return a zip object. Then if you want to iterate and using print, first convert to list(). I hope this help!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.