1

I am trying to generate a list from a pandas data frame based on certain conditions on column values in data frame, my df looks something like

        df =      
                       48 150  39   0
        0    BE0974302342   0   0  21
        1    BE0974302342   3   3  19
        2    BE0974302342   F   2   2
        3    FR0000073843   0   0  22
        4    FR0000073843   3   3  20
        5    FR0000073843   F   2   2
        6    FR0000076861   0   0  21
        7    FR0000076861   3   3  18
        8    FR0000076861   F   1   3
        9    FR0000076861   F   2   3
        10   FR0000076887   0   0  13
        11   FR0000076887   3   3  11
        12   FR0000076887   8   8  19
        13   FR0000076887   F   2   2
        14   FR0000077562   0   0  22
        15   FR0000077562   3   3  19
        16   FR0000077562   F   2   3
        17   FR0000079147   0   0  20
        18   FR0000079147   3   3  16
        19   FR0000079147   F   1   1
        20   FR0000079147   F   2   4
        21   FR0004034072   0   0  14
        22   FR0004034072   3   3  12
        23   FR0004034072   8   8  21
        24   FR0004034072   F   2   2
        25   FR0004152874   0   0  22
        26   FR0004152874   3   3  20
        27   FR0004152874   F   1   1
        28   FR0004152874   F   2   2
        29   FR0004178572   0   0  21
        ...

Here the combination of column 150 and 39 has a meaning so I wanted to extract count based on the combination, there are 6 possible combinations

    150 39
    0   0
    3   3
    4   4
    8   8
    F   1
    F   2

I want to form a final_list which will have count of each of these combination for every value in column '48',

for ex.

'BE0974302342', (150=0, 39=0) record count is 21, (150=3, 39=3) is 19, (150=4, 39=4) is 0, (150=8, 39=8) is 0, (150=F, 39=1) is 0, (150=F,39=2) is 2  

so the final record list would be something like

[[BE0974302342,21,19,0,0,0,2], 
[FR0000073843,22,20,0,0,0,2],
[FR0000076861,21,18,0,0,1,3]...]

What did I tried: I tried to convert the df in to list and then traverse through each sublist and checked for combination of 150 and 39 values, that partially worked but I wanted to have a better solution which will work perfectly, would appreciate any help or the suggestion for the approach that I should follow to achieve this, thanks in advance.

4
  • you can create a new column which will be a tuple of merged the two desired and then simply compute histogram - counting uniques elements Commented Apr 24, 2019 at 6:34
  • That seems nice approach, let me try that out, thanks a lot :) Commented Apr 24, 2019 at 6:38
  • Alternativly you can look at stackoverflow.com/questions/35584085/…, where the following solution is surgested df.groupby(df.columns.tolist(),as_index=False).size() Commented Apr 24, 2019 at 6:39
  • Thank you, Peter :) Commented Apr 24, 2019 at 7:17

1 Answer 1

2

Use crosstab with convert DataFrame to lists:

df1 = pd.crosstab(df[48], [df[150], df[39]])
#alternative solutions
#df1 = df.groupby([48, 150, 39]).size().unstack(level=[1,2], fill_value=0)
#df1 = df.pivot_table(index=48, columns=[150, 39], aggfunc='size', fill_value=0)
print (df1)
150           0  3  8  F   
39            0  3  8  1  2
48                         
BE0974302342  1  1  0  0  1
FR0000073843  1  1  0  0  1
FR0000076861  1  1  0  1  1
FR0000076887  1  1  1  0  1
FR0000077562  1  1  0  0  1
FR0000079147  1  1  0  1  1
FR0004034072  1  1  1  0  1
FR0004152874  1  1  0  1  1
FR0004178572  1  0  0  0  0

L = df1.reset_index().values.tolist()
print (L)

[['BE0974302342', 1, 1, 0, 0, 1], 
 ['FR0000073843', 1, 1, 0, 0, 1], 
 ['FR0000076861', 1, 1, 0, 1, 1], 
 ['FR0000076887', 1, 1, 1, 0, 1], 
 ['FR0000077562', 1, 1, 0, 0, 1], 
 ['FR0000079147', 1, 1, 0, 1, 1], 
 ['FR0004034072', 1, 1, 1, 0, 1], 
 ['FR0004152874', 1, 1, 0, 1, 1], 
 ['FR0004178572', 1, 0, 0, 0, 0]]

And if need combinations convert MultiIndex in columns to list of tuples:

print (df1.columns.tolist())
[('0', 0), ('3', 3), ('8', 8), ('F', 1), ('F', 2)]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.