2

I have a pandas dataframe.

import pandas as pd
data = pd.DataFrame({
'a': [0,1,0,0,1,1,0,1], 
'b': [0,0,1,0,1,0,1,1], 
'c': [0,0,0,1,0,1,1,1],
'rate': [0,0.1,0.11,0.12,0.24,0.27,0.3,0.4]})

a,b,c are my channels and I am adding another column that shows the sum of row wise totals of these channels by writing:

data['total'] = data.a + data.b + data.c

data

    a   b   c   rate    total
1   1   0   0   0.10    1
2   0   1   0   0.11    1 
3   0   0   1   0.12    1
4   1   1   0   0.24    2
5   1   0   1   0.27    2
6   0   1   1   0.30    2
7   1   1   1   0.40    3

I would like to work on data where total = 1 and total = 2

reduced = data[(data.a == 1) & (data.total == 2)]
print(reduced)

     a  b  c  rate  total
  4  1  1  0  0.24      2
  5  1  0  1  0.27      2

I would like to add columns to this reduced dataframe that looks like this:

      a  b  c  rate  total  prob_a  prob_b  prob_c
   4  1  1  0  0.24      2     0.1    0.11    0
   5  1  0  1  0.27      2     0.1     0      0.12

In the first row of the reduced data-frame, prob_c is 0 since C is absent ( ABC => 110). In the second row of the reduced data-frame, prob_b is 0 since B is absent (ABC => 101)

where,

# Channel a alone occurs (ABC => 100)
prob_a = data['rate'][(data.a == 1) & (data.total == 1)]

# Channel b alone occurs (ABC => 010)
prob_b = data['rate'][(data.b == 1) & (data.total == 1)]

# Channel c alone occurs (ABC => 001)
prob_c = data['rate'][(data.c == 1) & (data.total == 1)]

I tried this:

reduced['prob_a'] = data['rate'][(data.a == 1) & (data.total == 1)]
reduced['prob_b'] = data['rate'][(data.b == 1) & (data.total == 1)]
reduced['prob_c'] = data['rate'][(data.c == 1) & (data.total == 1)]
print(reduced)

which results in this output:

   a  b  c  rate  total  prob_a  prob_b  prob_c
4  1  1  0  0.24      2     NaN     NaN     NaN
5  1  0  1  0.27      2     NaN     NaN     NaN

2 Answers 2

3

IIUC

reduced['prob_a'] = data.loc[(data.a == 1) & (data.total == 1),'rate'].values[0]
reduced['prob_b'] = data.loc[(data.b == 1) & (data.total == 1),'rate'].values[0]
reduced['prob_c'] = data.loc[(data.c == 1) & (data.total == 1),'rate'].values[0]

reduced[['prob_a','prob_b','prob_c']]=reduced[['prob_a','prob_b','prob_c']].mul(reduced[['a','b','c']].eq(1).values)
reduced
Out[698]: 
   a  b  c  rate  total  prob_a  prob_b  prob_c
4  1  1  0  0.24      2     0.1    0.11    0.00
5  1  0  1  0.27      2     0.1    0.00    0.12
Sign up to request clarification or add additional context in comments.

Comments

2

Break it into two steps

One
Calculate Probabilities

probs = data.query('a + b + c == 1').pipe(
    lambda d: d.drop('rate', 1).T.dot(d.rate)
)

probs

a    0.10
b    0.11
c    0.12
dtype: float64

How it works
Get just the total equals to one rows:

data.query('a + b + c == 1')

   a  b  c  rate
1  1  0  0  0.10
2  0  1  0  0.11
3  0  0  1  0.12

pipe allows us to pipeline the results into function. From the result above, I want to dot product the columns a, b, c with the column rate. The lambda via the pipe allows me to do that.

More Generalized Approach

df = data.copy()
rate = df.pop('rate')
mask = df.sum(1) == 1
probs = df[mask].T.dot(rate[mask])

probs

Two
The Rest

data.join(
    data.drop('rate', 1).mul(probs).add_prefix('prob_')
).query('a + b + c == 2 and a == 1')

   a  b  c  rate  prob_a  prob_b  prob_c
4  1  1  0  0.24     0.1    0.11    0.00
5  1  0  1  0.27     0.1    0.00    0.12

3 Comments

hey, can u please explain the first step - especially the lambda expression.
The example in the question talks about 3 channels at a time. But in general, if I have "N" channels (a,b,c,d, ... z) , how do i write a function that gives me the probabilities for all possible combinations of pairs. For example, if I have 5 channels - A,B,C,D,E. I need 5c2 = 10 rows in my output where each row shows presence of only 2 channels at a time.
I added a more general approach for you

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.