How to get the value by column and row name in pandas in python

Question

I am getting a co-occurrence matrix as follows using pandas.

lst = [
    ['a', 'b'],
    ['b', 'c', 'd', 'e'],
    ['a', 'd'],
    ['b', 'e']
]

u = (pd.get_dummies(pd.DataFrame(lst), prefix='', prefix_sep='')
       .groupby(level=0, axis=1)
       .sum())

v = u.T.dot(u)
v.values[(np.r_[:len(v)], ) * 2] = 0

print(v)

My output is as follows.

   a  b  c  d  e
a  0  1  0  1  0
b  1  0  1  1  2
c  0  1  0  1  1
d  1  1  1  0  1
e  0  2  1  1  0

I want to get how many times e appears with d using the above matrix (i.e. 1) and divide it by the total count of co-occurrences (i.e. 9 --> since the matix is symetric I only considered the upper part of the matrix to get the total sum).

So my output should be;

for co-occurrence count of e and d is 1.

and co-occurrence count of all should be 9 as follows (since the matrix is symetric).

I would like to know if it is possible do it in pandas.

I am happy to provide more details if needed.

moys · Accepted Answer · 2019-08-16 06:37:55Z

1

Will this work for you?

a=df.loc['e','b']
b=df.values.sum()/2
print((a/b))

inside the loc method, First value is row & the second value is column. you can change it as needed.

edited Aug 16, 2019 at 6:37

answered Aug 16, 2019 at 6:17

moys

8,1173 gold badges19 silver badges51 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

jezrael · Accepted Answer · 2019-08-16 06:46:09Z

1

I believe you need divide by sum of all values only for upper matrix, so divide 2:

v = v / (v.values.sum() / 2)
print(v)
          a         b         c         d         e
a  0.000000  0.111111  0.000000  0.111111  0.000000
b  0.111111  0.000000  0.111111  0.111111  0.222222
c  0.000000  0.111111  0.000000  0.111111  0.111111
d  0.111111  0.111111  0.111111  0.000000  0.111111
e  0.000000  0.222222  0.111111  0.111111  0.000000

For one value:

print(v.loc['d','e'] / (v.values.sum() / 2))
0.1111111111111111

If need assign back ony one value:

v.loc['d','e'] = v.loc['d','e'] /v.values.sum() / 2
print(v)

   a  b  c  d         e
a  0  1  0  1  0.000000
b  1  0  1  1  2.000000
c  0  1  0  1  1.000000
d  1  1  1  0  0.111111
e  0  2  1  1  0.000000

edited Aug 16, 2019 at 6:46

answered Aug 16, 2019 at 5:58

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

6 Comments

EmJ Over a year ago

thanks a lot for the answer. I actually want to do it by specifying the column and row name. i.e. I give e and d and I get its co-occurrence count as 1. next I get the total co-ocuurrence count seperately from the matrix (i.e. 9) and later I divide it (i.e. 1/9 = 0.111111111). Is there a way to do this in pandas? :)

jezrael Over a year ago

@EmJ - So I think now understand, need scalar input? Answer was edited.

moys Over a year ago

@jezrael I think the out put expected is just one value. the porblem is to find the value in a particular location (for example intersection of row 'd' & column 'e' is 1, intersection of row 'e' & column 'b' is 2) and then divide this number by half the sum of the whole dataframe. In this sum of the whole dataframe is 18, so, half of it is 9. I have provided a solution, may be you can provide a even better one.

moys Over a year ago

ok! got it. saw the update only now. But the answer seems wrong. 1/9 should be 0.1111, not 0.027777777777777776. I double checked with the calculator on my computer.

jezrael Over a year ago

@mohanys - You are right, missing () for me. Now working nice.

|

Collectives™ on Stack Overflow

How to get the value by column and row name in pandas in python

2 Answers 2

Comments

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related