2

Consider this dataset:

cat > example_file.txt <<EOL
bins|1,1|2,2|3,4|5,6|7,9|10,12|13,18|19,24|25,36|37,54|55,81
1,1||431|501|651|861|1081|1461|1711|2071|2371|2531
2,2|||261|401|631|871|1291|1611|2001|2291|2431
3,4|||121|271|551|1011|1191|1511|1901|2211|2351
5,6||||101|361|901|1251|1301|1691|2011|2221
7,9|||||181|461|841|1151|1511|1821|2061
10,12||||||161|591|931|1291|1621|1821
13,18|||||||351|691|1091|1401|1571
19,24||||||||301|861|1451|1201
25,36|||||||||371|851|961
37,54||||||||||371|621
55,81|||||||||||351
EOL

which I import:

import pandas
example = pandas.read_csv('example_file.txt', sep = '|', index_col = 0)

Now, the row and column names of this table are actually interval bounds. I can have pandas recognize them as such:

col_bin = pandas.IntervalIndex.from_tuples([tuple(list(map(int, x.split(',')))) for x in example.columns])
row_bin = pandas.IntervalIndex.from_tuples([tuple(list(map(int, x.split(',')))) for x in example.index])
example.columns = col_bin
example.index   = row_bin

Now, I would like to find the cell value corresponding to value pairs such as (11, 13).

For example, for (11, 13) the cell value is 591.

This is because the row index value (11) belongs to the interval (10,12) [this is the 6th row] and the col index value (13) belongs to the interval (13,18) [this is the 7th column] and the value at (row, column) (6, 7) in the table is 591.

1 Answer 1

2

You need first add parameter closed='both' to IntervalIndex.from_tuples, because intevals are closed in both sides:

col_bin = pd.IntervalIndex.from_tuples([tuple(list(map(int, x.split(',')))) for x in example.columns], closed='both')
row_bin = pd.IntervalIndex.from_tuples([tuple(list(map(int, x.split(',')))) for x in example.index], closed='both')
example.columns = col_bin
example.index   = row_bin
print (example)
         [1, 1]  [2, 2]  [3, 4]  [5, 6]  [7, 9]  [10, 12]  [13, 18]  \
[1, 1]       NaN   431.0   501.0   651.0   861.0    1081.0    1461.0   
[2, 2]       NaN     NaN   261.0   401.0   631.0     871.0    1291.0   
[3, 4]       NaN     NaN   121.0   271.0   551.0    1011.0    1191.0   
[5, 6]       NaN     NaN     NaN   101.0   361.0     901.0    1251.0   
[7, 9]       NaN     NaN     NaN     NaN   181.0     461.0     841.0   
[10, 12]     NaN     NaN     NaN     NaN     NaN     161.0     591.0   
[13, 18]     NaN     NaN     NaN     NaN     NaN       NaN     351.0   
[19, 24]     NaN     NaN     NaN     NaN     NaN       NaN       NaN   
[25, 36]     NaN     NaN     NaN     NaN     NaN       NaN       NaN   
[37, 54]     NaN     NaN     NaN     NaN     NaN       NaN       NaN   
[55, 81]     NaN     NaN     NaN     NaN     NaN       NaN       NaN   

          [19, 24]  [25, 36]  [37, 54]  [55, 81]  
[1, 1]      1711.0    2071.0    2371.0      2531  
[2, 2]      1611.0    2001.0    2291.0      2431  
[3, 4]      1511.0    1901.0    2211.0      2351  
[5, 6]      1301.0    1691.0    2011.0      2221  
[7, 9]      1151.0    1511.0    1821.0      2061  
[10, 12]     931.0    1291.0    1621.0      1821  
[13, 18]     691.0    1091.0    1401.0      1571  
[19, 24]     301.0     861.0    1451.0      1201  
[25, 36]       NaN     371.0     851.0       961  
[37, 54]       NaN       NaN     371.0       621  
[55, 81]       NaN       NaN       NaN       351  

And then get positions by IntervalIndex.get_loc and select by DataFrame.iat:

tup = (11, 13)
pos1 = example.index.get_loc(tup[0])
pos2 = example.columns.get_loc(tup[1])

print (example.iat[pos1, pos2])
591.0
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot, I could feel I was not far but I was missing the last mile;)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.