Finding index of a pandas DataFrame value

Question

I am trying to process some .csv data using pandas, and I am struggling with something that I am sure is a rookie move, but after spending a lot of time trying to make this work, I need your help.

Essentially, I am trying to find the index of a value within a dataframe I have created.

max = cd_gross_revenue.max()
#max value of the cd_gross_revenue dataframe

print max
#finds max value, no problem!

maxindex = cd_gross_revenue.idxmax()
print maxindex
#finds index of max_value, what I wanted!

print max.index
#ERROR: AttributeError: 'numpy.float64' object has no attribute 'index'

The maxindex variable gets me the answer using idxmax(), but what if I am not looking for the index of a max value? What if it is some random value's index that I am looking at, how would I go about it? Clearly .index does not work for me here.

Thanks in advance for any help!

Does this dataframe have only 1 column or do you know which column has the max value? if you know the column then df.loc[df.col == max].index would return you the index — EdChum
– EdChum, Commented Oct 1, 2014 at 20:45
Hi EdChum, thanks for your answer. Doing this gives me the following error Traceback (most recent call last): File "psims2.py", line 81, in <module> print cd_gross_revenue.loc[cd_gross_revenue.col == max].index File "C:\Python27\lib\site-packages\pandas-0.14.1-py2.7-win32.egg\pandas\core\generic.py", line 18 43, in __getattr__ (type(self).__name__, name)) AttributeError: 'Series' object has no attribute 'col' — ploo
– ploo, Commented Oct 1, 2014 at 20:50
I think you misunderstand, col was a generic name for your column of interest so substitute the column name with the one from your df, my question is how many columns does this df have and is there only 1 or do you know which column has the max value, if so the subsitute col with that name — EdChum
– EdChum, Commented Oct 1, 2014 at 20:55

Sociopath · Accepted Answer · 2019-02-11 04:43:42Z

4

Use a boolean mask to get the rows where the value is equal to the random variable. Then use that mask to index the dataframe or series. Then you would use the .index field of the pandas dataframe or series. An example is:

In [9]: s = pd.Series(range(10,20))

In [10]: s
Out[10]:

0    10
1    11
2    12
3    13
4    14
5    15
6    16
7    17
8    18
9    19
dtype: int64

In [11]: val_mask = s == 13

In [12]: val_mask

Out[12]:
0    False
1    False
2    False
3     True
4    False
5    False
6    False
7    False
8    False
9    False
dtype: bool

In [15]: s[val_mask]
Out[15]:
3    13
dtype: int64

In [16]: s[val_mask].index
Out[16]: Int64Index([3], dtype='int64')

edited Feb 11, 2019 at 4:43

Sociopath

13.4k22 gold badges53 silver badges82 bronze badges

answered Oct 1, 2014 at 20:47

Daniel

27.8k12 gold badges65 silver badges88 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Adam Hughes · Accepted Answer · 2019-02-11 05:06:07Z

4

s[s==13]

Eg,

from pandas import Series

s = Series(range(10,20))
s[s==13]

3    13
dtype: int64

edited Feb 11, 2019 at 5:06

answered Oct 1, 2014 at 21:22

Adam Hughes

16.5k14 gold badges100 silver badges140 bronze badges

Comments

b10n · Accepted Answer · 2014-10-01 20:59:59Z

1

When you called idxmax it returned the key in the index which corresponded to the max value. You need to pass that key to the dataframe to get that value.

max_key = cd_gross_revenue.idxmax()
max_value = cd_gross_revenue.loc[max_key]

answered Oct 1, 2014 at 20:59

b10n

1,1869 silver badges9 bronze badges

Collectives™ on Stack Overflow

Finding index of a pandas DataFrame value

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related