Assuming that the data is always sorted (thanks @juanpa.arrivillaga), you can use the rank method from the Pandas Series class. rank() takes several arguments. One of them is pct:
pct : boolean, default False
Computes percentage rank of data
There are different ways of calculating the percentage rank. These methods are controlled by the argument method:
method : {‘average’, ‘min’, ‘max’, ‘first’, ‘dense’}
You need the method "max":
max: highest rank in group
Let's look at the output of the rank() method with these parameters:
import numpy as np
import pandas as pd
series = [1,2,2,2,2,2,2,2,2,2,2,5,5,6,7,8]
S = pd.Series(series)
percentage_rank = S.rank(method="max", pct=True)
print(percentage_rank)
This gives you basically the percentile for every entry in the Series:
0 0.0625
1 0.6875
2 0.6875
3 0.6875
4 0.6875
5 0.6875
6 0.6875
7 0.6875
8 0.6875
9 0.6875
10 0.6875
11 0.8125
12 0.8125
13 0.8750
14 0.9375
15 1.0000
dtype: float64
In order to retrieve the index for the three percentiles, you look up the first element in the Series that has an equal or higher percentage rank than the percentile you're interested in. The index of that element is the index that you need.
index25 = S.index[percentage_rank >= 0.25][0]
index50 = S.index[percentage_rank >= 0.50][0]
index75 = S.index[percentage_rank >= 0.75][0]
print("25 percentile: index {}, value {}".format(index25, S[index25]))
print("50 percentile: index {}, value {}".format(index50, S[index50]))
print("75 percentile: index {}, value {}".format(index75, S[index75]))
This gives you the output:
25 percentile: index 1, value 2
50 percentile: index 1, value 2
75 percentile: index 11, value 5