1

I have list of indexes on a dataframe with 5 columns.

indexes = [5,9,44]

I want to create a pd.Series with the value of a certain columns between the indexes. So, if the

row#5 = ["a","b","c","d","e"]
row#9 = ["q","w","e","r","t"]
row#44= ["z","x","c","v","b"]

and the columns_idx is 3, I want a vector with "d" in indexes 0-5, "r" in indexes 5-9, and "v" in indexes 9-44, and nan in indexes 44+. What is the best way to do so?

1 Answer 1

3

Solution working if default RangeIndex - convert all values outside indexeses to NaN and back filling values:

indexes = [5,9,44]
columns_idx = 3

out = df.iloc[:, columns_idx].where(df.index.isin(indexes)).bfill()

Another solution:

df.iloc[:, columns_idx] = df.iloc[indexes, columns_idx]
out = df.iloc[:, columns_idx].bfill()

Sample:

np.random.seed(343)
df = pd.DataFrame(np.random.randint(100, size=(20, 5)), columns=list('abcde'))
print(df)
     a   b   c   d   e
0   63  18  56  22  50
1   22  12   4  38  23
2   65  19   2  13  51
3   43  90  46  44  36
4   74  43  92  34  18
5   39  24  62  21   5
6   19  90  51  79  77
7    3   0  51  26  22
8   36  54  25  22   4
9    5  66  76  98  65
10  42  95  89  49  98
11  84  82  88  87  35
12  63  58  17  72  92
13  78  47  71  23   3
14  37  60  82  96  66
15  18  83  93  82  82
16  34  77   2  50  17
17  63   8  78  74  89
18  25  86  83  42  77
19  82  35  77  46  42

indexes = [5,9,14]
columns_idx = 3

out = df.iloc[:, columns_idx].where(df.index.isin(indexes)).bfill()
print (out)
0     21.0
1     21.0
2     21.0
3     21.0
4     21.0
5     21.0
6     98.0
7     98.0
8     98.0
9     98.0
10    96.0
11    96.0
12    96.0
13    96.0
14    96.0
15     NaN
16     NaN
17     NaN
18     NaN
19     NaN
Name: d, dtype: float64

Explanation:

Compare index values for True per indices from indexes with isin:

print (df.index.isin(indexes))
[False False False False False  True False False False  True False False
 False False  True False False False False False]

Seelct column by position with DataFrame.iloc:

print (df.iloc[:, columns_idx])
0     22
1     38
2     13
3     44
4     34
5     21
6     79
7     26
8     22
9     98
10    49
11    87
12    72
13    23
14    96
15    82
16    50
17    74
18    42
19    46
Name: d, dtype: int32

Replace non matched values by where:

print (df.iloc[:, columns_idx].where(df.index.isin(indexes)))
0      NaN
1      NaN
2      NaN
3      NaN
4      NaN
5     21.0
6      NaN
7      NaN
8      NaN
9     98.0
10     NaN
11     NaN
12     NaN
13     NaN
14    96.0
15     NaN
16     NaN
17     NaN
18     NaN
19     NaN
Name: d, dtype: float64

And last back filling missing values:

print (df.iloc[:, columns_idx].where(df.index.isin(indexes)).bfill())
0     21.0
1     21.0
2     21.0
3     21.0
4     21.0
5     21.0
6     98.0
7     98.0
8     98.0
9     98.0
10    96.0
11    96.0
12    96.0
13    96.0
14    96.0
15     NaN
16     NaN
17     NaN
18     NaN
19     NaN
Name: d, dtype: float64
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.