I have:
- NewData, a pd.DataFrame to be populated from
- SourceData, a list of dataframes holding source data and
- source, a dataframe holding index values for which dataframe in SourceData is to be assigned.
- indexlen, an integer for the length of indexes in the dataframes
(Using dataframes because it's critical the indexes align.)
For instance, assume that there are 1000 df's in SourceData, and indexlen is 10,000. At 10,000, I will be assigning all columns from SourceData to NewData, moving up the indexes (es because all df's share the same index) until source decrements, at which point I will start assigning the values from all columns in the dataframe in SourceData[999] to NewData, etc.
I'm currently doing this with a loop:
for j in range(1, indexlen + 1):
NewData[j] = SourceData[source[j]].ix[j,:]
I would like to do this without using a loop, but I don't know how to broadcast this. I'm sure I'm missing something obvious, but any help would be appreciate. Thank you!
Edit: I made source a list, because I figured that was more efficient to access by element.
In response to a question about the dataframes, they are standard price data:
>>>SourceData[1].head()
bpz1975 Open High Low Close Vol OI
1975-02-13 2.275 2.275 2.275 2.275 0 50
1975-02-14 2.275 2.275 2.275 2.275 0 50
1975-02-18 2.275 2.275 2.275 2.275 0 50
1975-02-19 2.290 2.290 2.290 2.290 0 50
1975-02-20 2.290 2.290 2.290 2.290 0 50
In this case, reading in all months of a futures contract and then applying roll logic to create a series.