2

I'm a new to pandas DataFrame, and I'm having a bit of struggle as I can't figure how to access a particular cell to make calculation to fill a new cell.

I'd like to use apply to call a external function with data from a cell at row - 1.

I did that, but outputing everything in a simple array, but i'm pretty sure there is a better way to do it:

I build my dataFrame from a csv with the following index:

DateIndex = pd.date_range(start="2005-1-1", end="2017-1-1", freq=BDay())

I'm positive my dataframe is ok, as per the extract below:

2005-01-03    0.005742
2005-01-04    0.003765
2005-01-05   -0.005536
2005-01-06    0.001500
2005-01-07    0.007471
2005-01-10    0.002108
2005-01-11   -0.003195
2005-01-12   -0.003076
2005-01-13    0.005416
2005-01-14    0.003090

So, I would like to add 100 to the first entry, and for the other ones, add one and then multiply it by the previous entry.

I was able to do so in an array:

for i in range(0,len(df.index)):
    if i == 0:
        listV = [df.iloc[i] + 100]
    else:
        listV.append(listV[i-1] * (1 + df.iloc[i]))

is there a way to do this and put the result directly in a new column of the data frame ?

Thanks a lot, Regards, Julien

1
  • 1
    Just do df['new column name'] = listV. You need to remove the square brackets in your if statement otherwise it will turn the value into a list. The line should also be inside of an append statement as you had under your else statement. Commented Feb 1, 2017 at 18:13

2 Answers 2

2

initialization

df = pd.DataFrame(dict(
        col=[ 0.005742,  0.003765, -0.005536,  0.0015  ,  0.007471,
              0.002108, -0.003195, -0.003076,  0.005416,  0.00309 ]
    ), pd.to_datetime([
            '2005-01-03', '2005-01-04', '2005-01-05', '2005-01-06', '2005-01-07', 
            '2005-01-10', '2005-01-11', '2005-01-12', '2005-01-13', '2005-01-14'])
    )

print(df)

                 col
2005-01-03  0.005742
2005-01-04  0.003765
2005-01-05 -0.005536
2005-01-06  0.001500
2005-01-07  0.007471
2005-01-10  0.002108
2005-01-11 -0.003195
2005-01-12 -0.003076
2005-01-13  0.005416
2005-01-14  0.003090

comments
This looks to be a series of returns. By adding 100 to the first observation, you are marginalizing that first return making it .57 basis points as opposed to .57 percent

I believe what you want to do is add to add one to everything, then take the cumulative product, then multiply by 100.

This would show the cumulative growth of 100 which is what I believe you are after.

df.add(1).cumprod().mul(100)

                   col
2005-01-03  100.574200
2005-01-04  100.952862
2005-01-05  100.393987
2005-01-06  100.544578
2005-01-07  101.295746
2005-01-10  101.509278
2005-01-11  101.184956
2005-01-12  100.873711
2005-01-13  101.420043
2005-01-14  101.733431

df.add(1).cumprod().mul(100).plot()

enter image description here

Sign up to request clarification or add additional context in comments.

2 Comments

Here's where the domain knowledge comes into play ;-)
Brilliant. That's exactly that ! I can't thank you enough.
2

Here's a better way to achieve the same thing:

col_copy = df.col.copy()   # generate a copy to isolate the series completely
col_copy.iloc[0] += 100    # Increment first row by 100
col_copy.iloc[1:] += 1     # Increment 1 to rest

df.assign(new_col=col_copy.cumprod()) # compute cumulative product and assign to new column

yields:

enter image description here

Data:

Consider a DF with a single column 'Col' as prepared:

txt = StringIO(
"""
2005-01-03    0.005742
2005-01-04    0.003765
2005-01-05   -0.005536
2005-01-06    0.001500
2005-01-07    0.007471
2005-01-10    0.002108
2005-01-11   -0.003195
2005-01-12   -0.003076
2005-01-13    0.005416
2005-01-14    0.003090
""")

df = pd.read_csv(txt, delim_whitespace=True, parse_dates=True, header=None, 
                 index_col=['date'], names=['date', 'col'])
df.index.name = None
df

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.