Adding rows to Pandas dataframe

Question

I'm trying to use pandas to create a ledger of activity. My object will have a pandas DataFrame that will track balances and transactions associated to that object.

I'm struggling how to append single rows of data to that pandas dataframe as orders get associated to that object. It seems like the most common answer is to "only create the frame once you have all the data", however I can't do that. I want to have the ability to compute on-the-fly as I'm adding in new data.

Here's my associated code (which fails):

self.ledger  = pd.DataFrame(data={'entry_date' : [pd.Timestamp('1900-01-01')],
'qty' : [np.float64(startingBalance)],
'element_type' : [pd.Categorical(["startingBalance"])],
'avail_bal' : [np.float64(startingBalance)],
'firm_ind' : True,
'deleted_ind' : False,
'ord_id' : ["fooA"],
'parent_ord_id' : ["fooB"] },
columns=ledgerColumnList
)        

self.ledger.iloc[-1] = dict({'entry_date' : ['1900-01-02'],
'qty' : [startingBalance],
'element_type' : ["startingBalance"],
'avail_bal' : [startingBalance],
'firm_ind' : [True],
'deleted_ind' : [False],
'ord_id' : ["foofa"],
'parent_ord_id' : ["foofb"] })

Here's the error I'm getting:

File "C:\Users\MyUser\My Documents\Workspace\myscript.py", line 135, in __init__
'parent_ord_id' : ["foofb"] })
File "C:\Python27\lib\site-packages\pandas\core\indexing.py", line 117, in __setitem__
self._setitem_with_indexer(indexer, value)
File "C:\Python27\lib\site-packages\pandas\core\indexing.py", line 492, in _setitem_with_indexer
setter(item, v)
File "C:\Python27\lib\site-packages\pandas\core\indexing.py", line 422, in setter
s._data = s._data.setitem(indexer=pi, value=v)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 2843, in setitem
return self.apply('setitem', **kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 2823, in apply
applied = getattr(b, f)(**kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 636, in setitem
values, _, value, _ = self._try_coerce_args(self.values, value)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 2066, in _try_coerce_args
raise TypeError
TypeError

Thoughts?

1) How can I do this in Pandas?

or

2) Is there something better I should be using that would give me the built-in calculation tools of pandas but would be more well-suited to my little-at-a-time data needs?

Joe T. Boka · Accepted Answer · 2016-03-10 22:27:21Z

3

You can also use df.loc[]

df = pd.DataFrame({'A': [1,2,3,4], 'B': [5,6,7,8], 'C': [9,10,11,12]})
df
    A   B   C
0   1   5   9
1   2   6   10
2   3   7   11
3   4   8   12
new_row = pd.DataFrame({'A': [35], 'B': [27], 'C': [43]})
new_row
     A  B   C
0   35  27  43
df.loc[4] = new_row.loc[0]
df
    A   B   C
0   1   5   9
1   2   6   10
2   3   7   11
3   4   8   12
4   35  27  43

answered Mar 10, 2016 at 22:27

Joe T. Boka

6,5896 gold badges33 silver badges49 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

irene · Accepted Answer · 2016-03-31 08:20:08Z

2

You can also try to create a new dataframe for the new data, and then use concat.

For illustration purposes, let's take a simple dataframe:

import pandas as pd
df = pd.DataFrame({'a':[0,1,2],'b':[3,4,5]}
print df
>>    a  b
   0  0  3
   1  1  4
   2  2  5

Let's say you have new data coming in, with values a=4 and b=7. Create a new dataframe containing only the new data:

newresults = {'a':[4],'b':[7]}
_dfadd = pd.DataFrame(newresults)
print _dfadd
>>    a  b
   0  4  7

Then concatenate:

df = pd.concat([df,_dfadd]).reset_index(drop=True)
print df
>>    a  b
   0  0  3
   1  1  4
   2  2  5
   3  4  7

answered Mar 31, 2016 at 8:20

irene

2,2531 gold badge24 silver badges39 bronze badges

Comments

Julien Spronck · Accepted Answer · 2016-03-10 21:41:03Z

1

One way is to use pandas.DataFrame.append():

self.ledger = pd.DataFrame(data={'entry_date' : [pd.Timestamp('1900-01-01')],
                                  'qty' : [np.float64(startingBalance)],
                                  'element_type' : [pd.Categorical(["startingBalance"])],
                                  'avail_bal' : [np.float64(startingBalance)],
                                  'firm_ind' : [True],
                                  'deleted_ind' : [False],
                                  'ord_id' : ["fooA"],
                                  'parent_ord_id' : ["fooB"] },
                            columns=ledgerColumnList)

df = pd.DataFrame(data={'entry_date' : [pd.Timestamp('1900-01-02')],
                        'qty' : [np.float64(startingBalance)],
                        'element_type' : ["startingBalance"],
                        'avail_bal' : [np.float64(startingBalance)],
                        'firm_ind' : [True],
                        'deleted_ind' : [False],
                        'ord_id' : ["foofa"],
                        'parent_ord_id' : ["foofb"] },
                  columns=ledgerColumnList)

self.ledger.append(df)

answered Mar 10, 2016 at 21:41

Julien Spronck

15.5k5 gold badges50 silver badges57 bronze badges

1 Comment

NumericOverflow Over a year ago

Almost everything I've read suggests against using append very often if at all possible because it creates a new object with the new data appended, so there's quite a bit of overhead involved and it is much slower than traditional appending operations.

Collectives™ on Stack Overflow

Adding rows to Pandas dataframe

3 Answers 3

Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related