46

I'm trying to add a new row to the DataFrame with a specific index name 'e'.

    number   variable       values
a    NaN       bank          true   
b    3.0       shop          false  
c    0.5       market        true   
d    NaN       government    true   

I have tried the following but it's creating a new column instead of a new row.

new_row = [1.0, 'hotel', 'true']
df = df.append(new_row)

Still don't understand how to insert the row with a specific index. Will be grateful for any suggestions.

3
  • Possible duplicate of Pandas: Appending a row to a dataframe and specify its index label Commented Oct 7, 2017 at 15:34
  • 1
    @Zero I have read the answers in the link but they're discussing adding random values there. Commented Oct 7, 2017 at 16:12
  • Usually table-oriented APIs distinguish Update and Insert methods, but Pandas mixes them: Existing index labels are updated, new labels are inserted, assuming you don't need duplicates. You're left with convoluted concatenations to mimic the missing method. Commented May 9, 2024 at 10:14

5 Answers 5

75

You can use df.loc[_not_yet_existing_index_label_] = new_row.

Demo:

In [3]: df.loc['e'] = [1.0, 'hotel', 'true']

In [4]: df
Out[4]:
   number    variable values
a     NaN        bank   True
b     3.0        shop  False
c     0.5      market   True
d     NaN  government   True
e     1.0       hotel   true

PS using this method you can't add a row with already existing (duplicate) index value (label) - a row with this index label will be updated in this case.


UPDATE:

This might not work in recent Pandas/Python3 if the index is a DateTimeIndex and the new row's index doesn't exist.

it'll work if we specify correct index value(s).

Demo (using pandas: 0.23.4):

In [17]: ix = pd.date_range('2018-11-10 00:00:00', periods=4, freq='30min')

In [18]: df = pd.DataFrame(np.random.randint(100, size=(4,3)), columns=list('abc'), index=ix)

In [19]: df
Out[19]:
                      a   b   c
2018-11-10 00:00:00  77  64  90
2018-11-10 00:30:00   9  39  26
2018-11-10 01:00:00  63  93  72
2018-11-10 01:30:00  59  75  37

In [20]: df.loc[pd.to_datetime('2018-11-10 02:00:00')] = [100,100,100]

In [21]: df
Out[21]:
                       a    b    c
2018-11-10 00:00:00   77   64   90
2018-11-10 00:30:00    9   39   26
2018-11-10 01:00:00   63   93   72
2018-11-10 01:30:00   59   75   37
2018-11-10 02:00:00  100  100  100

In [22]: df.index
Out[22]: DatetimeIndex(['2018-11-10 00:00:00', '2018-11-10 00:30:00', '2018-11-10 01:00:00', '2018-11-10 01:30:00', '2018-11-10 02:00:00'], dtype='da
tetime64[ns]', freq=None)
Sign up to request clarification or add additional context in comments.

10 Comments

Wow super simple. Wish I had used that. Its all about timings
@Bharathshetty, yeah, i use this method if i need to add a single row, if i need to add 2+ rows - i;m using your method (df.append(another_DF))
I added that in my answer. :)
df.append(pd.Series(new_row, index=df.columns, name='e') -- series should do for single row.
@yeliabsalohcin, it'll work - please see the updated answer
|
17

Use append by converting list a dataframe in case you want to add multiple rows at once i.e

df = df.append(pd.DataFrame([new_row],index=['e'],columns=df.columns))

Or for single row (Thanks @Zero)

df = df.append(pd.Series(new_row, index=df.columns, name='e'))

Output:

  number    variable values
a     NaN        bank   True
b     3.0        shop  False
c     0.5      market   True
d     NaN  government   True
e     1.0       hotel   true

3 Comments

df.append(pd.Series(new_row, index=df.columns, name='e') series should do.
@Zero Series was my very first thought confused a bit with name and index. So went to DataFrame approach. I updated my answer. I wanted to be first to answer so.
This works in the case of a pandas dataframe with a DateTimeIndex when trying to add a row with a new datetime which doesn't exist in the index.
4

If it's the first row you need:

df = Dataframe(columns=[number, variable, values])
df.loc['e', [number, variable, values]] = [1.0, 'hotel', 'true']

Comments

1
df.loc['e', :] = [1.0, 'hotel', 'true']

should be the correct implementation in case of conflicting index and column names.

Comments

1

In future versions of Pandas, DataFrame.append(other, ignore_index=False, verify_integrity=False, sort=False) will be deprecated.

Source: Pandas Documentation

The documentation recommends using .concat().

It would look like this (if you wanted an empty row with only the added index name:

df = pd.concat([df, pd.Series(index=['New index label'], dtype=str)])

If you wanted to add data use this:

df = pd.concat([df, pd.Series(data, index=['New index label'], dtype=str)])

Hope that helps!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.