3

I have a large dataframe and would like to update specific values at known row and column indices. I would like to do this without an explicit for loop.

For example:

import string                                                                                                                                  
import numpy as np                                                                                                                             
import pandas as pd                                                                                                                            
df = pd.DataFrame(np.random.rand(10, 10), index = range(10), columns = list(string.ascii_lowercase)[:10])    

I have arbitrary arrays of indexes, columns, and values that I would like to use to update df. For example:

update_values = [0,-2,-3]                                                                                                                       
update_index = [3,5,7]                                                                                                                          
update_columns = ["d","g","i"]     

I can loop over the arrays to update the original dataframe:

for i,j,v in zip(update_index, update_columns, update_values): 
    df.loc[i,j] = v 

but would like to use a technique not involving an explicit for loop.

5
  • You have a list of values, and you want to use each item in the list to update your dataframe, you have to use a loop! Commented Apr 21, 2019 at 15:32
  • @DeveshKumarSingh that's not actually how pandas works ;} Commented Apr 21, 2019 at 15:36
  • Ohh, Even I am learning pandas, could you point me to good resources which will help my understanding of pandas Commented Apr 21, 2019 at 15:38
  • @DeveshKumarSingh I believe constant training, official docs and stackoverflow is the best combo ;} Commented Apr 21, 2019 at 15:42
  • Makes sense! Thanks :) Commented Apr 21, 2019 at 15:43

2 Answers 2

4

Use the underlying numpy values

indexes = map(df.columns.get_loc, update_columns)
df.values[update_index, list(indexes)] = update_values
Sign up to request clarification or add additional context in comments.

2 Comments

this answers my question. thank you. I do still wonder if there is anything in the pandas API to accomplish this without resorting to numpy and .values?
pandas' indexing is different from numpy's indexing. Take a look at the docs on indexing of both modules and you'll see the differences.. The lists you have are just perfect for numpy indexing and there is probably no faster way to do this other than just using the underlying numpy array
3

try using loc which is used to specify the needed indexes and columns names loc[[index_names], [columns_names]]

df.loc[[3,5,7], ["d","g","i"]] = [0,-2,-3]

1 Comment

This doesn't do what you think it does. Those indices will return a 3x3 dataframe, and you are updating each row to those values, rather than the individual entries.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.