1

I have a dataframe containing a one missing value.

   exam_id   exam  
0        1   french   
1        2   italian 
2        3   chinese  
3        4   english  
4        3   chinese  
5        5   russian  
6        1   french       
7      NaN   russian   
8        1   french   
9        2   italian

I want to fill in the missing exam_id for russian exam based on existing information. Since exam_id for russian is 5 I would like to have the same value assigned to the missing one.

2
  • just once? or for all missing values Commented Mar 13, 2017 at 19:22
  • for all missing values! Commented Mar 13, 2017 at 19:22

2 Answers 2

3

You can group your data frame by exam, then do a ffill + bfill in case there are missing values before and after the existing value:

df.groupby("exam").ffill().bfill()

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

1

This approach does not only fill missing values. So beware. However, this would also take care of miscodings (e.g., "french" being coded as 3). Building a dictionary for the languages and their values and then applying it via a map will create a new exam_id column. Do note, however, that if a language doesn't appear in the dictionary (e.g. "French"), it will produce a missing value.

language_test_map = {'french': 1,
                     'italian': 2,
                     'chinese': 3,
                     'english': 4,
                     'russian': 5}

df['exam_id'] = df['exam'].map(language_test_map)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.