This question is related to Adding rows per group in pandas / ipython if per group a row is missing, but is a bit more complicated.
I have a table like this:
ID DEGREE TERM STATUS GRADTERM
1 Bachelors 20111 1
1 Bachelors 20116 1
2 Bachelors 20126 1
2 Bachelors 20131 1
2 Bachelors 20141 1
3 Bachelors 20106 1
3 Bachelors 20111 1 20116
3 Masters 20116 1
3 Masters 20121 1
3 Masters 20131 1 20136
What I would like is to turn that into this (when run for term 20151):
ID DEGREE TERM STATUS
1 Bachelors 20111 1
1 Bachelors 20116 1
1 Bachelors 20121 0
1 Bachelors 20126 0
1 Bachelors 20131 0
1 Bachelors 20136 0
1 Bachelors 20141 0
1 Bachelors 20146 0
1 Bachelors 20151 0
2 Bachelors 20126 1
2 Bachelors 20131 1
2 Bachelors 20136 0
2 Bachelors 20141 1
2 Bachelors 20146 0
2 Bachelors 20151 0
3 Bachelors 20106 1
3 Bachelors 20111 1
3 Bachelors 20116 2
3 Bachelors 20121 2
3 Bachelors 20126 2
3 Bachelors 20131 2
3 Bachelors 20136 2
3 Bachelors 20141 2
3 Bachelors 20146 2
3 Bachelors 20151 2
3 Masters 20116 1
3 Masters 20121 1
3 Masters 20126 0
3 Masters 20131 1
3 Masters 20136 2
3 Masters 20141 2
3 Masters 20146 2
3 Masters 20151 2
In each table, STATUS is 0 - Not Enrolled, 1 - Enrolled, and 2 - Graduated. The TERM fields are the year followed by a 1 or 6 for spring or fall.
Missing TERM records should be added for each person between their first record and the current term (which is 20151 in this case). For each added record, assign a STATUS of 0 unless the last existing record has a STATUS of 2 (which carries). That is, a person is enrolled (STATUS=1) or they are not (STATUS=0 or 2).
I am using pandas in Python, but I am new to Python. I have been trying to figure out how the indexing for a DataFrame works, but that is a complete mystery at this point. Any guidance would be greatly appreciated.