I have a table such as:
employee <- c('John Doe','Peter Gynn','Jolie Hope')
salary <- c(21000, 23400, 26800)
startdate <- as.Date(c('2010-11-1','2008-3-25','2007-3-14'))
employ.data <- data.frame(employee, salary, startdate)
employ.data$Salary.category <- NA
employ.data$Salary.next.year <- 0
I am looking to add more columns. the values of each columns will be a function of the value of the salary.
I have created the following loop:
for (employee in 1:nrow(employ.data)){
if (is.na(employ.data[employee,2]) == FALSE){
if(employ.data[employee,2] <= 22000){
employ.data[employee,4] = "Sub 22k"
employ.data[employee,5] = employ.data[employee,2] * 1.20
} else if(employ.data[employee,2] > 22000 && employ.data[employee,2] <= 23000){
employ.data[employee,4] = "Sub 23k"
employ.data[employee,5] = employ.data[employee,2] * 1.10
} else if(employ.data[employee,2] > 23000){
employ.data[employee,4] = "Sub 24k"
employ.data[employee,5] = employ.data[employee,2] * 1.10
}
}
}
It works well, as the df result is :
> employ.data
employee salary startdate Salary.category Salary.next.year
1 John Doe 21000 2010-11-01 Sub 22k 25200
2 Peter Gynn 23400 2008-03-25 Sub 24k 25740
3 Jolie Hope 26800 2007-03-14 Sub 24k 29480
The issue is that in the actual table i have about 5 columns to add on over 1mln rows, and this takes abour 2hr running. Is there a better way to build the additional table with Apply's for example?