0

I am trying to do the following in data.table or create a function in replace of a for loop. However, I am not sure how to return two columns with one depending on the calculation of another. The dataset contains sales and delivery units for each 'place' by month however, only a starting inventory for the first month. I need to calculate the beginning inventory of each period by first calculating the ending inventory of the last month at that place. Ending inventory for each place is equal to the starting inventory minus sales units plus delivery units.

Here is how i am currently calculating:

data <- data.table(place = c('a','b'),
                 month = c(1,1,2,2,3,3,4,4,5,5,6,6),
                 sales = c(20,2,3,5,6,7,8,1,5,1,5,3),
                 delivery = c(1,1,1,1,1,1,1,1,1,1,1,1),
                 starting_inv = c(100,100,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),
                 ending_inv = c(81,99,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA) ) 

print(data)

   place month sales delivery starting_inv ending_inv
 1:     a     1    20        1          100         81
 2:     b     1     2        1          100         99
 3:     a     2     3        1           NA         NA
 4:     b     2     5        1           NA         NA
 5:     a     3     6        1           NA         NA
 6:     b     3     7        1           NA         NA
 7:     a     4     8        1           NA         NA
 8:     b     4     1        1           NA         NA
 9:     a     5     5        1           NA         NA
10:     b     5     1        1           NA         NA
11:     a     6     5        1           NA         NA
12:     b     6     3        1           NA         NA

dt <- data[order(place,month)]

print(dt)

    place month sales delivery starting_inv ending_inv
 1:     a     1    20        1          100         81
 2:     a     2     3        1           NA         NA
 3:     a     3     6        1           NA         NA
 4:     a     4     8        1           NA         NA
 5:     a     5     5        1           NA         NA
 6:     a     6     5        1           NA         NA
 7:     b     1     2        1          100         99
 8:     b     2     5        1           NA         NA
 9:     b     3     7        1           NA         NA
10:     b     4     1        1           NA         NA
11:     b     5     1        1           NA         NA
12:     b     6     3        1           NA         NA

for (i in 1:nrow(dt)) {


  if (dt[i]$month != 1) {
  dt$starting_inv[i] <- dt[i-1]$ending_inv
  dt$ending_inv[i] <- dt[i]$starting_inv - dt[i]$sales  + dt[i]$delivery 
  }
  

}

print(dt)

   place month sales delivery starting_inv ending_inv
 1:     a     1    20        1          100         81
 2:     a     2     3        1           81         79
 3:     a     3     6        1           79         74
 4:     a     4     8        1           74         67
 5:     a     5     5        1           67         63
 6:     a     6     5        1           63         59
 7:     b     1     2        1          100         99
 8:     b     2     5        1           99         95
 9:     b     3     7        1           95         89
10:     b     4     1        1           89         89
11:     b     5     1        1           89         89
12:     b     6     3        1           89         87

I would like to avoid the step that requires I sort the table by Place and Month. Then Calculating this on a table with much more data takes too long to run and I am having trouble making this in to a vectorized function.

2 Answers 2

1

The iteration is captured by the cumulative sum, the rest can then be vectorised so should be fast.

data[, starting_inv := cumsum(shift(delivery-sales, fill = starting_inv[1])), place]
data[, ending_inv := starting_inv+delivery-sales]

data
#>     place month sales delivery starting_inv ending_inv
#>  1:     a     1    20        1          100         81
#>  2:     b     1     2        1          100         99
#>  3:     a     2     3        1           81         79
#>  4:     b     2     5        1           99         95
#>  5:     a     3     6        1           79         74
#>  6:     b     3     7        1           95         89
#>  7:     a     4     8        1           74         67
#>  8:     b     4     1        1           89         89
#>  9:     a     5     5        1           67         63
#> 10:     b     5     1        1           89         89
#> 11:     a     6     5        1           63         59
#> 12:     b     6     3        1           89         87

This assumes the actual data you are dealing with is ordered by month. If it is not then insert an order(month) after the first square bracket in the first line.

Sign up to request clarification or add additional context in comments.

Comments

0

Here is one option with accumulate2 from purrr

library(purrr)
library(dplyr)
library(tidyr)
dt %>%
     group_by(place) %>%
     dplyr::mutate(starting_inv = accumulate2(delivery, sales, 
        ~ ..1 - ..3 + ..2 , .init = first(starting_inv))[-n()]) %>% 
     unnest(c(starting_inv)) %>%
     mutate(ending_inv = lead(starting_inv))
# A tibble: 12 x 6
# Groups:   place [2]
#   place month sales delivery starting_inv ending_inv
#   <chr> <dbl> <dbl>    <dbl>        <dbl>      <dbl>
# 1 a         1    20        1          100         81
# 2 a         2     3        1           81         79
# 3 a         3     6        1           79         74
# 4 a         4     8        1           74         67
# 5 a         5     5        1           67         59
# 6 a         6     5        1           59         NA
# 7 b         1     2        1          100         99
# 8 b         2     5        1           99         95
# 9 b         3     7        1           95         89
#10 b         4     1        1           89         89
#11 b         5     1        1           89         87
#12 b         6     3        1           87         NA

This can be also used along with data.table

dt[, starting_inv := unlist(accumulate2(delivery, sales, 
     function(x, y, z) x - z + y ,
   .init = first(starting_inv))[-.N]), place][, ending_inv := 
         shift(starting_inv, type = 'lead'), place]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.