Create a loop for calculating values from a dataframe in R?

Question

Let's say I make a dummy dataframe with 6 columns with 10 observations:

X <- data.frame(a=1:10, b=11:20, c=21:30, d=31:40, e=41:50, f=51:60)

I need to create a loop that evaluates 3 columns at a time, adding the summed second and third columns and dividing this by the sum of the first column:

 (sum(b)+sum(c))/sum(a) ... (sum(e)+sum(f))/sum(d) ...

I then need to construct a final dataframe from these values. For example using the dummy dataframe above, it would look like:

        value
1.     7.454545
2.     2.84507

I imagine I need to use the next function to iterate within the loop, but I'm fairly lost! Thank you for any help.

Do you repeat the values? eg sum(b)+sum(c))/sum(a) then sum(d)+sum(c))/sum(a) or should it be sum(d)+sum(c))/sum(b) — Onyambu
– Onyambu, Commented Jul 30, 2020 at 16:00
Hi Onyambu, no the values don't repeat -- it's every 3 discrete columns. So c+b/a, then e+f/d, and so on. — wammy
– wammy, Commented Jul 30, 2020 at 19:37

IceCreamToucan · Accepted Answer · 2020-07-30 16:15:17Z

1

You can split your data frame into groups of 3 by creating a vector with rep where each element repeats 3 times. Then with this list of sub data frames, (s)apply the function of summing the second and third columns, adding them, and dividing by the sum of the first column.

out_vec <- 
  sapply(
    split.default(X, rep(1:ncol(X), each = 3, length.out = ncol(X)))
    , function(x) (sum(x[2]) + sum(x[3]))/sum(x[1]))

data.frame(value = out_vec)
#      value
# 1 7.454545
# 2 2.845070

You could also sum all the columns up front before the sapply with colSums, which will be more efficient.

out_vec <- 
  sapply(
    split(colSums(X), rep(1:ncol(X), each = 3, length.out = ncol(X)))
    , function(x) (x[2] + x[3])/x[1])

data.frame(value = out_vec, row.names = NULL)
#      value
# 1 7.454545
# 2 2.845070

edited Jul 30, 2020 at 16:15

answered Jul 30, 2020 at 16:10

IceCreamToucan

28.8k2 gold badges27 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Onyambu · Accepted Answer · 2020-07-30 20:12:12Z

1

You could use tapply:

tapply(colSums(X), gl(ncol(X)/3, 3), function(x)sum(x[-1])/x[1])
       1        2 
7.454545 2.845070

answered Jul 30, 2020 at 20:12

Onyambu

80.3k3 gold badges29 silver badges65 bronze badges

Comments

akrun · Accepted Answer · 2020-07-30 23:03:31Z

0

Here is an option with tidyverse

library(dplyr) # 1.0.0
library(tidyr)
X %>% 
     summarise(across(.fn = sum)) %>% 
     pivot_longer(everything()) %>% 
     group_by(grp = as.integer(gl(n(), 3, n()))) %>% 
     summarise(value = sum(lead(value)/first(value), na.rm = TRUE)) %>% 
     select(value)
# A tibble: 2 x 1
#  value
#  <dbl>
#1  7.45
#2  2.85

answered Jul 30, 2020 at 23:03

akrun

891k38 gold badges590 silver badges700 bronze badges

Collectives™ on Stack Overflow

Create a loop for calculating values from a dataframe in R?

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related