1

I have searched quite a bit and not found a question that addresses this issue--but if this has been answered, forgive me, I am still quite green when it comes to coding in general. I have a data frame with a large number of variables that I would like to combine & create new variables from based on names I've put in a 2nd data frame in a loop. The data frame formulas should create & call columns from the main data frame data

USDb = c(1,2,3)
USDc = c(4,5,6)
EURb = c(7,8,9)
EURc = c(10,11,12)
data = data.frame(USDb, USDc, EURb, EURc)

Now I'd like to create a new column data$USDa as defined by

data$USDa = data$USDb - data$USDc

and so on for EUR and other variables. This is easy enough to do manually, but I'd like to create a loop that pulls the names from formulas, something like this:

a = c("USDa", "EURa")
b = c("USDb", "EURb")
c = c("USDc", "EURc")
formulas = data.frame(a,b,c)

for (i in 1:length(formulas[,a])){
    data$formulas[i,a] = data$formulas[i,b] - data$formulas[i,c]
    }

Obviously data$formulas[i,a] this returns NULL, so I tried data$paste0(formulas[i,a]) and that returns Error: attempt to apply non-function

How can I get these strings to be recognized as variables in this way? Thanks.

3 Answers 3

1

There are simpler ways to do this, but I'll stick to most of your code as a means of explanation. Your code should work so long as you edit your for loop to the following:

for (i in 1:length(formulas[,"a"])){
    data[formulas[i,"a"]] = data[formulas[i,"b"]] - data[formulas[i,"c"]]
}
  1. formulas[,a] won't work because you have a variable defined as a already that is not appropriate inside an index. Use formulas[, "a"] instead if you want all rows from column "a" in data.frame formulas.
  2. data$formulas is literally searching for the column called "formulas" in the data.frame data. Instead you want to write data[formulas](of course, knowing that you need to index formulas in order to make it a proper string)
Sign up to request clarification or add additional context in comments.

Comments

0

logic : iterate through each of the formulae, using a apply which is a for loop internally, and do calculation based on the formula

x = apply(formulas, 1, function(x) data[[x[3]]] - data[[x[2]]])
colnames(x) = formulas$a
x
#     USDa EURa
#[1,]    3    3
#[2,]    3    3
#[3,]    3    3

cbind(data, x)
#  USDb USDc EURb EURc USDa EURa
#1    1    4    7   10    3    3
#2    2    5    8   11    3    3
#3    3    6    9   12    3    3

1 Comment

yes, this solution worked on the actual data I have and to me seems more elegant than looping. Thank you for the guidance.
0

Another option is split with sapply

sapply(setNames(split.default(as.matrix(formulas[-1]), 
   row(formulas[-1])), formulas$a), function(x) Reduce(`-`, data[rev(x)]))
#     USDa EURa
#[1,]    3    3
#[2,]    3    3
#[3,]    3    3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.