2

I have the following code.

completemodel <- function(model, colnum)
{
  modlst = c()
  tuplenum = length(model)
  if(tuplenum != 0)
    for(i in 1:tuplenum)
      modlst = c(modlst, model[[i]])
  index = seq(0, colnum-1)
  inddiff = setdiff(index, modlst)
  inddifflen = length(inddiff)
  for(i in seq(length.out=inddifflen))
    model = append(model, inddiff[i])
  return(model)
}

##   Calculate number of parameters in model.
numparam <- function(mod, colnum)
  {
    library(RJSONIO)
    mod = fromJSON(mod)
    mod = completemodel(mod, colnum)
    totnum = 0
    for(tup in mod)
      totnum = totnum +(4**length(tup))
    return(totnum)
  }

x = cbind.data.frame(rownum=c(100, 100), colnum=c(10, 20), modeltrue=c("[]", "[]"), modelresult=c("[[1,2]]","[[1,3]]"), stringsAsFactors=FALSE)

> x
  rows colnum modeltrue modelresult
1  100     10        []     [[1,2]]
2  100     20        []     [[1,3]]

How can I operate on x to give me a data frame that looks like the following? Here of course I mean that the value of e.g. numparam("[]", 10) when I write numparam("[]", 10).

  rownum   colnum    numparam_modeltrue   numparam_modelresult
  100        10      numparam("[]", 10)   numparam("[[1,2]]", 10)
  100        20      numparam("[]", 20)   numparam("[[1,3]]", 20)

Some version of the apply function might work, but I am having problems finding the proper formulation.

UPDATE: It seems that if the rownnum, colnum tuple is not unique, then one can do the following.

x = cbind.data.frame(id=c(1, 2, 3), rownum=c(100, 100, 100), colnum=c(10, 20, 20), modeltrue=c("[]", "[]", "[]"),
  modelresult=c("[[1,2]]","[[1,3]]","[[1,3, 4]]"), stringsAsFactors = FALSE)

##Then, create a data.table and set the key

library(data.table)
xDT <- as.data.table(x)
setkeyv(xDT, c("id", "rownum", "colnum")

Is that the correct method?

4
  • @RomanLuštrik: I'd be happy to, but what kind of context do you need? The code given above is complete, I think. I just want to operate on the given data frame with the numparam function to obtain another data frame in the manner speciried. What is unclear? This is the actual code I am using. I suppose I could come up with a simpler example to illustrate, though this one is not very complex. Commented Oct 9, 2012 at 8:01
  • numparam_modeltrue and numparam_modelresult are factors? Commented Oct 9, 2012 at 10:18
  • @RomanLuštrik: No, just strings. I'm modified the call to cbind. Commented Oct 9, 2012 at 10:27
  • This might help: stackoverflow.com/questions/9236306/… Commented Oct 10, 2012 at 0:26

3 Answers 3

3

If you're open to it, you could use the data.table package.

First, create a data.table, add a unique identifier column id and set that as the key

library(data.table)
xDT <- as.data.table(x)
xDT[, id := seq_len(nrow(xDT))]
setkey(xDT, "id")

Then, using do.call, you can run your numparam function on the appropriate columns:

res1 <- xDT[, list(numparam_modeltrue = do.call(numparam, unname(.SD))),
  .SDcols = c(3, 2), by = key(xDT)]
res2 <- xDT[, list(numparam_modelresult = do.call(numparam, unname(.SD))),
  .SDcols = c(4, 2), by = key(xDT)]

Then combine the results into a data.table

xDT[res1][res2][, c("modeltrue", "modelresult") := NULL, with = FALSE]
   id rownum colnum numparam_modeltrue numparam_modelresult
1:  1    100     10                 40                   48
2:  2    100     20                 80                   88

EDIT:

As Matthew Dowle suggests, you could reach the same results without the mrege at the end by the following:

xDT[,numparam_modeltrue := do.call(numparam, unname(.SD)),
  .SDcols = c(3, 2), by = key(xDT)]
xDT[,numparam_modelresult := do.call(numparam, unname(.SD)),
  .SDcols = c(4, 2), by = key(xDT)]

And if you want to get rid of the columns modeltrue and modelresult,

xDT[,c("modeltrue", "modelresult") := NULL, with = FALSE]
# NOTE that with = FALSE shouldn't be necessary with data.table 1.8.3
# But I'm still with 1.8.2
Sign up to request clarification or add additional context in comments.

11 Comments

+1 Could the res1<- and res2<- steps each be done in one := by group; e.g., xDT[, numparam_modeltrue := do.call(numparam, unname(.SD)), .SDcols = c(3, 2), by = key(xDT)] directly to save the res1[res2]?
@MatthewDowle, That would be a possibility, but the two functions refer to different sets of columns, and I defined .SDcols to correspond appropriately. An attempt at subsetting .SD didn't work out yet...
I edited my comment a few times, apols. I mean two :=-by-group, and no res1[res2].
Ah yes. Added your suggested alternative, @MatthewDowle
Cool. Nice to see you're up to speed with 1.8.3.
|
1

Alternative approach using sapply:

numparamvec <- function(rownum, colnum, modeltrue, modelresult)
  {
    totnum1 = numparam(modeltrue, as.integer(colnum))
    totnum2 = numparam(modelresult, as.integer(colnum))
    return(c(rownum = rownum, colnum = colnum,
      numparam_modeltrue = totnum1, numparam_modelresult = totnum2))
  }

val <- sapply(seq_len(nrow(x)),
  function(y) do.call(numparamvec, x[y, ]))

> as.data.frame(t(val))
  rownum colnum numparam_modeltrue numparam_modelresult
1    100     10                 40                   48
2    100     20                 80                   88

Alternative approach using vapply:

val <- t(vapply(seq_len(nrow(x)), function(y) do.call(numparamvec, x[y, ]),
  c(rownum = 0, colnum = 0, numparam_modeltrue = 0, numparam_modelresult = 0)))

> val
     rownum colnum numparam_modeltrue numparam_modelresult
[1,]    100     10                 40                   48
[2,]    100     20                 80                   88

2 Comments

Thanks for the update. I suggest merging your edit to my answer with this answer (and removing it from there), since they are so similar. I think I slightly prefer the vapply version, because, if I understand correctly, it does some validation on the input.
@FaheemMitha, Good Suggestion. Changes made.
1

The following code sort of works. It is not very pretty, though. Suggestions for improvement welcome. In particular, it would be nice to not have to transpose the matrix and add the column names, and also, since it returns a matrix, there is still that annoying issue where the integers are converted to strings. Thanks to flodel for the tip regarding his answer to "Pass arguments to a function from each row of a matrix".

completemodel <- function(model, colnum)
{
  modlst = c()
  tuplenum = length(model)
  if(tuplenum != 0)
    for(i in 1:tuplenum)
      modlst = c(modlst, model[[i]])
  index = seq(0, colnum-1)
  inddiff = setdiff(index, modlst)
  inddifflen = length(inddiff)
  for(i in seq(length.out=inddifflen))
    model = append(model, inddiff[i])
  return(model)
}

##   Calculate number of parameters in model.
numparam <- function(mod, colnum)
  {
    library(RJSONIO)
    mod = fromJSON(mod)
    print(paste("mod", mod))
    mod = completemodel(mod, colnum)
    totnum = 0
    for(tup in mod)
      totnum = totnum +(4**length(tup))
    return(totnum)
  }

numparamvec <- function(rownum, colnum, modeltrue, modelresult)
  {
    totnum1 = numparam(modeltrue, as.integer(colnum))
    totnum2 = numparam(modelresult, as.integer(colnum))
    return(c(rownum, colnum, totnum1, totnum2))
  }

x = cbind.data.frame(rownum=c(100, 100), colnum=c(10, 20), modeltrue=c("[]", "[]"), modelresult=c("[[1,2]]","[[1,3]]"), stringsAsFactors=FALSE)
val = t(apply(x, 1, function(x)do.call(numparamvec, as.list(x))))
colnames(val) = c("rownum", "colnum", "numparam_modeltrue", "numparam_modelresult")

5 Comments

@BenBarnes: The resulting data frame, however, has integers as strings. Should one just run a converter over the data frame, or is there a better way to handle this?
The problem with conversion of the integers to character happens when using the apply function, which calls as.matrix on 2-D objects. If there are any non-numeric, -complex, or -logical data in the data.frame, as.matrix coerces your data to character. vapply allows you to specify the format of the output. (I'll add an example to your post.)
And the result is a matrix, not a data.frame!
@BenBarnes: Thanks for the improved version. I had trouble with the FUN.VALUE argument. It seems from the (scant) documentation that this gives the type and length of the return value from FUN. However, the rules are not precisely stated. I tried using data.frame(rownum = 0, colnum = 0, numparam_modeltrue = 0, numparam_modelresult = 0) but got an error message. The idea was to get a data frame returned instead of a matrix.
@BenBarnes: I suggest you split off your version into a separate answer. That way people can upvote it, and I might choose it as my preferred answer. Except for having to transpose and convert to data frame, it looks good.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.