Apply function on multiple rows of matrix

Question

If I have a matrix, does there exist a way to apply a function on the rows of a matrix in such way that a certain number of rows are grouped?

As an example: I might like to solve a least squares problem using QR decomposition on a matrix for every ten of my hundred rows. This might look like:

set.seed(128)
f <- function(x) x^2 -x + 1
x <- runif(1000, -1, 1)
y <- f(x) + rnorm(1000, 0, 0.2)

morpheus <- cbind(1,x,x^2)
# apply qr.solve(morpheus, y) 100 times on 10 rows at a time 
# in such way that the correspondence between morpheus and y is not broken

Would anybody now how this problem could be solved? If it would be possible, I'd prefer an approach using any form of apply or other functional solution, but still any help is welcome

DatamineR · Accepted Answer · 2015-11-14 20:18:15Z

3

Using dplyr

library(dplyr)
morpheus %>% group_by(rep(1:10, 100)) %>% do(as.data.frame(rbind(qr.solve(cbind(.$const, .$x, .$x_sq), .$y))))
Source: local data frame [10 x 4]
Groups: rep(1:10, 100)

   rep(1:10, 100)        V1         V2        V3
1               1 1.0410480 -0.9616138 0.8777193
2               2 0.9883532 -0.9751688 1.0431504
3               3 1.0263414 -1.0053184 0.8811848
4               4 1.0114099 -1.0024364 0.9341063
5               5 1.0059417 -0.9694164 0.9322200
6               6 1.0501467 -1.0186771 0.9048468
7               7 0.9748101 -1.0045796 1.0932815
8               8 0.9784629 -0.9572418 1.0008312
9               9 0.9559010 -1.0271767 1.0823086
10             10 0.9435522 -1.0583352 1.0804009

answered Nov 14, 2015 at 20:18

DatamineR

9,6803 gold badges28 silver badges50 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Zbynek · Accepted Answer · 2015-11-15 09:18:23Z

2

I think the simplest solution, apart from for loop, would be using by

f <- function(x) x^2 -x + 1
x <- runif(1000, -1, 1)
y <- f(x) + rnorm(1000, 0, 0.2)

morpheus <- cbind(1,x,x^2,y, rep(1:100,each=10))

by(morpheus[,1:4], morpheus[,5], function(x)qr.solve(x[,1:3],x[,4]))

     INDICES: 1
        V1          x         V3 
     1.1359248 -0.7800506  0.6642460 
    --------------------------------------------------------------------------------- 
    INDICES: 2
       V1          x         V3 
     0.9156199 -1.0999112  1.0019637 
    --------------------------------------------------------------------------------- 
    INDICES: 3
       V1          x         V3 
     0.9901892 -0.8275427  1.2576495 

### etc.

UPDATE: you can use do.call to get the results into a matrix for further use:

do.call('rbind',as.list(
  by(morpheus[,1:4], morpheus[,5], function(x){
    qr.solve(x[,1:3],x[,4])
  })
))

# results:

          V1          x        V3
1   0.9445907 -1.0655362 0.9471155
2   1.0370279 -0.8100258 0.7440526
3   0.9681344 -0.7442517 0.9108040
### etc.

edited Nov 15, 2015 at 9:18

answered Nov 14, 2015 at 19:57

Zbynek

5,9956 gold badges34 silver badges53 bronze badges

2 Comments

Pierre L Over a year ago

Don't use by, it is usually useless for further operations as it is designed to be the last output.

Zbynek Over a year ago

Ok, but that could be solved, i.e. by do.call (see the updated answer)

gented · Accepted Answer · 2015-11-14 19:26:51Z

0

If you have an additional variable that labels the set of rows you want to independently apply your function on, you may want to try

 library('data.table')
 iris <- as.data.table(iris)
 iris[,
      apply(.SD,1, mean),
      by = Species
      ]

       Species    V1
  1:    setosa 2.550
  2:    setosa 2.375
  3:    setosa 2.350
  4:    setosa 2.350
  5:    setosa 2.550
 ---                
146: virginica 4.300
147: virginica 3.925
148: virginica 4.175
149: virginica 4.325
150: virginica 3.950

and replace mean with any other function of your choice, by = variable being the variable allowing you to group by ten rows at a time.

answered Nov 14, 2015 at 19:26

gented

1,7171 gold badge17 silver badges23 bronze badges

2 Comments

Zbynek Over a year ago

In that case i would suggest aggregate or using dplyr or plyr packages, which are faster

gented Over a year ago

@Zbynek aggregate, as well as dplyr and plyr, are by no means faster: see github.com/Rdatatable/data.table/wiki/Benchmarks-:-Grouping.

Collectives™ on Stack Overflow

Apply function on multiple rows of matrix

3 Answers 3

Comments

2 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related