1

I have a large data set 200 rows and 5 columns in a .CSV fromat. here is part of data set:

 4.1    1.2 47.3    10954   51
 3.4    1.5 0.5 1   5316
 0.3    30.1    1.2 10  875
 0.2    0.4 119 0   0
   0    52.6    0.1 0   3.1
   0    0.3 880 0   0
   0    0.1 148 180 0
   0    0.1 490.2   0   0.4
   0    1.1 0.2 0.6 0.9
   0    0   0   0   0

I want to write a code to read each 10 rows separately and store it in a matrix(10 by 5) using for-loop. So at the end I have 20 matrices each (10*5). This is the command line:

all.data   <- read.csv("C:\\Users\\Desktop\\myarray.csv",header=FALSE)#read whole data
for (k in 1:20){   
data_temp.k <- array(NA, dim=c(10,5))
  for( i in 1:10 ){
    for( j in 1:5 ) {
        data_temp.k[i,j] <- all.data[(k-1)*10:k*10,j]
    }
  }
}
write.csv(data_temp.k,"mymatrix.k")

I know the problem is somehow related to "k" and its dual function here as both matrix index and counter.

3 Answers 3

5

Don't use a loop for this, use row indexing :

## Sample data
set.seed(1)
m <- matrix(rnorm(1000),nrow=200,ncol=5)
## Generate indices to keep
indices <- seq(1,nrow(m), by=10)
## Subset matrix rows
m[indices,]
Sign up to request clarification or add additional context in comments.

1 Comment

I like your answer but as I mentioned I need index number to be considered for each matrix. So at the end I can have m.1, m.2, ..., m.20 which each of these matrices stores one chunk of the original matrix.
2

This probably doesn't add much other than being a nice demonstration of how you can use arrays and aperm to split a mtrix into chunks and reshape, all using base R vectorised functions. You can always apply functions to each dimension of an array using apply.

#  Sample data
m <- matrix( 1:16 , 4 , 4 )
#     [,1] [,2] [,3] [,4]
#[1,]    1    5    9   13
#[2,]    2    6   10   14
#[3,]    3    7   11   15
#[4,]    4    8   12   16

# Use array() to turn into arrays and aperm() to transpose the 3D array t0 the result you expect
out <- aperm( array( t(m) , c(4,2,2) ) , c(2,1,3) )
#, , 1
#     [,1] [,2] [,3] [,4]
#[1,]    1    5    9   13
#[2,]    2    6   10   14

#, , 2
#     [,1] [,2] [,3] [,4]
#[1,]    3    7   11   15
#[2,]    4    8   12   16

You can apply functions over the third dimension, e.g. using 'apply'

#  Sum all the elements in each of the third dimension of your arrays
apply( out , 3 , sum )
#[1] 60 76

5 Comments

This is cool but at the end of the day I want to have for example 20 different matrices in this format: Out.1 , Out.2, ..., Out.3 with different index which each one is one chunk of the original matrix.
@user2607526 I am compelled to ask... why? Almost always it is better to have 20 matrices residing in a single list (or an array) which you can operate on rather freely (once you know how) than cluttering up your workspace with 20 objects. What do you want to do with them?
@ SimonO101; I want to use another forr-loop to read each of them and convert them to netcdf format. That's why I need them to have index then I can recall them in another for-loop. Do you know a better way to recall them?
@user2607526 If you put them all in a list, you can call them by their position in the list rather than by name (which is sort of inefficient)...it would also then be very easy to use something like lapply to do your conversion instead of another loop.
Sure. I like for loops for writing files out, because I think of for loops as being for the situation where you expect a side effect rather than a return value (the side effect is the file being written to disk). So if you happen to have an array like above... for( i in 1:dim(out)[3]){ netcdf( out[,,i] , .... ) } so out[,,i] will refer to each of the matrices (and netcdf is your file writing function that I am not familliar with.
0

If, though, you insist on using a for loop, you can -at least- use only one and not three nested loops.

You don't need j because you want to keep all columns in each matrix. E.g. mat[1,] selects all columns and row 1; you don't need to mat[1,1:ncol(mat)].

Also, the way you use i is unneccessary, because you subset more than one row (using k-1 * 10 etc) to pass to row i every time.

Finally, if you're trying to save each of the 20 matrices, you might need paste.

This should work (not tested):

for(k in 1:20)
 {
  data_temp.k <- all.data[((k-1)*10):(k*10),]

  write.csv(data_temp.k, paste("mymatrix", k, sep = ".")
 }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.