0

I have a list consisting of 3 elements:

datalist=list(a=datanew1,b=datanew2,c=datanew3)

datalist$a :

      Inv_ret Firm size  leverage        Risk  Liquidity Equity
17  0.04555968  17.34834 0.1323199 0.011292273 0.02471489      0
48  0.01405835  15.86315 0.6931730 0.002491093 0.12054914      0
109 0.04556252  16.91602 0.1714068 0.006235836 0.01194579      0
159 0.04753472  14.77039 0.3885720 0.007126830 0.06373028      0
301 0.03941040  16.94377 0.1805346 0.005450653 0.01723319      0

datalist$b :

      Inv_ret Firm size   leverage        Risk  Liquidity      Equity
31  0.04020832  18.13300 0.09326265 0.015235240 0.01579559 0.005025379
62  0.04439078  17.84086 0.11016402 0.005486982 0.01266566 0.006559096
123 0.04543250  18.00517 0.12215307 0.011154742 0.01531451 0.002282790
173 0.03960613  16.45457 0.10828643 0.011506857 0.02385191 0.009003780
180 0.03139643  17.57671 0.40063094 0.003447233 0.04530395 0.000000000

datalist$c :

   Inv_ret Firm size   leverage       Risk   Liquidity      Equity
92  0.03081029  19.25359 0.10513159 0.01635201 0.025760806 0.000119744
153 0.03280746  19.90229 0.11731517 0.01443786 0.006769735 0.011999005
210 0.04655847  20.12543 0.11622403 0.01418010 0.003125632 0.003802365
250 0.03301018  20.67197 0.13208234 0.01262499 0.009418828 0.021400052
282 0.04355975  20.03012 0.08588316 0.01918129 0.004213846 0.023657440

I am trying to create a cor.test on the datalist above :

Cor.tests=sapply(datalist,function(x){ 
  for(h in 1:length(names(x))){

    for(i in 1:length(names(x$h[i]))){
      for(j in 1:length(names(x$h[j]))){
      cor.test(x$h[,i],x$h[,j])$p.value 


    }}}})

But I get an error :

Error in cor.test.default(x$h[, i], x$h[, j]) : 
  'x' must be a numeric vector

Any suggestions about what I am doing wrong?

P.S. If I simply have one dataframe, datanew1 :

      Inv_ret Firm size  leverage        Risk  Liquidity Equity
17  0.04555968  17.34834 0.1323199 0.011292273 0.02471489      0
48  0.01405835  15.86315 0.6931730 0.002491093 0.12054914      0
109 0.04556252  16.91602 0.1714068 0.006235836 0.01194579      0
159 0.04753472  14.77039 0.3885720 0.007126830 0.06373028      0
301 0.03941040  16.94377 0.1805346 0.005450653 0.01723319      0

I use this loop :

results=matrix(NA,nrow=6,ncol=6)
for(i in 1:length(names(datanew1))){
  for(j in 1:length(names(datanew1))){
    results[i,j]<-cor.test(datanew1[,i],datanew1[,j])$p.value 


}}

And the output is:

results :
             [,1]         [,2]         [,3]         [,4]         [,5]        [,6]
[1,] 0.000000e+00 7.085663e-09 3.128975e-10 3.018239e-02 4.806400e-10 0.475139526
[2,] 7.085663e-09 0.000000e+00 2.141581e-21 0.000000e+00 2.247825e-20 0.454032499
[3,] 3.128975e-10 2.141581e-21 0.000000e+00 2.485924e-25 2.220446e-16 0.108643838
[4,] 3.018239e-02 0.000000e+00 2.485924e-25 0.000000e+00 5.870007e-15 0.006783324
[5,] 4.806400e-10 2.247825e-20 2.220446e-16 5.870007e-15 0.000000e+00 0.558827862
[6,] 4.751395e-01 4.540325e-01 1.086438e-01 6.783324e-03 5.588279e-01 0.000000000

Which is exactly what I want. But I want to get 3 matrices, one for each element of the datalist above.

EDIT: If I do as Joran says:

Cor.tests=lapply(datalist,function(x){ 

  results=matrix(NA,nrow=6,ncol=6)
  for(i in 1:length(names(x))){
    for(j in 1:length(names(x))){
      results[i,j]<-cor.test(x[,i],x[,j])$p.value 
    }}})

I get:

$a
NULL

$b
NULL

$c
NULL

2 Answers 2

1

This can be done without for loops.

1) A solution with base R:

lapply(datalist,
       function(datanew) outer(seq_along(datanew),
                               seq_along(datanew),
                               Vectorize(function(x, y)
                                            cor.test(datanew[ , x],
                                                     datanew[ , y])$p.value)))

2) A solution with the package psych:

library(psych)
lapply(datalist, function(datanew) corr.test(datanew)$p)

A modified version of approach in the question:

lapply(datalist, function(x) { 
                    results <- matrix(NA,nrow=6,ncol=6)
                    for(i in 1:6){
                       for(j in 1:6){
                          results[i,j]<-cor.test(x[,i],x[,j])$p.value 
                       }
                    }
                    return(results)
                 })

There were two major problems in these commands:

  1. The matrix results was not returned. I added return(results) to the function.

  2. You want to have a 6 by 6 matrix whereas your data frames have seven columns. I replaced 1:length(names(x)) with 1:6 in the for loops.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for the alternative solution Sven! But I am trying to understand why the lapply solution in my question does not work. datalist is of length 3. Therefore, I am trying to create a list of length 3, where each element is cor.test for the 3 elements of datalist. Any suggestions?
Thanks Sven! Great answer! But my data.frames has only 6 columns though.
By the way Sven, do you have some blog about R? If you do, please give me the link, because I would love to read it!:)
@user1665355 I don't have a blog. But if I start a blog, I will let you know.
0

I'm not going to attempt to provide you with working code, but hopefully what follows will help explain why what you're trying isn't working.

Let's look at the first few lines of your sapply call:

Cor.tests=sapply(datalist,function(x){ 
  for(h in 1:length(names(x))){
    for(i in 1:length(names(x$h[i]))){

Let's stop here and think for a moment about x$h[i]. At this points, x is the argument passed to your anonymous function in sapply (presumably either a data frame or matrix, I can't be sure from your question which it is).

At this point in your code, what is h? h is the index variable in the previous for loop, so initially h has the value 1. The $ operator is for selecting items from an object by name. Is there something in x named h? I think not.

But then things get even worse as you attempt to select the ith element within this non-existant thing named h inside x. I'm honestly not even sure what R's interpreter will do with that since you're referencing the variable i in the expression that is supposed to define the range of values for i. Circular, anyone?

If you simply remove all attempts at the third for loop, you should have more luck. Just take the working version, plop it down in the body of the anonymous function, and replace every occurrence of datanew1 with x.

Good luck.

(PS - You might want be happier with the output of lapply rather than sapply)

1 Comment

I tried it joran by I get null answer for the datalist elements...What am I missing?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.