Background
I have some financial data (1.5 years SP500 stocks) that I have manipulated into a wide format using the data.table package. After following the whole data.table course on Datacamp, I'm starting to get a hang of the basics, but after searching for hours I'm at a loss on how to do this.
The Problem
The data contains columns with financial data for each stock. I need to delete columns that contain two consecutive NAs.
My guess is I have to use rle(), lapply(), to find consecutive values and DT[,x:= NULL]) to delete the columns.
I read that rle() doesn't work on NAs, so I changed them to Inf instead.
I just don't know how to combine the functions so that I can efficiently remove a few columns among the 460 that I have.
An answer using data.table would be great, but anything that works well is very much appreciated.
Alternatively I would love to know how to remove columns containing at least 1 NA
Example data
> test[1:5,1:5,with=FALSE]
date 10104 10107 10138 10145
1: 2012-07-02 0.003199 Inf 0.001112 -0.012178
2: 2012-07-03 0.005873 0.006545 0.001428 Inf
3: 2012-07-05 Inf -0.001951 -0.011090 Inf
4: 2012-07-06 Inf -0.016775 -0.009612 Inf
5: 2012-07-09 -0.002742 -0.006129 -0.001294 0.005830
> dim(test)
[1] 377 461
Desired outcome
date 10107 10138
1: 2012-07-02 Inf 0.001112
2: 2012-07-03 0.006545 0.001428
3: 2012-07-05 -0.001951 -0.011090
4: 2012-07-06 -0.016775 -0.009612
5: 2012-07-09 -0.006129 -0.001294
PS. This is my first question, I have tried to adhere to the rules, if I need to change anything please let me know.