2

I have a dataframe like this

 M2 <- matrix(c(1,0,0,1,1,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0),nrow=7, 
  dimnames=list(LETTERS[1:7],NULL)) 

I would like to select the rows based on multiple columns. For instance when I want to select rows based on only two columns, I did

 ans<-M2[which(M2[,1]==0 & M2[,2]==0)

But when I want to select only those rows which have value zero based on three or four columns, say based 1, 3, and 4 th column or say 1, 2,3, 4 , how do I do this?

1
  • 2
    You write you have a data.frame, but show a matrix. You don't need which here. Is ans<-M2[M2[,1]==0 & M2[,2]==0 & M2[,4]==0,] sufficient for your needs? Commented Oct 29, 2013 at 18:17

3 Answers 3

10

Just for fun a solution that works for a data.frame and could be used for a large number of columns:

DF <- as.data.frame(M2)
DF[rowSums(sapply(DF[,c(1,2,4)],`!=`,e2=0))==0,]
#  V1 V2 V3 V4
#B  0  0  0  0
#F  0  0  0  0
#G  0  0  0  0

What happens here?

  1. sapply loops over the columns of the subset DF[,c(1,2,4)]. It applies the function != (not equal) to each column of the subset and compares with 0 (e2 is the second argument of the != function). The result is a matrix of logical values (TRUE/FALSE).
  2. rowSums takes the sum of each row of this logical matrix. Logical values are automatically coerced to 1/0.
  3. We then test if these row sums are 0 (i.e. all values in the row not unequal to 0).
  4. The resulting logical vector is used for subsetting the rows.

Of course this is easier and faster with a matrix:

M2[rowSums(M2[,c(1,2,4)] != 0) == 0,]
Sign up to request clarification or add additional context in comments.

Comments

4

You could use rowSums:

M2[rowSums(M2[,c(1,2,3,4)]) == 0,]

gives you all rows where column 1,2,3 and 4 have a zero:

  [,1] [,2] [,3] [,4]
B    0    0    0    0
F    0    0    0    0
G    0    0    0    0

Please note that this won't work if you have positive and negative numbers in you matrix.

2 Comments

@SeñorO: Thanks, I simplified it.
Try M2[rowSums(M2[,c(1,2,3,4)] != 0) == 0,].
0

Your question is not quite clear to me, but is this what you are looking for?

To select based on the values of columns 1 to 4, you will do the following:

ans <- M2[M2[,1]==0 & M2[,2]==0 & M2[,3]==0 & M2[,4]==0,]

 #> ans
 #  [,1] [,2] [,3] [,4]
 #B    0    0    0    0
 #F    0    0    0    0
 #G    0    0    0    0

This will result in the subset of M2 for which all columns 1 to 4 are zero.

1 Comment

I think the point of the question was requesting a method that didn't require a column by column approach.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.