1

I have a pair of binary variables (1's and 0's), and my professor wants me to create a new binary variable that takes the value 1 if both of the previous variables have the value 1 (i.e., x,y=1) and takes the value zero otherwise.

How would I do this in R?

Thanks! JMC

1 Answer 1

3

Here's one example with some sample data to play with:

set.seed(1)
A <- sample(0:1, 10, replace = TRUE)
B <- sample(0:1, 10, replace = TRUE)
A
#  [1] 0 0 1 1 0 1 1 1 1 0
B
#  [1] 0 0 1 0 1 0 1 1 0 1

as.numeric(A + B == 2)
#  [1] 0 0 1 0 0 0 1 1 0 0

as.numeric(rowSums(cbind(A, B)) == 2)
#  [1] 0 0 1 0 0 0 1 1 0 0

as.numeric(A == 1 & B == 1)
# [1] 0 0 1 0 0 0 1 1 0 0

Update (to introduce some more alternatives and share a link and a benchmark)

set.seed(1)
A <- sample(0:1, 1e7, replace = TRUE)
B <- sample(0:1, 1e7, replace = TRUE)

fun1 <- function() ifelse(A == 1 & B == 1, 1, 0)
fun2 <- function() as.numeric(A + B == 2)
fun3 <- function() as.numeric(A & B)
fun4 <- function() as.numeric(A == 1 & B == 1)
fun5 <- function() as.numeric(rowSums(cbind(A, B)) == 2)

library(microbenchmark)
microbenchmark(fun1(), fun2(), fun3(), fun4(), fun5(), times = 5)
# Unit: milliseconds
#   expr       min        lq    median        uq        max neval
# fun1() 4842.8559 4871.7072 5022.3525 5093.5932 10424.6589     5
# fun2()  220.8336  220.9867  226.1167  229.1225   472.4408     5
# fun3()  440.7427  445.9342  461.0114  462.6184   488.6627     5
# fun4()  604.1791  613.9284  630.4838  645.2146   682.4689     5
# fun5()  373.8088  373.8532  373.9460  435.0385  1084.6227     5

As can be seen, ifelse is indeed much slower than the other approaches mentioned here. See this SO question and answer for some more details about the efficiency of ifelse.

Sign up to request clarification or add additional context in comments.

9 Comments

Thanks! I wasn't familiar with the rowSums() function, which made the problems far easier.
It's interesting to me that you approach the issue in this way. It seems like things I've seen a number of times on SO. I think ifelse() is more intuitive, especially if someone has some basic programming before coming to R (not a criticism). Is there a simple reason you prefer these types of solutions?
@gung, because R is vectorized and these approaches are likely to be faster :-)
Isn't ifelse() vectorized? I thought it was. I know that if(){...} else {...} is not vectorized, & I try not to use it.
@gung, I'm out right now, but will update later with some benchmarks. I seem to remember a question about this too, so I'll try to track that down for you.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.