R - Function to make a binary variable

Question

I have some variables which take value between 1 and 5. I would like to code them 0 if they take the value between 1 and 3 (included) and 1 if they take the value 4 or 5.

My dataset looks like this

var1    var2        var3
1       1            NA
4       3            4
3       4            5
2       5            3

So I would like it to be like this:

var1    var2        var3
0       0            NA
1       0            1
0       1            1
0       1            0

I tried to do a function and to call it

making_binary <- function (var){
  var <- factor(var >= 4, labels = c(0, 1))
  return(var)
}


df <- lapply(df, making_binary)

But I had an error : incorrect labels : length 2 must be 1 or 1

Where did I go wrong? Thank you very much for your answers!

Ronak Shah · Accepted Answer · 2020-06-30 07:48:25Z

4

You can use :

df[] <- +(df == 4 | df == 5)
df
#  var1 var2 var3
#1    0    0   NA
#2    1    0    1
#3    0    1    1
#4    0    1    0

Comparison of df == 4 | df == 5 returns logical values (TRUE/FALSE), + here turns those logical values to integer values (1/0) respectively.

If you want to apply this for selected columns you can subset the columns by position or by name.

cols <- 1:3 #Position
#cols <- grep('var', names(df)) #Name
df[cols] <- +(df[cols] == 4 | df[cols] == 5)

As far as your function is concerned you can do :

making_binary <- function (var){
  var <- as.integer(var >= 4)
  #which is faster version of
  #var <- ifelse(var >= 4, 1, 0)
  return(var)
}

df[] <- lapply(df, making_binary)

data

df <- structure(list(var1 = c(1L, 4L, 3L, 2L), var2 = c(1L, 3L, 4L, 
5L), var3 = c(NA, 4L, 5L, 3L)), class = "data.frame", row.names = c(NA, -4L))

edited Jun 30, 2020 at 7:48

answered Jun 30, 2020 at 7:41

Ronak Shah

391k20 gold badges173 silver badges237 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Emeline Over a year ago

I cannot really do that because I have lots of other variables which I do not want to change

Stéphane Laurent Over a year ago

Interesting. Please could you explain what does this leading + mean ?

Nico Over a year ago

@Emeline if you only want to change the first and second column change df[] to df[, c(1:2)]

Emeline Over a year ago

Thank you for answering many of my questions andd always make it simple for a beginner to understand! I am really improving thanks to you (and others from Stackoverflow!)

Ronak Shah Over a year ago

@Emeline There are ways in which you can apply the function to selected columns. See edit to the answer that shows couple of them.

Eyayaw · Accepted Answer · 2020-06-30 08:13:44Z

1

I think ifelse would fit the problem well:

df[] <- lapply(df, function(x) ifelse(x >=1 & x <=3, 0, x))
df
  var1 var2 var3
1    0    0   NA
2    4    0    4
3    0    4    5
4    0    5    0
df[] <- lapply(df, function(x) ifelse(x >=4 & x <=5, 1, x))

df
  var1 var2 var3
1    0    0   NA
2    1    0    1
3    0    1    1
4    0    1    0

If you need to do the two steps at once, you can look at dplyr::case_when() or data.table::fcase().

answered Jun 30, 2020 at 8:13

Eyayaw

1,0915 silver badges12 bronze badges

1 Comment

Emeline Over a year ago

Thank you! This is a nice easy way to do it!

GKi · Accepted Answer · 2020-06-30 08:59:18Z

1

You can simply test if the value is larger than 3, which will return TRUE and FALSE and cast this to a number:

+(x>3)
#     var1 var2 var3
#[1,]    0    0   NA
#[2,]    1    0    1
#[3,]    0    1    1
#[4,]    0    1    0

In case you want this only for some columns, you have to subset them. E.g. for column 1 and 2:

+(x[1:2]>3)
#+(x[c("var1","var2")]>3)  #Alternative
#     var1 var2
#[1,]    0    0
#[2,]    1    0
#[3,]    0    1
#[4,]    0    1

Data:

x <- data.frame(var1 = c(1L, 4L, 3L, 2L), var2 = c(1L, 3L, 4L, 5L)
              , var3 = c(NA, 4L, 5L, 3L))

answered Jun 30, 2020 at 8:59

GKi

40.1k3 gold badges36 silver badges56 bronze badges

Collectives™ on Stack Overflow

R - Function to make a binary variable

3 Answers 3

5 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

5 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related