1

I have a numeric column of weights (df$weight0) in a df. I want to create a new column df$weight1 which is a factor based on the values in df$weight0.

If the value in df$weight0 is less than or equal to 170, the corresponding value in df$weight1 should be 1 but if the value in df$weight0 is greater than 170,the corresponding value in df$weight1 should be 2.

The code below is what I have tried but it gives a single value not a vector.

  if (i<=170){
    i==1
  }else{
    i==2
  }
}
1
  • 2
    df$weight1 <- (df$weight0>170)+1 Commented Feb 26, 2020 at 12:16

4 Answers 4

1

Use cut() for discretizizing continuous variables by intervals

For such kind of interval categorization, there is the very useful function cut.

nums <- nums <- runif(100, min=0, max=300) # n = 100 random numbers between 0 and 300
factorized_num <- cut(nums, c(-Inf, 170, +Inf))
# you can name the categories as you want:
levels(factorized_num) <- c(1, 2) # first interval 1, next interval 2
# with include.lowest=TRUE or FALSE you can determine whether lower limit is <= or <

Or use Vectorize() to vectorize non-vectorized functions

# define for one case:
categorize <- function(i) if (i<=170) 1 else 2
# then vectorize it:
categorize <- Vectorize(categorize)

Now you can use it:

categories <- categorize(nums)
head(categories) ## 1 2 1 1 ...

I prefer this - out of my experience - much more over ifelse() because you have full control over the single case.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you all. I tried with the cut method wcgs$w8t<-cut(wcgs$weight0,breaks = c(78,170,320), labels = c("<= 170 lbs", "> 170 lbs"), include.lowest=T)
nice! The best is you include/name all limits (and -Inf and +Inf are thereby useful if there are no limits).
1

ifelse can use vectorized input:

df$weight1 <- ifelse(df$weight0<=170,1,2)

Comments

1

You were checking the value of i, not the values in your df. Also the assignment of your new columns was not implemented. Try the following.

for (i in wcgs$weight0){
  if (wcgs$weight0[i]<=170){
    wcgs$weight1[i] <- 1
  }else{
    wcgs$weight1[i] <- 2
  }
}

1 Comment

Thanks. So I use wcgs$weight1[i] to change the value at that index
0

Using case_when from the package :

library(dplyr)
df %>% mutate(df$weight1 = case_when(df$weight0 =< 170 ~ 1,
                                     df$weight0 > 170 ~ 2)) 

case_when() helps to formulate if-else construct and mutate() modifies or adds columns.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.