0

I have a dataframe with several numeric variables along with factors. I wish to run over the numeric variables and replace the negative values to missing. I couldn't do that.

My alternative idea was to write a function that gets a dataframe and a variable, and does it. It didn't work either.

My code is:

NegativeToMissing = function(df,var)
{
  df$var[df$var < 0] = NA
}

Error in $<-.data.frame(`*tmp*`, "var", value = logical(0)) : replacement has 0 rows, data has 40 

what am I doing wrong ?

Thank you.

2
  • 2
    Read ?Extract. I think what you need in there is df[[var]][ df[[var]] < 0 ] <- NA. When you use df$var, it expects a column named var in the frame, which is not the case. To reference it indirectly, the only method is with [ (which should return a single-column frame) or [[ (which always returns a vector). Commented Dec 13, 2018 at 9:47
  • Please provide a reproducible example along with expected output. Commented Dec 13, 2018 at 10:08

3 Answers 3

1

Here is an example with some dummy data.

df1 <- data.frame(col1 = c(-1, 1, 2, 0, -3),
                  col2 = 1:5,
                  col3 = LETTERS[1:5])
df1
#  col1 col2 col3
#1   -1    1    A
#2    1    2    B
#3    2    3    C
#4    0    4    D
#5   -3    5    E

Now find columns that are numeric

numeric_cols <- sapply(df1, is.numeric)

And replace negative values

df1[numeric_cols] <- lapply(df1[numeric_cols], function(x) replace(x, x < 0 , NA))
df1
#  col1 col2 col3
#1   NA    1    A
#2    1    2    B
#3    2    3    C
#4    0    4    D
#5   NA    5    E

You could also do

df1[df1 < 0] <- NA
Sign up to request clarification or add additional context in comments.

2 Comments

I understand why you used sapply at first, but why lapply later. Doesn't it return a list ?
@user2899944 Yes. A data.frame is a list too (with elements of equal length) so we can use lapply here to replace the values of the numeric_cols. But you could also use sapply or vapply instead.
0

With tidyverse, we can make use of mutate_if

library(tidyverse)
df1 %>%
    mutate_if(is.numeric, funs(replace(., . < 0, NA)))

Comments

0

If you still want to change only one selected variable a solution withdplyr would be to use non-standard evaluation:

library(dplyr)
NegativeToMissing <- function(df, var) {
  quo_var = quo_name(var)
  df %>% 
    mutate(!!quo_var := ifelse(!!var < 0, NA, !!var))

}

NegativeToMissing(data, var=quo(val1)) # use quo() function without ""
#   val1 val2
# 1    1    1
# 2   NA    2
# 3    2    3

Data used:

data <- data.frame(val1 = c(1, -1, 2),
                   val2 = 1:3)
data
#   val1 val2
# 1    1    1
# 2   -1    2
# 3    2    3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.