38

I have a data frame that I construct as such:

> yyz <- data.frame(a = c("1","2","n/a"), b = c(1,2,"n/a"))

> apply(yyz, 2, class)
      a           b 
"character" "character"

I am attempting to convert the last column to numeric while still maintaining the first column as a character. I tried this:

> yyz$b <- as.numeric(as.character(yyz$b))
> yyz
  a  b
  1  1
  2  2
  n/a NA

But when I run the apply class it is showing me that they are both character classes.

> apply(yyz, 2, class)
      a           b 
"character" "character"

Am I setting up the data frame wrong? Or is it the way R is interpreting the data frame?

1
  • 2
    Note that class(yyz$b) yields "numeric" in this example. Therefore the column is in fact numeric. As pointed out by @akrun, the apparent mismatch of classes results from the use of apply(). Commented Jun 8, 2016 at 15:46

1 Answer 1

81

If we need only one column to be numeric

yyz$b <- as.numeric(as.character(yyz$b))

But, if all the columns needs to changed to numeric, use lapply to loop over the columns and convert to numeric by first converting it to character class as the columns were factor.

yyz[] <- lapply(yyz, function(x) as.numeric(as.character(x)))

Both the columns in the OP's post are factor because of the string "n/a". This could be easily avoided while reading the file using na.strings = "n/a" in the read.table/read.csv or if we are using data.frame, we can have character columns with stringsAsFactors=FALSE (the default is stringsAsFactors=TRUE)


Regarding the usage of apply, it converts the dataset to matrix and matrix can hold only a single class. To check the class, we need

lapply(yyz, class)

Or

sapply(yyz, class)

Or check

str(yyz)
Sign up to request clarification or add additional context in comments.

5 Comments

This converts both columns to numeric. I only want column b to be numeric. I specifiy yyz$b <- lapply(yyz$b, function(x) as.numeric(as.character(x))), it will turn them into lists
@Dexstrum It is because you are assigning a list to a column. If we need only a single column as numeric, use the same syntax as you did yyz$b <- as.numeric(as.character(yyz$b))
Please look again at what I posted. I already tried that, and it did not change the column to numeric.
After running the sapply(yyz, clas) it shows up as numeric. Thank you.
@Dexstrum It should change the column to numeric as the character elements are changed to NA. If sapply(yyz, class) show up as numeric, isn't that you wanted?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.