0

My dataframe has a column called "LandType" of characters, either "Rural" or "Urban" for a bunch of samples. All I want to do is convert them to 1's and 0's, where "Rural" is 1, and "Urban" is 0.

I thought it would as simple as:

data$LandType[data$LandType == "Rural"] <- 1
data$LandType[data$LandType == "Urban"] <- 0

But after running this with no errors and then checking my data df, the crazy thing is that ONLY "Rural" has changed to 1's but Urban still remains as a string. I tried with different numbers but same thing happened, only Rural would change to the value I assigned.

7
  • Maybe Urban is written different in data$Landtype ? Check it e.g. with table(data$Landtype) or unique(data$Landtype). Commented Sep 6, 2021 at 15:33
  • Okay I checked and it comes up as "Rural" and "Urban", and there are 72 "Rural" and 93 "Urban" samples. Commented Sep 6, 2021 at 15:34
  • 1
    is.factor(data$Landtype) gives FALSE ? Commented Sep 6, 2021 at 15:38
  • Yes it does because that column is a string of characters. I'll try converting it but not sure why then Rural would change to a value but Urban would not... Commented Sep 6, 2021 at 15:39
  • 1
    Once you use LandType the other time Landtype. Commented Sep 6, 2021 at 15:45

4 Answers 4

4

Just use ifelse

#your data
data = data.frame(Landtype = c("Rural", "Urban", "Rural", "Urban"))
#ifelse condition 
data$Landtype = ifelse(data$Landtype == "Rural", 1,0)
Sign up to request clarification or add additional context in comments.

2 Comments

This worked! It is still incredibly odd to me that Urban would not change. I guess this was a nice workaround an odd situation like this. I also tried with using "Urban" in the ifelse statement and that seemed to work. Running it individually as I wrote above does not work.
This also avoids the problem of the data type on your 1's and 0's being changed to character as it did in the original. Data.frames can only have a single type per column, so when you only replace half the values, it's forced to change the 1 to "1".
2

A tidyverse option using recode()

library(dplyr)

mutate(data, Landtype = recode(Landtype, Rural = 1, Urban = 0))

# # A tibble: 4 x 1
# Landtype
# <dbl>
# 1        1
# 2        0
# 3        1
# 4        0

Data

data <- tibble(Landtype = c("Rural", "Urban", "Rural", "Urban"))

Comments

0

We could use as.integer

data$Landtype <- as.integer(data$Landtype == "Rural")

data

data = data.frame(Landtype = c("Rural", "Urban", "Rural", "Urban"))

Comments

0

In my case your way works:

set.seed(42)
data <- data.frame(LandType = sample(c(rep("Rural", 72), rep("Urban", 93))))

data$LandType[data$LandType == "Rural"] <- 1
data$LandType[data$LandType == "Urban"] <- 0

table(data$LandType)
# 0  1 
#93 72

To get binary values I would recommend to use the type logical (TRUE or FALSE).

data$LandType <- data$LandType == "Rural"

In case 0 and 1 is needed just add a +

data$LandType <- +(data$LandType == "Rural")

2 Comments

I know - there must be some weird error on my end that is not showing up in R as any warning/error. I checked the raw csv file as well and nothing seems out of place. Maybe a hidden character somewhere? That is my only conclusion.
Maybe dput your data to see where it comes from.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.