1

I have imported some data (which are originally in a .csv file) into R and have the following data frame, with only one variable--V1. There are tens-of-thousands of elements (rows) with the data composition shown below. V1 is a character variable, but it contains both words and numbers, which I would like to separate into three variables as shown at the bottom.

V1
"Tigers"
"Africa"
"23"
"North America"
"15"
"Asia"
"276"
"Elephants"
"Africa"
"233"
"North America"
"0"
"Asia"
"554"

This is what I would like the complete df to look like--three variables with the names Animal, Continent, Value. The value must be a numeric (or integer variable) and the other two variables may be either factors or characters.

    Animal     Continent Value
    Tigers        Africa    23
    Tigers North America    15
    Tigers          Asia   276
 Elephants        Africa   233
 Elephants North America     0
 Elephants          Asia   554

Thanks for any help. I do not want to do this manually.

1
  • 1
    what does the csv look like before you import it Commented Sep 30, 2020 at 21:52

2 Answers 2

1

I think this should work:

library(data.table)

v <- c("Tigers",
"Africa",
"23",
"North America",
"15",
"Asia",
"276",
"Elephants",
"Africa",
"233",
"North America",
"0",
"Asia",
"554")

Animal <- v[seq(1, length(v), 7)]
n <- 3
Animal <- c(rep(Animal[1], n), rep(Animal[2], n))
rest <- v[-seq(1, length(v), 7)]
Continent <- rest[seq(1, length(rest), 2)]
Value <- rest[seq(2, length(rest), 2)]

df <- data.table(
  Animal = Animal, 
  Continent = Continent, 
  Value = as.numeric(Value) 
)
Sign up to request clarification or add additional context in comments.

3 Comments

If this does resolve your question, once you're certain of it please come back and accept the answer. (I only say this now because you're new, and it's courtesy/etiquette on SO that many new users don't initially recognize.
Hello. I tried to upvote the answer, but the following pop-up message appeared: "Thanks for the feedback! Votes cast by those with less than 15 reputation are recorded, but do not change the publicly displayed post score." Is that what is meant by "accepting the answer?" I can't an obvious way to accept the response. Thanks!
Hi, usually as the one who asked the question you should be able to click on the gray tick symbol, which would then become green. It should be under the Up/Downvote symbol. Regarding the upvote, SO requires the users to have a certain amount of reputation to do certain things. That includes the ability to upvote answers. You can click on the trophy symbol in the top right corner and then on privileges to find out more. Thank you for letting me know, I will do my part to increase you reputation with an upvote :)
0

I fail to completely grasp the question. I could not tell you how to reconstruct a df from a single column of seemingly random data. I would suggest you check your data import methods. That said, if you are interested in extracting some information from your data I would simply extract the subjects into vectors

# Given
`%!in%` <- Negate(`%in%`)

df <- data.frame(V1 = c("Tigers", "Africa", "23", "North America", "15", "Asia", "276", 
                        "Elephants", "Africa", "233", "North America", "0", "Asia", "554"), 
                 stringsAsFactors = FALSE)

# Vector to ID continents
cont <- c("Africa", "Asia", "Europe", "North America", "South America", "Oceania", "Antarctica")


# subset continents
Continent <- df$V1[df$V1 %in% cont]

# Extract digits
Values <- as.numeric(gsub("[^[:digit:]]+", '\\1', df$V1))
Values <- Values[!is.na(Values)]

# Remove "conitnents" and "values" and you are left with "animals"
Animal = df$V1[df$V1 %!in% Values & df$V1 %!in% cont]

# I do not recommend binding into a dataframe as it will be meaningless...
df2 <- cbind(Animal, Continent, Values)

I would not recommend binding these data into a dataframe

  • the vectors will likely differ in lengths
  • a df assumes the values which comprise an observation (row) are related.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.