0

I have a dataframe such as

Groups Names
G1     SP1
G1     SP2
G1     SP3
G2     SP1
G2     SP4
G3     SP2
G3     SP1 

And I would like to transform it as :

  Names G1 G2 G3
  SP1   1  1  1
  SP2   1  0  1  
  SP3   1  0  0 
  SP4   0  1  0

Where in columns are the Groups and within cell 1 = present and 0 = absent

Here is the dput format

structure(list(Groups = c("G1", "G1", "G1", "G2", "G2", "G3", 
"G3"), Names = c("SP1", "SP2", "SP3", "SP1", "SP4", "SP2", "SP1"
)), class = "data.frame", row.names = c(NA, -7L))
2
  • 2
    1> with(your_data_frame, table(Names, Groups)); 2> xtabs(~ Names + Groups, your_data_frame); 3> xtabs(~ Names + Groups, your_data_frame, sparse = TRUE) Commented Jul 2, 2022 at 14:22
  • It is rather frustrating. My comment reached you first, and I posted an answer later. Yet you accepted another answer which is covered by my comment. Commented Aug 11, 2022 at 5:44

3 Answers 3

3

Use table:

table(df$Names, df$Groups)
     
      G1 G2 G3
  SP1  1  1  1
  SP2  1  0  1
  SP3  1  0  0
  SP4  0  1  0
Sign up to request clarification or add additional context in comments.

Comments

3

Expanding comment to an answer.

This is known as a contingency table, and can be computed in several ways, without using fancy packages.

dat <- structure(list(Groups = c("G1", "G1", "G1", "G2", "G2", "G3", 
"G3"), Names = c("SP1", "SP2", "SP3", "SP1", "SP4", "SP2", "SP1"
)), class = "data.frame", row.names = c(NA, -7L))

mat1 <- with(dat, table(Names, Groups))
#     Groups
#Names G1 G2 G3
#  SP1  1  1  1
#  SP2  1  0  1
#  SP3  1  0  0
#  SP4  0  1  0

mat2 <- xtabs(~ Names + Groups, dat)
#     Groups
#Names G1 G2 G3
#  SP1  1  1  1
#  SP2  1  0  1
#  SP3  1  0  0
#  SP4  0  1  0

Such table is a matrix. If you want a data frame, coerce them using:

data.frame(unclass(mat1))
#    G1 G2 G3
#SP1  1  1  1
#SP2  1  0  1
#SP3  1  0  0
#SP4  0  1  0

data.frame(unclass(mat2))
#    G1 G2 G3
#SP1  1  1  1
#SP2  1  0  1
#SP3  1  0  0
#SP4  0  1  0

Remark:

In your case, your data frame should have no duplicated rows, otherwise a contingency table won't just contain 0 and 1. In this sense, computing a contingency table actually overkills. An algorithmically simpler way (although with more lines of code) is:

m1 <- unique(dat$Names)
m2 <- unique(dat$Groups)
mat <- matrix(0, length(m1), length(m2), dimnames = list(m1, m2))
mat[with(dat, cbind(Names, Groups))] <- 1
#    G1 G2 G3
#SP1  1  1  1
#SP2  1  0  1
#SP3  1  0  0
#SP4  0  1  0

Comments

1

You can use table over df either by

> t(table(df))
     Groups
Names G1 G2 G3
  SP1  1  1  1
  SP2  1  0  1
  SP3  1  0  0
  SP4  0  1  0

or

> table(rev(df))
     Groups
Names G1 G2 G3
  SP1  1  1  1
  SP2  1  0  1
  SP3  1  0  0
  SP4  0  1  0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.