Transform dataframe to binary dataframe in R

Question

I have a dataframe such as

Groups Names
G1     SP1
G1     SP2
G1     SP3
G2     SP1
G2     SP4
G3     SP2
G3     SP1

And I would like to transform it as :

  Names G1 G2 G3
  SP1   1  1  1
  SP2   1  0  1  
  SP3   1  0  0 
  SP4   0  1  0

Where in columns are the Groups and within cell 1 = present and 0 = absent

Here is the dput format

structure(list(Groups = c("G1", "G1", "G1", "G2", "G2", "G3", 
"G3"), Names = c("SP1", "SP2", "SP3", "SP1", "SP4", "SP2", "SP1"
)), class = "data.frame", row.names = c(NA, -7L))

1> with(your_data_frame, table(Names, Groups)); 2> xtabs(~ Names + Groups, your_data_frame); 3> xtabs(~ Names + Groups, your_data_frame, sparse = TRUE) — Zheyuan Li
– Zheyuan Li, Commented Jul 2, 2022 at 14:22
It is rather frustrating. My comment reached you first, and I posted an answer later. Yet you accepted another answer which is covered by my comment. — Zheyuan Li
– Zheyuan Li, Commented Aug 11, 2022 at 5:44

Chris Ruehlemann · Accepted Answer · 2022-07-02 14:41:29Z

3

Use table:

table(df$Names, df$Groups)
     
      G1 G2 G3
  SP1  1  1  1
  SP2  1  0  1
  SP3  1  0  0
  SP4  0  1  0

answered Jul 2, 2022 at 14:41

Chris Ruehlemann

21.5k4 gold badges15 silver badges45 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Zheyuan Li · Accepted Answer · 2022-07-02 14:56:16Z

Expanding comment to an answer.

This is known as a contingency table, and can be computed in several ways, without using fancy packages.

dat <- structure(list(Groups = c("G1", "G1", "G1", "G2", "G2", "G3", 
"G3"), Names = c("SP1", "SP2", "SP3", "SP1", "SP4", "SP2", "SP1"
)), class = "data.frame", row.names = c(NA, -7L))

mat1 <- with(dat, table(Names, Groups))
#     Groups
#Names G1 G2 G3
#  SP1  1  1  1
#  SP2  1  0  1
#  SP3  1  0  0
#  SP4  0  1  0

mat2 <- xtabs(~ Names + Groups, dat)
#     Groups
#Names G1 G2 G3
#  SP1  1  1  1
#  SP2  1  0  1
#  SP3  1  0  0
#  SP4  0  1  0

Such table is a matrix. If you want a data frame, coerce them using:

data.frame(unclass(mat1))
#    G1 G2 G3
#SP1  1  1  1
#SP2  1  0  1
#SP3  1  0  0
#SP4  0  1  0

data.frame(unclass(mat2))
#    G1 G2 G3
#SP1  1  1  1
#SP2  1  0  1
#SP3  1  0  0
#SP4  0  1  0

Remark:

In your case, your data frame should have no duplicated rows, otherwise a contingency table won't just contain 0 and 1. In this sense, computing a contingency table actually overkills. An algorithmically simpler way (although with more lines of code) is:

m1 <- unique(dat$Names)
m2 <- unique(dat$Groups)
mat <- matrix(0, length(m1), length(m2), dimnames = list(m1, m2))
mat[with(dat, cbind(Names, Groups))] <- 1
#    G1 G2 G3
#SP1  1  1  1
#SP2  1  0  1
#SP3  1  0  0
#SP4  0  1  0

ThomasIsCoding · Accepted Answer · 2022-07-02 16:40:12Z

1

You can use table over df either by

> t(table(df))
     Groups
Names G1 G2 G3
  SP1  1  1  1
  SP2  1  0  1
  SP3  1  0  0
  SP4  0  1  0

or

> table(rev(df))
     Groups
Names G1 G2 G3
  SP1  1  1  1
  SP2  1  0  1
  SP3  1  0  0
  SP4  0  1  0

answered Jul 2, 2022 at 16:40

ThomasIsCoding

106k9 gold badges38 silver badges110 bronze badges

Collectives™ on Stack Overflow

Transform dataframe to binary dataframe in R

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related