3

I have a data frame such as ;

   Groups   Value
    G1  NA
    G1  NA
    G1  NA
    G1  23
    G2  NA
    G2  NA
    G2  NA
    G2  NA
    G2  NA
    G2  NA
    G3  34
    G3  21
    G4  NA
    G4  NA
    G5  NA
    G5  45

and I'm looking for a code in R in order o get another data frame with binary values (1 for at least one Value in the group was a >=1) and (0 for Groups containing only NA values)

and get a new datagram such as:

G1  G2  G3  G4  G5
1   0   1   0   1

Thanks for your help.

4 Answers 4

3

We can do with table from base R. Get the 'Value' column as a logical vector (!is.na), and find the frequency table with 'Groups', check whether the frequency is greater than 0, convert the logical vector to binary with as.integer or +

+(table(df1$Groups, !is.na(df1$Value))[,2] > 0)
# G1 G2 G3 G4 G5 
# 1  0  1  0  1 

Or using rowsum from base R

rowsum(+!is.na(df1$Value), df1$Groups)

NOTE: Both the above methods are base R - No packages used


Or using tidyverse

library(tidyverse)
df1 %>% 
  group_by(Groups) %>%
  summarise_all(list(~ as.integer(sum(!is.na(.)) > 0)))
# A tibble: 5 x 2
#  Groups Value
#   <chr>  <int>
#1 G1         1
#2 G2         0
#3 G3         1
#4 G4         0
#5 G5         1

Or with data.table

library(data.table)
setDT(df1)[, +(sum(!is.na(Value)) > 0), Groups]

data

df1 <- structure(list(Groups = c("G1", "G1", "G1", "G1", "G2", "G2", 
"G2", "G2", "G2", "G2", "G3", "G3", "G4", "G4", "G5", "G5"), 
    Value = c(NA, NA, NA, 23L, NA, NA, NA, NA, NA, NA, 34L, 21L, 
    NA, NA, NA, 45L)), class = "data.frame", row.names = c(NA, 
-16L))
Sign up to request clarification or add additional context in comments.

Comments

1

We can use base R aggregate

aggregate(Value > 1~Groups, df1, any, na.rm = TRUE, na.action = na.pass)

#  Groups Value > 1
#1     G1      TRUE
#2     G2     FALSE
#3     G3      TRUE
#4     G4     FALSE
#5     G5      TRUE

If you need 1/0 values instead of TRUE/FALSE you could do

aggregate(Value~Groups, df1, function(x) 
           +(any(x > 1, na.rm = TRUE)), na.action = na.pass)

#  Groups Value
#1     G1     1
#2     G2     0
#3     G3     1
#4     G4     0
#5     G5     1

Comments

1

With dplyr, you can also do:

df %>%
 group_by(Groups) %>%
 summarise(Value = as.integer(any(!is.na(Value))))

  Groups Value
  <chr>  <int>
1 G1         1
2 G2         0
3 G3         1
4 G4         0
5 G5         1

Or:

df %>%
 group_by(Groups) %>%
 summarise(Value = as.integer(max(Value, na.rm = TRUE) > 0))

Comments

0

by using looping condition we can extract the same

data

data <- data.frame (Groups =rep(c("G1","G2"), each = 4),  Value = c(NA,NA,NA,23,NA,NA,NA,NA))

Loop

for (i in unique(data$Groups)){
  data$new_value[data$Groups==i] <- ifelse(sum(data$Value[data$Groups==i],na.rm = T)>1,1,0)
}


data1 <- unique(data[,c(1,3)])

Groups new_value
  G1         1
  G2         0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.