0

I am trying to get a binary matrix but first I need to replace multiple string columns to binary values (0 and 1). I tried to get it in R and python but the code didn't work. I was wondering if someone could help me.

I have a matrix of 29,584 rows x 982 columns, similar like:

  G       X4646466.555  X564737373.455  X737347474.56   
0 add     bp_ggfgfgg    
1 fgr     bb_jhfjfjf    bpp_fhfhfhf     bb_jfjfjf
2 dfr
3 tth                   bb_jdjfjdd
4 dee     bp_djdjdd
5 ee                    bp_dhsdhdh
6 ff                    bb_hfhfhf       bpp_dfhdhdhd
...

For each column that start with X, there are various string values. These values start in bb_, bpp_ and bp_. In addition, there are missing data (in blank). I would like to replace with 1 all the string values from each column that start witn X (or all columns except G) and to replace with 0 the missing data from the columns that start with X.

I am attaching a imagen of the dataframe.

1
  • Are you looking for an answer in python or in R? Commented Aug 3, 2021 at 18:18

2 Answers 2

2

We can use

library(dplyr)
df2 <- df1 %>%
    mutate(across(starts_with('X'), ~ +(!is.na(.)))
Sign up to request clarification or add additional context in comments.

Comments

1

We could use mutate across with case_when:

library(dplyr)
df %>% 
    dplyr::mutate(across(starts_with("X"), ~case_when(is.na(.) ~ 0,
                                                TRUE ~ 1)))
# A tibble: 7 x 5
  G     X4646466.555 X564737373.455 X737347474.56    X5
  <chr>        <dbl>          <dbl>         <dbl> <dbl>
1 add              1              0             0     0
2 fgr              1              1             1     0
3 dfr              0              0             0     0
4 tth              1              0             0     0
5 dee              1              0             0     0
6 ee               1              0             0     0
7 ff               1              1             0     0

OR

library(tidyverse)
df1 <- df[,-1] %>% 
    modify(~ ifelse(is.na(.), 0,1))
    
cbind(df[,1],df1)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.