1

I want to create a matrix of 0 and 1 from a vector where each string contains the two names I want to map to the matrix. For example, if I have the following vector

vector_matrix <- c("A_B", "A_C", "B_C", "B_D", "C_D")

I would like to transform it into the following matrix

  A B C D
A 0 1 1 0
B 0 0 1 1
C 0 0 0 1
D 0 0 0 0

I am open to any suggestion, but it is better if there is some built-in function that can deal with it. I am trying to do a very similar thing but in a magnitude that I will generate a matrix of 25 million cells.

I prefer if the code is R, but doesn't matter if there is some pythonic solution :)

Edit: So when I say "A_B", I want a "1" in row A column B. It doesn't matter if it is the contrary (column A row B).

Edit: I would like to have a matrix where its rownames and colnames are the letters.

2
  • Are the 1 and 0 randomly choose or not? Commented Jun 28, 2021 at 14:05
  • if we have "A_B", I want a "1" in row A and column B. Commented Jun 28, 2021 at 14:09

2 Answers 2

3

Create a two column data frame d from the data, calculate the levels and then generate a list in which each colunn of d is a factor and finally run table. The second line sorts each row and that isn't actually needed for the input shown so it could be omitted but you might need it for other data if B_A is to be regarded as A_B.

d <- read.table(text = vector_matrix, sep = "_")
d[] <- t(apply(d, 1, sort))
tab <- table( lapply(d, factor, levels = levels(factor(unlist(d)))) )
tab

giving this table:

   V2
V1  A B C D
  A 0 1 1 0
  B 0 0 1 1
  C 0 0 0 1
  D 0 0 0 0


heatmap(tab[nrow(tab):1, ], NA, NA, col = 2:3, symm = TRUE)

screenshot

library(igraph)
g <- graph_from_adjacency_matrix(tab, mode = "undirected")
plot(g)

screenshot

Sign up to request clarification or add additional context in comments.

7 Comments

Hi, I would be interested each letter is treated as a column. And the same for rows.
Yes, but I want to obtain a matrix that can be manipulated afterward. This means I want to have that exact matrix associated with a variable where I can call each column and each row like d[1,2] (Accessing row 1 column 2, and I would get returned a 1).
Okay, your solutions works, but I am really interested in getting a matrix where rownames and colnames are the letters. My idea is to make a heatmap with that matrix so I need it to be done like that.
Hi, sorry for not understanding well from the start, but your answer was right from the very beginning, it was me that I was not really understanding. Thank you!
Have added heatmap and graph of edges and nodes to show how it can be used.
|
1

The following should work in Python. It splits the input data in two lists, converts the characters to indexes and sets the indexes of a matrix to 1.

import numpy as np

vector_matrix = ("A_B", "A_C", "B_C", "B_D", "C_D")

# Split data in two lists
rows, cols = zip(*(s.split("_") for s in vector_matrix))
print(rows, cols)
>>> ('A', 'A', 'B', 'B', 'C') ('B', 'C', 'C', 'D', 'D')

# With inspiration from: https://stackoverflow.com/a/5706787/10603874
row_idxs = np.array([ord(char) - 65 for char in rows])
col_idxs = np.array([ord(char) - 65 for char in cols])
print(row_idxs, col_idxs)
>>> [0 0 1 1 2] [1 2 2 3 3]

n_rows = row_idxs.max() + 1
n_cols = col_idxs.max() + 1
print(n_rows, n_cols)
>>> 3 4

mat = np.zeros((n_rows, n_cols), dtype=int)
mat[row_idxs, col_idxs] = 1
print(mat)
>>>
[[0 1 1 0]
 [0 0 1 1]
 [0 0 0 1]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.