3

I have a list as follows:

yel <- list(c(1,3,5,7,9),
        c(1,2,6,9),
        c(2,4,5,6,7,8,9))

And I want to transform the list into a dataframe like the one below:

  1 2 3
1 1 1 0
2 0 1 1
3 1 0 0
4 0 0 1
5 1 0 1
6 0 1 1
7 1 0 1
8 0 0 1
9 1 1 1

To give an idea of how I arrived with that list: I have a dataframe with 2 columns namely "id" and "text". The "text" column is a list of characters. I found the unique words in the character list and created a data frame "yel" where the first list represents the "id" which has "text1", second list represents "id" which has "text2" and so on. (The "id" in my dataset for example is 7170325). Thank you very much in advance !

3 Answers 3

5

tabulate might be handy here:

setNames(data.frame(lapply(yel, tabulate)), seq_along(yel) )
#  1 2 3
#1 1 1 0
#2 0 1 1
#3 1 0 0
#4 0 0 1
#5 1 0 1
#6 0 1 1
#7 1 0 1
#8 0 0 1
#9 1 1 1
Sign up to request clarification or add additional context in comments.

3 Comments

More wizardry from a master.
@thelatemail Thank you for your response. I tried to input my data and got the following error: Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: In addition: Warning messages: 1: Unknown column 'i' 2: Unknown column 'i'
A similar alternative would be to use table to make a tabulation once; table(unlist(yel), rep(1:length(yel), lengths(yel)))
4

We can use mtabulate

library(qdapTools)
t(mtabulate(yel))
#   [,1] [,2] [,3]
#1    1    1    0
#2    0    1    1
#3    1    0    0
#4    0    0    1
#5    1    0    1
#6    0    1    1
#7    1    0    1
#8    0    0    1
#9    1    1    1

3 Comments

Thank you very much! This worked like a charm on my huge dataset.
can we change the order or rows? like in this case its sorted, but if i want it in a particular order can I do it using mtabulate?
@ManishRanjan Yes, you can convert the list elements to factor and specify the levels in the order you wanted. Here I am using sample i.e. t(mtabulate(lapply(yel, factor, levels = sample(1:9))))
2

Obtain the maximum value in the list which is the number of rows. Check for each arrays in your list if all the values from 1 to maximum value are present in the array using %in%. This gives logical which you can convert to numeric.

Incorporating comment from @thelatemail

setNames(as.data.frame(lapply(yel, function(x)
                as.numeric(1:max(unlist(yel)) %in% x))), 1:length(yel))
#  1 2 3
#1 1 1 0
#2 0 1 1
#3 1 0 0
#4 0 0 1
#5 1 0 1
#6 0 1 1
#7 1 0 1
#8 0 0 1
#9 1 1 1

3 Comments

Two minor points in addition to my +1 - max_num could just be max(unlist(yel)) and lapply for the second step would probably make more sense so you are not going list --> matrix --> data.frame but rather list --> data.frame.
@thelatemail,thank you! I edited my answer to include your suggestion
@d.b Thank you very much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.