0

I have a data.table / data frame with lists as values. I would like to make a box or violin plot of the values, one violin/box representing one row of my data set, but I can't figure out how.

Example:

test.dt <- data.table(id = c('a','b','c'), v1 = list(c(1,0,10),1:5,3))
ggplot(data = test.dt, aes(x = as.factor(id), y = v1)) + geom_boxplot()

I get the following message:

Warning message: Computation failed in stat_boxplot(): 'x' must be atomic

So my guess is that maybe I should split the lists of the values to rows somehow. I.e.: the row with a as id would be transformed to 3 rows (corresponding to the length of the vector in v1) with the same id, but the values would be split among them.

Firstly I don't know how to transform the data.table as mentioned, secondly I don't know either if this would be the solution at all.

2 Answers 2

2

Indeed, you need to unnest your dataset before plotting:

library(tidyverse)

unnest(test.dt) %>% 
ggplot(data = ., aes(x = as.factor(id), y = v1)) + geom_boxplot()
Sign up to request clarification or add additional context in comments.

4 Comments

You can also put everything in a dplyr chain.
The explanation was simpler in separate steps. But indeed, it's more elegant: unnest(test.dt) %>% ggplot(data = ., aes(x = as.factor(id), y = v1)) + geom_boxplot()
@Luminita Edited my question, thanks for pointing it out. I was trying to make a boxplot, indeed!
In case the aim is a histogram of v1 (a histogram is a univariate tool as it plots the frequency of a variable): ggplot(data = test.dt, aes(x = v1)) + geom_histogram(). Otherwise, one may be interested in histograms by groups (here, ids) - which can be achieved with facetting: ggplot(data = test.dt, aes(x = v1)) + geom_histogram() + facet_wrap(~id)
1

I believe what you are looking for is the very handy unnest() function. The following code works:

library(data.table)
library(tidyverse)

test.dt <- data.table(id = c('a','b','c'), v1 = list(c(1,0,10),1:5,3))
test.dt = test.dt %>% unnest()

ggplot(test.dt, aes(x = as.factor(id), y = v1)) + 
  geom_boxplot()

If you don't want to import the whole tidyverse, the unnest() function is from the tidyr package.

This is what unnest() does with example data:

> data.table(id = c('a','b','c'), v1 = list(c(1,0,10),1:5,3))
   id        v1
1:  a   1, 0,10
2:  b 1,2,3,4,5
3:  c         3
> data.table(id = c('a','b','c'), v1 = list(c(1,0,10),1:5,3)) %>% unnest()
   id v1
1:  a  1
2:  a  0
3:  a 10
4:  b  1
5:  b  2
6:  b  3
7:  b  4
8:  b  5
9:  c  3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.