0

I want to create boxplots comparing the analyte concentrations but grouping the samples on which donor they came from (D1 to D4), which virus they contained (VEH, HCV, or HIV) and whether or not they incubated with CO2 (+ or - CO2), ALL of which can be determined by the sample name. For example, the first sample, D1VEH+CO2 came from Donor 1, had the virus "VEH" (which technically isn't a virus but that's besides the point), and was incubated with CO2. I don't have to do all of these at once - I'll create a series of different boxplots. The thing I'm struggling with is isolating the different groups within the mappings. For example, see the command below:

ggplot(data = df, mapping = aes(x = AnalyteSample, y = A)) + geom_boxplot()

Now this gives me many boxplots of ALL the samples. What if I only want the boxplots of the samples containing the virus HIV? How do I filter the AnalyteSample column within a ggplot command?

structure(list(AnalyteSample = c("D1VEH+CO2", "D1HCV+CO2", "D1VEH-CO2", 
"D1HCV-CO2", "D2VEH+CO2", "D2HCV+CO2", "D2VEH-CO2", "D2HCV-CO2", 
"D3VEH+CO2", "D3HCV+CO2", "D3VEH-CO2", "D3HCV-CO2", "D4VEH+CO2", 
"D4VEH-CO2"), A = c("4190", "6665", "7435", "2052", "783", "322", 
"199", "90", "46", "17", "8", "3", "3", NA), B = c("11569", "6677", 
"3852", "983.88", "589", "359", "203", "68", "33", "12", "6", 
NA, "4", NA), C = c("20453", "7699", "2499", "707.98", "412", 
"328", "156", "88", "39", "27", "17", NA, NA, NA), D = c("7893", 
NA, "1623", "685.64", "321", "644", "112", "65", "35", "29", 
"9", "5", NA, NA), E = c("320", "15444", "2049", "1065", "389", 
"365", "145", "77", "38", "16", "9", "6", NA, NA), F = c("7438", 
NA, "3472", "1057", "563", "401", "167", "89", "46", "19", "6", 
NA, NA, NA), G = c(7345, 9001, 2473, 1138, 516, 403, 134, 81, 
37, 17, 8, 6, 4, 3), H = c("9004", "3998", "2299", "964.88", 
"499", "341", "112", "88", "39", "32", NA, NA, NA, NA), I = c("8434", 
"8700", "2217", "1263", "567", "352", "153", "80", "43", "18", 
"9", "2", "3", NA), J = c("7734", "6733", "2092", "1115", "637", 
"332", "155", "82", "37", "17", "10", "4", "1", NA), K = c(NA, 
NA, "2118", "862.13", "426", "355", "143", "78", "44", "22", 
"11", NA, NA, NA), L = c(6345, 7688, 2311, 1195, 647, 366, 177, 
83, 41, 20, 8, 6, 3, 2), M = c("4222", NA, "1846", "814.61", 
"422", "314", "154", "86", "41", "27", "21", NA, NA, NA), N = c("6773", 
"8934", "2381", "1221", "677", "356", "146", "89", "40", "17", 
"10", "5", "2", NA), O = c(NA, NA, NA, "564.5", "226", "476", 
"111", "60", "32", "36", "18", NA, NA, NA)), row.names = c(NA, 
-14L), class = "data.frame")

1 Answer 1

1

It's far easier if you separate your AnalyteSample column into its component parts. (Thanks to Tjebo for pointing out this is better than using substring.)

library(ggplot2)
library(dplyr)

df %>% tidyr::separate(AnalyteSample, c("Donor", "Virus", "CO2"), c(2, 5)) %>%
  ggplot(mapping = aes(x = Donor, y = as.numeric(A))) + 
  geom_boxplot() +
  facet_grid(.~CO2)

enter image description here

df %>% tidyr::separate(AnalyteSample, c("Donor", "Virus", "CO2"), c(2, 5)) %>%
  ggplot(mapping = aes(x = Donor, y = as.numeric(A))) + 
  geom_boxplot() +
  facet_grid(.~Virus)

enter image description here

Sign up to request clarification or add additional context in comments.

6 Comments

your column split looks a bit too complicated maybe? Something like tidyr::separate would do a similar job with less code, I guess... ?
Thanks @Tjebo. As I was writing this I thought "it would be nice if tidyr::separate took numeric values to split on". I didn't realise it could! Now I know :)
What are the c(2, 5) for? Please excuse me if this is a silly question - I'm still new to R
@ReeNadeau these are the number of characters after which we split the string. "ABCDEFG" would split into "AB", "CDE", "FG"
Ohhh okay. Do underscores or dashes count as characters?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.